26th February 2026
Key Takeaways
Stress testing has become a central tool of macroprudential policy, but its informational value has diminished. Despite increasingly severe scenarios, recent supervisory stress tests have delivered highly predictable and reassuring outcomes that undermine their credibility. Taking the Bank of England’s 2021 stress test as an example, I argue that current exercises are biased towards optimism by modelling assumptions, behavioural simplifications, and institutional incentives that favour reassurance over discovery. As a result, stress testing risks becoming an exercise in implicit certification rather than genuine risk analysis. I call for a shift towards exploratory, diagnostic tools - emphasising vulnerability mapping, reverse stress testing, and sensitivity analysis - to better understand and manage tail risks.
Introduction
Stress testing has become a central pillar of modern macroprudential policy. In the years following the global financial crisis, supervisory stress tests played a valuable role in restoring confidence in banking systems that were demonstrably undercapitalised and poorly understood. By forcing banks to confront severe macroeconomic scenarios—and by linking the results to concrete capital actions—these exercises helped rebuild resilience and credibility at a critical juncture.
More than a decade on, however, it is reasonable to ask whether stress testing, at least in its current form, is still delivering commensurate informational and policy value. Stress scenarios have become progressively more severe, more complex, and the tests more resource intensive. Yet the results have become increasingly predictable and reassuring. Major banking systems are routinely judged to be resilient to shocks that are not only extreme by historical standards, but in some cases rival the most severe macroeconomic contractions on record.
There are two possible interpretations of these outcomes. The first is that post-crisis reforms have indeed rendered banking systems extraordinarily resilient—robust even to shocks of exceptional magnitude. If so, it naturally raises the question of what type of scenario would now be capable of generating systemic distress. This interpretation also lends support to the view that regulation has overshot—that we have achieved a “stability of the graveyard”, and that capital standards could be relaxed without materially compromising resilience.
The second is more troubling: that current macroprudential stress testing exercises are no longer delivering credible or informative results, instead generating an excessively benign picture of system-wide risk. On that reading, the apparent resilience revealed by stress tests reflects limitations in the design, scope, and objectives of the exercises themselves, rather than genuine invulnerability of the financial system.
I argue the case for the second interpretation. Drawing on recent supervisory stress tests as a motivating example, I suggest that these exercises have come to function primarily as pass–fail capital adequacy assessments, with outcomes that are largely pre-ordained. The consequence is a framework that offers limited insight into where vulnerabilities actually lie, and that may even risk fostering complacency about tail risks. I conclude by outlining an alternative orientation for stress testing—one that places greater emphasis on vulnerability mapping, reverse stress testing, and genuinely exploratory analysis.
Exhibit A: the Bank of England’s 2021 solvency stress test
The Bank of England’s 2021 solvency stress test is a good place to start. The stated purpose of the exercise was to assess whether major UK banks would be resilient to a further severe shock to the global economy, layered on top of the economic damage already inflicted by the Covid-19 pandemic.
The macroeconomic scenario underpinning the test was extraordinarily severe by any measure. Global GDP growth at the beginning of the scenario fell by an amount equivalent to roughly seven standard deviations relative to its historical distribution (Figure 1). Events of this magnitude are vanishingly rare in the historical data and lie far beyond the range of shocks typically used to calibrate banks’ internal risk models.
Figure 1: Distribution of global real GDP growth with 2021 stress scenario marked
Note: Historical distribution of global real GDP growth, with the first-year shock in the Bank of England’s 2021 stress test highlighted. The scenario corresponds to an event around seven standard deviations below the mean. Source: author’s calculations.
In scale, the assumed shock was broadly comparable to the collapse in global activity experienced in 2020. That episode triggered a severe breakdown in market functioning, including widespread fire sales, acute liquidity shortages, and a global “dash for cash”. Core financial markets stabilised only after massive and rapid intervention by central banks. Absent such intervention, market functioning would almost certainly have deteriorated further, with potentially serious implications for bank balance sheets, funding costs, and liquidity positions.1
Yet the results were striking. Despite the severity of the macroeconomic path, the aggregate impact on UK banks’ capital positions was modest. The system-wide common equity Tier 1 (CET1) ratio fell by around five to six percentage points at its trough, while leverage ratios declined by roughly one percentage point. No major bank breached its hurdle rates. On this basis, the Bank concluded that the UK banking system was resilient to outcomes “much more severe” than the Monetary Policy Committee’s central forecast, supporting the Financial Policy Committee’s judgement that the system was robust.
It is hard to square this assessment with the scale of the underlying shock. A global recession of this magnitude would plausibly be associated with severe stress across a wide range of financial markets, the potential failure of highly leveraged non-bank financial institutions, and sharp tightening in funding and liquidity conditions. The finding that UK banks would remain resilient in such circumstances does not pass the “sniff test”.
Market-based indicators at the time showed much weaker bank resilience than that implied by the supervisory stress test. Estimates of expected capital shortfalls derived from equity prices suggested that banks were significantly more exposed to a large negative shock than the regulatory exercise indicated. Figure 2 shows the well-known “SRISK” measure, which points to an aggregate capital shortfall at the time stress test of £50-60bn.
Figure 2: SRISK-based capital shortfall estimates for major UK banks
Note: SRISK-implied capital shortfalls under a severe global equity market decline, assuming a 20% decline in global equity prices and a 3.5% leverage ratio hurdle for UK banks. Source: VLAB and author’s calculations.
Market-based measures are imperfect, of course. But the stark divergence is revealing nonetheless. It suggests that the stress test may be abstracting from channels of risk that investors perceive as salient in extreme states of the world - particularly those related to market dysfunction, liquidity spirals, and interactions with the non-bank financial sector.
The Bank of England’s 2021 exercise is the most egregious example of a wider issue. A macroprudential stress test is essentially a mapping from a scenario S to a change in bank capital:
Δk=f(S),
where S denotes the severity of the stress scenario, and Δk the implied deterioration in banks’ capital positions. While this mapping is necessarily complex, my contention is that our current technology for running these tests generates an f that is not credible at the tail - producing implausibly small capital impacts even as the scenario severity increases dramatically. I illustrate this point in Figure 3.
Figure 3: The mapping from severity to capital impact – modelled versus plausible
Why are stress tests not delivering credible results?
The credibility gap stems from modelling choices, institutional incentives, and communication practices that all bias outcomes in a reassuring direction. Two explanations stand out.
Explanation 1: Wishful thinking
First, stress tests rely on excessively optimistic assumptions about how financial systems behave in extreme stress. While the macroeconomic scenarios themselves are severe, the surrounding environment in which banks are assumed to operate is strikingly benign.
Most exercises abstract from the possibility of disorderly failures of large foreign banks, from severe dysfunction in non-bank financial institutions, and from endogenous liquidity spirals that amplify initial shocks. Funding markets are generally assumed to continue functioning, albeit at higher spreads, and the interaction between solvency and liquidity risk is treated in a highly stylised manner, if at all. Yet it is precisely these amplification mechanisms that have proved decisive in past crises.
The behavioural assumptions are just as problematic. Banks are typically assumed to have perfect foresight over the shape of the stress scenario: they know ex ante where the trough lies and how conditions will evolve thereafter. But in reality, crises are characterised by extrapolative expectations, strategic interaction, and profound Knightian uncertainty. In practice, banks do not know whether an adverse shock represents the beginning of a short-lived downturn or the onset of a prolonged and deep crisis - and this uncertainty materially shapes their behaviour.
Stress tests also optimistically assume that banks will “use” their capital buffers to absorb losses exactly as intended. This is highly questionable. Simultaneously, they are assumed to conserve capital by cutting dividends and variable remuneration. If these behavioural assumptions were fully credible, it is hard to understand why the Bank of England - and other major central banks - felt it necessary to impose explicit restrictions on dividends and payouts during the pandemic. The revealed preference of policymakers suggests considerably less confidence in banks’ willingness to act in this way under stress.
Compounding these issues is the heavy reliance on banks’ internal models to translate macroeconomic stress into losses on loan portfolios and trading books. These models are typically estimated using data drawn from relatively benign periods and, in many cases, from episodes that featured substantial official-sector support. As a result, stress test outcomes are highly exposed to model risk precisely where confidence in model outputs should be lowest - in the far tail of the distribution.
The upshot is that stress tests may be implicitly conditioned on a world in which the most disruptive elements of systemic crises are assumed away. Small wonder, then that even extreme macroeconomic shocks translate into manageable capital impacts.
Explanation 2: Stress testing is a PR exercise
The second explanation has less to do with models than with institutional incentives. In many jurisdictions, macroprudential stress tests are designed, executed, and communicated by authorities that are also responsible for the ongoing supervision of the banking system. The tension is obvious. Stress tests are ostensibly intended to probe vulnerabilities and expose weaknesses. But unfavourable outcomes would simultaneously imply that supervisors have allowed those vulnerabilities to persist.
This dual role biases stress testing towards reassurance rather than discovery. Exercises that repeatedly confirm system-wide resilience can be presented as validation of supervisory effectiveness. By contrast, tests that reveal widespread capital inadequacies would raise uncomfortable questions about past regulatory judgements. Over time, this dynamic risks turning stress testing into a form of implicit certification, rather than a genuine attempt to learn about tail risks.2
Career and reputational incentives matter too. Senior officials involved in stress testing operate within a relatively small ecosystem spanning central banks, supervisory agencies, international institutions and, eventually, the regulated sector itself. In such an environment, there are few rewards to presiding over exercises that trigger market disruption or cast doubt on the robustness of the regime, and many to delivering results that are seen as steadying and confidence-enhancing. These incentives shape outcomes at the margin.
You can see this shift in the official language. In the UK, the 2016 stress test was framed in relatively nuanced terms:
“While the Prudential Regulation Authority (PRA) Board judged that some capital inadequacies were revealed for three banks (The Royal Bank of Scotland Group, Barclays and Standard Chartered), these banks now have plans in place to build further resilience.
The Financial Policy Committee (FPC) judged that, as a consequence of the stress test, the banking system is in aggregate capitalised to support the real economy in a severe, broad and synchronised stress scenario.”
Bank of England Stress Test Results, 2016
The tone since then has become more triumphalist. In 2017, “for the first time since the BoE launched its stress tests in 2014, no bank needs to strengthen its capital position as a result of the stress test”. In 2018, “the UK banking system is resilient to deep simultaneous recessions … more severe overall than the GFC”. The same formulation was repeated in 2019. By 2022, the conclusion was that “reflecting resilience built up by banks in recent years, the results indicate the UK banking system would be able to withstand the severe macroeconomic scenario and has the capacity to support households and businesses throughout the stress.”
The 2021 exercise is particularly revealing. Rather than testing resilience, the stress test was explicitly framed as confirming an existing judgement:
“The Bank’s 2021 solvency stress test has acted as a cross-check on the FPC’s judgement that the banking system is resilient to outcomes for the economy that are much more severe than the MPC’s central forecast”.
“These results support the FPC’s judgement that the system is resilient to outcomes for the economy that are much more severe than the Monetary Policy Committee’s (MPC’s) central forecast”.
Bank of England Financial Stability Report, December 2021.
This reads less like an independent stress test and more like a goal-seek exercise, designed to validate a prior conclusion.
The contrast with IMF-run exercises is telling. Where the institution conducting the test is not the supervisor, results are often markedly less reassuring. The IMF’s October 2020 stress test, for example, found that around 30 per cent of global systemically important banks would see CET1 ratios fall below 8 per cent under its stress scenario.
Finally, stress test results are presented in a way that actively inhibits external scrutiny. Point estimates are offered as if they were known with certainty. There is little transparency about the models used, their empirical accuracy, or the samples over which they were estimated. Alternative assumptions and sensitivity analysis are rarely front and centre. The implicit message is not “here is what we have learned under uncertainty”, but rather: “trust us - the system is resilient”.
To be clear, this is not a claim that stress test results are being deliberately engineered, or that scenarios are selected with the aim of avoiding adverse findings. It is also not a judgement about the intentions or integrity of those involved. Rather, the concern is that insufficient weight is placed on challenging whether the results are plausible, and on acknowledging the limits of what these exercises can realistically deliver. Stress testing is intrinsically difficult. Without a degree of humility about model uncertainty and system complexity, there is a risk that reassurance is mistaken for robustness.
Does this matter?
A common defence is that regulators deliberately “juice” the severity of stress scenarios to offset well-known modelling biases. On this view, extreme scenarios are a feature rather than a bug: they are meant to compensate for optimistic assumptions and structural limitations, delivering an appropriate degree of conservatism overall. This defence is unconvincing.
It reinforces the perception that stress tests have become goal-seek exercises - calibrated to deliver a reassuring bottom line rather than to interrogate vulnerabilities. And repeated publication of highly reassuring results may be counterproductive. If authorities consistently communicate that the probability of another banking crisis is vanishingly small, banks and investors may rationally take on more risk or devote fewer resources to monitoring tail events. If stress testing is akin to war-gaming - building analytical muscle memory for use under pressure - it’s not clear the current approach, with its polished point estimates and confirmatory narratives, is building that capability.
The analogy to pre-Covid pandemic preparedness scores is uncomfortable.
A way forward: from certification to exploration
The shortcomings of current macroprudential stress testing don’t mean stress testing itself has outlived its usefulness. The core insight remains sound: complex financial systems need examining under extreme but plausible conditions. The problem is in how this has been implemented.
Instead of running increasingly resource-intensive pass-fail exercises tied to arbitrary hurdle rates, macroprudential authorities should reorient stress testing toward a more exploratory and diagnostic role. The goal should shift from certifying resilience to understanding where vulnerabilities lie, which shocks matter most, and which institutions or markets the stability of the system is most reliant upon.
There are signs this is happening. In recent years, the Bank of England has complemented its traditional solvency stress tests with a series of exploratory exercises, including climate stress tests, system-wide liquidity scenarios, and a planned future exercise on risks in private markets. The difference: these are framed as tools for learning rather than for judgement. They aim to map exposures, identify amplification mechanisms, and improve understanding of complex interactions, rather than to deliver a binary verdict on resilience.
This shift is welcome and should be taken further.
Take reverse stress testing. Instead of specifying a macroeconomic scenario and examining its impact on bank capital, reverse stress tests start from adverse outcomes - such as severe capital depletion or widespread distress - and ask what combinations of macro-financial conditions would be required to generate them. This is especially useful for identifying non-linearities and interaction effects that are easily missed in forward-looking scenario design. It also forces explicit engagement with the question that current stress tests tend to avoid: what would it actually take to break the system?3
Or focus on sensitivity analysis rather than single-scenario outcomes. Instead of concentrating analytical resources on one highly specific stress path, authorities could require banks to report how their capital and liquidity positions respond to a range of well-defined shocks: for example, given discrete declines in commercial real estate prices, sharp steepening or inversion of the yield curve, or severe but plausible funding market disruptions. The goal would be to construct a “risk topography” of the financial system - an explicit mapping of sensitivities to different risk factors - rather than a single headline resilience metric.4
This would also reduce the reliance on opaque internal models operating far outside their estimation range. By focusing on marginal sensitivities and partial shocks, results would be easier to interpret, easier to challenge, and easier to compare across institutions. Crucially, they would also be more informative for policymakers seeking to understand where preventive macroprudential tools might be most effectively deployed.
This reframing would make stress testing more like war-gaming than examination. Its value lies not in providing reassurance ex ante, but in developing the analytical muscle memory needed to respond effectively when stress emerges. That, ultimately, was the original promise of stress testing in the aftermath of the global financial crisis. Reclaiming that purpose requires moving beyond the comfort of pass–fail exercises and embracing a more open, uncertain, and informative approach to analysing systemic risk.
Notes
This piece draws on material I presented at a London Quant Group (LQG) Seminar with the same title, in November 2023. I thank Aditya Mori for helpful research assistance in preparing for that seminar.
1 As Deputy Governor Jon Cunliffe put it “Absent the massive intervention [by central banks], core markets would have continued to seize up and the liquidity crunch would have become worse”. Cunliffe (2022).
2 See Parlasca (2024) for a model of regulatory incentives around the disclosure of stress test results.
3 See Aikman et al. (2024) for an example of such an approach.
4 See Brunnermeier et al. (2011) and Duffie (2011) for more detailed descriptions of the approach.
References
Aikman, D, Angotti, R, and Budnik, K (2024), “Stress Testing with Multiple Scenarios: A Tale on Tails and Reverse Stress Scenarios”, ECB Working Paper No 2024/2941.
Brunnermeier, M, Gorton, G, and Krishnamurthy, A (2011), “Risk Topography”, NBER Macroeconomics Annual 2011.
Cunliffe, J (2022), “Learning from the dash for cash – findings and next steps for margining practices”, Keynote address at the FIA & SIFMA Asset Management Derivatives Forum 2022.
Duffie, D (2011), “Systemic Risk Exposures: A 10-by-10-by-10 Approach”, NBER Working Paper 17281, August 2011.
Parlasca, M (2024), “Stress Tests by an Informed Regulator”, Mimeo.
David Aikman is Director of the National Institute of Economic and Social Research (NIESR).
He joined NIESR in July 2025, having previously been Professor (in Practice) of Finance at King’s Business School and the inaugural Director of the Qatar Centre for Global Banking and Finance. He co-founded the Bank of England Watchers’ Conference (with Richard Barwell) and launched the Macroprudential Matters blog. David is a member of the Bank of England’s Macroprudential Roundtable and, until July 2025, served on the Prudential Regulation Authority’s Cost Benefit Analysis Panel. He has recently advised the Central Bank of Ireland, the Bank of Canada, and the Central Bank of the UAE, and was a Visiting Professor at Keio University in Tokyo in 2023.
Earlier in his career, David worked at the Bank of England in financial stability roles and was seconded to the Federal Reserve Board between 2013 and 2015. He has represented the UK internationally at the FSB, Basel Committee, and ESRB. He holds a PhD in Economics from the University of Warwick.
His research focuses on monetary policy, financial stability, and macro-financial risks.