2nd February 2026
"The bank capital regime provided little warning of the approaching abyss." - Tim Geithner, 20191
Key Takeaways
The success of stress testing to help end the Global Financial Crisis (GFC) motivated its inclusion in the post-GFC revisions to the regulatory capital regime. There is no perfect stress test design, but there are some desirable features:
• Multiple scenarios: One scenario is simply not enough to cover the space of relevant risks.
• Horizon length and time-step: Three-year scenarios using quarterly time-steps: to adequately capture different loss and revenue dynamics with acceptable model forecast deterioration, and to match financial reporting cadence.
• Models: Both banks and supervisors need to build their own set of models.
• Supervisor and bank (ICAAP): Supervisory (top-down) and bank (bottom up) stress tests, should be run simultaneously.
• How often: More is often better, but it is not costless. Annual for the largest and most complex banks, every other year for the next tier of banks.
• Disclosure: More is better, but because of strategic behavior, there can be too much disclosure.
• Exploratory scenarios and reverse stress testing: Both are excellent tools – supervisors should use them more extensively.
Overview
Modern stress testing for financial institutions, banks in particular, was born in the Global Financial Crisis (GFC). The then existing regulatory capital regime proved woefully inadequate in assessing the capital strength of banks; another approach was needed. Its success in helping to end the crisis proved too seductive to give up once financial peacetime set in. It has since become one of the central pillars not only of the bank capital regime but also as a useful tool for bank supervisors and for risk management broadly.
In this paper I discuss features and design choices for effective stress testing from the vantage point of the bank supervisor. A fundamental challenge for supervisors is that they are always at an informational disadvantage vis à vis the banks they supervise – except in one important dimension: the ability to compare across firms. As I will show, stress testing is particularly well suited for leveraging this one advantage to provide valuable insight into bank and banking system resilience.
The focus is on stress testing for capital adequacy purposes. Along the way I show that such stress testing can be a very effective supervisory tool to also assess some of the core elements of risk management such as risk identification, risk appetite and capital planning. While the discussion centers on capital adequacy for banks, the principles and design features are readily applicable for determining liquidity adequacy via liquidity stress testing. Details on execution of bank capital stress tests, applications to other financial service providers such as insurers or assets managers and even central banks, as well as a broader discussion of stress testing and financial stability can be found in Farmer et al. (2022).
The challenge going into the GFC is captured perfectly in Geithner’s quote. The prevailing bank capital regime had two main planks: a leverage ratio which is very simple but treats all assets, from treasury bonds to commercial real estate loans the same; and a risk-based capital ratio where assets are risk weighted based on either a set of simple weights (e.g. commercial loans are twice as risky as residential mortgages) or more complex model-based weights. Neither of these ratios were flashing red, or even amber, in 2007.
Stress testing became the third plank in the bank capital regime. Among its many features and advantages, two stand out: stress testing is dynamic (a scenario unfolds over time), and the impact assessment is for the whole balance sheet (not just assets) and the income statement or profit and loss (P&L). This dynamism allows for different loss emergence profiles as well as revenue and cost arrival, mimicking how bank financials would actually manifest in a stress event.
Because stress testing is scenario based, it is more intuitive and typically easier to understand than the more abstract RWA approach. Describing relative and absolute riskiness in the context of a severe recession, a market shock with a flight to quality, perhaps triggered by a geopolitical event, is more natural and easier to grasp than the abstraction of a risk weighting formula. This feature is valuable not just for banks, and especially for board members, but also for supervisors who need to assess the soundness and resilience of individual banks and the banking system to shocks. The severity of the scenario(s) embodies a risk appetite, be it for the bank or the supervisor. The lower the risk appetite, the greater the need for high resilience, the more severe the scenario to test against. And, with a common single or more scenarios, supervisors are able to easily compare across institutions: same test at the same time. The table below compares the three planks of the bank capital regime.
While the updates to the capital regime via Basel 3 surely had an impact in raising capital levels, stress testing was the biggest driver, at least in the U.S. At the end of 2006, $486 billion in common equity2 was supporting about $10.1 trillion in total assets in the U.S. banking system. At year-end 2024, $2.1 trillion in CET1 capital was supporting about $23.6 billion in assets. Controlling for inflation, assets have increased by a little over half but capital by nearly three-fold! See also the figure below.
Total assets and capital in US banking system. “Capital” is CET1 for 2024 and common equity tier 1 for 2006. 2006 “real” is in 2024 dollars. Source: regulatory (call) reports, H.8 data.
To better appreciate its versatility and flexibility, it is useful to recognize that stress testing can be thought of as a dynamic and bespoke risk weighting algorithm, able to adapt to evolving risks. For example, in the wake of the pandemic and the shift to work-from-home, downtown office space suffered a severely adverse demand shock. As a result, the riskiness of CRE loans to that sector jumped dramatically. A targeted scenario aimed at this sector could reveal banks’ vulnerabilities to increasing losses in downtown office space, effectively raising the risk weight on this specific asset class. That is exactly what was done in the Federal Reserve’s 2023 stress test.3
Design Choices
There are a number of design choices available to the supervisor when embarking on stress testing.
Scope of application
A natural starting point in the design process is which banks to include in the stress test, recognizing that supervisory resources are quite limited, and that large and more complex banks pose greater risk to the system. Should foreign banks operating in the jurisdiction be included? For example, the Federal Reserve includes domestic operations of large foreign banks, but the Bank of England’s PRA does not. Multi-country supervisors such as the ECB’s SSM have the additional complication of country representation. A bank that might be considered too small to be included for a large economy would be large enough for a smaller economy. Finally, if the objective of the exercise is more macro-prudential and thus would go beyond just the banking system, which non-bank financial institutions (NBFIs) would one want to include? A good example is the Bank of England’s system-wide exploratory scenario exercise.4
Scenario design
This is arguably one of the most important aspects of stress test design as it captures evolving risks that banks are facing around the time of the exercise. The more complex and nuanced those risks, the more scenarios are needed to cover the space. The scenario(s) should be designed to explore and probe the vulnerabilities faced by banks and the banking system. The wider the range of business models, portfolios and risk exposures, the harder it is to adequately capture the range of risks with just one scenario.
The 2023 cycle of stress testing is a good example of the pitfalls of a single scenario. The canonical scenario is a severely adverse shock to markets and the economy resulting in a flight to quality, decline in GDP and rise in unemployment, with the standard monetary policy response of lowering interest rates. However, in the face of rising inflation, the opposite monetary policy response is warranted. Since interest rates can’t move in two different directions at the same time, this requires a minimum of two scenarios. An inflation scenario would have been particularly helpful in 2023 when central banks were engaged in a steep rate hike cycle to fight inflation. Inflation risk remained on the horizon when the 2023 scenario was being designed, and both supervisors and banks would have learned from it. But the Federal Reserve relied solely on the single standard severely adverse shock scenario, with falling rates. Increasing the number of scenarios from just one, which is currently the case for nearly all supervisory stress testing regimes, may be the single most impactful design change from current practice.
Even with several scenarios, supervisors have incomplete information about the risk profile of banks. The banks themselves should and do have better information, and thus the scenario(s) that the banks design for their own stress testing provides very useful insights for the supervisor. Moreover, once those bank specific scenarios are aggregated across all banks, a more complete and nuanced risk profile facing the banking system emerges. A comparison to the set of supervisor-designed scenario(s) provides valuable insights into the relative perception of risks.
The severity of the scenario(s) is guided by the risk appetite of the supervisor; and by the bank’s risk appetite for the bank scenario(s). Indeed, if the supervisory (top down) and banks’ own (bottom up) exercises are done simultaneously, as I would advocate, then it is easier to compare scenarios both across banks and bank to supervisor, providing valuable insights and making it easier to assess differences in risk appetite. The ECB and the Bank of England separate the exercises into supervisory stress testing and Internal Capital Adequacy Assessment Process (ICAAP), while the Federal Reserve combines them.
Stress scenarios, whether designed by supervisors or banks, are largely characterized by financial and macroeconomic shocks, and with that consider mostly financial risks. Nonfinancial risks or sources of shock are less common. This could include cyber-attacks and other technology shocks (for instance, AI gone amok), unexpected legal risks (such as the subprime related fines in the U.S. following the GFC) geopolitical events, pandemics and climate change. Of course, the two types are not rigidly separable: these nonfinancial shocks do have financial impacts, and it is that linkage which is important to explore.
Horizon length and time step
In addition to the number of scenarios and their severity, supervisors need to decide on the horizon length. The Federal Reserve’s stress test is effectively a two-year horizon (nine quarters, with the first accounting for the time it takes to run the exercise), the ECB uses three years, and the Bank of England extends the horizon to five years. Longer horizons can better account for the different loss emergence profiles across asset classes and allow for more nuanced scenario profiles (a shock with slow and delayed recovery, for instance). The trade-off is along two dimensions: first, a shorter (closer to two years) horizon aligns more closely with management actions: planning, execution, impact. Second, any forecast deteriorates as the horizon extends: a one quarter ahead forecast is more reliable than a one-year forecast, three year and so on. While stress testing is, strictly speaking, not forecasting, the models used are nonetheless built on the past, so one cannot escape the projection deterioration problem. Taking all of these aspects together, a three-year horizon may be the sweet spot.
The choice of time step will matter for model architecture and reporting. Practically the choice comes down to quarterly or annual time steps; the former is used by the Federal Reserve, the latter by the ECB and the Bank of England. Quarterly time steps allow for more granular loss emergence and revenue realization dynamics. These need not be coincident: some losses can occur sooner than revenue can arrive to help absorb those losses before eating into capital. If the time-step is yearly, challenges the bank might face during the year because of this asynchronous loss/revenue profile will be missed. Moreover, a quarterly time step matches the reporting frequency by banks which are important information disclosure events to which the market can and does react. Since stress testing is supposed to be a facsimile of an actual stress event, matching to reporting frequency is desirable.
Models
The choice of time step has implications on model design. There are two sets of models: the first to generate the scenario(s), the second to translate them to financial outcomes like losses, revenues and capital impact. There are a host of more detailed choices about granularity of modeling which are beyond the scope of this note. But regardless of those decisions, these models are built by both supervisors and banks, and for effective stress testing, both are needed. Banks know (or should know) their business dynamics and risks better, and thus their models should be better able to capture both – for their bank. Supervisors have the benefit of cross-bank information and can build models which make use of a much wider data set. For example, CRE loss models that use data from multiple banks are likely more robust than models that are limited to just one bank’s experience. Moreover, banks tend to be more optimistic about their ability to withstand shocks than supervisors. The banks, of course, would argue that supervisors are overly conservative or pessimistic, so between the two we have a useful range which should contain the most plausible outcome. At a minimum, supervisory modeling is needed to allow supervisors to effectively challenge and evaluate the banks’ own models and model outputs.
How often to stress test
Stress testing exercises are quite resource intensive, both for supervisors as well as for banks. Thus, the effort needs to justify the informational value that is gained from the exercise. The faster the evolution of risks in the real world, the faster the turnover of portfolios or balance sheets in the face of those evolving risks, the more often one should conduct a stress test.
The ECB and the Bank of England conduct their stress test every other year while the Federal Reserve does so every year – but not for every bank subject to stress testing. Smaller and less complex banks (but still large enough to be in the program: at least $100 billion in assets) are on a bi-annual cycle. This tailored approach seems like the right compromise between effort and insights where one would most want them: the largest, most complex and systemically relevant banks.
Disclosure
One of the most important reasons the stress test was so successful to help end the financial crisis was its disclosure regime and transparency. Because the existing regulatory capital regime was not that useful in differentiating fragile from healthy banks, the market and the public were understandably quite skeptical about supervisors’ ability to make those distinctions. To restore trust, the new approach of stress testing had to provide rich disclosure to demonstrate that it was better and more useful than the existing regulatory capital approach. Disclosure of detailed, bank by bank, asset class by asset class results were provided, in addition to details on the methodology. Over time, disclosure has expanded further, with very rich quantitative information disclosed especially for ECB stress tests and, recently, detailed information on supervisory models in the case of the Federal Reserve.5
Detailed and full disclosure of supervisory models does come with some risk of gaming: with the models in hand, banks can reposition their portfolio to optimize their capital requirement. This is true for any capital regime which is based on some explicit algorithm. An opaque regime makes gaming harder but places greater weight on supervisory discretion with its concomitant trade-offs.
On the whole, more disclosure is desirable to provide relevant information to market participants to complement their own evaluation of the capital adequacy of the banks. This is clearly highly relevant to bank creditors, including depositors, and to investors concerned about capital levels, dividends, and buybacks. But since supervisors have access to confidential and bank proprietary information that is not disclosed, market participants will parse what is disclosed through that lens. Supervisors know more than they have disclosed, so what conclusions should one draw then from what is disclosed? This strategic behavior consideration should inform what, how much and when to disclose. See the excellent discussion by Goldstein and Leitner (2022).6
Use of results
Now that the stress test is done, results are in hand, insights gathered and gained, how can they be best used? Most of the attention has been paid to the quantitative output: losses, revenues, capital impact. Indeed, that was the original purpose in the financial crisis to answer a seemingly simple question: do the banks have enough capital, and if not, which banks need more, and how much? This wartime task has evolved to a peacetime effort of assessing capital adequacy for banks by guiding or even explicitly setting a bank’s capital requirement.
There are broadly two approaches of moving from stress test results to capital requirements: direct and indirect. The direct approach, used by the Federal Reserve, is to take the quantitative modeled output of capital impact and directly include it in the capital requirement until the next stress test when a new result might imply a new requirement. Mechanically this is done through the stress capital buffer (SCB) which is effectively a bank-specific capital conservation buffer in that it has a floor of 2.5% CET1 capital. Banks whose capital impact is larger will have a higher SCB. Supervisor, not bank models determine the SCB.
The indirect approach, used by the Bank of England and the ECB, uses the stress test results to inform the Pillar 2 capital requirement using supervisory judgment and discretion.
The direct approach is clearly more transparent, which is desirable. But that transparency raises the bar on transparency of the process that generated the capital impact numbers, especially the models. As discussed above in the Disclosure section, the Federal Reserve has evolved over time to disclose more detail about its models, starting in 2019 with functional form and variables used, to 2025 when parameter estimates were added, along with significant additional methodology documentation. No such disclosure has been provided by other supervisors, nor is there much disclosure on how exactly the stress test results are used to inform the Pillar 2 capital add-on.
In addition to the quantitative results, the stress testing exercise yields a rich trove of qualitative insights, especially on risk management practices. This includes risk identification (critical for scenario design), modeling capabilities, data integrity, governance and control – to name a few. These are key aspects of supervision: supervisors spend considerable time and effort examining banks to make assessments of these critical aspects of banks’ risk management capabilities to form a view on the bank’s safety and soundness. In this way, the stress test exercise, conducted at the same time with banks most important to the banking system and the economy, is one of the most valuable qualitative tools in the supervisory toolkit. This is often underappreciated and under-leveraged.
Exploratory scenarios and reverse stress testing
The choice of scenario(s) for the stress test is clearly one of the most important decisions and design choices, if nothing else because it has impact on the capital requirement of the bank. Because it is so impactful, there may be reluctance to explore risks and scenarios that are quite different from the set of scenarios used in the past. The willingness to explore and experiment is understandably low. With that in mind, the Bank of England introduced in 2015 their bi-annual exploratory scenario which, as the name implies, is meant to inform on bank resilience but not to set capital requirements. As a result, much less information is disclosed and only at the aggregate (so not bank specific) level. This approach allows the supervisor to gather new insights on emerging risks without the burden of having to take the results to the capital requirements. The Federal Reserve has since adopted this useful approach with the full introduction of two exploratory scenarios in 2024.
No discussion of stress testing would be complete without touching on reverse stress testing. The idea is simple: instead of designing a scenario and running it through the bank financials to calculate an impact, in reverse stress testing one starts with a given impact, either in absolute terms, say a $20 billion capital impact, or in relative terms, say three percentage points of capital, and considers the range of ways one could plausibly lose that much capital. It has the advantage of forcing one to be more creative in considering the range of possible scenarios to get to a predetermined loss and is thus a very valuable risk management tool. It is less useful for actually determining the level of capital required, although in executing the reverse stress test, one might stumble on plausible scenarios that one would not have considered before, especially if the standard scenarios generate losses that are lower than the reverse stress test seeks to explore. A good example is the upcoming ECB stress test of geopolitical risk where banks are asked to consider a range of plausible scenarios that would generate an impact of three percentage points of CET1 capital.7
“Mix me the perfect stress test”
This discussion has hopefully shown that stress testing is an exercise of tradeoffs. There is no perfect stress test design. But there are some desirable features which are worth collecting here.
• Multiple scenarios: One scenario is simply not enough to cover the space of relevant risks, especially if the world is uncertain (so a wide range of plausibly dangerous risks) with a wide range of business models, bank sized and degrees of complexity. How many? Sarin and Schuermann (2025) suggest three to five, with three as a minimum, and to average the two worst (highest capital impact) for a given bank.8
• Horizon length and time-step: Three-year scenarios to adequately capture different loss and revenue dynamics and allow for richer risk factor dynamics with acceptable model forecast deterioration. Using a quarterly time-step is desirable for similar reasons of loss and revenue dynamics, and to match financial reporting cadence.
• Models: Both banks and supervisors need to build their own set of models. They will generate different results. There is valuable information in that difference. Supervisors must have their own models to robustly assess banks’ results.
• Supervisor and bank (ICAAP): The occasion of running the supervisory stress test should be used to also have banks run their own which, in many jurisdictions, is a key element of the ICAAP. It allows both supervisors and banks to focus on their vulnerabilities at the same time, for supervisors with the banking system in mind, and banks with their own, for easy and valuable comparison.
• How often: More often is better, but it is not costless. Annual for the largest and most complex banks, every other year for the next tier of banks, seems the right compromise.
• Disclosure: More is better, but because of strategic behavior, there can be too much disclosure. If supervisory models play a critical and determinative role in setting capital standards, there is a higher burden of providing transparency.
• Exploratory scenarios and reverse stress testing: Both are excellent tools – supervisors should use them more extensively.
A note on return on effort. Stress testing is a time and resource intensive exercise, both for banks and for supervisors. Time and effort spent on stress testing is therefore not spent on running the bank, conducting other risk management activities, and doing other supervisory work. The more scenarios, the more model intensive, the more granular, the more often, the more resource intensive. If the recommendation is: multiple scenarios, quarterly time step, every year (at least for the large and complex banks), then something has to give lest the resource demands result in dysfunctional outcomes.
The “give” is a reduction in documentation, model sophistication and granularity. The model architecture has almost certainly become overly complicated, perhaps forgetting that stress tests focus on extreme and thus rare (but not implausible) outcomes. It is hard enough to model expected outcomes, let alone unexpected or more extreme tail outcomes. Simpler models tend to be more robust, and thus the complex model landscape that has evolved is almost certainly yielding a false sense of precision and accuracy. Simpler but more and more frequent stress testing is more useful and desirable than less frequent and very complex stress testing.
Notes
The author is grateful for excellent comments from Doug Elliott and Andy Kuritzkes. The views expressed here are those of the author.
1 Foreword to Farmer, J. Doyne, Alissa M. Kleinnijenhuis, Til Schuermann, and Thom Wetzer eds. 2022. Handbook of Financial Stress Testing, Cambridge: Cambridge University Press.
2 The CET1 standard did not exist pre-GFC. Tier 1 equity was the prevailing standard, with an expectation that common equity would make up the preponderance of Tier 1.
3 Board of Governors of the Federal Reserve System. 2023 Federal Reserve Stress Test Results.
4 Bank of England. Stress testing.
5 Federal Reserve. Dodd-Frank Act Stress Tests 2026
6 Goldstein, Itay and Yaron Leitner. 2022. “Stress Tests Disclosure: Theory, Practice, and New Perspectives.” ch. 11 in Farmer et al. (2022).
7 Interview with Anneli Tuominen, Member of the Supervisory Board of the ECB, 10 July 2025. European Central Bank - Banking Supervision, Interview with Público
8 Sarin, Natasha and Til Schuermann. 2025. "Stressing the Stress Tests," Journal of Financial Crises: Vol. 7: Iss. 3, 20-49.
Til Schuermann is a Partner and the Global Head of the Finance & Risk Practice at Oliver Wyman. Til advises private and public sector clients on enterprise risk management, stress testing, governance including board effectiveness, capital adequacy and capital management, macroeconomic and geopolitical risk.
Until March 2011, Til was a Senior Vice President at the Federal Reserve Bank of New York where he played a central role in the Fed’s response to the 2008 financial crisis. Til serves on the board of the Social Science Research Council, is on the Financial Risk Manager (FRM) exam committee for the Global Association of Risk Professionals (GARP) and has taught at Columbia University and at the Wharton School. He has numerous publications in both academic and practitioner journals, including as a contributing editor to the Handbook of Financial Stress Testing (Cambridge Univ. Press, 2022). Til started his career at Bell Labs. He received a PhD in Economics from the University of Pennsylvania.