Back to cover

Results and Performance of the World Bank Group 2021

Chapter 5 | Conclusions

The recent increases in the World Bank’s project outcome ratings and IFC’s development outcome ratings are positive news. The World Bank’s outcome ratings steadily improved from FY10 onward before increasing by an impressive 9 percentage points in FY20, reaching 88 percent of projects with outcome ratings of MS or higher, a historic high. The increase resulted from ratings improvements for virtually all categories of projects—all Practice Groups (especially Sustainable Development), all Regions (especially Europe and Central Asia and Western and Central Africa), and almost all lending sizes (especially the largest projects of $100 million or more)—rather than resulting from shifts in portfolio composition or improvements limited to specific portfolio segments. Ratings even increased in IDA and FCS countries, the most difficult operating environments. Our analysis shows that disruptions caused by COVID-19 did not have a discernable impact on the ratings jump during FY20. IFC and MIGA saw ratings improvements as well. In 2019, IFC’s ratings increased for the first time in 10 years, though there is not enough data to confirm if this improvement was sustained in 2020. MIGA’s project development outcome ratings have been steadily increasing for 10 years. Ratings increases across the Bank Group signal the institutions’ ongoing commitment to development effectiveness. What the analysis in this report also shows is that project ratings alone provide little evidence on the types of outcomes the Bank Group is achieving and the quality of associated targets and indicators.

The analysis in this report shows that although the presence and implementation of project-level M&E frameworks has improved, many World Bank projects still do not adequately measure the outcomes. Targets and indicators are a critical element of the World Bank’s self-evaluation methodology and are a key driver of the assessment of how well projects perform. But the logic and quality of these targets and indicators varies widely, and the correlation between ratings and the quality of targets and indicators is inconsistent, meaning that projects can still achieve high efficacy ratings even when they lack proper baselines or when they measure outputs and activities that do not match a project’s intended outcomes.

  • Implication: The World Bank and IEG could pay more attention to how well indicators measure project objectives. To do so would require a more systematic approach to gauging the appropriateness of indicators and targets early in the project cycle. A successful approach would include clarifying the links between indicators and project objectives and defining targets in relation to scrutinized baselines. The World Bank’s recent ICR reforms, which require an explicit reference to theories of change, are a step in the right direction. The World Bank could, however, make further efforts to select robust, direct, and attributable indicators and targets.

Like any other metric, aggregate ratings need to be interpreted correctly by understanding what they do and do not measure. As discussed above, ratings measure a project’s success in meeting self-defined targets and objectives, but ratings are not meant to assess either the nature of a project’s development outcomes or the extent to which the project addresses a country’s development needs. This means that individual ratings use indicators at different levels of ambition and complexity and are not measured by an absolute standard. As Bulman, Kolkma, and Kraay (2015) observe, this introduces the possibility that at least some of the variation in aggregate project outcome ratings is caused by differences in the ambition or attainability of the stated development objective, rather than any differences in actual outcomes. This problem is less acute in IFC and MIGA because their evaluation framework includes some objective criteria and standards (such as a project’s financial performance and comparisons with peers and industry benchmarks).

  • Implication: The World Bank could provide a fuller explanation of ratings as and how they relate to underlying development outcomes. IEG and the World Bank could carry out periodic syntheses and report on development outcomes, following in the footsteps of IEG’s outcome orientation agenda. Potentially, the World Bank could devise a system to regularly harvest project outcomes and key activities and match this information with ratings data for a more integrated monitoring of results and performance.

A project’s development outcomes are affected by a host of factors not directly considered in ratings. Ratings serve a strong purpose in evaluating a project’s performance, but supplementing ratings with information about a project’s size, type, country, outcome type, client type, outcome potential, corporate priorities, and other characteristics can help teams attain a fuller, more objective assessment of a project’s development outcome and its risk—as IFC and MIGA analysis has shown. This report shows that some of these characteristics may have a direct impact on ratings (for example, IFC projects with repeat clients tend to have higher ratings), other characteristics have a tenuous link to ratings (for example, pursuing certain corporate priorities), and still other characteristics have no confirmed effect on ratings (for example, outcome types on overall ratings). This is not to say, however, that these characteristics do not provide context to ratings and help project teams better understand a project’s probability of success or the risk for achieving certain development outcomes.

  • Implication: IFC and MIGA could use information on outcome types and other characteristics to better assess projects’ risks, ratings, and development outcomes. IFC’s AIMM framework and MIGA’s Impact Measurement and Project Assessment Comparison Tool framework already account for a project’s estimated and actual development potential and development outcome risks. IFC and MIGA could take it a step further by assessing the prevalence of different outcome types and other characteristics in projects to help enhance their frameworks. For example, the potential risk severity of outcome types, as manifested in ratings variance—and the difficulty of achieving certain outcome types—can be incorporated into a project’s development outcome and risk assessment. Adding this information to a typical assessment of development outcomes would contribute to the Bank Group’s learning and possibly improve the ratings system itself.

High ratings do not appear to signal risk aversion. Early in the RAP process, we hypothesized that ratings increases could have resulted from operational teams taking less risk. Indeed, the RAP 2020 and past evaluations identified this as a potential danger (World Bank 2016a, 2020c). However, this report shows that in the two GPs analyzed—Transport and Education—successor projects that introduced novelty (that is, that introduced new or expanded elements over the previous project) performed as well as or better than projects that closely replicated the predecessor project. The analyses of IFC’s XPSR ratings of projects with market-level outcomes and the relationship between outcome potential and XPSR ratings suggest that projects addressing high-magnitude development outcomes are not destined for lower ratings and that a project’s outcome potential and XPSR rating may actually be moving in the same direction. All this suggests that the World Bank and IFC operational teams can discern when the conditions are right for projects to support novel and complex activities. In this way, teams can take informed risks and selectively build on past experiences to elevate a project’s objectives without suffering lower project performance ratings.

  • Implication: The Bank Group could further emphasize operating “on the frontier” (that is, selecting the best combination of risks and opportunities) as a goal in addition to meeting the Corporate Scorecards rating targets. This shift in emphasis would provide a broad set of incentives and encourage the Bank Group to inquire further about the motivations for risk taking; the evolution of project designs; the pursuit of corporate priority goals; and the best way to leverage internal resources and the client’s engagement, commitment, and capacity to deliver development results. This could help ensure that the Bank Group continues to selectively take risks to improve development outcomes.