Back to cover

Results and Performance of the World Bank Group 2020

2 | Part I: Assessing Performance through Ratings

This chapter reports Bank Group ratings trends for World Bank projects, the Bank Group’s country programs, IFC investment projects and advisory services, and Multilateral Investment Guarantee Agency (MIGA) projects. It also explains some major trends and patterns in ratings, focusing on World Bank projects and IBRD country programs’ positive performance trends; programs in countries affected by fragility, conflict, and violence (FCV); and IFC’s less positive performance. In line with common practice, the chapter treats ratings as a success metric. Ratings measure projects’ achievement relative to objectives and targets stated at approval or revised subsequently. Ratings are not comparable across the three Bank Group institutions because of differences in mandates and business models.

World Bank Projects

Overall outcome ratings for World Bank lending are high. Of the 167 Project Completion Reports for projects that closed in fiscal year (FY)19 and were validated by IEG, 79 percent were rated moderately satisfactory or above (MS+) on achieving their stated outcomes. This is a slight decrease from 81 percent in FY18.

Looking back over a 10-year period, outcome ratings declined from 71 percent MS+ for project closures in FY09 to 68 percent MS+ in FY13, and they increased again to 81 percent MS+ in FY18 and 79 percent in FY19. A numerical conversion of the ratings scale, done to test the trend’s robustness, shows the same pattern of ratings declines until FY13 and increases afterward (figure 2.1). Because of the improved project performance, outcome ratings increasingly cluster in the moderately satisfactory or satisfactory points of the scale.

Figure 2.1. World Bank Project Outcome Ratings, Annual

Image

Source: Independent Evaluation Group.

Note: The dark blue line shows the numerical value of the six-point rating scale, which assigns 1 for highly unsatisfactory, 2 for unsatisfactory, and so on, with 6 being highly satisfactory. The light blue line represents the conventional percentage of projects rated moderately satisfactory or above. MS+ = moderately satisfactory or above.

Figure 2.1. World Bank Project Outcome Ratings, Annual

Source: Independent Evaluation Group.

Note: The dark blue line shows the numerical value of the six-point rating scale, which assigns 1 for highly unsatisfactory, 2 for unsatisfactory, and so on, with 6 being highly satisfactory. The light blue line represents the conventional percentage of projects rated moderately satisfactory or above. MS+ = moderately satisfactory or above.

Outcome ratings have been more stable over time when measured by project volume, increasing from 78 percent MS+ in FY08 to 82 percent in FY13, 84 percent in FY18, and 82 percent in FY19. IEG ratings for Bank performance improved from 69 percent of projects rated MS+ in FY13 to 84 percent in FY18 and 82 percent in FY19, reflecting better ratings for quality of supervision and quality at entry, the two components of the Bank performance rating (box 2.1).

Box 2.1. Aspects of Bank Performance

Quality at entry refers to the extent to which the World Bank identified, prepared, and appraised the operation so that it was most likely to achieve planned development outcomes.

Quality of supervision refers to the extent to which the World Bank identified and resolved threats to the achievement of development outcomes and to fiduciary aspects. The rating for quality at entry combined with the rating for quality of supervision determines the Bank performance rating.

Monitoring and evaluation (M&E) quality refers to the design and implementation of the project’s M&E arrangements and the extent to which the data are used to improve performance. M&E quality is not a formal dimension of the Bank performance rating, though aspects of M&E overlap with quality at entry and quality of supervision.

Source: Independent Evaluation Group.

Outcome ratings increased in most parts of the portfolio. To have robust sample sizes, IEG compared the project closings of three-year cohorts. It compared FY12–14, when outcome ratings were at their lowest point (69 percent MS+), with FY17–19, when ratings were 80 percent MS+ (figure 2.2). Ratings for investment project financing (IPF) operations rose from 68 percent MS+ to 81 percent, and ratings for development policy financing (DPF) operations declined from 72 percent MS+ to 69 percent between FY12–14 and FY17–19, based on preliminary FY19 data.

Outcome ratings moved upward in nearly all Global Practices (GPs). Outcome ratings decreased in the Macroeconomics, Trade, and Investment (MTI) GP, which has the lowest ratings among all GPs, at 55 percent MS+ in FY17–19.1 The MTI GP leads on many DPFs. Among the GPs with sizeable portfolios, the Education and Environment GPs’ project ratings increased the most. Currently, the Education GP has the highest ratings, at 92 percent MS+. Part II examines the reasons for the ratings differential between IPFs and DPFs and between the highest- and lowest-rated GPs.

Ratings for projects in IBRD countries increased from 71 percent MS+ in FY12–14 to 82 percent in FY17–19. Ratings for projects in IDA countries increased from 68 percent MS+ to 78 percent over the same period. Ratings increased in nearly all Regions. The Middle East and North Africa Region saw the largest outcome ratings increases and now has the highest rating, at 93 percent MS+ in FY17–19. The Africa Region was split into two vice presidential units, effective July 1, 2020. Although the two Africa Regions were both at 71 percent MS+ in FY17–19, their trends differ. Western and Central Africa increased from 52 percent MS+ in FY12–14, but Eastern and Southern Africa remained stable from 72 percent MS+ in FY12–14.

Outcome ratings in FCV-affected countries increased modestly but remained below those in non-FCV-affected countries. Projects in FCV- and non-FCV-affected countries were both at 69 percent MS+ in FY12–14. Outcome ratings rose to 77 percent MS+ in FCV-affected countries in FY17–19 compared with 81 percent in non-FCV-affected countries (figure 2.2).

Figure 2.2. Project Outcome Ratings, FY12–14 and FY17–19 (percent rated MS+)

Image

Source: Independent Evaluation Group.

Note: DPF = development policy financing; FCV = fragility, conflict, and violence; FY = fiscal year; IPF = investment policy financing; MS+ = moderately satisfactory or above.

Figure 2.2. Project Outcome Ratings, FY12–14 and FY17–19 (percent rated MS+)

Source: Independent Evaluation Group.

Note: DPF = development policy financing; FCV = fragility, conflict, and violence; FY = fiscal year; IPF = investment policy financing; MS+ = moderately satisfactory or above.

The underlying improved performance trends are also seen in higher ratings for projects’ quality at entry (see definition in box 2.1). The share of MS+ quality at entry ratings increased from 58 percent MS+ for projects that closed in FY12–14 to 75 percent for projects that closed in both FY18 and FY19. Quality at entry ratings increased in all Regions and all Practice Groups except for Equitable Growth, Finance, and Institutions. Projects in FCV-affected countries have similar quality at entry ratings: 73 percent MS+ in FY18, a pattern also seen in previous years.

The World Bank maintained strong quality at entry even as it responded to the global financial crisis and increased its annual commitments to client countries by 130 percent. In fact, quality at entry improved for projects that had been approved since FY09. In part, this was possible because the World Bank increased the size of projects under preparation during the global financial crisis more than it increased the number of new projects.2

Monitoring and evaluation (M&E) quality ratings increased over the past 10 years for projects in all Practice Groups and Regions. M&E ratings rose from 31 percent of projects rated substantial or above in FY09 to 51 percent rated the same in FY19. All Regions increased M&E ratings over this period, and the Middle East and North Africa Region’s ratings increased the most.

The overall increase in M&E quality ratings masks variation among GPs. Only four GPs achieved good M&E ratings (substantial or above) on at least half of their projects in FY16–18: Social Protection and Jobs (72 percent); Education (64 percent); Health, Nutrition, and Population (55 percent); and Urban, Resilience, and Land (51 percent). There have been many efforts to enhance tools, guidance, and training for staff to strengthen project M&E quality. Some examples include focusing attention on theories of change in project documents, restructuring projects to improve results frameworks, and building staff capacity by training existing staff and recruiting dedicated M&E specialists. Even so, interviews and desk reviews suggest that project M&E struggles for attention amid competing operational agendas.

About 60 percent of the projects that closed between FY07 and FY18 have a mismatch, or disconnect, between the rating given to M&E quality in the last supervision report (the Implementation Status and Results Report) and IEG’s validation of M&E quality based on the Implementation Completion and Results Report. The size of the mismatch varies quite widely by GP. It is possible that optimism bias is affecting assessments of M&E quality during implementation. The elements of what drives M&E quality are rather intuitive, as described in box 2.2.

Box 2.2. Elements of Monitoring and Evaluation Quality

Good project monitoring involves collecting the right data and using it in the right way. Projects with successful monitoring and evaluation (M&E) have outcome indicators that reflect project objectives without being too complicated. These projects plan and execute data collection that is computerized, quality controlled, aligned with client systems, and integrated into the operation rather than an ad hoc process. Teams use the data to track progress and identify implementation challenges. For example, an irrigation project in Mozambique had a specific project objective with clear, measurable, and directly linked indicators. The theory of change was sound, data collection was planned and executed regularly, weaknesses were corrected, and the team used the data to track progress, adjust the results framework during restructuring, and document project outcomes. Even better M&E also ensures country ownership over M&E arrangements, seeks to embed project M&E into client monitoring systems, and focuses on collecting useful data that can inform project implementation (versus more compliance-focused data). By contrast, projects with unsuccessful M&E had overambitious or complicated data collection plans and unclear results frameworks, resulting in delayed baseline data, irregular reporting, and information that lacked credibility

Sources: Independent Evaluation Group; World Bank 2016a.

Country Programs

Bank Group country program outcome ratings have improved over the past 10 years in IBRD countries but not in IDA and FCV-affected countries. Bank Group country program outcome ratings increased from 51 percent MS+ in FY09 to 74 percent in FY17 across all reviewed country program cycles. However, country program outcome ratings stayed flat in IDA and FCV-affected countries. These data are after smoothing, as explained in box 2.3. Among the six Regions, Europe and Central Asia and South Asia had the highest country program ratings in FY08–19, both at 79 percent MS+, and Africa, and East Asia and Pacific had the lowest, at 44 and 57 percent MS+, respectively. FCV-affected countries had lower country program outcome ratings over the period, at 50 percent MS+, compared with 66 percent MS+ for non-FCV (figure 2.3).

Figure 2.3. Country Program Outcome Ratings, FCV and Non-FCV Countries

Image

Source: Independent Evaluation Group.

Note: The data are smoothed as described in box 2.3. FCV = fragility, conflict, and violence; MS+ = moderately satisfactory or above.

Figure 2.3. Country Program Outcome Ratings, FCV and Non-FCV Countries

Source: Independent Evaluation Group.

Note: The data are smoothed as described in box 2.3. FCV = fragility, conflict, and violence; MS+ = moderately satisfactory or above.

Box 2.3. Smoothing Country Program Ratings

This Results and Performance of the World Bank Group uses a new data smoothing method to compare project ratings across country programs. The Independent Evaluation Group conducts reviews of Completion and Learning Reviews (CLRs) for country programs at the end of every country program cycle, usually every four to five years. With only about 20 CLR reviews per year, the sample size is too small to allow many comparisons and identify meaningful trends. To overcome this data challenge, this report smooths annual data fluctuations by averaging country program outcome ratings over the four-to-five-year CLR period versus just the CLR’s exit year. This method increases the number of data points per year and smooths country program outcome ratings over time.

Source: Independent Evaluation Group.

Explaining the Trends

The World Bank operates within country programs. Transforming its technical and financial support into results depends on both the country’s capacity and economic environment and the quality of the World Bank’s support. This RAP explores some of these external and internal factors further for World Bank projects and Bank Group country programs. It finds that improvements in project design, M&E, and supervision, combined with broadly conducive economic and institutional conditions during project implementation (that is, before the pandemic) in many of the larger countries, help explain the overall positive ratings trends. The worse performance in FCV-affected countries can partly be explained by difficult context and large shocks for which country programs in those countries were not sufficiently prepared.

IEG used decomposition analysis to account for the factors behind the increase in project outcome ratings between FY12–14 and FY16–18. The analysis decomposed the overall increase in World Bank project ratings over the period into changes in the size of different portfolio elements (such as Region, country, GP, lending instrument, and so on) and changes in the ratings for the portfolio elements. Figure 2.4 shows how much each portfolio element contributed to the total ratings increase. Decomposed this way, the increased portfolio share of projects in the South Asia Region (from 10 to 15 percent of the total portfolio size), together with modest improvements in project ratings, was an important contributor to improved performance ratings overall. Bangladesh, China, and Pakistan all had growing ratings and portfolios, thus increasing the total. IPF projects were the biggest contributor to improved average project outcome ratings.

Figure 2.4. Decomposing the World Bank Project Rating Increase over FY12–14 and FY16–18

Image

Source: Independent Evaluation Group.

Note: The circle sizes represent how much each portfolio element contributed to the total ratings increase. FCV = fragility, conflict, and violence; FY = fiscal year; GP = Global Practice; M = $, millions; PG = Practice Group.

Figure 2.4. Decomposing the World Bank Project Rating Increase over FY12–14 and FY16–18

Source: Independent Evaluation Group.

Note: The circle sizes represent how much each portfolio element contributed to the total ratings increase. FCV = fragility, conflict, and violence; FY = fiscal year; GP = Global Practice; M = $, millions; PG = Practice Group.

A conducive institutional and economic environment and good performance in many of the World Bank’s larger client countries contributed to improved project ratings. Many of the larger client countries saw good rates of economic growth and an uptick in their Country Policy and Institutional Assessment scores over this period.3 Studies have found a positive and statistically significant influence of economic growth and Country Policy and Institutional Assessment score on World Bank projects’ performance (Geli, Kraay, and Nobakht 2014; World Bank 2018b). IEG’s qualitative analysis of 14 projects rated highly satisfactory and 14 rated highly unsatisfactory found that the successful projects often benefited from a conducive context with strong political support and an enabling policy and regulatory framework. The opposite was true for the unsuccessful projects, which also suffered from political instability and clients’ weak implementation and coordination capacity.

There is evidence of improvement across several aspects of the World Bank’s work quality. Most projects are designed well, as judged from the positive quality at entry ratings. The increase in projects’ M&E quality helps explain the increasing outcome ratings. Ratings methodology plays a role because IEG gives poor ratings to projects with insufficient evidence of their achievement. Regression analysis that attempts to control for the role of ratings methodology has shown that World Bank projects with good-quality M&E tend to have substantially—and statistically significant—higher ratings on outcomes than similar projects do (Raimondo 2016). The correlation between M&E quality and outcome ratings has held up over time and, in fact, has increased somewhat. So when outcome ratings are plotted against M&E quality, the slope has become modestly steeper (figure 2.5). The analysis of 14 projects rated highly satisfactory and 14 rated highly unsatisfactory found that M&E data collection and use of data for decision-making was one of the most frequent distinguishing factors. IEG ratings for supervision quality are also high, at 86 percent MS+ in FY19. This matters because studies have found that the task team’s ability to identify and mitigate potential risks to the project during supervision improves project outcome ratings.

Figure 2.5. Outcome Rating Plotted against M&E Quality Rating

Image

Source: Independent Evaluation Group.Note: Circle sizes indicate how many projects fall in each category. FY = fiscal year; M&E = monitoring and evaluation.

Figure 2.5. Outcome Rating Plotted against M&E Quality Rating

Source: Independent Evaluation Group.Note: Circle sizes indicate how many projects fall in each category. FY = fiscal year; M&E = monitoring and evaluation.

Project outcomes can be achieved despite serious challenges if the task team can identify risks early, elicit support from managers, and act quickly to mitigate these risks, for example, by restructuring the project.4 The analysis of projects rated highly satisfactory found that these projects often benefited from collaborative supervision (active engagement of clients and partners, local presence, and a good mix of skills in the World Bank team) and timely reactions to challenges. Some non-IEG data also point to World Bank performance that is often strong in the field. Country Opinion Surveys since 2012 indicate that country clients generally perceive the Bank Group positively as a long-term partner that collaborates well with government and contributes quality knowledge work, especially on good development and M&E practices. Survey respondents in a different survey conducted by AidData perceived the World Bank to be among the most influential donors, with particularly high influence of its knowledge products (Custer and others 2015).5

The worse performance in FCV-affected countries can be explained by a vicious cycle that these countries face in which large shocks prevent them from building capacity and improving governance. This has to be understood in a context of a somewhat rigid results framework architecture that requires forecasting results and is not sufficiently adaptable to dynamic circumstances, shocks, and high levels of uncertainty. These factors affect country program outcomes in various ways.

In a sample of 15 FCV-affected countries, all experienced large shocks, such as Ebola outbreaks, disasters, oil price shocks, and political crises. These shocks altered national priorities, prevented countries from building stable and credible institutions, and compelled the country team to reallocate resources and adjust country programs’ implementation. Political shocks and armed conflict (for example, in Madagascar and the Republic of Yemen) are especially challenging. The reduced staff presence during a political crisis naturally made it hard to reengage and achieve program objectives after the crisis subsided.

Institutional and governance reforms in FCV-affected countries are often unsuccessful. About half of country program objectives in FCV-affected countries focused on institutional and governance reforms, and the other half focused on infrastructure development and public service delivery. Only 22 percent of objectives focused on institutional and governance reform were achieved or mostly achieved, compared with 66 percent of objectives focused on service delivery (figure 2.6).6 Institutional and governance reforms are harder to insulate from FCV-affected governments’ weak political and technical capacity and need more time to achieve objectives than public service delivery projects do, which could explain the lower achievement rates of these reforms. The longer timeline exposes these reforms to more shocks and government and World Bank staff turnover.

Figure 2.6. Ratings for Country Program Objectives by Type of Objective

Image

Source: Independent Evaluation Group.

Note: MS+ = moderately satisfactory or above.

Figure 2.6. Ratings for Country Program Objectives by Type of Objective

Source: Independent Evaluation Group.

Note: MS+ = moderately satisfactory or above.

Overburdened country programs performed worse in response to large shocks and crises. In this context, overburdened refers to FCV-affected country programs with weak relevance and selectivity. Bank Group country programs performed better during shocks when they limited and consolidated interventions, such as in Haiti, Kosovo, Lebanon, Nepal, and Timor-Leste. Some other shock-affected countries saw an influx of new projects or made existing projects more complex, overstretching the World Bank’s and clients’ capacity.

Most FCV-affected country programs were already stretched to capacity before the shocks occurred. In a sample of 15 recently evaluated FCV-affected country programs, 11 had low or weak relevance, defined as the likelihood a program will achieve its intended objectives given the program’s resources and instruments. Nine programs had low or weak selectivity, which is defined as concentrating resources on priority objectives in a way that maximizes development impact and does not overburden the client’s or the World Bank’s implementation capacity. Eight programs were neither relevant nor selective. Four FCV-affected country programs were both relevant and selective, and three of these—Liberia, the Solomon Islands, and The Gambia—performed well despite major shocks.

Mirroring these findings, IEG’s project validations often find that project designs that are too complex relative to clients’ capacity lead to weak ratings in FCV-affected countries. Some econometric studies have associated active conflict, inflation, natural resource dependence, and distortionary trade, fiscal, and monetary policies with lower project performance.7

Staff quality and presence also matters for performance. Studies have linked the quality and stability of the project’s task team leader to project performance—see, for example, Denizer, Kaufmann, and Kraay (2011); Geli, Kraay, and Nobakht (2014); Ralston (2014); Moll, Geli, and Saavedra (2015); and World Bank (2016b). Yet client perceptions of the Bank Group staff’s availability and the quality of its work are often less favorable in FCV-affected countries. On most Country Opinion Survey questions, perceptions of the Bank Group are worse in FCV-affected countries than in non-FCV-affected countries. This is especially true of the Bank Group’s respectful treatment of clients and stakeholders, the technical quality of its knowledge work, its value as an information source on global development practices, its project M&E, and its staff accessibility (figure 2.7). Similarly, responses in AidData’s survey on the World Bank’s perceived influence were markedly lower in FCV-affected countries than in others.8 It is not clear why FCV clients respond less favorably to perception surveys. The Bank Group has increased its budget and staff resources for FCV-affected countries over time, though recruiting qualified staff to work in FCV-affected countries has often been difficult, something that the Bank Group’s strategy for FCV (2020–25) is aiming to address through enhanced support, training, and incentives for staff working in fragile settings.

Figure 2.7. Country Client Perceptions in FCV and Non-FCV Countries

Image

Source: Independent Evaluation Group, based on World Bank Group Country Opinion Survey data collected annually from 2012 to 2019.

Note: All scores except for one are measured with the following Likert scale: 1 = to no degree at all; 10 = to a very significant degree. Technical quality of knowledge work is measured with the following Likert scale: 1 = very low technical quality; 10 = very high technical quality. Averages (“N”) are based on number of country-years. Statistical significance is for difference of means tests between question responses in FCV and non-FCV countries. Two-sample mean tests are used, assuming equal variances. FCV = fragility, conflict, and violence; M&E = monitoring and evaluation

**p <.01.

Figure 2.7. Country Client Perceptions in FCV and Non-FCV Countries

Source: Independent Evaluation Group, based on World Bank Group Country Opinion Survey data collected annually from 2012 to 2019.

Note: All scores except for one are measured with the following Likert scale: 1 = to no degree at all; 10 = to a very significant degree. Technical quality of knowledge work is measured with the following Likert scale: 1 = very low technical quality; 10 = very high technical quality. Averages (“N”) are based on number of country-years. Statistical significance is for difference of means tests between question responses in FCV and non-FCV countries. Two-sample mean tests are used, assuming equal variances. FCV = fragility, conflict, and violence; M&E = monitoring and evaluation

**p <.01.

Responding to COVID-19 and Other Shocks

The study of shocks and their impact on project and program results can also contribute some insights for the World Bank’s ongoing pandemic response. Teams are preparing pandemic response projects (many of which are new rather than additional financing) under tight time pressures and amid complex political, economic, and public health contexts and logistical challenges, such as the inability to travel or conduct meetings in person. According to IEG evaluations, there is sometimes less time for data collection, technical studies, learning from past lessons, and designing strong results frameworks when the World Bank rushes to prepare crisis responses—see, for example, World Bank 2010a, 2010b, and 2017. World Bank (2019) analyzed comprehensively the factors that influence quality at entry and found that foundational work matters. Less foundational work limits the World Bank’s understanding of local policy, capacity, and institutions and its ability to fine-tune procurement arrangements and other elements of project design. The logistical challenges could adversely affect the World Bank’s local staff presence and ability to build trusting relationships and partnerships—factors that World Bank (2019) also found critical for quality at entry.9

There is a statistical association between time pressures during preparation and projects’ quality at entry. IEG calculated a variable for project preparation time in a sample of more than 3,000 evaluated projects.10 Projects in the first three deciles of this variable, meaning projects with low preparation time relative to duration, were rated significantly lower on quality at entry than projects at or above the median of project preparation time (figure 2.8).11

Figure 2.8. Relationship between Quality at Entry and Project Preparation Time

Image

Source: Independent Evaluation Group.

Note: HS = highly satisfactory; HU = highly unsatisfactory; MS = moderately satisfactory; MU = moderately unsatisfactory; S = satisfactory; U = unsatisfactory.

Figure 2.8. Relationship between Quality at Entry and Project Preparation Time

Source: Independent Evaluation Group.

Note: HS = highly satisfactory; HU = highly unsatisfactory; MS = moderately satisfactory; MU = moderately unsatisfactory; S = satisfactory; U = unsatisfactory.

Robust implementation support can counter shocks, problems, and quality at entry weaknesses. The difference between projects rated highly satisfactory and highly unsatisfactory was less about the presence of shocks or the number of supervision missions but instead about the World Bank teams’ timeliness in flagging concerns, taking corrective measures, complying with mandated safeguards, undertaking Mid-Term Reviews, revising objectives, and collecting data. These often distinguished successful and unsuccessful projects. For example, Somalia’s Emergency Drought Response and Recovery Project, which IEG rated highly satisfactory on outcomes, was prepared in five weeks and required complex support to implement. It involved intense collaboration and overcoming institutional differences between the World Bank and the International Committee for the Red Cross on rules and procedures for M&E, procurement, financial management, and even protocols for communicating with government officials.

IFC Projects

Investments

IFC investment projects’ development outcome ratings have declined over the past 10 years, but there are early signs that this decline has stopped or may be starting to reverse. IFC development outcome ratings declined from a peak of 75 percent of projects rated mostly successful or better by IEG in calendar year (CY)08 to 40 percent in CY17 and 43 percent in CY18 (figure 2.9). These ratings are based on a stratified random representative sample, which in CY18 covered 99 projects, or 39 percent of all projects approved in CY13 and eligible for evaluation. Average ratings can also be measured by net commitment volumes rather than the number of projects and by using three-year instead of annual averages. Calculated this way, IFC’s development outcome ratings declined from 83 percent rated mostly successful or better in CY07–09 to 43 percent in CY16–18 and 48 percent in CY17–19. As these numbers suggest, the ratings decline may have stopped or reversed since CY17.12

Figure 2.9. IFC Investment Project Development Outcome Rating (annual data)

Image

Source: Independent Evaluation Group.

Note: IFC = International Finance Corporation.

Figure 2.9. IFC Investment Project Development Outcome Rating (annual data)

Source: Independent Evaluation Group.

Note: IFC = International Finance Corporation.

IFC infrastructure projects’ development outcome ratings fell from 63 percent mostly successful or better in CY13–15 to 40 percent in CY16–18. Development outcome ratings for projects involving oil and gas exploration and junior mining companies declined sharply (from 73 percent mostly successful or better in CY13–15 to 13 percent in CY16–18); IFC has halted or reoriented most of its oil, gas, and mining investing. Core infrastructure projects (that is, excluding oil, gas, and mining) were 46 percent mostly successful or better, which is similar to other Industry Groups. IEG’s review of infrastructure projects contains another lesson with wider applicability for IFC project success. These projects have shown that essential client actions, such as obtaining operating permits or licenses or reporting monitoring data, must be completed before disbursing the equity investment to the client. This is because IFC is a minority shareholder with limited recourse or influence after investments are disbursed. These projects also show that IFC’s early and continuous project engagement contributed to successful social and environmental ratings, particularly when companies expanded into different sectors and countries and thus benefited more from IFC’s advice. In recent years, IFC has expanded its advice to existing and prospective clients on social, gender, environmental, and community engagement issues.

Advisory Services

Development effectiveness ratings for IFC advisory services projects show signs of improvement. Development effectiveness ratings peaked in FY12–14, when 65 percent of advisory projects were rated mostly successful or better (figure 2.10). This declined to 38 percent in FY15–17 before increasing to 41 percent for projects evaluated in FY16–18 and 50 percent for FY17–19 (based on very preliminary FY19 data and therefore subject to change).13 When calculated by the advisory project’s funding amount rather than the number of projects, development effectiveness ratings declined from 70 percent mostly successful or better in FY12–14 to 33 percent in FY15–17, before increasing to 49 percent in FY17–19.

Figure 2.10. IFC Advisory Project Development Effectiveness Rating, Three-Year Moving Averages

Image

Source: Independent Evaluation Group.

Note: IFC = International Finance Corporation; MS+ = mostly successful or better.

Figure 2.10. IFC Advisory Project Development Effectiveness Rating, Three-Year Moving Averages

Source: Independent Evaluation Group.

Note: IFC = International Finance Corporation; MS+ = mostly successful or better.

Explaining the IFC Trends

IEG researched many possible explanations for the long period of decline in IFC investments’ development outcome ratings. The joint IFC-IEG underlying evaluation and ratings methodologies did not change during this period, so methods changes cannot explain IFC’s ratings decline. The ratings trends for IFC investments differ substantially from those of the Asian Development Bank and the European Bank for Reconstruction and Development, which are the only other multilateral development banks with published ratings for private sector operations. Both institutions’ development outcome ratings for private sector investment projects increased over the same 10-year period in which IFC’s ratings dropped, so global economic conditions alone cannot explain the ratings decline. Additionally, IFC’s development outcome ratings declined in all Regions; for all four industry groups (figure 2.11); in IDA-eligible, FCV-affected, and IBRD countries; for both equity and loan instruments; and in both greenfield and expansion projects. Therefore, major declines in specific project categories cannot explain IFC’s ratings decline because declines were across the board. Furthermore, IFC’s business volume stayed approximately the same over the past 10 years with no major investment increases in low-capacity countries, so rapid business growth cannot explain IFC’s ratings decline. In fact, IFC ratings in IDA-eligible countries are slightly higher than in IBRD countries.

Figure 2.11. IFC Investment Project Development Outcome Ratings by Industry Group

Image

Source: Independent Evaluation Group.

Note: CDF = Disruptive Technology and Funds; IFC = International Finance Corporation; FM = Financial Markets; Infra = Infrastructure; MAS = Manufacturing, Agribusiness, and Services.

Figure 2.11. IFC Investment Project Development Outcome Ratings by Industry Group

Source: Independent Evaluation Group.

Note: CDF = Disruptive Technology and Funds; IFC = International Finance Corporation; FM = Financial Markets; Infra = Infrastructure; MAS = Manufacturing, Agribusiness, and Services.

A combination of internal work quality issues, external risk factors, and broader market trends help explain IFC’s investment performance trends. A joint IFC-IEG study from 2017 identified work quality and credit and country risks as significant drivers of investment projects’ development outcome ratings. Staffing, incentives, organizational culture, focus on volume targets over development results, and diffused accountability were the main factors affecting IFC’s work quality. IFC endorsed those findings and has since taken many steps to implement the joint study’s recommendations, including setting up a vice presidential unit to focus on development results, seeking stronger country engagement with improved analytics, and screening projects ex ante for anticipated outcomes.14

External risk factors also influence projects’ performance. IFC invests in many domestic, medium-size firms affected by a variety of risks. IEG’s review of project validations found that market, country, and sponsor risks and transaction structuring were the factors most clearly associated with IFC investment projects’ performance. IEG reviewed nearly two-thirds of the projects that it had evaluated from 2016 to 2018 and used a machine learning framework to analyze all 720 IFC investment projects that IEG evaluated between 2010 and 2018. Both reviews sought to identify factors associated with projects’ success and underperformance, and both identified sponsor selection risks, market risks, country risks, and transaction structuring as the factors that most frequently distinguished projects with good ratings from less successful projects (figure 2.12). The machine learning algorithm clustered projects as high development outcome and high work quality (high work quality, 318 projects), and low development outcome and low work quality (low work quality, 213 projects).

  • Sponsor risks (risks linked to the client company in which IFC invests) were 1.8 times more frequent in projects with low work quality than in those with high work quality. IFC knew the sponsor well in the positively rated projects (high work quality). Either the sponsor was a repeat client in good standing and had strong business fundamentals or IFC’s due diligence had concluded that the client had the necessary knowledge and experience. By contrast, sponsors of projects with low work quality had started new business lines in which they lacked relevant experience or were highly leveraged.
  • Projects with high work quality coped better with market risks, which affected all types of projects. For example, slowdowns in visiting tourists and an oversupply of hotel rooms affected some tourism projects. Weak consumer demand caused by slowing economic growth and currency devaluations affected some agribusiness and forestry projects. Demand that was weaker than expected or competitors adding infrastructure capacity affected some infrastructure projects. Such market risks had only temporary effects on projects with sound underlying business fundamentals, strong sponsors, and enough liquidity. Market risks had the most lasting impacts on the success of projects without these strong fundamentals.
  • Country risks increased in relative influence on investment projects. The most common country risks were currency devaluations and political and regulatory risks. These risks increased for projects with both low work quality and high work quality that IEG evaluated in 2013–15 and 2016–18. However, the machine learning algorithm found that in projects with high work quality, IFC and its clients adapted well to country risks. Sometimes, IFC successfully mitigated the impact of currency devaluations on projects through local currency loans. However, there are examples where IFC lent to clients in foreign currency even though local currency IFC loans were also available. Other projects with low work quality relied on anticipated regulatory changes for project viability. However, these changes often took much longer than anticipated or did not happen at all, adversely affecting project results.
  • The quality of IFC’s transaction structuring, additionality, and sensitivity analyses varied between projects with high work quality and low work quality. Details vary across industry groups. Examples of strong IFC transaction structuring included good selection of IFC investment products, careful scrutiny of intragroup risks when investing in holding companies, rigorous analysis of market and exchange rate risks, and realistic consideration of a bank’s condition and priorities before investing in those banks.

Figure 2.12. Factors Affecting IFC Investment Performance

Image

Source: Independent Evaluation Group.

Note: DO = development outcome; IFC = International Finance Corporation.

Figure 2.12. Factors Affecting IFC Investment Performance

Source: Independent Evaluation Group.

Note: DO = development outcome; IFC = International Finance Corporation.

Broader market trends may have made IFC’s business model more exposed to certain risks. IFC screens for risks when selecting projects, but there is a finite pool of repeat clients and bankable or viable investment projects, so IFC needs to accept certain risks when it invests. Moreover, the pool of viable projects available to IFC may have shrunk because rival financiers, both private and multilateral, have expanded into emerging and developing markets over the last few decades. A weaker pool of viable investment projects can translate into less attractive risk-reward profiles, thus contributing to the ratings decline. This means that better internal identification of risks during project preparation may not suffice. In recent years, IFC has taken many steps to grow the pool of bankable investment projects and to better identify market opportunities and constraints, as described in box 2.4. It has also taken other steps to increase its focus on outcomes, including providing specialist resources to advise teams, encouraging midcourse corrections, and introducing a new tool (Anticipated Impact Monitoring and Measurement) to assess and screen projects for expected outcomes prior to approval.15 Hopefully, these steps will help align IFC’s business model to market, country, and sponsor risks, though it is too early to tell if they will improve ratings. Additional steps IFC could consider include enhanced tools and processes to identify and mitigate risks during supervision.

Box 2.4. IFC’s Reforms to Strengthen Upstream Engagement

The International Finance Corporation (IFC) has prioritized upstream engagement in its strategy IFC 3.0. Upstream engagement can increase the number of bankable investment opportunities through regulatory reforms to unlock private investment and development of viable investment projects. To do so, IFC has updated its funding and operating model to encourage upstream engagement and invested significant resources in developing its project pipeline.

IFC has strengthened its focus on country outcomes through new IFC Country Strategies and analytical tools such as Country Private Sector Diagnostics and IFC Sector Deep Dives. These tools aim to provide a deeper understanding of market constraints and opportunities and help develop better coordinated upstream engagements with hopefully greater development outcomes. IFC has also integrated advisory teams into industry groups and introduced a new additionality framework.

Source: Independent Evaluation Group, based on documents from the International Finance Corporation.

For IFC advisory projects, project size and duration and a change of team leader had a statistically significant negative association with project success.16 Some of the larger and longer-lasting projects were riskier, for example, if they involved public sector clients and complex regulatory reforms such as those in business climate and public-private partnerships. Some of these larger advisory projects were more likely to encounter difficulties with political economy and counterparts’ capacity compared with simpler projects with private sector clients. Such difficulties could increase in importance as IFC expands its upstream engagements and its program in challenging and fragile markets.

Other factors that mattered for IFC advisory projects’ success included the client’s commitment, IFC’s flexible and proactive supervision, and robust project M&E.17 Client commitment was a major driver of advisory projects’ success. Indications of client commitment include alignment with the client’s established business plan or ongoing activities, client contributions to project costs, and level of seniority of interlocutor staff. Commitment can be fostered by aligning client and project objectives, involving clients closely in project design, and establishing a variety of client interlocutors beyond the project’s day-to-day individual counterparts. The staff’s patience and flexibility to respond to changing circumstances (such as government personnel changes) also contributed to success, and detecting signs of waning client commitment and restructuring projects accordingly proved important. However, such project restructurings were helpful only when clients showed continued commitment, for example, through responsiveness and engagement; otherwise, canceling the projects was preferable. IFC staff and managers’ proactive involvement in decisions to restructure, cancel, or reduce the duration of projects was important because IFC consultants contracted to the project may lack the incentive to recommend such actions. Robust project M&E provides IFC teams with a more detailed understanding of projects’ achievements and challenges so that they can adjust implementation as needed and achieve results. As reported in RAP 2018, IFC has worked to strengthen its work quality for some years, including through greater attention to projects’ scope and results frameworks, self-evaluations, and staff training.

MIGA Projects

Ratings for MIGA projects’ development outcomes increased over the past 10 years. Specifically, the ratings increased from 64 percent satisfactory or better (S+) in FY07–12 to 69 percent in FY13–18 (figure 2.13). When calculating ratings by gross issuance amounts, MIGA development outcome ratings increased from 61 percent S+ to 75 percent over the same time frame. These higher ratings have continued into the most recent ratings period. The increases are driven by higher ratings for MIGA projects in IDA countries (from 59 percent S+ in FY07–12 to 77 percent in FY13–18), in Europe and Central Asia (from 56 percent S+ to 73 percent), and in the Energy and Extractive Industries sector (from 67 percent S+ to 79 percent). MIGA projects are rated at 77 percent S+ in IDA countries, which are a strategic focus for MIGA, compared with 63 percent S+ in non-IDA countries. Projects in FCV-affected countries also had high ratings at 88 percent S+ in FY13–18.

Figure 2.13. MIGA Project Development Outcome Rating, Six-Year Rolling Basis

Image

Source: Independent Evaluation Group.

Note: FY = fiscal year; MIGA = Multilateral Investment Guarantee Agency; S+ = satisfactory or better.

Figure 2.13. MIGA Project Development Outcome Rating, Six-Year Rolling Basis

Source: Independent Evaluation Group.

Note: FY = fiscal year; MIGA = Multilateral Investment Guarantee Agency; S+ = satisfactory or better.

The financial markets sector had the lowest ratings at 58 percent S+. These low ratings for financial markets projects were caused by adverse impacts that the global financial crisis had on financial markets in Eastern Europe and Central Asia and by issues with MIGA’s assessments, underwritings, and monitoring. MIGA has diversified its portfolio away from financial markets to other sectors in Eastern Europe and Central Asia, which has helped improve MIGA’s performance trend in that region.

MIGA’s work quality has improved. Ratings for MIGA’s Assessment, Underwriting and Monitoring increased from 54 percent S+ in FY07–12 to 59 percent in FY13–18. Ratings for the environmental and social effects of MIGA guarantee projects increased from 50 percent S+ in 2007–12 to 83 percent in FY13–18 on the heels of MIGA adopting its Performance Standards on Social and Environmental Sustainability in 2007.

MIGA’s clients are larger multilateral investors, reflecting its mandate to promote cross-border investment in developing countries by providing guarantees to investors and lenders. MIGA guarantees against political risks. Other investors carry the credit risk. The relatively large size of MIGA-supported projects—on average $109 million in gross issuance—makes these projects visible to host countries and motivates governments to help these projects succeed, for example, by undertaking planned regulatory reforms. MIGA originates about 62 percent of its projects from Part 1 countries.

MIGA played an active and important role in promoting private sector investment through projects in IDA and FCV-affected countries. This is based on IEG’s review of 13 MIGA projects in IDA and FCV-affected countries evaluated in FY17 and FY18. The reviewed projects were all relevant because they fit with MIGA’s and host countries’ strategic priorities. Capable international investors who introduced competitive power generation or other technologies sponsored successful infrastructure projects. Some large-scale power projects were the first of their kind in the country. MIGA helped deter political risks and resolve emerging issues, for example, on arrears payments by governments. In successful agribusiness projects, MIGA provided reinsurance for foreign direct investment in IDA countries, created new supply chains, provided trademark license agreement guarantees, and integrated farmers and others into new processing facilities, irrigation networks, or distribution networks. Generally, the agribusiness projects were socially, economically, and environmentally sustainable, and the demonstration effect encouraged future private sector participation in the sector. This highlights a main difference between successful and less successful MIGA projects: the project’s market and business sustainability.18 For example, unsuccessful power sector projects had low market and business sustainability because of lower consumer demand for power and intense competition from rival sources of power generation. In the telecom sector, some projects were unsustainable because episodes of violence or increased competition led to fewer subscribers than expected.

  1. Again, the fiscal year (FY)19 data is preliminary, so these numbers will change as more projects complete their evaluations.
  2. The World Bank nearly doubled the average size of new projects, from $87 million in FY05–07 to $157 million in FY09–10 (see also World Bank 2012).
  3. The Country Policy and Institutional Assessment score is an indicator of countries’ policy framework and institutional capacity.
  4. Two World Bank reports (2016a, 2018b) summarize the evidence, including an internal audit study.
  5. Unlike the International Finance Corporation (IFC), the World Bank does not have a rating system for its knowledge products. Perception surveys suggest they can be influential. AidData’s survey in Custer and others (2015) was updated in AidData’s 2014 Reform Efforts Survey Aggregate Data Set (2017).
  6. This is according to Completion and Learning Reviews, the Independent Evaluation Group’s (IEG) reviews of closed country programs.
  7. World Bank (2018b) summarizes the studies.
  8. According to a custom calculation that AidData provided to the Results and Performance of the World Bank Group, the average score for World Bank influence was 3.76 among non–fragility, conflict, and violence respondents, compared with 3.28 for respondents in countries that were classified as fragility, conflict, and violence–affected in FY14, the year before the survey. This is based on data documented in Custer and others (2015) and updated in AidData’s 2014 Reform Efforts Survey Aggregate Data Set (2017).
  9. Mirroring this, econometric research has linked project outcomes to the project team’s access to time, budget, and knowledge (see Ika 2015; and World Bank 2016b, 2017).
  10. This index is the absolute difference in months between projects’ approval time (time from inception to approval) and projects’ duration (time from effectiveness to project close).
  11. The score for quality at entry was calculated by converting the 6-point scale into numerical values: highly unsatisfactory = −3, unsatisfactory = −2, moderately unsatisfactory = −1, moderately satisfactory = 1, satisfactory = 2, and highly satisfactory = 3.
  12. The tentative reversal in IFC’s ratings trend is, however, within the margin of error, given that only a sample of IFC projects undergo ex post evaluation and that not all of the projects sampled for evaluation in the calendar year 2019 cohort have finished their evaluations.
  13. Although the FY17–19 estimate is based on 171 evaluated projects, the FY19 data are based on only 36 evaluated projects out of 54 projects sampled for evaluation. Estimates will therefore change as more projects finish their evaluations.
  14. IFC’s managerial actions translate into ratings for the mature portfolio with a long delay. That is because the projects rated this year were approved years before the mentioned actions.
  15. See World Bank (2019, 19) for a fuller description of IFC’s efforts to improve work quality.
  16. This is based on all 169 advisory projects evaluated between FY16 and FY18.
  17. This is based on the IEG’s review of 42 advisory projects evaluated in FY18.
  18. Of the 13 projects in International Development Association and fragility, conflict, and violence–affected countries evaluated in FY17 and FY18, 10 projects were rated satisfactory or better and 3 projects were rated less than satisfactory. All of the projects fit with the Multilateral Investment Guarantee Agency’s and host countries’ strategic priorities.