Back to cover

The World Bank Group Outcome Orientation at the Country Level

Chapter 3 | Capturing Outcomes


The World Bank Group’s country-level results system does not capture the Bank Group’s contributions to country-level outcomes because it relies on results frameworks predicated on three principles that do not fit well the nature of country programs: metrics, attribution, and time-boundedness.

Country Partnership Framework results frameworks do not capture well the country strategies’ objectives or the Bank Group’s contributions to country outcomes, which are difficult to measure because they are often not quantifiable and because of data gaps in many countries.

Country strategies lay out future objectives, but Country Partnership Framework results frameworks primarily capture past operations and do not capture medium- to long-term effects. Most country strategy indicators come from lending projects and do not properly capture the contribution of advisory services, analytics, policy dialogues, or stakeholders’ convening. The contributions from the International Finance Corporation and the Multilateral Investment Guarantee Agency are also not well captured.

There are costs associated with generating indicators that are not used.

The Bank Group’s country self-evaluation product, the Completion and Learning Review, is little used within the institution and provides a partial picture of Bank Group contributions.

Although some country teams carry out other evaluative exercises to assess the achievement of outcomes, these are rare and overshadowed by the results system’s emphasis on targets and indicators, an emphasis that also consumes the time of monitoring and evaluation staff.

This chapter reviews how country teams measure the Bank Group’s influence over country outcomes. There are several tools used in the country-level results system: CPF results frameworks, which are developed at the CPF design stage; the CLR, which is a mandatory self-evaluation of the country engagement carried out by the country team at the end of the cycle and is subsequently validated by IEG in the CLRR; and other nonmandatory evaluative activities. This chapter reviews how country teams use these three sets of tools. The chapter shows that country-level results systems provide an incomplete representation of the Bank Group’s contribution to country outcomes. The cause of this does not lie in country teams’ practices—they strive to faithfully follow the guidelines—but rather in the system’s measurement principles and tools and their institutional execution.

Results Frameworks

The CPF results framework is the cornerstone of the Bank Group’s country-level results system. CPF results frameworks, also referred to as “results matrixes,” assign a set of indicators and targets to CPF objectives. CPF results frameworks are at the beginning of the country engagement cycle, can be revised midway through the cycle in the PLR, and are used to rate the Bank Group’s results and performance at the end of the cycle in the CLR. As the principle of results-based management expanded from the project level to the country level, the main features of the project results measurement system were transposed to country programs (World Bank 2016f). The system measures objectives through time-bounded quantifiable metrics that are directly attributable to Bank Group interventions. Like projects, country programs are rated, through an objective-based assessment, on whether they have achieved their stated objectives during the CPF cycle. The ratings methodology adopted by the Bank Group and IEG assesses the extent to which the target indicators for each objective have been met and whether the indicators sufficiently reflect the objective. The purpose of this approach is to promote accountability for delivering results by judging success or failure by how far results ex post deviate from targets that were forecast ex ante, through indicators that can be verified by external validators and are time bound, as only achievements made during the CPF cycle are counted. The rest of this chapter shows how these principles shape country teams’ practice and inadequately capture the Bank Group’s contributions to country-level outcomes.

CPF results frameworks’ measurements do not capture well the objectives pursued and about half the time are framed below the level of the objectives they seek to capture, particularly because of the expectation of attribution. A comparison between CPF objectives and indicators shows a sizable mismatch between the outcome focus of objectives and their associated indicators. More than half (52 percent) of CPF objectives are measured by indicators that are below the objective’s level. For example, figure 3.1 shows that when the Bank Group pursues intermediate or long-term outcomes, it tends to measure them with lower-level indicators. In interviews, staff suggested that this is because the guidance on CPF results frameworks limits them to selecting indicators that are mostly attributable to Bank Group efforts and can be verified by IEG. Country staff are also understandably reluctant to be held accountable for achieving a target indicator that is beyond their control, as illustrated in figure 3.1. They also find it challenging to identify quantifiable, time-bound, attributable, and verifiable indicators that can capture the CPF’s objectives. As a result, the indicators they select often do not accurately reflect the CPF’s objectives or fundamentally capture the country engagement’s most important development results.

Figure 3.1. Indicator Outcome Level by Type of Objective

Source: Independent Evaluation Group.

Note: n = 361 Country Partnership Framework indicators. Level 1: Outputs; level 2: Immediate outcome; level 3: Intermediate outcome; level 4: Long-term outcome. The size of the circles is proportional to the share of indicators at each level.

Although CPFs lay out objectives for the future, their results frameworks primarily capture past operations. There is a built-in timing mismatch between CPF objectives and their measurement that country teams cannot reconcile. Country teams are asked to lay out objectives for the future that support high-level country outcomes; however, they are also tasked with finding result indicators that can be achieved and measured within a typical three-to-five-year CPF cycle (World Bank 2019u, 20). Because it often takes many years for a Bank Group intervention to move from conception to impact, in practice this means that when country teams are designing a CPF, they must select indicators from interventions that were designed long before the CPF objectives are formulated. This is because these result indicators will materialize and be measurable by the end of the CPF, even though they may not align well with the new CPF’s objectives. Alternatively, teams must speculate on the measurable results of a forward-looking pipeline, which can be uncertain, especially for private sector investments or in volatile country contexts. Country teams’ widely shared sentiment is that this approach to developing results frameworks is artificial and detracts from outcome orientation. There are cases where this mismatch is particularly salient. An extreme example is Vietnam, where new lending dropped dramatically after the country’s graduation from the IDA, from roughly $10 billion in lending to $67 million in the first year after graduation (though this is expected to rise). Because of this, roughly 70 percent of Vietnam’s CPF results framework was based on the legacy portfolio.

The country-level results system does not adequately capture the medium- to long-term effects of Bank Group interventions. A consequence of the reliance on metrics and on project-level monitoring systems is that results frameworks capture only the results that occur by the time a project closes. Yet many effects, especially indirect effects, take longer to manifest. For example, some of the most important benefits from slum upgrading and off-grid power projects in Vietnam were the way in which they influenced the design of much larger government programs. Similarly, IFC’s support for mobile phone providers in the Pacific Islands has been transformative for telecommunications, but it took time for these effects to be fully realized. These types of effects are not well captured by the country-level results system.

The country-level results system relies on quantitative metrics, but these do not capture many of the Bank Group’s outcomes at the country level. There are several elements in the CLR methodology that privilege metrics and numbers. For instance, the guidance emphasizes that indicators should be observable and externally verifiable by IEG. Yet, as one staff member noted, “By focusing on what is measurable, we are not capturing what is meaningful or useful, especially in terms of the big picture.” Most interviewed Bank Group staff perceived metrics as a challenge, or as a distortion, and shared the concern that the current results measurement system does not measure the most valuable Bank Group contributions to country outcomes, especially in sectors that are inherently harder to measure than others. To illustrate, among the five sectors reviewed by IEG, the discrepancy between objectives and indicators is the sharpest in the governance outcome area, which includes many of the Bank Group’s institutional capacity-building efforts and is the hardest to adequately capture with simple metrics (figure 3.2). In other sectors—where the Bank Group can deliver countable, verifiable outputs such as building a road network—the discrepancy is smaller.

Figure 3.2. Discrepancy between Objective and Indicator Level of Outcomes


Source: Independent Evaluation Group.

Note: n = 361 Country Partnership Framework indicators. The Independent Evaluation Group calculated the average difference between the outcome level of objectives and their attached indicators and found that, on average, indicators are formulated at an outcome level that is below the outcome level of their measured objectives. This measurement gap is higher for the Governance Global Practice, where the formulated indicators are, on average, 1.3 levels below the outcome level of their measured objectives.

Figure 3.2. Discrepancy between Objective and Indicator Level of Outcomes

Source: Independent Evaluation Group.

Note: n = 361 Country Partnership Framework indicators. The Independent Evaluation Group calculated the average difference between the outcome level of objectives and their attached indicators and found that, on average, indicators are formulated at an outcome level that is below the outcome level of their measured objectives. This measurement gap is higher for the Governance Global Practice, where the formulated indicators are, on average, 1.3 levels below the outcome level of their measured objectives.

Data gaps in countries also make it difficult for country teams to select outcome indicators. Many countries, particularly IDA and FCV-affected countries, lack strong statistical and monitoring and evaluation systems; some countries are also reluctant to share available data, and therefore have less data on country outcomes, even for sectors that are easier to measure (World Bank 2017a). Some sectors, such as governance, that seek system changes are inherently difficult to measure. In other sectors, like health, energy, or social protection, outcome data can be routinely collected through regular sectoral operations (for example, electricity or energy generation and consumption, prices, access, system losses, stunting rate). But even for these, many World Bank clients do not have adequate information systems to track development progress. For instance, the 2018 Global Partnership Monitoring Round report reveals that 64 percent of countries have good-quality national development strategies, but only 35 percent of governments have data and monitoring and evaluation systems to track these strategies (OECD and UNDP 2019). Country teams working in IDA, and especially in FCV countries, emphasized that they struggle with scarce and low-quality data. For example, in the Solomon Islands, there have only been two major household surveys in the past two decades, and methodological differences between them mean they are not easily comparable, limiting the country’s ability to monitor poverty. FCV governments often lack a system to track their own national development strategies and rely on development partners to track even basic outputs. However, partners have their own results measurement and reporting systems, making data from different partners difficult to reconcile. In many FCV contexts, poor security situations have made basic supervision and monitoring tasks difficult to carry out. The Bank Group has made efforts to mitigate these issues by investing in third-party monitoring and other monitoring and evaluation platforms, such as the Geo-enabling Initiative for Monitoring and Supervision, to compile data collection and map portfolios. But these solutions are only short-term fixes and do not address the larger data scarcity issues in FCV countries.

Box 3.1. Illustrating the Mismatch: Nicaragua Early Childhood

Nicaragua preschool education: In Nicaragua, the Country Partnership Strategy (CPS) FY13–17 sought to increase coverage of preschool education (ages three to five; World Bank 2012b). The team included an indicator on access in the CPS results framework, but the indicator was dropped at the Performance and Learning Review stage. The reason was that it was considered too ambitious to expect results within one CPS cycle. The outcome indicator was replaced with an output measure or milestone that was within the control of the World Bank “system to assess early childhood development outcome in place and operational.” Once the CPS cycle closed, the Completion and Learning Review failed to take stock of progress toward access. At the beginning of the next cycle, the Systematic Country Diagnostic noted that Nicaragua had the lowest rates of access to preschool of Central America, after Guatemala (World Bank 2017c, 75), thus again signaling the importance of moving toward increased access. Yet the Country Partnership Framework did not measure progress toward this outcome and again included a more output-oriented measure within the control of the World Bank Group: the percentage of preschool teachers with desired teaching practices (World Bank 2018d). In short, two country engagement cycles have passed, and the World Bank Group’s documents are still unclear on whether progress in access has been achieved and how Bank Group interventions influenced these changes.

Most CPF results indicators are taken from lending projects. For instance, among the 361 CPF indicators reviewed by IEG, two-thirds were traced to indicators of a single or multiple Bank Group projects. This was hard to calculate because 63 percent of indicators did not list their source in country documents, so IEG had to collect this information through the World Bank’s operations portal, the World Bank’s operations database, and the Bank Group’s project documents. The remaining one-third of indicators that did not come from projects typically measure an action taken by a government with the Bank Group’s support—usually through development policy financing—and only rarely measure changes to country outcomes. Interviewees explained that country teams harvest indicators from discrete lending and investment projects because this ensures the data are available and attributable to the World Bank’s interventions. Another reason for using project indicators is that monitoring and evaluation is funded through projects but under-resourced at the country level. Country teams try to avoid additional evidence gathering because of the limited resources and expertise available.

CPF results frameworks do not properly capture the contributions of advisory services, analytics, policy dialogues, or convening power. The country-level results system is meant to capture the effects of various types of interventions, lending and nonlending. Interviewed World Bank staff recognized the challenge of quantifying verifiable metrics for analytical and capacity-building work. At the same time, there is emerging empirical evidence—from AidData microlevel surveys with 1,244 public sector officials in 121 developing countries—that the World Bank has substantially greater influence over the direction, design, and implementation of government policies than most of its bilateral and multilateral peers. This policy influence is at least in part because of the World Bank’s analytical work, advisory services, and technical assistance. Country teams, especially staff interviewed working on IBRD countries, find it difficult in the current results system to do justice to the World Bank’s ASA contributions to institutional strengthening and capacity building. Box 3.2 illustrates an attempt at going beyond what is expected from the results system and carrying out a dedicated evaluation of the CMU’s RAS program. IFC has developed a project-level evaluation system for advisory services that seeks to capture the influence of the project on direct recipients of the advice as well as the extent to which project activities have made a difference at the market or sector level. However, IFC’s contribution is underrepresented in CPF results frameworks and a very small number of indicators in the sample reviewed captured the results of IFC’s advisory services.

Box 3.2. Evaluating Advisory Services in Romania

The Romania country engagement has a large reimbursable advisory services (RAS) program, and the country office decided to commission two evaluations of its program to better understand its contribution to institutional change in the country. The first covered the RAS program between 2012 and 2015; the second is in process. The first evaluation assessed the 39 RAS agreements conducted with line ministries, government agencies, and local authorities for a total value of approximately $59 million. The evaluation classified the type of capacity challenges that the RAS program was seeking to address and traced the effect of RAS along an institutional capacity change pathway. The evaluation concluded that the World Bank had strengthened institutional capacity across 15 Romanian agencies at the central, regional, and country levels through three main channels: enhancing the effectiveness of organizational arrangements, enhancing the efficiency of policy instruments, and strengthening the ownership for reforms among various levels of institutional actors. The RAS program had played a key catalytic role in helping Romania leverage European Union funding for achieving development objectives. The Romania country office found the evaluation useful and better adapted to its knowledge and accountability needs than Country Partnership Framework results frameworks. To dig deeper into assessing and understanding its contribution to Romania’s institutional strengthening, the country team is partnering with the Governance Global Practice to carry out a second evaluation and test specific parts of its theory of institutional change.

Source: World Bank 2016d.

Staff raised concerns that country-level results systems have a disproportionate focus on lending projects, rather than on other instruments, and on medium-term goals rather than long-term outcomes. Current country-level results systems collect information primarily on lending and investments because they supply measurable indicators of progress. However, this approach misses the results contributions from other instruments, such as capacity building, analytical work and policy dialogue, and combined interventions; thus, it does not promote synergy across interventions to support objectives. There are signs that complying with the country-level results system has subtle but pernicious effects on teams’ behaviors. Staff argue that if the results system disproportionately focuses on lending, it can have the adverse effect of driving attention and resources away from other instruments that in certain contexts are better suited to pursue certain types of outcomes. Many staff also argue that the emphasis on results frameworks and attributable time-bound indicators has other negative effects. It encourages country teams to pay too much attention to outputs and immediate goals rather than the big picture; it misses opportunities to identify synergies and complementarities across instruments; and it encourages risk aversion by favoring safe and achievable projects that might have solid but incremental effects rather than risky endeavors that might have transformative impacts or might miss their targets.

IFC’s and MIGA’s contributions to outcomes are not well captured in CPF results frameworks. Forty percent of the CPF objectives reviewed for this evaluation mention IFC and MIGA as contributing to the objective, yet only 16 percent of CPF objectives had an indicator that captures these IFC or MIGA contributions. This gap is partially explained by IFC’s and MIGA’s project pipelines being largely unknown at the CPF design stage, making it difficult for teams to establish clear IFC and MIGA metrics and targets. Interviews with IFC and MIGA staff highlighted the challenges they face in linking their interventions with country outcomes. They find it more reasonable to link IFC and MIGA work to country outcomes in cases where they are supporting projects that are of national significance, such as when they are supporting the first in kind of a particular industry or technology or a major player with high market share. This is more common in certain industries, such as infrastructure. In the five sectors reviewed by IEG, the transport sector had the highest share of IFC or MIGA contributions covered by indicators (76 percent), followed by the water sector (44 percent). IFC investments into sustainable mining in Guyana or renewable energy in Argentina are examples of how IFC can generate impacts beyond a specific client (see box 3.3). The IFC Anticipated Impact Measurement and Monitoring and MIGA IMPACT tools incentivize teams to consider development impact when structuring transactions by more systematically assessing potential impacts. For example, Anticipated Impact Measurement and Monitoring develops ex ante impact claims based on IFC’s anticipated market creation and stakeholder and environmental effects. IMPACT assesses the ex ante development effects of MIGA guarantee projects, including their ability to attract and mobilize foreign direct investment. These tools are designed to encourage staff to prioritize potential development impacts as well as financial returns. This is different from the challenge in the World Bank, which is to create a clearer line of sight between country programs and country-level outcomes.

Box 3.3. The International Finance Corporation’s Contributions to Country Outcomes in Guyana

After International Finance Corporation (IFC) initial investments in Guyana Goldfields and Aurora Gold (mining), IFC played an increasingly active role in the country. IFC joined the World Bank’s efforts in playing an important role in the implementation of the of the country’s Low Carbon Emissions Development Strategy through its contribution to the development of sustainable mining, transfer of environmental management standards and practices, and introduction of appropriate resource use and conservation methods. In fact, IFC’s engagement with Guyana Goldfields helped establish a viable model for sustainable mining development in the country. It fostered local economic development, skills transfer, and building environmental management capacity. IFC has made a second equity investment of $4.15 million in the project as a continuation of a partnership focused on introducing clean energy, preserving the biodiversity in the rain forest, and implementing international standards for sustainable mining in a remote region where artisanal mining practices had a significant adverse impact on the environment. The project contributed to local economic growth by generating more than 300 jobs and $1.5 million in purchases of local goods and services. 

Source: Independent Evaluation Group.

Country teams make great efforts to comply with the results system, though formal rules allow for some flexibility that is not fully used. Teams take the design of the results framework seriously and apply the methodology as intended. World Bank, IFC, and MIGA staff interviewed for this evaluation say there is a lot of organizational effort and scrutiny that goes into designing CPF results frameworks. Staff tend to identify indicators that meet the system’s basic requirements: for example, among the 361 CPF indicators IEG reviewed, most identified baselines (95 percent) and targets (94 percent). In most cases, staff felt there were not better indicators to use, though they recognized that the indicators they chose were often not meaningful. Overall, staff seemed resigned to complying with a system that did not really work for them. The formal guidance for results frameworks, the Shared Approach for Assessing CPF, does allow some flexibility that is not fully used. It does not require direct attribution; it allows for “alternative indicators” to be used to provide evidence when the results framework evidence is weak; and in principle it assigns ratings based on the degree to which objectives, rather than targets, are achieved. However, teams have internalized a stricter interpretation of the guidance because of advice provided by quality assurance processes, peer reviewers, and interactions between teams and Operations Policy and Country Services or IEG, which has established a strong culture of attribution. In particular, teams are understandably resistant to being held accountable for the achievement of objectives that are influenced by many other factors beyond their control. Yet relaxing this stricter interpretation would still not address the inherent challenges of the results system.

There are costs associated with compiling, processing, quality controlling, and reporting on indicators that are not used. The direct costs associated with the country-level results system are hard to track, as described in box B.17 of appendix B. World Bank and IFC country staff, portfolio analysts, and monitoring and evaluation officers have described many transaction and opportunity costs associated with the time and effort they spend on developing CPF results frameworks at the design stage and revising them later, and then updating indicators for the PLR and CLR through a laborious process that involves individual requests for updates on indicators to sector and industry staff. The difficulty of forecasting a country program’s eventual portfolio at the CPF stage often means that the initial CPF indicators will not be a good representation of eventual outcomes, which adds to the teams’ burden of continually revising indicators. Given the frequent development of country products, some World Bank and IFC staff dedicate a large portion of their work program to feeding results systems, especially in multicountry CMUs with many products. IFC staff especially struggle to keep up, given the additional IFC country products over and above the Bank Group products. These costs of managing indicators tend to overshadow or discourage country teams’ attempts at bringing more relevant and eclectic evidence to any assessment of their achievements and failures. These opportunity costs also crowd out the space afforded by PLRs and CLRs to collectively reflect and learn from experience. The time spent on creating these products and shepherding them through formal Board review processes is time that could be devoted to more useful evaluative or collective thinking tasks.

Completion and Learning Reviews

The CLRs provide a partial picture of the Bank Group’s development outcomes at the country level. In CLRs, country teams tend to focus their efforts on the part of the methodology that assesses the deviation of reported results from the CPF’s forecasted result targets and indicators. IEG validators often have to conduct additional research, consulting databases and external reports, to find additional and more relevant sources of information on progress toward objectives. Thus, the CLR also falls prey to the limitations of the results frameworks by overly focusing on quantifiable, time-bounded, and attributable indicators from individual lending projects that do not capture country outcomes or consider the contribution of IFC and MIGA or ASA products and other tools. For example, in China, many of the World Bank’s most important contributions have come from capacity-building efforts and learning-by-doing facilitated by the World Bank. Government officials gained exposure and experience by applying international best practice in procurement, social and environmental risk management, and project management to their own projects. In Chad, some of the World Bank’s most important contributions were from persuading and convening stakeholders to help Chad obtain heavily indebted poor countries’ debt relief. Neither of these impressive country outcomes was well captured in those countries’ CLRs.

CLRs do not establish whether the Bank Group’s contribution to country outcomes is greater than the sum of individual project objectives. At the CPF design stage, country teams articulate how Bank Group interventions will together contribute to CPF objectives. This articulation is thorough and typically included in both the CPF document’s narrative and the results framework annex. However, at the CLR stage, there is a tendency to engage in “bean counting,” or a mechanical accounting of project results achieved under each CPF objective. For the most part, CLRs do not assess the cumulative effects from multiple projects. This is partially explained by the CLR’s overreliance on results framework indicators as the main source of evidence. IEG found that results frameworks measure over half of CPF objectives, with only one or two indicators pulled directly from a single Bank Group operation, whereas most CPF objectives’ intervention logics rely on the combined effects of multiple activities. For instance, the Bank Group has used various instruments in 30 or more relevant interventions to help Serbia reform its state-owned enterprises. Obviously, these multiple and sustained efforts are likely to have a greater cumulative country impact than a simple sum of individual lending outcomes. Box 3.4 illustrates how country programs’ results frameworks reveal very few cumulative results from Serbia’s state-owned enterprises agenda and shows how difficult it is to assess the agenda’s overall progress toward achieving country outcomes.

Box 3.4. A Partial View of Bank Group Progress on Long-Term Reform Agendas: Examples of State-Owned Enterprise Reforms in Serbia

The most impactful contributions of the World Bank Group to country outcomes in the Western Balkans are not quantifiable—they have to do with strengthening the implementation capacity of the governments in key areas of their European Union accession process, including public administration reforms, state-owned enterprise (SOE) reforms, and railway and energy provision reforms. The Bank Group has been a change agent in SOEs and in building various ministries’ capacity to undertake complex and politically sensitive SOE reforms, including in railways and energy.

However, only a partial view of these meaningful contributions can be gleaned from country documents and their results frameworks. The fiscal years 2012–15 strategy tracked only the introduction of rules for the appointments of managers and the coverage of payments for workers made redundant, and the 2016 indicator on the number of companies still to be privatized (a measure of overall progress on privatization) was replaced by an indicator tracking the resolution of unproductive SOEs, as privatization turned out to be more difficult than anticipated.

The Completion and Learning Review considered the objective achieved because the targets were met; the parliament adopted a law and allocated enough resources in the budget to provide severance payment to redundant workers. The institution and capacity building, as well as the transfer of know-how from the Bank Group to the client through years of joint work on SOEs, is missing from these assessments.

Sources: World Bank 2015i, 2015j, 2019q.

Because of their timing and limited content, CLRs are little used within the institution. The CLR is intended to be a learning tool for country teams to collectively reflect on achievements, failures, and experiences (World Bank 2018c, 19). The CLR is carried out by the country team concomitantly with the new CPF and SCD and presented to the Board as a CPF appendix. This is too late for the CLR to meaningfully inform the CPF and means that CLRs are overshadowed by the CPF or SCD, both of which receive much more attention from vice presidents, the Board, and other stakeholders. As a result, very few people read the CLR and little learning emerges from it. Country staff working on CLRs expressed the sentiment that they were writing mostly for IEG validators. They viewed the CLR as a compliance exercise because of its emphasis on updating results frameworks and applying ratings, and they had a tendency to outsource it or assign it to junior staff. Country teams find the lessons from the CLR and its validation by IEG, the CLRR, to be generic. Box 3.5 provides examples of the types of lessons found in CLRs. As one country manager summed it up: “Real evaluation—meaning collecting proper evidence, reflecting on what we do, how we do it, and then distilling these lessons learned—is absolutely critical. The devil is in what the system pushes teams to do. In practice we spend too much time on meaningless things, such as revising targets so that they look ‘perfect,’ and on determining the rating. This type of bean-counting mentality is detrimental to learning and innovating.”

Box 3.5. Examples of Lessons from the Completion and Learning Review

  • Open dialogue with a broad spectrum of stakeholders increases the ownership of the program (Albania CLR for CPS FY11–14).
  • It is important to quickly respond to changes in the country context and to monitor the program promptly (Benin CLR for CPS FY13–18).
  • Proactive engagement with the donor community is important in leveraging our support (Arab Republic of Egypt CLR for CPS FY06–14).
  • The World Bank Group’s contribution to the country’s development agenda is not limited by the size of its portfolio (Honduras CLR for CPS FY11–15).
  • Shifting client demand has required a high degree of flexibility (Sri Lanka CLR for CPS FY13–16).

Sources: World Bank 2015a, 2015b, 2015d, 2016f, 2018b.

Note: CLR = Completion and Learning Review; CPS = Country Partnership Strategy; FY = fiscal year.

Staff almost invariably say that the CLR’s accountability elements lack bite. The CLR and the CLRR are tools that are meant to uphold the Bank Group’s accountability for achieving results. Although this evaluation finds that these tools enable accounting and transparent reporting, there is little evidence that the CLR and its validation actually support accountability, which requires better staff incentives and signals from management and the Board, as described in chapter 5. Interviews with country staff and management and a review of Regional Memorandums of Understanding, Scorecards, and managerial dashboards show that staff are incentivized to focus on commitments, disbursements, and the preparation of documents, such as the CPF, rather than on results management. The CLR and the CLRR are hardly discussed by the Board when a new CPF is approved. The aggregation of ratings from CLR and their validation into corporate-level scorecards provides some indication of trends, but this is not used to orient regional strategies. Regional briefings to the Board do not discuss outcomes or lessons from country-level results systems, including the CLR. Interviewed staff also highlighted that the frequent staff rotation reduces country teams’ accountability for outcome management. Staff also argue that promotions or appointments to desirable posts are not based on a track record of achieving development results.

Other Evaluative Exercises

There is a lot of evaluative thinking embedded in sector work, but this is rarely captured by country engagement cycle documents. IEG’s desk review found 12 out of 29 CPFs with planned activities for assessing intervention outcomes, including through impact evaluations related to CPF objectives. However, these assessments covered only a very small portion of programmed activities because of a lack of impetus and high costs. Other evaluative activities mentioned in PLRs or CLRs include client and beneficiary satisfaction surveys, third-party monitoring and evaluation, census and household survey data reviews, and gender-sensitive analyses with gender-disaggregated data. In interviews, GP staff often pointed to sectoral ASAs as the primary conduit for informing future actions to progress on sector outcomes. However, these ASA products were largely forward-looking and typically did not derive lessons from experience of what worked or what did not. Moreover, the knowledge and evidence stemming from these ASA products and evaluative exercises are not well featured in CLRs and had little weight in the assessment of results. For its part, the CLR rarely synthesizes existing knowledge about what does and does not work and why, including through synthesizing project Implementation Completion and Results Report Reviews.

Some country teams carry out helpful country-level evaluation activities outside of the CLR, but these are rare and overshadowed by the results system’s emphasis on indicators and targets. The Bank Group’s cultural orientation toward generating reportable metrics crowds out space for mixed-method evaluations of the overall successes and failures of multiple, complex interventions. There are some promising examples of such evaluative exercises taking place, but case studies and the structured document review found these examples to be rare. In Haiti, the World Bank and other major donors carried out an assessment of the first two years of donor responses to the 2010 Port-au-Prince earthquake. The motivation for this assessment was the vast sums donors spent on the response, the high profile of the interventions, and the recognition that there were significant shortcomings. In Vietnam, the country team is working on a retrospective of the 25 years of partnership between the World Bank and Vietnam that connects the World Bank’s long-term engagement to high-level country development outcomes, such as power generation capacity/access, social and health insurance beneficiaries, educational attainment, and poverty. The report also outlines the World Bank’s main projects and knowledge work contributing to key policy reforms and outcomes in those sectors.

The Bank Group’s monitoring and evaluation staff are consumed by corporate reporting requirements, leaving little time to support country teams’ outcome measurements. Country team leadership pointed out that the monitoring and evaluation functions that could play a meaningful role in assessing the Bank Group’s contribution to country outcomes are instead focused on reporting corporate requirements and checking compliance with methodologies. For example, country teams believe that Regional development effectiveness units are helpful at the time of Regional Operations Committees (ROCs) and help them navigate corporate requirements. However, as one staff member put it: “If they had to do less ‘corporate tick the box’ and instead focused on helping us design and carry out meaningful monitoring and evaluation activities, wouldn’t it be better?” Similarly, country teams mentioned IEG’s important role assessing the Bank Group’s country-level contributions through mixed-method evaluations but also feel that “IEG reinforces a broken system.” They noted that CLRRs reinforced the practice of validating results frameworks and prioritizing attributable, quantifiable, verifiable, and time-bounded indicators. Country program evaluations were seen as useful but too infrequent and rating focused. In a nutshell, many country staff regretted that “IEG’s expertise is spent on judging and rating and not on helping us assess and learn our true contribution to outcomes.”