Back to cover

Meta-Evaluation of IEG Evaluations

Chapter 5 | Using Innovative Methods in Independent Evaluation Group Evaluations

Evaluation question 5. What do evaluation reports, Approach Papers, and interviews with IEG staff tell us about the use of innovative methods in the context of evaluation in IEG?

As noted in chapter 3, conventional methods such as case studies, structured interviews, and statistical analysis were relatively common across the sample, with innovative or broadened methods present in a minority of the reports studied. Nearly all evaluations employed some combination of interviews, case studies, desk reviews, and surveys. The total count of conventional methods tended to be higher in the final evaluation reports than what was initially proposed in the Approach Papers. Furthermore, analysis of temporal trends suggested that the adoption of more innovative methods had increased in more recent evaluations.

Given that one of the goals of the meta-evaluation was to “provide IEG’s Leadership Team with an external perspective on how to improve the quality and credibility of IEG’s evaluations,” attention was paid to the use of innovative evaluation methods in both the review of Approach Papers and reports and during interviews with IEG staff. With respect to the latter, it was noted that several ongoing evaluations have expanded the scope of methods employed, suggesting a growing trend with respect to this issue. Among the methods used, the meta-evaluation found a growth in applications of geospatial analysis, process tracing, QCA, machine learning, and social network analysis. An inexhaustive set of examples is discussed below. Given the fact that we did not pass any summative judgment on the use of innovative methods, we cite some examples from the sample as well as from other (including more recent ongoing) evaluations.

Geographically targeted analysis of georeferenced data on World Bank investments was used in the Mexico Country Program Evaluation: An Evaluation of the World Bank Group’s Support to Mexico (2008–17). The background of this approach is described as follows in appendix 1 of the report: “geo-referenced poverty and aid data allow to evaluate targeting effectiveness of development interventions. Initially, this can be done by correlating the geographical allocation of World Bank projects at regional level with regional measures of (under)development. Relatively high correlations are consistent with effective geographic targeting, whereby most resources are directed toward underdeveloped regions. However, finding low correlations may not necessarily point to poor targeting as there are many factors potentially affecting the allocation of World Bank projects. Therefore, a regression approach is necessary, controlling for other factors such as conflict, public spending and other factors.”

The carbon finance and engaging citizens evaluations provide clear examples of the benefits of process tracing in evaluation. In the latter, “the evaluation team piloted an in-depth causal analysis method called process tracing in the case of the Reportes Comunitarios of the national CCT of the Dominican Republic. Process tracing was used to assess the impact of embedding a participatory monitoring in the CCT and to evaluate the significance of the World Bank’s contribution. Process tracing is a rigorous method of within-case causal inference that relies on Bayesian updating logic to transparently assess the probative value of pieces of evidence provided to justify specific contribution claims.”1

The use of a (semisupervised) machine learning approach presented another example of innovation in evaluation. In the Approach Paper for Evaluation of the World Bank’s Support to Improving Child Undernutrition and Its Determinants, such an approach was piloted to assess the Bank Group’s contribution to reducing undernutrition, exploring the effectiveness of various interventions relative to the outcome. Having identified key concepts from the underlying theory of change, machine learning was then used to explore a large portfolio of projects across sectors and databases in a more efficient way. Given that nutrition interventions can be nested in a broad pool of projects (such as those involving health, agriculture, water, governance, and social protection), a machine learning–supported portfolio analysis presented a more effective means of examining the pool of over 4,000 projects considered in the evaluation scope. This was complemented with the production of automatically generated knowledge graphs that explicitly encoded expert knowledge that would otherwise have been difficult to capture.2 The combination resulted in the development of a more nuanced theory of change, as well as a streamlined portfolio review process.3

Finally, social network analysis was applied in several reports, including the Knowledge Flow and Collaboration under the World Bank’s New Operating Model (FY19) and World Bank Group Support to Health Services: Achievements and Challenges evaluations. The evaluation The World’s Bank: An Evaluation of the World Bank Group’s Global Convening (FY20) also used this approach, analyzing Twitter data “to assess the reach and visibility of the Bank Group on Twitter and to compare its connectedness in its social networks on selected issue areas with that of key actors (by virtue of their mandate and comparative strengths) in said area” (World Bank 2020c, 50).4

In several interviews with task team leaders and senior evaluators, attention was paid to the importance of broadening the integration of innovative methods in IEG’s evaluations. Interviews on the development of innovative methods suggested a generally positive trend in recent years, moving toward the broader integration of such methods into evaluations. In some cases, innovation was perceived to be coming “from the outside or from above” without due consideration of the relevance of these methods to the subject of evaluation. It was noted that if innovation is imposed from the outside it could contribute to a (less than optimal) fragmentation of resources and evaluation results.5

Overall, the meta-evaluation noted that the use of innovative methods has increased in IEG evaluations over time. The inventory of methods from IEG evaluations (chapter 3) supports this assertion. As noted previously, innovative methods include the analysis of big data from social media sources, geospatial data, and “text-as-data” approaches (including machine learning in portfolio analysis), as well as specialized theory-based evaluations. Theory-based evaluation methods can be used to reconstruct and test the underlying assumptions about mechanisms (behavioral, cognitive, economic, and institutional) that can explain how and under what circumstances Bank Group interventions can have an impact.6

The meta-evaluation also noted that innovative methods can be classified into two categories. First, there are some innovations that may significantly influence the overall design and approach to evaluation. For example, some of the new text analytics and machine learning approaches change the way portfolios are identified and analyzed. Other innovative approaches can better be classified as “boutique studies,” a term that carries both a positive connotation and certain implications of detachment. In principle, innovative “boutique studies” should be stimulated. Experimentation in the use of innovative methods can be a strong incentive for staff and can help IEG maintain its edge as a leading evaluation institution. Yet, prudence is in order. Though interviewees emphasized the importance of innovation, they also noted that the relevance of such approaches was not always fully articulated or integrated into the evaluation design matrix. This may have influenced the perceived fragmentation noted above. While the trend of increasing methodological diversity identified in the inventory of methods should be applauded, innovation should not become an end in itself. Evaluation teams should always carefully consider the cost-benefit ratio of innovation and the logic of using specific methods to address evaluation questions, making sure that each new approach adds value to the analysis.

  1. Elsewhere in this report, it is indicated that “The process tracing study in the Dominican Republic was used to test formally the theoretical framework emerging from the literature review.” See box A.4: Process Tracing of Citizen Engagement in the Dominican Republic, p. 78.
  2. As noted in the report, “knowledge graphs allow for a ‘smart’ theory of change that integrates the theory of change and project outcome data to streamline the portfolio reviewing process, as well as to assist reporting, strategic analysis, and portfolio management. Knowledge graphs are complementary to machine learning because they can explicitly encode expert knowledge in ways that are difficult with machine learning models.”
  3. As the theory of change is “a static object, which keeps the task of validating project indicators and outcomes manual hitherto, the challenge for AI-based decision support is to formulate the theory of change as an instantiated machine-readable artifact” (World Bank, forthcoming).
  4. Published April 1, 2020. While social media analysis provides certain clear advantages, it should be noted that there are also serious analytical limitations tied to the nature of the underlying data analyzed. Such issues are outside of the scope of the meta-evaluation.
  5. The reasoning seems to be that they are perceived as an extra lens leading to new and possibly different insights.
  6. See Pawson (2013) and the earlier references to the Coleman Boat Model for assessing macro-meso-micro links.