Organization
World Bank
Report Year
2015
1st MAR Year
2016
Accepted
Yes
Status
Active
Recommendation

Develop and adopt explicit evaluation protocols for piloted interventions to capture lessons from experience on poverty reduction, with a view toward opportunities for scaling up successful interventions.

Recommendation Adoption
IEG Rating by Year: mar-rating-popup NMNTNT Management Rating by Year: mar-rating-mng-popup NSNTNT
CComplete
HHigh
SSubstantial
MModerate
NNegligible
NANot Accepted
NRNot Rated
Findings Conclusions

Pilots are used to strengthen the design, implementation, and scaling-up of projects and to enhance the poverty focus of the Bank's projects. Early intermediate outcomes can attract additional resources leading to scaling up. The impact of Bank support for poverty reduction will depend in part on successful use of pilots to leverage and crowd-in external resources to scale up. However, evidence on whether pilots are successful and serve to leverage non-Bank resources is rare.

Original Management Response

WB: Agrees. Each piloted intervention, particularly those that aim to reduce poverty or enhance shared prosperity, should develop a well thought-through process and protocols for evaluating its impact and create the evidence base for scaling-up.

Action Plans
Action 1
Action 1 Number:
0376-01
Action 1 Title:
Action 8a: Toolkit for distributional impact analysis.
Action 1 Plan:

Action 8a: Toolkit for distributional impact analysis.
Indicator: Toolkit issued and disseminated.
Baseline: No comprehensive source of know-how exists.
Target: a) Working Paper prepared, b) Internal workshop to disseminate the handbook internally.
Timeline: FY17

Action 2
Action 2 Number:
0376-02
Action 2 Title:
Action 8b: Improve monitoring of pilots designed for scaling up and capturing lessons learned from these pilots.
Action 2 Plan:

Action 8b: Improve monitoring of pilots designed for scaling up and capturing lessons learned from these pilots.
Indicator: Update “Results Framework and M&E Guidance Note to provide advice on monitoring of pilots designed for scaling up and update ICR guidance note to capture lessons learned from these pilots.
Baseline: No guidance on monitoring of such pilots or systematic capture of resulting learning
Target: Update “Results Framework and M&E Guidance Note and ICR guidance note.
Timeline: FY17

Action 3
Action 4
Action 5
Action 6
Action 7
Action 8
2019
IEG Update:
No Updates
Management Update:
No Updates
2018
IEG Update:
No Updates
Management Update:
No Updates
2017
IEG Update:

Action 8a: The toolkit for distributional impact analysis has been produced. Dissemination is underway.
Action 8b: No guidance exists yet on how to monitor pilot projects and how to capture and assess learning from these projects. As IEG has stressed in the past, an essential step for the implementation of this recommendation involves setting up a definition of pilot projects (as management recognizes) and a repository for monitoring pilots, in order to be able to identify them and document their results. The overall rating of Moderate proposed by IEG for Recommendation 8 stems from an average of "Substantial" for Action 8a and "Negligible" for Action 8b.

Management Update:

Action 8b:
Discussions have been initiated with IEG on how to set PDOs and indicators for pilot projects, and how to rate them appropriately. The next step, after completion of these discussions, will be to include guidance in the PAD guidance note on how to define pilot projects and in the ICR guidance note on how pilots will be rated. Communication with staff on pilot projects would then be integrated as part of core training.
8.b. Rating: Negligible
Action 8a: Toolkit for distributional impact analysis. Indicator: Toolkit issued and disseminated. Baseline: No comprehensive source of know-how exists. Target: a) Working Paper prepared, b) Internal workshop to disseminate the handbook internally. Timeline: FY17
We are attaching the toolkit "Distributional Impact Analysis: Toolkit and Illustrations of Impacts Beyond the Average Treatment Effect," by Guadalupe Bedoya, Luca Bittarello, Jonathan Davis, and Nikolas Mittag. This toolkit was published in the World Bank Policy Research Working Paper Series in July 2017 (WPB 8139). As of September 6, 2017, it has been downloaded 1,237 times from one of the sources where it was published (see here)
A training for researchers was delivered based on a preliminary version of the toolkit in December 2016 (see related information attached), and we plan to conduct additional events in the Bank and potentially at DFID that has disseminated the tool within their teams and has expressed their interest to organize an event as well.
Additional dissemination activities will take place in the next weeks through the World Bank Research Twitter account, and within DIME's impact evaluation network through email and the DIME's Newsletter. About the Toolkit This toolkit aims to contribute to improving the use of state-of-the-art methodologies to learn further about development programs that are being implemented and evaluated. The toolkit was disseminated among more than 200 IE researchers in DIME's network, and it was tweeted by the DECRG Director (see below), and retweeted after that.
About the content of the toolkit (extracts from the introduction of the toolkit)
Traditional methods to evaluate the impacts of social programs and the vast majority of applied econometric policy evaluations focus on the analysis of means. However, there is also a large and growing literature on methods to evaluate the effects of programs and policies beyond their mean impact. While less frequently applied, these methods can provide information that is valuable or even necessary in the assessment of the consequences of policies and their desirability. The purpose of this toolkit is to provide an overview of the questions such methods can address and the core approaches that have been developed to answer them, including discussions of the assumptions they require, and practical issues in their implementation. Mean impacts are a natural first summary statistic to describe the effect of a policy. The mean impact of a policy or intervention tells us by how much the outcome would increase or decrease on average when every member of a particular population is exposed to the policy or intervention. Thereby, they provide the central piece in any cost-benefit analysis. However, a decision-maker usually requires information on the effects of a policy beyond its mean impact. For example, mean impacts allow us to calculate the total gain from a program or policy, but do not allow us to say anything about the distribution of the gain or how the outcome distribution is affected by the program beyond changes in its mean. A positive average program effect tells us that a program can generate social surplus, but it may not be sufficient to allow us to judge whether the program is desirable or not if any weight is placed on distributional concerns, such as whether inequality is affected by the program, whether some people are harmed by the policy or whether a particular demographic group benefits. The common theme of these issues is that they cannot be addressed by mean impacts alone. Finding answers requires thinking about the impact of a program or policy as a collection of distributional parameters rather than a single scalar parameter such as the mean. Hence, we refer to these types of analyses as "distributional impact analysis" (DIA). DIA concerns the study of the distributional consequences of interventions due to participants' heterogeneous responses or participation decisions. In particular, DIA investigates features beyond the gross total gain of a program by studying where the gains/losses of a program&ndash if any&ndash were produced, and who wins or loses as a result of program participation. The benefits of DIA outlined above beg the question why we do not see more of these methods applied in practice. The literature is still nascent compared to the importance of the topic and the tools available. There are often good reasons to focus primarily on mean impact and the motives to not look beyond it depend on the application. For example, DIA usually requires larger sample sizes, and many of the methods below are justified asymptotically and are not necessarily unbiased in finite samples, which can be problematic in applications with small samples such as most RCTs. Some methods rely on additional assumptions that are reasonable in some applications, but not in others. However, part of the hesitation to apply these methods also seems to stem from inertia and the fact that they are relatively new or have only recently become computationally feasible. Inertia may be an obstacle if researchers or their audiences are more used to and hence comfortable interpreting mean impacts and the conditions under which they are valid. To mitigate these issues, we review the practical considerations of statistical inference and power calculations for DIA. Finally, we illustrate select DIA methods and what can be learned from them by re-analyzing the impacts of a financial literacy program in Brazil and a school management program in The Gambia.
8a Rating: Complete
Total rating: substantial

2016
IEG Update:

Action 8a: Management indicates that a draft toolkit exists, although this was not attached so it was not possible to verify to what extent it is helpful for distributional impact analysis. It has not been yet issued and disseminated.

Action 8b: No response has been provided by management. As IEG highlighted when the action plan was under preparation, an essential step for the implementation of this recommendation involves setting up a repository for monitoring pilots, in order to be able to identify them and document their results.

Management Update:

The first draft of the Toolkit has been prepared. The toolkit covers the distributional consequences of programs/policies due to participants' heterogeneous responses or participation decisions. In particular, it examines features beyond the gross total gain of a program by studying where the gains/losses of a program - if any - were produced, or who wins or loses as a result of program participation. Sections include: (i) Questions of Interest and Definitions (ii) Impact on the Outcome Distributions for RCTs with full compliance and RTCs with less than full compliance (iii) Distribution of Treatment Impacts incl. Bounding the Distribution of Treatment Impacts and Point Identification of Features of the Distribution of Treatment Impacts (iv) Subgroup/Conditional Analysis (Subgroup Means and Machine Learning) (v) Applications (vi) Practical Issues incl. Sampling/Power, Inference and Multiple Hypothesis Testing and (vii) Appendices and Code Samples.

The Toolkit will be circulated for comments in August 2016 and planned to be submitted to the Working Paper series in the Fall.

The internal training will be delivered late Fall or early Spring.

8b. Management has undertaken steps towards reforming the ICR, which is expected to be completed by the end of FY17. Once the decision is made how to reform the ICR's content and process, a new ICR guideline will be developed and the M&E Guidance Note will also be updated.”