Embrace All Evaluation Methods – Can We End the Debate?

In my own organization I drew many lessons from observing committee meetings in which new projects were being considered. Typically such committees had a long list of agenda items to cover and limited time to cover them. Out of, say, six key considerations of whether a given project deserved to move on to the next stage, only one or two would be discussed in detail. The question of which handful of issues were the ones really muddled-over I believe was important to the quality of the project and (to the extent such project would become a precedent for future projects) of the institution’s ongoing operations. Who or what determined where the committee focused its time and attention? The chairperson’s predispositions for one; and then there was a kind of unwritten rule about what types of input were culturally acceptable, either generating enthusiastic nods or impatient sighs around the room. I am sharing this aspect of my experience to jump-start the debate to which Caroline refers above: while I agree that all methods have their place, given the subjective nature of the factors I described and the complexity of the organization in which these decisions played out, I would place special value (at least for purposes of corporate and process evaluations) toward a highly contextualized and participatory method.

Reply

Thank you for raising this important topic, Caroline. In fact, at the AEA Conference in Washington, DC this October, we were discussing how we might invite the RCTers to present, and how to promote cross learning. I would love a follow-up blog or longer publication showcasing specific experiences at IEG. We need more examples of what it means "to draw from RCTs and other methods." I.e. at IEG, do you COMBINE elements of RCTs and other methods, do you apply them SEPARATELY and draw lessons, or is there another way? It would enhance our discussion to be led through one such evaluation in terms of process, methods, and evaluation outcomes. I look forward to learning more from the IEG's experience.

Reply

Delighted. Thank you.

Reply

Embrace all APPROPRIATE evaluation methods. The challenge with inclusiveness is that it can lead to a watered-down inability to make any useful conclusions: When all methods are equally valid, we are left with no way to discern between competing claims. Rather, we should endeavor to apply the right method to the right question. In doing so we must be clear about the explicit identification assumptions as well as the too-often implicit ontological and epistemological assumptions behind different methods of evaluation. Although there is no hierarchy of evaluation methods generally, there is for specific questions. So while an experimental or quasi-experimental design is probably not the right method to answer the question about how stakeholders viewed the rollout process of a particular type of intervention, it is the right method to answer the question of the causal, attributable effect of a defined intervention in raising a defined construct of welfare for a defined population at a particular place and time. The question of external validity is a valid one. As Lant points out, there are few “invariance laws” for social science. In practice, therefore, the general principle of “external validity” is not likely to be accurate—there is no such thing as general generalizability in social sciences. Moreover, the question of out of sample prediction, which is what critics usually mean by external validity, is not limited to any particular type of evaluation method; all have the same challenge. Rather, we must have a particular, defined context in mind when discussing the transferability of programmatic findings, regardless of the evaluation method that generated those findings. The advantage of systematic reviews is not that they prescribe universal policy, but that they present those interventions that tend to be robust to the many ways things can go wrong and describe the contexts in which they worked to allow policy makers to form their own judgments on the appropriateness of local application. Beyond the oft-repeated and as often agreed to bit of useless wisdom that “context matters”, the challenge for evaluation—of all stripes—moving forward will be understanding which are the important elements of context that matter. While the applicability of a finding from one intervention to another place, time, or scale is never 100%, neither is it zero. Discovering the key elements upon which transferability is likely to hinge may well rely on both social science theory and the arsenal of evaluation methods, appropriately applied.

Reply

Sandra, thanks for flagging these important issues. Yes, culture matters, whether at entry, during implementation or at the end of a projects life. When you look at IEG evaluations you will often find a reference to "incentives" which are part of the culture that drives behaviors. We are in the midst of an evaluation that is trying to understand the conditions that stimulate or hinder learning in World Bank lending, which should help us generate very interesting insights.

Reply

Tessie: you will see a new post today on a Systematic Review of Impact Evaluations of mother and child health interventions. This review used existing impact evaluations and analyzed them first for their quality and then for their findings, including reliability. Building on this experience we are now undertaking a study of gender and social protection, which combines a systematic review of existing impact evaluations with a review of the World Bank's portfolio, and with one of qualitative evaluations on the subject. The exercise is drawing on existing information rather than going out to collect new data. In addition, we are doing systematic reviews for other large-scale evaluations, for instance, on access to electricity, where the systematic review will be combined with literature and portfolio reviews, in-depth project evaluations and country case studies. We will make sure to share more information about the "how" we went about these processes and not only what we found. Thanks for stimulating the discussion.

Reply

Jeff, you are entirely right -- it is about choosing the appropriate evaluation method for the question. And, since our evaluations often cover a complex set of questions, it is about the right combination of methods within an evaluation to triangulate findings from different sources and through different methods to develop a better understanding of what happens, why and how.

Reply

Selection of appropriate method can be very much influential in one evaluation or the other. Comparing two idiosyncratic environments, by no means, either experimental or quasi-experimental, either quantitative or qualitative, either systematic review or others, culturally justified. However, its a way of learning not finding a truth. None of the things behave exactly the same way as the other. Thus, it is better to be adaptive based on the rationality, relevancy, effectiveness, equity, the impact and sustainability perspective and if economy permits to verify the outcomes of one method by the other, triangulate to be closer to reality. Every evaluation method has some merits and some demerits. Thus, the capacity of weighing the value of a particular method for a given context and finding a best among options is the ingenuity of an evaluater the matters a lot.

Reply

Very much enjoyed reading the discussion here. Evaluation in a complex environment of a complicated programme/policy requires a careful selection of methods and design to suit the question we have at hand. We cannot use the same tool for all purpose but try and select the best tools/methods for the issue at hand. The other variables that need to be taken into account would be cost and use of the findings in the final selection of methods. In the end, experience (not much literature as evidence) has shown the simpler the method the better the final quality of the evaluation report. There are many reasons for this but one being that the just as context is changing during a programme/policy intervention so is the context changing during the conduct of the evaluation and the fact that despite good methods and design of the evaluation the team conducting it may not have the capacity to conduct the evaluation as designed.

Reply

Embrace All Evaluation Methods – Can We End the Debate?

Embrace All Evaluation Methods – Can We End the Debate?

FILTER BY

The Independent Evaluation Group data busters

What can we learn from the Independent Evaluation Group’s project…

Harnessing data for better development: The Independent Evaluation…

Unlocking the potential of geospatial analysis for evaluation

Comments

Add new comment

Restricted HTML

Article
Blog
comment compare
Custom decscriptions
Data
Evaluation
Multimedia
Event
Expert
General Documents
Homepage spotlight feature
Home page content spotlight
ICRR Reports
IEG Timeline
MAR
News
Basic page
Podcast
Reader chapter
Reader publication
Reports
Series
Survey Banner
Topic
Upcoming Report
Upload Mar
Xml Import

Embrace All Evaluation Methods – Can We End the Debate?

Embrace All Evaluation Methods – Can We End the Debate?

About the Author

FILTER BY

The Independent Evaluation Group data busters

What can we learn from the Independent Evaluation Group’s project…

Harnessing data for better development: The Independent Evaluation…

Unlocking the potential of geospatial analysis for evaluation

Comments

Add new comment

Restricted HTML