TWEET THIS:

Evaluators must ask if DAC criteria are inclusive enough and respect under-privileged groups.
Evaluation must take into account new thinking that questions escalating consumption patterns.
Need to develop evaluation models that capture complexity to inform policy.

Have we reached a Copernican moment where we realize the 'earth isn’t flat', and our definitions and 'understanding of the world' need to be reset? Leaving aside jargon and methodological challenges, there are other good reasons to revisit the evaluation criteria we use.


Over the past 30 years, evaluation in the development field has gone through multiple cycles of questioning which method is better than another. But few in the development circles in which I have operated, have questioned the standard evaluation criteria that we use.

Many development institutions, including the World Bank, regional development banks, the UN, and bilateral aid agencies subscribe to what has come to be known as the DAC evaluation criteria. Specifically, there are five criteria – relevance, effectiveness, efficiency, impact, and sustainability; in short R/E/E/I/S – that underpin most evaluation systems in international development.  

Evaluation questions get framed around these criteria, and reports get written up using this language. But, many an evaluation struggles to implement these criteria in sincerity. Others are accused of using too much jargon as they report faithfully on these criteria. And often, the evaluations tend to leave readers with unanswered questions.

After nearly 15 years of adhering to the DAC evaluation criteria, is it time for a rethink? Have we reached a Copernican moment where we realize the “earth isn’t flat”, and our definitions and “understanding of the world” need to be reset? Leaving aside jargon and methodological challenges, there are other good reasons to revisit the evaluation criteria we use.

Values

As our societies develop, norms and values shift. While the evaluation criteria appear to be neutral and should be applied as such, they were formed by a set of values. The post-2015 agenda has declared its intention to be more inclusive, respecting under-privileged groups of people, which means we as evaluators need to reflect whether the criteria represent such diverse views. Being able to shape norms that are more inclusive of diversity rather than judge everyone through more limiting norms will be a necessity if 2030 is to become the world we want.

End Game

The adoption of the Sustainable Development Goals (SDGs) signal that we need to shift our understanding of development outcomes. Our development and economic models are premised on ever-increasing consumption. By contrast, the SDGs recognize that such consumption levels are unsustainable from an environmental, economic, and social point of view. This new commitment should lead to a paradigm shift around desirable development pathways that are not premised on escalating consumption patterns. Evaluation tools to unpack intrinsic impacts on consumption patterns will be needed to determine whether the world is evolving in desired ways.

Complexity

The world has become more complex, or rather: our ability to accept and understand complexity has increased. International development has relied on often linear and simplified logical frameworks or results chains that string inputs-activities-outputs-outcomes-impacts into a straight causal path. Development practitioners as much as evaluators know that development processes do not follow such linear assumptions. Instead, one action might cause a number of reactions that have effects in rather diverse ways. Hence,we need to develop evaluation models that capture the effects of complexity to inform policy-makers and practitioners about the actual effects of choices they make and actions they take (see excellent book on this topic by Jos Vaessen et al).

Technology 

The pace at which technology develops and influences lives has far-reaching effects on societies. Solutions to complex problems can be generated in un-thought of ways and often through unconventional networks of people. Information travels, is demanded, and influences large groups of people at a much faster and inter-connected pace than ever before. We are faced with an avalanche of data, a dearth of facts, and an ease of spreading (mis)information that has been unprecedented.  Evaluation can benefit from technology, be it to construct with greater ease models that reflect theories of change, help with data collection and processing, or sharing evaluation evidence with a much wider audience than before. But, it does so in an environment of multitudes of realities that may or may not lead to evidence-based decision-making, especially if a “post-fact” era were inevitable. 

Cost & Benefits 

Current considerations of efficiency, cost savings, or cost-benefit analyses are challenged to take long-term impacts into account. Something that appears efficient today, might have inadvertent devastating long-effects on natural resources or the social capital of communities. Likewise, the distribution of cost and benefits have been uneven, as witnessed by those who bear the brunt of eroded natural resources, or of development outcomes that benefit some groups in society and not others.

Do these issues really necessitate a Copernican shift in the evaluation field that would require questioning the established five evaluation criteria? Are the criteria so inflexible that they can’t be adapted as they are to address these challenges?  Does this even matter for anyone else, other than the nerdy evaluators and their jargon-filled reports?

I say yes to all three questions. And particularly so, in a world that lives by the mantra “what gets measured, gets done”.


The Rethinking Evaluation series is dedicated to unpacking and debating evaluation criteria by which we judge success and failure, and whether they are fit for the future. Stay tuned and contribute your views.

Read Part II of this series: Rethinking Evaluation- Is Relevance Still Relevant?

Comments

Submitted by Mark Clayton on Tue, 01/10/2017 - 14:50

Permalink

A "Copernican" moment in time or the beginning of an Archimedial "Eureka" and a revolutionary movement for change? Are the SDG's an opportunity to reframe the Evaluation dialogue and build the foundations for a more embracing, resilient, inclusive and sustainable world? A time for more innovation and creativity in both approaches and methods and a recognition of the longer term trajectory of change - and dare I say it - behavioural science and economics - it's a messy and risky business we're in!

Submitted by Ting on Tue, 01/10/2017 - 20:44

Permalink

Thanks Caroline for such thought-provoking message. There's debate and concern on suitability and appropriateness in choosing and in applying evaluation methods in various contexts, perhaps similarly, how those ‘gold standard’ DAC evaluation criteria get used in reality is worth reviewing as well? Are they simply used as ‘checklist’ to fulfil bureaucratic requirements or are they applied to really reflect and capture the change stories? Are those criteria more relevant for some types of programmes/projects than others or are they applicable to most cases? This seems interesting research area.

Thanks, Ting, for your point about asking also whether the criteria are being used -- at all, as a checklist, or otherwise. A number of blogs speak to exactly this point. I hope you will comment and add your experience to the discussion.

Submitted by Alex Kremer on Wed, 01/11/2017 - 03:07

Permalink

The R/E/E/I/S framework can capture diversity and unintended consequences if we allow it to. The problem is not R/E/E/I/S but the way we use the Results Framework, like a racehorse's blinkers, to focus our attention on our original intentions.

Thanks, Alex. Yes, there is a question about application. Like any tool: it can be designed with the best intention and ideas in mind, but when used badly even the best tool cannot produce good results. But, I do think and will argue in the blog series that a number of the tools need a face lift. Hope you stay engaged and post your contributions as we go along.

Caroline and Alex thank you so much for your comments. Not only do we need to rethink evaluation but we need to be think the aims of development. Too often it is we who design development projects with what we think they should have rather than consulting local partners and participants in-country from the onset on what they think they should have. Far too often we evaluate how well they fulfilled our plans rather than them evaluating us on how well we have helped them have sustainable health, livelihoods, environmental conditions (as your article alludes to). Do we design for sustained impact? Not yet ;). Thank you.
More at www.ValuingVoices.com

Submitted by rick davies on Wed, 01/11/2017 - 10:24

Permalink

Readers will probably have many candidates for additional or replacement criteria, but one which I think quite a few people would like to see included is: equity

Many thanks, Rick. Very useful. We are right now in the process of evaluating what the World Bank Group calls "shared prosperity", which is all about distributional effects and equity. There will be a lot to learn in terms of evaluation methods. Looking forward to your suggestions.

Submitted by May Pettigrew on Thu, 01/12/2017 - 08:51

Permalink

There is a trend towards technicist instrumentalism, a kind of elaborated tick box approach to evaluation, that answers the questions of the evaluation criteria but in the end often leaves me feeling flat, dulled by the blandness of reports that lack sharply observed engagement with the complex realities of the programme. Evaluation criteria and the associated norms, standards and quality assurance measures were needed to support the many new evaluators who came into the field as it expanded in 2000s. That helped make sense of evaluation, but now we are in danger of losing its real meaning. The criteria have done their time, bravo, but lets now loosen the straightjacket.

Submitted by Lee Alexander Risby on Thu, 01/12/2017 - 11:31

Permalink

Thanks Caroline for a stimulating blog post. What we need are evaluative omnivores, more flexibility - willing to try new approaches and methods (developing the menu - not sticking to it) that are driven by context and questions and the demands of those wanting evaluation (who, at least outside of the bilaterals and multilaterals not always aware of R/E/E/I/S or want it when they are made aware). We use it for some evaluations (when appropriate) and others we are trying out feedback (constituent voice - with social entrepreneurs) and exploring the use of developmental evaluation for impact investing - where flexibility, embedding evaluation / and evaluator into operations, quick feedback and learning is needed in interventions which are not neatly conceptualised and subject to change. R/E/E/I/S in the way it is presently applied does not offer much flexibility. It probably has contributed to the mass of impentrable evaluation reports being left of the shelf - the same old reports spat out to the same old formular. We are in the business of evidence, learning, influence and change - can we always get it OECD DAC criteria? No. So I agree, time for a rethink and a fun and interesting debate to make relevant for the challenges of the SDGs, impact investing and corporations who want to be 'a force for good'.

Submitted by Anand on Thu, 01/12/2017 - 20:25

Permalink

Thanks Caroline for a succinct, but very powerful message on the need to rethink the current practices in evaluation. I am relatively new to evaluation. Couple of observations: (a) Evaluation criteria are set to measure the "project mode" of work. There is a need to adapt evaluations/use flexible approaches in several areas where global organizations work, which involve supporting political processes and aims at policy changes. As you rightly pointed out, evaluation in such cases are complex in nature, and attribution is often difficult; (b) Impact of normative work of global agencies are not easy to measure, as decision to use and the methods of use of the normative tools/products/instruments involve political decisions.

Look forward to more thought provoking posts.

Submitted by Roxana Salehi on Fri, 01/13/2017 - 09:50

Permalink

Thank you, I enjoyed reading the ideas here. Doing “complexity aware evaluation” or one that has “equity” at its core is a lens through which evaluation projects could take place, regardless of what criteria is used. But I do agree that developing these concepts into explicit evaluation criteria, or adapting the current ones to better represent these ideas, help keep them at the forefront of evaluation projects. It does matter to non evaluators, absolutely, particularly to program designers, but also to anyone who genuinely wants to help put data into action.

Submitted by Michel Laurendeau on Sat, 01/14/2017 - 12:19

Permalink

It is time to improve the evaluation practice by adopting better and more robust approaches to the assessment of program impacts and of their relative contributions to observed results in the normal context in which these programs operate. Politicians need more reliable assessments of the cost-effectiveness of program interventions to support (re)allocation decisions. Unfortunately, because of capacity issues, the evaluation community has been focusing too much on efficiency and has generally failed to deliver critical information on effectiveness and cost-effectiveness. Economists have been able to develop complex monitoring systems to measure and manage the impact of fiscal and monetary programs/policies. The technology and methodology has been there for some time. The evaluation community should start pushing for equivalent system and capabilities to monitor and manage the impact of public programs that are more oriented towards social development, health and environmental issues.

Submitted by Benedetta on Sun, 01/15/2017 - 05:10

Permalink

Thank you for challenging the status quo. The five DAC criteria indeed have been very helpful in bringing discipline, comparability and reliability into the evaluation world , but they often need to be adapted, stretched or accompanied by other criteria to better analyse and unpack the complexity of the world we try to evaluate. For instance by including other criteria such as equity, accessibility, sustainability, policy coherence, etc. The new SDG indeed offer us a great opportunity to stop and rethink how best we can meet the increased demand and desire to unpack and analyse out reality to guide future policy interventions.

Submitted by Ansgar Eussner on Tue, 01/17/2017 - 06:01

Permalink

Interesting discussion. I agree that diversity, social distribution, environment, unintended effects, are increasingly important to consider in evaluations but in my opinion the five criteria will still remain valid and can cover these issues, with enlarged and rethought methods though. In the Council of Europe we address often the political framework conditions in legal and political terms, providing advice for changes of constitutions and laws in various fields, but also through projects ranging from policy advice to training and technical assistance in many areas. They are not growth oriented nor affecting the environment nor the income distribution but nevertheless can be and are evaluated by us using the five criteria and mostly qualitative methods of data collection. Our main problem is documenting impact, as this is long term and the causal chain is difficult to isolate.

Submitted by Ansgar Eussner on Tue, 01/17/2017 - 08:20

Permalink

Good to discuss, but I think that you can build most of the social, equity. environment related issues into the initial set of objectives and then they will show up in the evaluation, even if using the five DAC criteria. Afterwards it is rather a question how to get the data, including on unintended effects.

Submitted by federico bastia on Tue, 01/17/2017 - 10:44

Permalink

Very interesting discussion. From my perspective, the big Five remain a valid to discuss about evaluation objectives. What I’ve found frustrating is when these criteria are put in ToR as a sort of shopping list where implementers or donors (more often implementers) ask for everything without really focus on specific, realistic and useful key research questions.

Add new comment

By submitting this form, you accept the Mollom privacy policy.