The growing interest in strengthening development outcomes has stirred increasing debate about evaluation effectiveness. Today, many development institutions subscribe to what has come to be known as the DAC evaluation criteria. Specifically, these are five criteria – relevance, effectiveness, efficiency, impact, and sustainability; in short R/E/E/I/S – that underpin most evaluations in international development.
In a recent blog series that was read by over 12,000 participants, Caroline Heider, Director General Evaluation at the World Bank Group suggested that it was time to review the criteria. Over 100 readers shared their comments and questions. The Conversations team invited Ms. Heider and Hans Lundgren, Manager of the OECD/DAC Network on Development Evaluation, to respond to some of the questions that were submitted online, and to share their thoughts on the state of development evaluation today.
Question 1: Let’s start with you Caroline. For many years , the development community has used a common set of evaluation criteria, commonly known as the DAC evaluation criteria. In one of your recent blogs, you suggested that now is a good time to revisit the DAC evaluation criteria, and that we may be at a "Copernican" moment. Why do you think so?
|"It is (in my view) time to move to development models – theories of change – that are less linear, more representative of complex realities, and build on adaptive management. These approaches require evaluation to become more dynamic as well, adopt methods that capture complexity and unintended effects. In addition, there is a need to assess the adaptitiveness of project management." - Caroline Heider
Caroline Heider: Copernicus is a famous symbol for rethinking how we see the world. For a long time, models have been developed that made assumptions or simplifications. These assumptions were necessary to make the models work, but removed them from the complexity of reality. Today, we are increasingly able to cope with complexity, at least in our thinking and in our modelling capacity. Therefore, it is (in my view) time to move to development models – theories of change – that are less linear, more representative of complex realities, and build on adaptive management. These approaches require evaluation to become more dynamic as well, adopt methods that capture complexity and unintended effects. In addition, there is a need to assess the adaptitiveness of project management. For instance, are adaptations happening at the right time, what causes them, and so on.
2) Hans - you were involved in the process that led to the DAC evaluation criteria. Tell us about that experience and how these criteria came to be adapted so widely by the development community?
Hans Lundgren: The DAC evaluation criteria have their origin in the DAC principles for evaluation which was one of the first tasks I was responsible for when assuming responsibility of the DAC Evaluation Network back in 1989. The criteria were then updated in 2002 with the Glossary of evaluation terms which was developed in collaboration with IEG. Both these processes involved extensive consultations and consensus-building efforts, which were finally agreed to by all member countries and agencies. The criteria are part of a broader package of principles, guidance and standards developed by the DAC Evaluation Network. The criteria were conceived to help evaluation managers to reflect upon and structure the key questions in an evaluation. I think one reason behind their wide-spread use is that they are relatively easy to understand and to use when framing evaluation questions. Moreover, they relate to some key issues when assessing the success or failure of a programme.
Caroline: I agree with Hans that the criteria have been useful to shape overall questions about what we aim to assess. But, in practice I have seen too many evaluations that ask these questions without thinking. They use standardized – what made the program effective?, how efficient was the project?, etc. – without asking whether these questions are most important and useful. There are many other ways of asking questions that are more responsive to program managers, less jargonistic, and that will still lead to an assessment – or evaluative conclusion – of the relevance, effectiveness, efficiency, sustainability, and impact of programs that are evaluated.
3) Are all five criteria in the R/E/E/I/S framework still relevant? Is it time to review or replace all or some of them?
|"I am personally open to look again at the criteria and see how they can be refreshed. But before throwing the adolescent out with the bathwater – the criteria have been in place for fifteen years now and not a baby anymore – we should reflect on what we can build on and the fact that since they have such a wide-spread use many consider them useful in practical work.." - Hans Lundgren
Hans: Since your question asks if they are still relevant, I guess the criterion of relevance at least is still relevant! More seriously, I am personally open to look again at the criteria and see how they can be refreshed. But before throwing the adolescent out with the bathwater – the criteria have been in place for fifteen years now and not a baby anymore – we should reflect on what we can build on and the fact that since they have such a wide-spread use many consider them useful in practical work.
Caroline: True. It is not a matter of throwing the criteria out and starting all over. But, as evaluators we should take stock of how well they have worked and how they can be improved. I have made a number of suggestions in my recent blog series and we will take stock of all of the comments to think through the next steps.
4) Do you see some criteria as being more relevant for some types of programs/projects than others or are they applicable to most cases?
Hans: The five criteria should not necessarily be used in all evaluations. The application of the criteria or any other criteria depends on the evaluation questions and the objectives of the evaluation. Furthermore, we have developed additional criteria in evaluating humanitarian aid and for peacebuilding activities in settings of conflict and fragility. I am in favour of a thoughtful application of these or other criteria not a mechanical application.
5) Revising the evaluation criteria is likely to be messy and difficult. Is it worth it? Can’t we just work with what we have?
Caroline: On the messiness of the process, Hans has a lot of experience in negotiating consensus among different parties. In addition to the challenges he points out, I would say that the tent has become bigger: there are more actors involved in development, which means there are more involved in evaluation. I would hope that a body like the OECD/DAC remains a standard setter and the legitimate convener of building consensus even with an enlarged group of players. But, in response to whether it is worth it? Yes, I do think so! The wide-spread use of the criteria demonstrates how important they – and the consensus around them – were. For evaluation – as a profession or practice – to adapt to modern times, it has to redefine itself periodically. Research into evaluation methods and their practical application are leading the way, but eventually we will have to update and redefine the norms.
Hans: It is true that developing and building consensus around internationally agreed norms and standards is a not a simple process, and I have spent years in my career on facilitating such consensus-building processes. It is not only because of the number of actors but because some countries and agencies may hold very firm positions. For instance, the DAC evaluation standards took three years to develop, test, revise and reach consensus on. An alternative to agreed, common approaches is of course that each agency and development bank develops their own criteria, norms and standards. However,this would limit the possibilities of collaboration and reduce comparability.
6) One unintended consequence is that the criteria have potentially become somewhat of a straightjacket and lack the necessary flexibility. In other words, they foster a rigid structure that produces the same old reports spat out to the same old formula. Is this a fair criticism?
Caroline: This critique is not new to me and often takes the shape of complaints about jargon that only evaluators can understand. I don’t think this is a problem of the criteria as such, but has to do with their use, that is: the practice of evaluation. As I mentioned before, I have found evaluators – in many of the institutions that I have worked for – that have rigidly stuck to the criteria and were unable to use the criteria as the tool they were meant to be.
Hans: I am not sure which agency or development bank you have in mind when you say that they produce the same old reports spat out to the same old formulae. The application of the criteria has not blocked innovation as new methods and approaches have been developed during the last 15 years both for qualitative and quantitative evaluations. The criteria do not specify a specific method for evaluation but rather a way to help evaluators think about and structure the evaluation questions.
7) Are some criteria more important than others? Some have argued, for instance, that impact and sustainability matter more than efficiency, relevance, and effectiveness.
Hans: Which criteria are most important depends on the focus of the evaluation. There is obviously some interdependence between the criteria – if you get a number of positive effects it is also likely that your programme was implemented effectively. One way of dealing with complexity and interdependence would be the merging of criteria which is mentioned in the blog series. At the same time, any changes need to be clear and practical in order to be applied.
8) In reviewing the criteria, how do we avoid the danger of being trapped into even more elaborate box-ticking approach to evaluation?
Caroline: The problem that you raise is true, but not just for evaluation. I have seen this happen in many circumstances in the development field, and commented on the problem in evaluations I have written. I have not yet found the answer why this behaviour occurs: is it the normal course of bureaucracies, or a natural response to ever more demanding agendas that ask too much for people to handle? At least initially, I do hope that we can keep the discussion of evaluation criteria and methods sufficiently “charged” to hold off on the more standardized responses or practices that you described as “box-ticking”. In addition, my hope is that with an increasing number of evaluators who have dedicated their studies, research, and professional practice to evaluation, they will carry the banner that keeps renewing practices, including methods and criteria, to counter any risk of falling into stale routines.
Hans: As I am not in favour of a box ticking approach with the current set of criteria, I would not be in favour of a box ticking approach with a different set either.
9) There is a risk that incorporating the new criterion into evaluations will add complexity to what some already see as an already complex endeavor and entail a new learning curve. Is this where the development community should be spending its resources?
Hans: In my view, to get wide spread use, any new criterion needs be to clear and not overly complex. I think there are other issues around “re-thinking evaluation” that the community needs to reflect on. An important issue is whether evaluation in its current form really provides policy makers with the evidence needed to make decisions on trade-offs between choices. Policy makers need to take decisions on alternative options, involving uncertainty and sometimes limited information. Perhaps evaluation work needs to become more exploratory in nature, rather than generating a historic record of accountability. Moreover, current evaluation and knowledge systems do not always function optimally and work remains to be done to improve use of evaluation findings and promote learning.
Caroline: Indeed, there are many things we need to work on, and that the criteria are only one of them. And while Hans is right that decision-makers have to have evidence to weigh trade-offs between choices, this should not be limited to or even be primarily the responsibility of evaluation. In development banks, the appraisal of projects should include a comparison of the proposed solution with alternative options. Only: in practice that hardly ever happens. And, I do believe that an update to the evaluation criteria could incentivize the evaluation practice to address issues of importance to decision-makers. For instance, an evaluation that would evolve from an assessment of project relevance in its policy context to one that produces evidence whether the most impactful development challenge was addressed – as suggested in our blog series – would be a step towards answering questions in a more complex and uncertain world.
10) Complexity, agility, coherence, sustainability, and equity are examples of emerging areas in the area of evaluation. How are evaluators addressing these and other emerging issues?
Hans: I think new approaches, new methods and new evaluation thinking are all to be welcomed. Evaluation research is leading the way and finding its way increasingly into practical evaluation work on such issues as complexity and equity for instance. But it would be good to see more experimentation and broader uptake of a variety methods. For instance, the use of big data in evaluation seems still to be in its infancy at least in development evaluation work. Further work on unintended effects would also seem to warrant more attention. Re-thinking evaluation however goes far beyond the discussion on criteria.
Caroline: Hans is right to say that rethinking evaluation goes beyond the criteria. As the past has shown: the criteria have incentivized a focus on certain aspects of development practice and can therefore be transformative if they are defined in line with current needs. That is not to replace the development, testing, and experimentation of new methods, but to stimulate and support these developments and keep with times.
11) Are we keeping up with trends outside the world of development evaluation? There is a vibrant and much larger universe of evaluation, beyond that of the development industry, that is continuously evolving and flourishing, and for which "rethink, reframe, revalue, relearn, retool and engage" is an embedded and ongoing process.
Caroline: By all means: we are open to new ideas and improved practices. At IEG, we have hired a number of evaluation experts with the vision to upgrade our methodologies and evaluation practices. In addition, we are drawing on expertise and literature from any of the fields of evaluation to continuously grow.
Hans: I don’t have the impression that the development evaluation field has gone stale and is inward looking. New articles in evaluation journals and books are being published constantly. And I am certainly in favour of promoting cross- fertilization from other areas.
12) Do the SDG's present an opportunity to reframe the evaluation dialogue and build the foundations for a more embracing, resilient, inclusive and sustainable world? What other drivers do you see as pushing the need to change?
Hans: The Sustainable Development Goals as a vision for 2030 are certainly both an opportunity and a challenge. One lesson from the MDG era was that monitoring took the main role while evaluation was in the backseat. The implementation of the ambitious 17 goals, 169 targets, and the monitoring of 230 indicators certainly poses a number of challenges. From an evaluation perspective, I would like to see some more critical thinking: What is the theory of change? What about the assumptions in reaching the goals and targets? What steps need to be taken to enable evaluation to play a useful role in supporting implementation? A number of factors are driving change and disruption in our societies, including technology, violent extremism, competition between private firms and states - not only collaboration. Evaluators need to look outside the box.
Caroline: In addition, the SDGs include some targets on consumption patterns. If all countries aimed for consumption levels like those in OECD countries, the world overall would face considerable constraints and not achieve sustainability. Everyone needs to rethink consumption, including how we evaluate progress towards new consumption patterns. For instance, the efficiency criterion asks whether project resources were used as efficiently as possible, but not whether the project (by design and in its final implemented state) contributes to wasteful consumption or sustainable consumption patterns. It is the most difficult part of the SDG agenda, is uncomfortable, and falls under no-ones mandate in particular, which are ingredients for a “forgotten” agenda that will be revived far too late, that is close to the 2030 target year.
13) Given the amount of interest that this topic has generated, how and where can stakeholders engage with you to build on the existing R/E/E/I/S framework going forward?
Hans: The stakeholder group that I am most involved with is the DAC Evaluation Network which consists of some 40 evaluation departments from ministries, development agencies and banks. I believe there is an openness to discuss issues around “re-thinking evaluation”. If a process of revisiting the criteria will be launched, it would be important to reach out widely to partners, civil society and evaluators in a consultative mode of engagement.
Caroline Our first step will be to review the many comments and contributions we received on the blog series and then discuss with stakeholders, like Hans, whether and where to take this discussion. I agree with Hans that such a process would be open to wide-ranging consultation process.
Read the #WhatWorks series Rethinking Evaluation:
Have we had enough of R/E/E/I/S?, Is Relevance Still Relevant?, Agility and Responsiveness are Key to Success, Efficiency, Efficiency, Efficiency, What is Wrong with Development Effectiveness?, Assessing Design Quality, Impact, the Reason to Exist. and Sustaining a Focus on Sustainability.
Conversations: Making Evaluation Work for the World Bank, the African Development Bank, and Beyond
Caroline Heider and Rakesh Nangia, Evaluator General of the African Development Bank, explore the role of independent evaluation in their respective institutions, and some of the key issues they have encountered.