I had – mistakenly – assumed that the rationale for the “Big 5” evaluation criteria was well understood. After so many years of use, it became almost second nature to use these criteria, as opposed to other options, in our evaluations. [Recent] discussions revealed that while the DAC evaluation criteria were indeed well known, the underlying rationale for using them was not.
In previous blogs, I have discussed why after nearly 15 years, the time is ripe for the development community to revisit the DAC evaluation criteria. These criteria have been widely adopted, and today they largely govern how many institutions evaluate development projects and operations (see earlier blogs).
Recently I participated in a workshop organized by the DAC Evaluation Network to help shape the process of updating the evaluation criteria. It was a good discussion that built on previous conversations, recognized the need for action and some alignment among the participants around some of the issues, and took concrete steps towards broader consultation. Overall the meeting confirmed the need for a combination of three things: (1) genuine new criteria; (2) updating existing criteria; and (3) better implementation or application of what we have now.
The workshop made me realize that I had – mistakenly – assumed that the rationale for the “Big 5” evaluation criteria was well understood. After so many years of use, it became almost second nature to use these criteria, as opposed to other options, in our evaluations. The discussions revealed that while the DAC evaluation criteria were indeed well known, the underlying rationale for using them was not.
If we are going to revisit the criteria or test their continued “fit-for-purpose”, it is useful that we trace their origins. Not doing so would mean missing an important part of the conversation.
In this blog, I try to address this gap, at least in part.
How the DAC Evaluation Criteria Emerged
The underlying assumption at the time was that “aid” should help “recipient countries” achieve positive development results. To do so, aid needed to be
- Relevant → address actual needs, align with country policies, and comparative advantages of the agency;
- Effective → attain objectives that were agreed among the stakeholders that are working together to carry out the intervention;
- Efficient → make the best use of scarce resources so that the country and its people got the most benefits from the intervention;
- Impactful → produce effects beyond the objectives of the intervention; and
- Sustainable → have lasting effects that would continue benefitting the recipient country and its people where the intervention was carried out.
Given this context, the ongoing conversation around rethinking the evaluation criteria is based on the premise that the world has changed and requires evaluation to adjust its evaluation criteria. The reason: to ensure evaluation incentivizes development practices that promise success in terms of lasting, positive results for partner countries.
So, what does that mean for the evaluation criteria? Is each of them sufficient to determine success or failure?
My take on how they could be improved reaffirms that they provide a strong foundation, but can be improved through revisions, additions, and improved practices.
- Relevance is a necessary condition for success. But, is it sufficient? In a context where the interdependence of development process has become stronger, and the relative size of aid has become smaller, success will also be determined by strategic selectivity. Understanding a complex interplay of various actors, policies, interventions, etc. and knowing where best to contribute will be important for development institutions comparative advantage. Evaluation could incentivize practitioners’ search for such opportunities by assessing strategic selectivity.
- Effectiveness has suffered from poor practices. Objectives are often poorly defined, and only limited relevant data is collected to assess progress. Evaluations should be capturing intended and unintended outcomes, but often struggle to determine outcomes as stated. Starting with development practices, we need to see an improvement in defining objectives. Not just for the sake of evaluators, but to ensure all parties have a clear understanding and agreement about the results they intend to achieve. Lack of clarity might result from a lack of agreement or ownership, something that has been identified as key to success but inherently difficult to evaluate.
- Straddling between relevance and effectiveness is the question of agility and responsiveness. As the development context has become more dynamic, interventions must – and often do – respond to changing contexts. Evaluation can incentivize such adaptation (where it does not happen) by providing feedback whether an institution acted in time and based on a good understanding of changing needs. In theory this assessment could be done by comparing “relevance at design” with “relevance at evaluation”, but often does not happen, or fails to assess whether well-informed and timely decisions led to continued relevance.
- Efficiency is a criterion where improved application is essential. Anyone should want to make the best use of their resources, whether borrowed, generated domestically, or even given as grants (given the opportunity cost of a wasted grant). This is even more so for two reasons. Resource limitations (among all parties) dictates that we make best use of what we have. In addition, as partner countries develop their capacity to evaluate their own programs, they should want to make sure public resources are used as efficiently as possible to get the most services and goods for their people.
- Impact in a complex context needs to move towards assessments of (positive and/or negative) synergy effects. For instance, an intervention might have achieved its objectives/been effective, but due to other factors (contextual or other policies and interventions), its impact is limited or negated. To ensure success, these wider effects should be considered during planning (in design), implementation (through monitoring), and evaluation to learn how interventions can leverage positive synergy and mitigate negative synergy.
- From a normative point of view, impact assessments could be expanded to include the distribution of impacts on different population groups (income groups, vulnerable groups, etc.) to understand better whether concerns of poverty and inequality have been alleviated or deepened.
- Sustainability needs to recognize a much broader concept than environmental sustainability, and assess whether an intervention contributed to increasing resilience – environmental, economic, and social.