Rethinking Evaluation - What is Wrong with Development Effectiveness?
The way we look at development effectiveness needs a facelift.
The way we look at development effectiveness needs a facelift.
Focusing on intended results could circumvent examination of unintended consequences
Attributing change to a single actor or intervention ignores many forces that could be at play
Distributional effects need to be assessed if we are serious about boosting shared prosperity
If development planners were to use complexity models to understand the web of interrelated processes to identify their objectives, intended and possible unintended effects would become clearer, and possibly increase evaluability.
Effectiveness is central to international development and its evaluation. The OECD/DAC Glossary of Terms defines development effectiveness as “the extent to which a given development intervention’s objectives were achieved, or are expected to be achieved, taking into account their relative importance.”
By itself, the DAC definition embodies the accountability dimension of evaluation. Complemented with an evaluative question of “why” objectives were achieved (or not), one gets to learning about the experience of trying to achieve a particular objective in a particular context.
The term embodies the fundamental concept that development assistance is measured against the yardstick that it sets for itself, because it is the development partners who decide on the objectives they aim to pursue. This notion is very different from assessment tools like benchmarking (a comparison with an agreed standard) or competition, where success is defined in comparison with others.
When viewed among these options, effectiveness seems rather lenient, given that the development partners define what success looks like. Nonetheless some development practitioners argue that effectiveness is too tough, and too rigid to account for adaptation during the life of the intervention (read our earlier blog – Rethinking Evaluation: Agility and Responsiveness are key to success). Others, mostly evaluators, argue that it is the practitioners’ risk aversion that makes them shy away from effectiveness as a measure of accountability, and has incentivized behaviors to “game the system”. In such scenario, objectives are written to get a good rating at the end rather than as the intended results that development partners try to achieve. There are good points to each of these arguments.
But, from my perspective, there are additional reasons why the way we look at development effectiveness needs a facelift!
With our increasing understanding of and ability to work with complexity there will be different demands on project planners and evaluators, as discussed in an earlier blog about relevance. This might change the nature in which objectives are set, which will either make it more challenging to assess whether they were met, or demand an equally dynamic evaluation tool, or both. It raises questions about the differentiation between effectiveness and impacts – something many practitioners have struggled with – and might call for merging these two criteria.
In addition, the way effectiveness has been defined has kept attention focused on intended results. Most evaluations grapple with getting evidence to determine whether objectives were achieved and to measure an intervention’s contributions. Fewer evaluations are able to collect evidence on effects outside the immediate results chain and identify unintended consequences. If development planners were to use complexity models to understand the web of interrelated processes to identify their objectives, intended and possible unintended effects would become clearer, and possibly increase evaluability. And even if planners do not use such tools, evaluators should explore how they can become part of defining program theory and evidence collection.
At the same time, complexity models make it clearer that attributing change to a single actor or intervention ignores that many forces are at play. The question of attribution has been at the heart of many a debate about the rigor and validity of evidence and whether it could prove one policy or action was better than another. A better understanding of complexity might help join up interventions of different development partners, and suggests that (in the long-term) evaluations have to be undertaken from a systemic point of view rather than focused on a single development agency or intervention.
Likewise, distributional effects of interventions, whether explicit part of the intended outcomes or not, need to be assessed if we are serious about goals like “no-one left behind” (proclaimed by the global community through the SDGs), or boosting shared prosperity, as one of the goals of the World Bank Group. Too little attention is paid to the assumptions we make about interventions that are not targeted and supposedly have no distribution effects. If the analysis of intended and unintended effects is differentiated by different stakeholder groups (rather than “beneficiaries” as one homogenous category), we can get a better understanding of the actual effects or impacts of interventions.
In short, the criterion “effectiveness” needs a facelift, not just for the purpose of addressing counter-productive behaviors. The spotlight that we evaluators shine has incentivized certain behaviors of decision-makers, program planners and implementers. Let’s do so intentionally, rethinking evaluation criteria and methods that incentivize behaviors for better development outcomes.
Have we had enough of R/E/E/I/S?, Is Relevance Still Relevant?, Agility and Responsiveness are Key to Success, Efficiency, Efficiency, Efficiency, and Rethinking Evaluation - Assessing Design Quality
Thank you very much for your critical reflection on what I believe is the OECD criterion that receives the most attention in evaluation reports. In particular, I appreciated that you brought up the issue of unintended outcomes in evaluation. This is a very dear topic to me: I researched on it with Michael Bamberger and I have been discussing with government officials and NGO staff as well as workshop trainees for the past 10 years.
In an effort to better understand why unintended or unexpected outcomes are often invisible in evaluation, I identified two possible causes that I hope could contribute some food for thought for further discussion. Needless to say, my contribution intends to go beyond the common belief that "organizations never want to look bad and therefore have no incentive to ask about occurrences within the scope of their programs that were not adequately envisaged at planning".
The first cause is what I called the "RBM-isation of the evaluation culture" in a brief article that I wrote for OECD back in 2014. Otherwise said, too many people still associate evaluation with logical frameworks and believe that its focus should be only on logframe boxes. When I engage in a discussion with planners and manager that embrace this perspective, I always try to explain to them that evaluation is certainly an integrated component of the Results-Based Management Cycle but that it has also some specific features that set it apart from monitoring (e.g., assessing program or policy effects - good or bad- that were contemplated at the planning stage). In such cases, one possible solution would be to clarify to stakeholders (e.g., decision-makers and planners in ministries as well as NGOs) that evaluation function does not simply consist in verifying the compliance between what's found in a project log frame and what happened in reality during implementation. In doing so, using RBM terminology and referring to the RBM cycle as a starting point for discussion might be particularly effective in capacity development programs aimed at professionals who have been exposed to RBM training for over a decade and that are only now starting to familiarize themselves with evaluation.
The second cause that I was able to identify in relation to the limited visibility of unintended outcomes in evaluation is the neglect of critical assumptions at planning or the lack of an adequate monitoring of them during implementation. Put simply, critical assumptions, which, by definition, are taken for granted and never measured or assessed, often do not hold and that's exactly why unintended outcomes often come about. In this vein, one possible solution would be to make a greater effort in monitoring program and policy assumptions as much as possible, making sure to allocate adequate budget resources for it during the planning stage.
On a more general and final note, I believe that your blog series reflects a new trend in the current evaluation discourse where practitioners are finally taking a stance vis-a-vis some of the dogmas of our profession (e.g. the OECD criteria). Therefore, blogs like yours are extremely useful for enriching the ongoing discussion. Once again, thank you very much for bringing up this very important issue within our community!
Thanks Caroline for another…
Thanks Caroline for another thought-provoking blog. It's indeed key to differentiate terminologies at first instance. As it happens that same term might be used to refer to very different notions or different terms used for same meaning, comparison and conclusion possibly made on such basis becomes problematic. Might it help by differentiating and clarifying the underlying consideration and hypothesis involved in 'effectiveness' or 'impacts'? For example, which specific layer/level of result this ’effectiveness’ might correspond to or what might be its relevant domain along a results chain...
Besides, in terms of measuring against preset yardstick as you mentioned and Michele’s comment on the function of logframe in evaluation, I think it also depends on how it gets used. Perhaps we might use logframe mainly as an evaluation framework providing theory/theories of change and facilitating evaluative thinking, rather than a major evaluation tool itself; or if logframe has to be used to fulfil some bureaucratic imperatives as preferred by many, it perhaps can be used together with other tools rather than as the sole or central tool, so that some unintended results might be captured.
Very intense blog, thank you…
Very intense blog, thank you! Our improved understanding and ability to work with complexity indeed require serious revision of how we shape interventions, implement and evaluate them. I’d think that our focus should be on understanding the ‘risk landscape’ of individuals, households, communities and how we alter them with different interventions. Measuring outcomes therefore should be focused on how we change the evolving dynamic of each particular risk landscape rather than vis-à-vis intended results here and now.
Add new comment