Selfies in Evaluation: Improving the Project Self-evaluation System

How to make self-evaluation more candid and less burdensome

By:

Ken Chomitz

April 30, 2014

"Selfies" – self-portraits taken with cellphones, at arm’s length and posted on social media – are being generated at an astounding rate. This provides, for the sociologist or artist, an immense and diverse gallery of photos for study. It must be admitted, though, that the average quality of selfies is low. And the objectivity of these self-portrayals is questionable, with images skewing towards the vain, the lewd, and the blurred.

Here, then, is the dilemma of project self-evaluation, as practiced at the World Bank Group and other development agencies. These agencies need to track project experience, both for accountability and for learning. Self-evaluation, in principle, provides information about many more projects than could possibly be visited by a limited corps of evaluators. It also promotes the laudable practices of self-examination and lesson-learning by development practitioners. But of course it is difficult for anyone to be dispassionately objective about one’s own work. Individuals, teams, and organizations also face strong incentives to present their work in the best possible light.

To counteract these natural biases, the World Bank's self-evaluation system incorporates independent validation. The project team submits a completion report, which includes self-ratings of outcomes, Bank performance, and other dimensions of project experience. IEG does a desk review of the completion report and, in about a third of the cases, downgrades the outcome rating (only 2% of outcome ratings are upgraded.)

The status quo: a self-evaluation system that doesn’t rely on self-evaluation

There are two problems with this approach. First, IEG’s desk-based validation process is based just on an assessment of the appraisal document and the completion report. But completion reports rarely document the full evidence base that generated the reported results. So it is as if a financial auditor only checked a company’s balance sheet without even glancing at the underlying accounting data. When IEG does a detailed field-based evaluation, with access to more evidence, 23% of outcome ratings are downgraded from the validation stage, and 8% are upgraded. (Part of this, though, may relate to changes in performance since the project closed.)

The second problem is that, paradoxically, World Bank Group staff and managers ultimately do not rely on the self-evaluations. Since the self-rating is universally reviewed and subject to revision by the IEG rating, the self-rating is essentially superfluous for the purpose of tracking project or Bank performance.

Paradigm change: validate the system, not the documents

Is there an alternative? One possibility would be a paradigm change: instead of validating each individual completion report, let’s validate the integrity of the self-evaluation system as a whole. The idea is to use a system of selective review, with incentives and disincentives, to both motivate and verify accurate self-reporting.

A familiar example is the way many tax agencies, including the US’s Internal Revenue Service, deal with tax returns. The agency does not do a full audit of every return. Rather, it does a quick check of each return for internal consistency and for “red flags,” which, in past experience, have been associated with differences between the taxpayer’s estimate of taxes due and the agency’s. Disincentives are put in place to discourage inaccurate reporting.

Another example is Indonesia's PROPER system for monitoring industrial pollution (). Companies are rated on a five point scale. The Ministry of Environment celebrates companies with exemplary performance that goes well beyond regulatory standards, and posts the names of those found to be deliberately contravening pollution laws (if they fail to respond to a warning). Companies that establish a record of good performance are visited by field inspectors with a lower frequency than those that consistently or egregiously do poorly.

Clear objectives and robust monitoring = less scope for disagreement

Could this work for project evaluation? The trick is first to put in place incentives that truly reward accurate self-reporting. Second, use standards and procedures that better align evaluators’ and WBG staff evaluation standards and criteria.

Much of the scope for disagreement on ratings can be eliminated simply by ensuring that project objectives are unambiguously defined, indicators are clearly linked to objectives, and monitoring systems robustly track indicators. A sample-based audit could then verify the accuracy of the self-reports. If there is good agreement between self-ratings and independent ratings in the sample, then all the self-ratings would be considered validated.

Given proper incentives, clear objectives, and reliable evidence, perhaps we can move towards a self-evaluation system that produces higher quality "selfies" -- and a more accurate snapshot of the organization.

FILTER BY

MORE OF #WHATWORKS
Related Media

Kristin Strohecker, IEG Program Manager for Data, Systems, and Staff Learning, Mari Noelle Roquiz, IEG Monitoring and Evaluation Specialist, and Tao Tao, IEG Data Scientist.

The Independent Evaluation Group data busters

Data dashboards in poverty, education, infrastructure, agriculture, energy, jobs, and SDGs

What can we learn from the Independent Evaluation Group’s project…

Dashboard with data on education poverty and infrastructure

Harnessing data for better development: The Independent Evaluation…

View from the space of planet earth and a satellite.

Unlocking the potential of geospatial analysis for evaluation

Multimedia

INFOGRAPHIC - A Look at the World Bank Group's Self-Evaluation System

Add new comment

Comments

I like the comparison. I'm glad to read where it's acknowledged that by itself, the self-rating function of the ICR and the PAD is evidently prone to bias. In line with the paradigm change you speak of, a proper field assessment that uses different techniques to collect feedbacks to validate/revalidate what has been rated is what's essentially needed. This will come a long way from self-reporting as a bare "due-diligence" process. I hope evaluation specialists can also draw on the lessons learned about those areas rated as "not working" as this is a very critical way we can get to designing and implementing successful interventions that are based on thorough, context-specific evidence.

Reply

Thanks, Yemisi. I think that in a good self-evaluation system the practitioners as well as the evaluators should be drawing on rich rapid feedback, learning quickly what's working and what's not and gaining insights into why. This moves in the direction of a self-validating system that is not a burden, but a tool for increased effectiveness.

Reply

Hello There. I found your blog using msn. This is a really well written article. Ill make sure to bookmark it and return to read more of your useful information. Thanks for the post. I will definitely return.

Reply

Add new comment