Back to cover

Process Tracing Method in Program Evaluation

Chapter 1 | Theorizing How Interventions Produce Contributions

A pToC, in effect, answers a specific and important impact evaluation question: How did intervention x work to produce outcome y? A pToC combines elements of ToCs, which describe the overall causal logic of an intervention, with theories of action, which articulate the mechanisms through which particular activities produce a specific contribution. The pToCs differ from more conventional ToCs in two key ways (Camacho and Beach 2023; Raimondo and Beach 2024). First, a ToC describes, in terms of assumptions, the causal links between the inputs to and activities of a project and the outputs from and outcomes and impacts of the project (for example, Mayne 2017, 2019). In contrast, a pToC tries to break down each of the links into a set of activities and interactions that illustrate the actual behavioral link between an input and an output. Second, ToCs tend to depict interventions in snapshot form, describing inputs and activities followed by outputs and outcomes. A pToC offers a more dynamic picture that breaks down the interactions between program actors and beneficiaries into a series of episodes.

A good pToC avoids the extremes of either drowning in detail or simplifying so much that the workings of the intervention are not transparent. An excessively detailed pToC would not be practical or useful in an evaluation because producing it would require finding and assessing evidence related to (almost) every action and interaction by every actor during the implementation period of a program or policy. An additional reason is that it would make learning across cases difficult because the pToC would include so many details that the theory would in effect be unique to the studied case. In contrast, an oversimplified pToC would not provide the theoretical scaffolding required to make credible evidence-based claims about an intervention’s contributions to the outcomes of a policy or program; the oversimplification would obscure the activities and links that can be expected to leave observable traces that evaluators can assess empirically.

A good pToC includes only the key episodes of interaction that link the actions of program actors and those affected by these actions in a causal sequence that produces one or more contributions to program or policy outcomes; these episodes in a causal process are also known as causal hotspots in the literature (Apgar and Ton 2021). To identify key episodes, evaluators must ask themselves which interactions have been critical in enabling actors to overcome challenges and barriers that otherwise would have stood in the way of achieving the change the program or policy intended to accomplish. The pToC can also depict these challenges or barriers, together with corresponding episodes whose series of actions could, in theory, have overcome them.

When developing a pToC, it is important to not include everything that occurred during the implementation of the program or policy being evaluated. As process tracing tends to be resource intensive, a pToC should focus on only one or a small number of key episodes, even for the most complex interventions. Evaluators will need to consider how feasible it is to evaluate a particular pToC with the resources they have available (Aston and Wadeson 2023). Focusing an evaluation on the most interesting parts of an intervention, the most important challenges and barriers, and the contribution pathways used to overcome them is a way to avoid drowning in excessive details. A pToC will never be a perfect representation of what actually happened in a particular intervention; it will always be a simplification of reality. But it should be detailed enough that it gives those who read it a better understanding of how the intervention worked in the real world.

Within each key episode in a pToC, evaluators can use the language of actors and actions to conceptualize the interactions that constitute a process of change (Schmitt and Beach 2015; Wauters and Beach 2018). In simple terms, this means identifying who is doing what. A pToC typically presents the actors in more abstract terms that depict the causal roles they play in the process rather than using their formal titles and names. In IEG reports, for instance, we use terms such as “World Bank official” or “national ministry official.”

Identifying what actors are doing is not enough to enable evaluators to understand why the actions of one actor led other actors to do the things they did. That level of understanding requires that evaluators supplement the process tracing with what Cartwright and Hardie (2012) term causal principles, defined as reasons why a given action by one actor might plausibly lead another actor to do something. A pToC can be visualized in generic terms, as in figure 1.1, in which each part is composed of an actor (who), action (what), and causal principle (why), formulated using a “because…” clause, along with the contextual conditions that evaluators might expect to find if the pToC is to work as theorized.

Figure 1.1. A Generic Process Theory of Change

Image
A figure relating how the triangulation of different empirical inputs can help create a fuller account of World Bank Group contributions to outcomes.

Figure 1.1. A Generic Process Theory of Change

Source: Adapted from Camacho and Beach 2023.

There are three steps in developing a pToC for how an intervention worked (Camacho and Beach 2023):

  • Descriptive analysis of what happened, identifying what activities were associated with the intervention and what potential contributions to the final policy or program outcome each activity might have produced;
  • Initial exploration of how each contribution to the final outcome might have been plausibly produced, including identification of key episodes;
  • Disaggregation of the pToC into episodes of interaction between actors.

Step 1: Descriptive Analysis of What Happened During an Intervention

A pToC is a theoretical model that aims to capture the dynamics of how a contribution was actually produced in a particular case, based on empirical research. Before theorizing how a specific intervention produced a particular contribution in an individual case, however, it is vital to know something about what happened, empirically, in the case. Therefore, the first step in generating a pToC is to describe analytically what a particular intervention was, what activities were performed, and what potential contributions to the final policy or program outcome could have resulted. Evaluators can undertake the necessary analysis through desk research involving program documents, but ideally they will supplement this research using exploratory interviews with program actors and managers to gain a better understanding of what actually happened during implementation of the intervention.

The initial empirical research should result in a timeline of potentially important moments in the implementation of the intervention, as well as initial hypotheses about what contribution to program or policy outcomes the intervention might plausibly have produced.

In the example IEG evaluation, we attempted to assess whether policy dialogue in the client country based on World Bank data and diagnostic work made a positive contribution to the coherence of policy in that country and, if so, how this contribution was achieved.

As the first step in our evaluation, we produced a timeline of major events in the country related to reform processes. The timeline included domestic events such as unrest and protests, political events such as major speeches or statements, and the production of data and diagnostic work by the World Bank and other relevant actors, such as domestic think tanks, with a view to identifying windows of opportunity for reform and the World Bank’s potential contribution to any reforms that were achieved. To reconstruct the timeline and pinpoint key events in the case we were studying, we used as our data sources desk research that

  • Identified relevant factors in the client country’s political economy, including slowing economic growth and patterns of domestic unrest;
  • Pinpointed central events, including analysis of speeches delivered by the client country’s head of state, parliamentary decisions, and media data covering important events and World Bank activities during the period under review;
  • Reviewed important documents, including the World Bank diagnostic report on the intervention itself, other internal documents related to the intervention, diagnostic reports by other actors (including a domestic think tank that had reached conclusions similar to those in the World Bank’s diagnostic report), and the final reform policy document produced by the country.

On the basis of this preliminary timeline, we identified, as one potential contribution to reforms in the client country, World Bank ideas included in the diagnostic report that could have helped shape changes in the country’s development model. However, at this stage, it was merely a working hypothesis that these ideas were an actual World Bank contribution to these changes, and we would subsequently subject our hypothesis to critical empirical scrutiny and further theorization.

Step 2: Initial Exploration of How the Contribution Could Have Been Produced

After figuring out what happened and identifying potential contributions to policy or program outcomes, an evaluation can start to explore how the intervention might have produced the identified contributions. This step involves moving beyond the timeline developed in step 1 to identify what might plausibly have linked particular interventions and specific contributions to outcomes. This in turn involves identifying what plausible barriers to a particular desired change might have been present in the case being studied and trying to pin down which causal pathways could have driven the interactions that overcame those barriers. The initial theorization that results is the product of both systematic searches of existing literature (academic literature and other evaluations) and initial empirical fieldwork involving a small subset of participants in the intervention.

On the basis of a review of existing academic research and evaluations of other cases involving similar issues relating to policy advice and dialogue, evaluators should try to identify plausible challenges or barriers to change in the case they are evaluating. Doing so enables evaluators to then make a more targeted search of the existing social science literature, as well as of findings of other evaluations, and to identify potential pathways that plausibly describe overall how those challenges or barriers could be overcome. Social science theories and evaluations of other cases can be very helpful in identifying such potential pathways.

For instance, a common barrier to policy influence is getting national policy makers to take notice of ideas presented by other actors. Before this particular barrier can be overcome, policy makers must recognize that they face a problem; they may then be willing to listen to ideas about how to fix the problem. We noticed early in the IEG evaluation that national discussions about the problems the client country faced focused on disparities in poverty from region to region within the country. However, we found when we examined the existing policy documents, speeches, and parliamentary debates that they did not recognize broader problems with the country’s development model during this time period, despite slowing growth and increasing inequalities.

At this stage of the evaluation, we therefore searched existing social science literature, focusing on the specific barrier of failures in problem recognition. One relevant dynamic that we identified in the policy studies literature for overcoming this barrier was incorporating benchmarking techniques in a knowledge product (de la Porte et al. 2001; Mahon and McBride 2009). As the literature discusses, an unexpectedly positive (or negative) comparative ranking in a benchmarking exercise, presented either directly to policy makers or through the media, can draw attention to a problem. We then explored whether there was evidence of benchmarking techniques having been used to get policy makers in the client country to notice World Bank ideas relevant to solving the problem. In the case we were studying, we found that the World Bank leveraged its management of global benchmarking data, specifically indicators from The Changing Wealth of Nations reports, to open a window of opportunity for starting a process of policy dialogue in the country about possible reforms. In this particular case, it was the positive ranking of the client country compared with the rankings of its neighbors when intangible capital was included in measures of national wealth that prompted national authorities to initiate a dialogue on how the country’s development model could make better use of human and institutional capital in the country’s growth and development trajectory. We concluded, therefore, that the World Bank’s use of benchmarking to overcome the lack of problem recognition among the country’s policy makers was the first key episode in the case (illustrated in figure 1.2). Had this barrier not been overcome, the process would have stalled, and no amount of policy dialogue based on data and ideas could have succeeded if policy makers had remained unaware of the problem or had not felt compelled by the data and the implications of those data to ask the World Bank to provide expert diagnosis.

Depending on the complexity of a particular policy or program, there can be multiple challenges or barriers to change that form key episodes. In a specific case, there might, for example, be a sequence of key episodes in which one challenge is overcome, producing an intermediate outcome (contribution), followed by the next key episode, defined by a new challenge. In more complex interventions, there can also be multiple causal pathways operating in parallel.

In the IEG evaluation, to enable us to develop initial hypotheses about what types of processes might have linked data and diagnostics with policy influence, we also undertook a round of exploratory interviews with key participants to map what happened in the interactions between the World Bank and national officials during the intervention. On the basis of these interviews, we developed another initial working hypothesis focused on a later stage of the process, involving policy dialogue between World Bank officials and key national policy makers in the client country.

Through this combination of systematic literature searches and exploratory empirical fieldwork, we identified several challenges and key episodes in sequence during the time period studied. Figure 1.2 depicts these challenges and key episodes. A lack of problem awareness among policy makers can be overcome, in theory, through actions related to promoting problem recognition. Once policy makers recognize a problem, they face the next challenge: how to fix it. In the corresponding phase of the process in the case studied here, there was still the question of how to supply the expert diagnosis of the problems and solutions. The World Bank needed to strike a balance between writing its diagnostic report in an overly diplomatic manner that would fail to help the country reform and overstepping national red lines that would make the country’s policy makers dismiss the World Bank’s ideas out of hand. Intense policy dialogue enabled the World Bank to overcome this challenge in supplying diagnosis that could speak hard truths without going too far (see further discussion of this later in the chapter). Finally, after it supplied the diagnosis, the World Bank still faced the challenge of getting these ideas and this advice into the country’s final policy reform document without triggering national sensitivities about outside intervention.

Figure 1.2. Initial Aggregate Process Theory of Change for the Independent Evaluation Group Case Study

Image
A figure relating how the triangulation of different empirical inputs can help create a fuller account of World Bank Group contributions to outcomes.

Figure 1.2. Initial Aggregate Process Theory of Change for the Independent Evaluation Group Case Study

Source: Independent Evaluation Group.

Step 3: Disaggregation of the Process Theory of Change

Once they have identified key episodes in a case, including the challenges and barriers to change and one or more plausible pathways for overcoming them, evaluators should break down each episode into a more granular, step-by-step process. This process needs to explain how the specific case moves, causally, from the original situation to the expected intermediate or final program or policy outcomes, and what is the contribution of the activities and links involved in the interactions that overcame the barriers. Evaluators can attempt to identify through desk research and initial fieldwork what types of interactions might have taken place to overcome these barriers. Additionally, empirical examples from existing evaluation studies might provide ideas about what actions and reactions might have taken place. As another key source, evaluators can explore relevant social science literature related to how particular challenges or barriers might be overcome (for example, how to get data noticed). Finally, evaluators can engage in a logical brainstorming about how particular challenges in the case could plausibly have been overcome, by looking either forward or backward from the initial intervention in the case to the potential contribution to final program or policy outcomes.

In practice, the formulation of a pToC often involves a back and forth between empirical observables and theory. This means that the distinction between step 3 and subsequent empirical testing can become blurred in practice. Evaluators begin with hunches about a possible pToC for a particular case—based on either logical brainstorming or case knowledge—and then engage in a preliminary round of fieldwork to probe the plausibility of these hunches. At this stage, the pToC will be incomplete, and several actions or causal principles in parts of episodes may remain unknown. Putting hunches and question marks into their preliminary pToC flags for evaluators everything that they do not yet know and that they should therefore make the focus of their empirical probing, in fieldwork, by asking, “How did it work?” Initial hunches often prove wrong, and when they do, evaluators should update their preliminary pToC to match what they have actually found in studying the case. If possible, they should engage in exploratory fieldwork to assess their initial pToC and, based on the findings of this initial fieldwork, revise and assess the pToC more systematically in a second round of fieldwork.

Figure 1.3 illustrates how, in the IEG evaluation, we theorized how the policy dialogue episode in which the challenge of lack of knowledge about possible solutions among national policy makers was potentially overcome. The episode involved an iterative dialogue between our initial hypotheses, the existing academic literature on policy learning and dialogue, and empirical fieldwork of the case.

In this step, we wanted to know what activities World Bank officials engaged in, with whom they interacted, and how these interactions were conducted. In the initial pToC for the policy dialogue episode, the intervention operated through two parallel pathways. We originally thought that these parallel pathways were rival contribution theories, but through empirical testing using interviews, we realized that they actually worked together in parallel, reinforcing each other, to explain the intervention’s final contribution to policy reform. Here we describe the pToC for the top part of figure 1.3, which involved national authorities asking the World Bank to produce a diagnostic report of the development situation in the country. We look at the pToC for the bottom part of the figure (engagement with a domestic think tank in the client country [grayed out in figure 1.3]) when discussing operationalization in chapter 2.

As shown in the top part of the figure, we found through exploratory interviews that once World Bank officials gained a “seat at the table” to discuss with key economic policy makers in the client country the shortcomings of the country’s existing set of policies and explore alternatives, other challenges for influencing policy surfaced. These challenges included finding the right framework for focusing the policy agenda—a framework that would speak hard truths and propose sufficiently ambitious changes while not crossing red lines that would make it easy for national policy makers to dismiss the report. Further probing using additional interviews and internal project team documents revealed that engagement by World Bank officials involved first identifying potential supporters or “champions” of reforms who could make a difference within governmental deliberations, then engaging in multiple conversations with World Bank officials to sound out what areas were national priorities and to identify red lines. Through this engagement, key national officials felt “heard,” leading them to (i) champion reforms along the lines identified by World Bank officials, because as the officials provided input and advice, they began to feel ownership over the ideas, and (ii) provide feedback that the World Bank team could use to revise its diagnostic report to make it more palatable to the country’s decision makers (and thus more likely to be taken into account in final policy decisions). The final outcome was a reform agenda that was partly shaped by the contribution of the World Bank (the diagnostic report).

Figure 1.3. Policy Dialogue Episode (Engagement with National Officials)

Image
A figure relating how the triangulation of different empirical inputs can help create a fuller account of World Bank Group contributions to outcomes.

Figure 1.3. Policy Dialogue Episode (Engagement with National Officials)

Source: Independent Evaluation Group.