Back to cover

Poverty Mapping: Innovative Approaches to Creating Poverty Maps with New Data Sources

Chapter 1 | Survey and Census Data

This section discusses the methodological implications of using traditional data sources such as survey and census data for generating poverty maps.

Definition

Household surveys are useful for collecting reliable information on the demographic and socioeconomic characteristics of a population of interest. Data are usually collected from a sample of randomly selected households, and statistical inference is used to estimate population parameters. The LSMS is the World Bank’s flagship household survey program. It focuses on strengthening household survey systems in client countries and improving the quality of microdata to better inform development policies. The LSMS encompasses a multitopic survey that is customized to local contexts and uses best practices in survey methodology. The data collection and data management processes are coordinated by national statistical offices with the support of the World Bank LSMS team.

The Demographic and Health Surveys (DHS) are nationally representative household surveys that provide data for a wide range of indicators, such as population, health, and nutrition. Standard DHS surveys have large sample sizes (usually between 5,000 and 30,000 households) and are typically conducted every five years to allow comparisons over time. The DHS project is funded by the United States Agency for International Development. Household surveys, including the LSMS, DHS, and national surveys, can be used in combination with household-level census data to generate poverty maps. They provide detailed information about key variables contributing to poverty, with census data providing broader geographic coverage of household information.

Data Sources

For this method, survey and census data are combined to estimate poverty rates. A typical household survey collects data on a range of dimensions related to household and individual well-being, including but not limited to demographics, education, health, fertility, migration, labor, housing, savings and credit, income, and consumption. Surveys can be customized to a specific country context. Data can be collected via face-to-face methods, remotely, or both (significant innovations in remote data collection were prompted by the coronavirus [COVID-19] pandemic). Country-level implementation of household surveys relies on instruments such as household, community, price, and facility questionnaires.

Variables on consumption and income can be used to determine the economic status (and poverty levels) of households across a country. These variables can be complemented by other data sources, such as the socioeconomic modules included in household surveys, to derive a multidimensional poverty estimate. The concept of multidimensional poverty includes not only the insufficiency of economic resources but also the lack of basic rights (such as access to food, health, housing, social security, and education [CONEVAL 2017]).

Methods

The primary method for estimating geolocated poverty incidence rates using survey and census data is small-area estimation. Small-area estimation refers to a family of statistical imputation techniques that combines census and survey data to derive poverty estimations disaggregated to small geographical units (such as cities, towns, villages, or census divisions).

The creation of poverty maps for small areas is complex. Household survey data typically include income or consumption variables (which are required to derive poverty estimates) but are not representative at lower levels of disaggregation because of insufficient sample sizes. But the opposite is true for census data, which are representative at lower levels but do not usually contain sufficient information on consumption or income. To overcome these data limitations, the small-area estimation method aims to link income or consumption variables from survey data to other variables available in the census, so that they can be applied to census-level units of observation.

Using the small-area estimation method requires both national census data and household survey data (such as DHS, LSMS, or national income-expenditure surveys) for the country of interest. The survey and census data used to apply this method must share a common set of variables (associated with poverty levels). Furthermore, the data sources must be close in time (the accepted gap is three to five years). This is important to ensure that the characteristics of the populations have not significantly changed between the two surveys, since the method relies on the assumption that the estimated model of consumption or income from the survey is applicable to the census-level observations.

The small-area estimation method usually consists of two steps: (i) calibration of a statistical model based on survey data, and (ii) application to the comprehensive census data. In the first step, multiple linear regression analysis is used to estimate a model of household income or consumption based on survey data, restricting the explanatory variables in the model to the subset available in both the survey and the census. In the second step, the estimated model parameters are applied to the census data. The output from these two steps is an estimate of income or consumption for every household in the census. These estimates are then aggregated at the desired geographical level (for example, municipalities, districts, or villages).

Various techniques can be used to conduct small-area estimations. The most commonly employed methods include the ELL method (named after researchers Elbers, Lanjouw, and Lanjouw, [2003]), Empirical Bayes Prediction, Hierarchical Bayes, and Best Linear Unbiased Prediction. The World Bank has been a pioneer in the development of the small-area estimation method for creating poverty maps. Different variations of this technique have been applied to countries such as Albania, Bolivia, Bulgaria, Cambodia, China, Ecuador, Indonesia, Mexico, Morocco, Thailand, and Vietnam (Bedi, Coudouel, and Simler 2007).

Applicability Considerations

In the context of evaluation, the use of household survey data for poverty mapping is fairly limited by the substantial data requirements. Costs for this method vary. If household survey data are not available, data collection costs are likely to be prohibitive for individual evaluations. For example, the LSMS is estimated to cost approximately US$1.7 million per survey per country (commensurate with similar survey efforts by other organizations; see, for example, SDSN TReNDS [2018]). However, piggybacking on existing data from LSMS and national household surveys is quite feasible. LSMS data are publicly available via a database of completed surveys conducted in 38 countries from 1980 to the present.1 Data access is characterized as (i) direct data access; (ii) public-use data files; or (iii) data available from an external repository. Data sets categorized as direct data access can be downloaded immediately. Data sets categorized as public-use data files can be accessed after registering with the World Bank Microdata Library and applying for access. This application requires a description of the intended use of the data. For data sets categorized as available from an external repository, the World Bank Microdata Library provides links to partners’ websites. In addition, an access policy is outlined for each data set in the study description; this policy includes the name of a contact individual, access conditions, and citation requirements.

In some countries, national survey data are available and can be used for evaluative purposes. Similarly, the DHS program has been running for over 30 years and has produced over 320 surveys in 90 countries. DHS surveys can be directly accessed from the website of the United States Agency for International Development but viewing and downloading DHS microdata requires registration as a DHS data user.2 DHS data set access is granted only for legitimate research purposes.

Such analysis can be completed using standard software packages for statistical analysis such as R, Stata, or SPSS. The World Bank has also released a publicly available Stata package, which can be used to conduct small-area estimation.3 The visualization of the poverty estimates in maps might also require access to geospatial software such as QGIS (open source) or ArcGIS. The application of such estimation techniques requires knowledge of multivariate statistics and data manipulation and processing skills

Examples

Example 1: Poverty Maps Using Household Surveys in Brazil

Elbers, Lanjouw, and Leite (2008) validated the application of the ELL method based on a poverty map of Minas Gerais, a state in southeastern Brazil. This exercise was motivated by the fact that the 2000 Brazil census included additional income information as part of the census data collection procedure: (i) a single question on income of the household head was added to the traditional questionnaire collected from all households, and (ii) a more detailed questionnaire on income was fielded to 12.5 percent of households. These additional data provided an opportunity to compare the predicted poverty estimates produced by the ELL method with the actual household income figures obtained from the census data. For computational ease, the analysis focused on the state of Minas Gerais only.

After examining the estimates in nearly 1,000 municipalities, the researchers concluded that the poverty estimates produced by the ELL method were closely aligned to the actual observed poverty rates in those municipalities. Furthermore, the authors found that confidence intervals for those estimates were moderate.

Example 2: Poverty Maps Using National Household Surveys in Bolivia

As described by Arias and Robles in “The Geography of Monetary Poverty in Bolivia: The Lessons of Poverty Maps,” the World Bank, in conjunction with the Social and Economic Policy Analysis Unit and the National Institute of Statistics developed a poverty map of Bolivia using the ELL method (Arias and Robles 2007).

The main data sources for this exercise were the National Population and Housing Census of 2001 and household surveys that were conducted through the Program for the Improvement of Household Surveys and the Measurement of Living Conditions and carried out by the National Institute of Statistics in 1999, 2000, and 2001. Data from these sources were combined to obtain a larger sample that could be disaggregated according to the main administrative regions (departments) and areas in Bolivia. The method linked household consumption expenditure with variables measured in the household surveys and the census to impute the missing expenditure data.

Example 3: Multidimensional Poverty Maps Using National Household Surveys in Mexico

CONEVAL (2017) developed a poverty map of Mexico, disaggregated at the municipality level, using the small-area estimation method. A novel element in this case was the use of a multidimensional poverty measure. This estimation was based on a combination of data on economic well-being (income) and social rights (such as access to food, health, education, social security, or dignified housing). Income data were obtained from the Intercensal Survey, and information for the multidimensional measurement of poverty was extracted from the Socioeconomic Conditions Module of the National Survey of Household Income and Expenditure. The study, within the multidimensional approach of measuring poverty, also produced granular estimates on food insecurity and lack of access to social security.

Example 4: Poverty Maps Using Living Standards Measurement Study Survey Data in Nicaragua

Sobrado and Rocha used data from a 2005 LSMS in Nicaragua to create a poverty map of the country (World Bank 2008). The 2005 Census of Nicaragua and the 2005 LSMS were used as data sources, and the authors included only data from questions that were either the same or similar in both the sources of information. The authors then compared the 2005 poverty map with one created in 1995 to identify changes in the distribution of poverty. Through this exercise, the authors found a decrease in the incidence of poverty and in the poverty gap index for almost all regions of Nicaragua. The authors recommended that policy makers in Nicaragua use the 2005 poverty map as a targeting tool (in addition to other tools) because the map showed both the distribution of poverty and how the distribution had shifted since 1995.

Example 5: Poverty Maps Using Household Survey Data in Ecuador, Madagascar, and South Africa

Gabriel Demombynes et al. (2002) created poverty maps for Ecuador, Madagascar, and South Africa by combining survey and census data. Although the three countries differ significantly in geography, stage of development, and so on, the researchers found that the poverty estimates generated from this exercise were plausible (that is, the estimates generated from the census data matched well with estimates calculated directly from the survey data) and sufficiently precise (that is, at a lower level of disaggregation than was possible through the household survey data alone).

For the Ecuador map, the researchers used data from a 1990 census conducted by the National Statistical Institute of Ecuador and a 1994 household survey based on the LSMS. For the Madagascar map, they used data from a 1993 census conducted by the National Institute of Statistics, a 1993–94 household survey conducted by the Ongoing Household Survey, and data on spatial and environmental outcomes at the fivondrona (communes) level. For the South Africa map, they used data from the 1995 October Household Survey, an Income and Expenditure Survey conducted at approximately the same time, and a 1996 population census.

The researchers examined the extent to which the poverty estimates from the census matched the poverty estimates from the household surveys (at the level represented in the survey). The poverty estimates for Ecuador were relatively close to the results of the census, with all but two regions within 95 percent confidence intervals. The estimates for Madagascar were also relatively close, except for one or two strata that were not well explained by the first-stage regression (for example, the adjusted R2 for the rural Antsiranana stratum was 0.292, the lowest of any of the models explored). The estimates for South Africa were also deemed satisfactorily close.

Based on these results, the researchers found that across the three countries, the poverty estimates at the census level aligned overall with the household survey estimates, with the standard errors at the stratum level being consistently lower than those derived solely from the household survey data.

The researchers also explored how far the census-based poverty estimates can be disaggregated, using the household survey sampling errors to benchmark acceptable levels of precision. For all three countries, they could generate poverty estimates at the third administrative level with similar levels of precision to the household survey data (at the representative stratum level of the survey). This exercise demonstrated how this method can provide useful information about the incidence of poverty levels across regions.

  1. For example, poverty maps can identify small pockets of poverty within wealthier areas, information that would otherwise be masked by national poverty averages. Although geolocated survey data offer similar benefits, surveys are more costly to implement and have lower coverage (in time and space) than the aforementioned proxies.
  2. Poverty maps also have a broader relevance in the development community at large. They can be used to enhance key policy and programmatic aspects, such as targeting and coordinating strategies at local levels. Timely poverty estimates also aid real-time decision-making during crises, such as pandemics and natural disasters. Although the broader applicability of poverty maps contributes to their overall usefulness and relevance for development programs and policies, this publication focuses specifically on creating and using poverty maps in the context of evaluation.
  3. See the Living Standards Measurement Study database at https://microdata.worldbank.org/index.php/catalog/lsms.
  4. See the Demographic and Health Surveys data sets: https://dhsprogram.com/data/available-datasets.cfm.
  5. The Stata package is available at https://github.com/pcorralrodas/SAE-Stata-Package.