Geographic microtargeting of social assistance with high-resolution poverty maps

Significance Many antipoverty programs use geographic targeting to prioritize benefits to people living in specific locations. This paper shows that high-resolution poverty maps, constructed with machine learning algorithms from satellite imagery, can improve the geographic targeting of benefits to the poorest members of society. This approach was used by the Nigerian government to distribute benefits to millions of the extreme poor. As high-resolution poverty maps become globally available, these results can inform the design and implementation of social assistance programs worldwide.


Correlations between imputed DHS wealth estimates and ground truth
Targeting simulations for the survey benchmark poverty map are conducted by replacing a portion of true DHS wealth 43 estimates with imputed wealth estimates. Imputation is done using DHS data from nearby regions; see Section E.2 for details. 44 Figure S7 plots these imputed wealth estimates against NLSS ground truth wealth estimates for the corresponding regions. As 45 expected, the correlations between imputed DHS estimates and ground truth wealth are lower than the original DHS estimates 46 (correlation declines from 0.787 to 0.647 for LGAs, and from 0.779 to 0.663 for wards). This reflects a decline in accuracy of DHS wealth estimates in regions where data is unavailable.

48
Targeting all wards in the ground truth sample 49 Our main analysis compares the performance of targeting using optimal, ML-based, and survey benchmark poverty maps on 50 the matched sample of 5.3% of wards where both the optimal (NLSS) and benchmark (DHS) surveys have coverage. It is also 51 possible to evaluate performance on the full 22.9% of wards for which NLSS (but not necessarily DHS) data are available by 52 using the process described in Section E.2 to impute the missing wealth of DHS wards where no DHS data exist ("minimum 53 imputation DHS"). Of the 2,016 wards that have NLSS data (22.9% of all wards in Nigeria), the DHS contains data for 464 54 wards (23.0% of the 22.9%); the wealth estimates for the remaining 1,552 wards are imputed.  It is reassuring that the performance of the ML-based approach is nearly the same in this sample, as it suggests that 59 missingness of validation data was not biasing our estimates. The modest increase that we observe in the performance of 60 DHS-based ward-level targeting is expected, since it uses DHS survey data directly for 23.0% of wards, and imputed estimates

Fig. S2. Locations of clusters included in NLSS and DHS surveys
Notes: For privacy reasons, cluster locations are displaced by up to two kilometers in urban settings and up to five kilometers in rural ones. One percent of clusters, selected at random, are displaced by up to ten kilometers.

Fig. S7. Correlation between imputed DHS wealth indices and ground truth
Notes: Scatterplots compare the imputed survey benchmark (DHS) wealth estimates for each administrative unit against the ground truth (NLSS) wealth estimate. Wealth estimates are imputed using data from surrounding areas to simulate the accuracy of DHS in regions where it does not contain surveyed households; see section E.2 for details. Left plot shows Local Government Areas (LGAs), the Admin-2 unit, and the right plot shows Wards, the Admin-3 unit. All correlations are significant at p=0.001.   For each household, an error term is computed using their percentile in poverty rankings minus their percentile in targeting order; thus a negative error term implies the household receives aid too late in the targeting order, and a positive error term implies they receive aid too early. Plot shows the difference in mean error for demographic and counterfactual groups; that is, how much earlier or later on average households in this demographic group are targeted relative to similar households outside the demographic group. 0.609 (0.408, 0.754) 0.200 * (-0.071, 0.444)

Table S1. Accuracy of ML-based poverty maps in regions with less DHS training data
Notes: The ML-based poverty map is trained using DHS data. This table reports the accuracy of the ML-based maps in regions where the nearest DHS cluster is progressively more distant. The first column indicates the number of wards that meet the criteria listed in the row heading. The second column indicates the correlation between ML-based and ground truth (NLSS) poverty estimates for this subset of wards, and the third shows this correlation using the DHS-based poverty map, with missing wards imputed. Row 1 considers all wards for which DHS and ground truth (NLSS) data are available. Row 2 considers wards with ground truth data but no DHS data. Rows 3-7 consider subsets of the wards in row 2 for which the nearest DHS cluster to the ward is at least some distance away. All correlations are significant at p=0.001 except where marked with * (DHS wards at least 20km from the nearest DHS cluster, p=0.146); parentheses show 95% confidence intervals.