Exposure to open defecation can account for the Indian enigma of child height

Physical height is an important measure of human capital. However, differences in average height across developing countries are poorly explained by economic differences. Children in India are shorter than poorer children in Africa, a widely studied puzzle called “the Asian enigma.” This paper proposes and quantitatively investigates the hypothesis that differences in sanitation — and especially in the population density of open defecation — can statistically account for an important component of the Asian enigma, India's gap relative to sub-Saharan Africa. The paper's main result computes a demographic projection of the increase in the average height of Indian children, if they were counterfactually exposed to sub-Saharan African sanitation, using a non-parametric reweighting method. India's projected increase in mean height is at least as large as the gap. The analysis also critically reviews evidence from recent estimates in the literature. Two possible mechanisms are effects on children and on their mothers.

to also shape population average height (Bozzoli et al., 2009;Hatton, 2013). Coffey et al. (2017) show that sub-regions of Nepal where open defecation has improved more quickly after a government sanitation program have also seen more rapid improvements in anemia, on average, a nutritional consequence that could plausibly be an effect of sanitation-related intestinal disease. Collectively, these new studies suggest that it is plausible that open defecation in India could have a large effect on average early-life net nutrition, the key determinant of average child height.
Finally, two studies have recently been discussed at conferences but not yet published. Reese et al. (2017) describe a matched-cohort study in Orissa, India that is designed to assess the effectiveness of a combined piped water and sanitation intervention by Gram Vikas. Preliminary findings in Heather Reese's dissertation show an improvement in child height. 23 Humphrey et al. (2015) describe the carefully-conducted SHINE trial of sanitation and nutrition, which is located in rural Zimbabwe. Population density in rural Zimbabwe is much lower than in rural India: Hathi et al.'s (2017) analysis of all GIS-coded DHS surveys predicts that open defecation would not be steeply associated with child height in sub-Saharan African contexts where population density is low. In the supplementary appendix, figure A.2 shows, using Zimbabwe's 2015 DHS, that this prediction is borne out in the data: in rural Zimbabwe, child height-for-age is essentially uncorrelated with the same type of PSU open defecation variable that is constructed for this paper's main analysis.

B Non-parametric reweighting decompositon
This section presents further details on the child-level results from section 5 of the paper.

B.1 Open defecation interacts with population density to predict height
The child-level results in section 4 of the paper use the log of open defecation density as the independent variable. For reference,  Gertler, et al. (2015) estimate an instrumented coefficient of -0.46, using results from three randomized experiments.
Section 4 uses the same pooled DHS dataset as Jayachandran and Pande (2013,2017). They report an open defecation coefficient of -0.358, pooling data from India and Africa and using a binary indicator of household open defecation. There are two reasons to believe that this is an underestimate of the relevant parameter, which is the extent to which the height of the average Indian child would increase if counterfactually exposed to African open defecation. First, this household-level measure ignores externalities of open defecation, capturing only differences across households. Second, this estimate pools the large Indian effect of interest with the smaller effect in Africa, where population density is lower. Finally, columns 5 and 6 verify the statistical significance of the interaction between open defecation and population density, which informs our preferred specification's use of the density of open defecation as an explanatory variable. Recall that population density itself does not explain essentially any of the India-Africa height gap and that section 3 of the paper confirmed that population density itself is not correlated with DHS-average child height. Table A.2 presents summary statistics for these data. Demographic categories are correlated with child height, especially in India. In particular, Jayachandran and Pande (2017) have recently emphasized a correlation between child birth order and height: they observe that birth order has a more steeply negative gradient with child height in India than in Africa. Table A.2 is structured by child birth order and mother's fertility 24 in order to permit the reader to visualize separately the correlates of a child's birth order and of her mother's fertility, that is to say, of the size of her siblingship at the time of the DHS survey, when height is measured.

B.2 Summary statistics by birth order and sibsize
A well developed literature in economic demography emphasizes the importance of separately controlling for sibsize and birth order when trying to identify the effects of either; they are mechanically correlated because high birth order children must come from high fertility mothers (Blake, 1989;Black et al., 2015). These data requirements matter here because the Indian DHS only measures the height of children under 5, and some African DHS surveys do not even measure height up to age 5. One consequence of this is that only the youngest children in each family have height data: 97% of children with measured height in the DHS are either the most recent or second most recent of their mother's births. This means that child birth order and the size of a mother's fertility are particularly strongly correlated in the sub-sample of children in the DHS with measured height: the correlation between a child's birth order and the number of children born to her mother is 0.64 among all children observed in the DHS birth history, but is 0.96 in the sub-sample with measured height.
Despite this correlation, the large size of the dataset permits some scope to distinguish between the role of mother's fertility (that is, the size of the child's siblingship) and the role of the child's birth order. Table A.2 presents means for each observed combination of mother's fertility (in rows) and child birth order (in columns) for each of four variables: child height-for-age, 25 an indicator for the mother's literacy, the mother's height in centimeters, the fraction of households in the child's local PSU who report defecating in the open, and our constructed measure of open defecation density.
Comparisons down rows show that higher fertility is considerably more correlated with disadvantage in India than in Africa. For an example from the top corner of the table, in India first-born children in siblingships of 2 are about 0.09 standard deviations shorter than first-borns in siblingships of 1; in Africa first-born children from siblingships of 2 are about 0.02 standard deviations taller than first-borns from siblingships of 1. Mother's literacy in India declines by almost 40 percentage points from singletons to those with 4 children 24 In the table, we follow the DHS documentation and convention in demography by calling this variable "children ever born." 25 The table presents residuals after height-for-age is regressed on 119 age-in-months by sex dummies. Age-in-months by sex is the level at which the WHO height reference charts are constructed; therefore, these indicators control for any bias introduced by any discrepancy between average height in our sample and in the reference sample. Child height-for-age is well known to be declining in the first two years of life in poor countries where stunting is common (Victora et al., 2010); figure 1 shows that the average child in the DHS with measured height is older in India than in Africa -because early-life mortality is greater in Africa -so these controls for age are necessary.
(Panel B) and height declines by about a centimeter (Panel C); in Africa over the same range mother's literacy declines by less than 20 percentage points and mother's height increases. Similarly, children born to higher-fertility mothers are exposed to much more open defecation in India than children exposed to lower-fertility mothers, by a steeper gradient than in Africa.
In contrast, comparisons across columns show that, at the same level of mother's fertility, later born children in India do not appear more disadvantaged than later born children in Africa. Panel A considers our variable of interest, child height-for-age. In all cases, compared adjacently in rows, later-born children of the same siblingship size are taller than earlier-born children. In India, this apparent advantage is much larger than in Africa for siblingships of size 3 or 4: for example, Indian third-borns are 0.12 standard deviations taller than Indian second-borns, both of mothers to which 3 children have been born; African third-borns are only 0.01 standard deviations taller. In general, in Africa -where fertility is higher, on average, and high fertility is less of a marker of disadvanage -there is not much difference by fertility or birth order: no average differs by as much as 0.04 standard deviations from the average for singleton first-borns. In India, however, there is a steep gradient in mother's fertility. Spears, Coffey, and Behrman (2018) document that empirical strategies, such as mother fixed effects, that succeed in other studies in the birth order literature are unable to convincingly separate sibsize from birth order in the DHS height subsample, due to the facts that the DHS only measures the youngest children and that age, height-for-age, and birth order are correlated.
Later birth order children may appear to be relatively disadvantaged in India because high fertility is more negatively selective there. As a result, our decomposition analysis takes care to allow for different consequences of fertility between India and Africa -both by controlling for the mean difference in fertility and by computing projections within demographic subsamples in case of any interactions -as well as any role of birth order.  Standard errors clustered by survey PSU in parentheses. p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001. Open defecation is a fraction 0 to 1.  Note: Standard errors clustered by survey PSU in parentheses. p-values: † p < 0.1, * p < 0.05, ** p < 0.01, *** p < 0.001.

B.3 Robustness to changes in the African sample
Observations: This table implements the OLS-as-decomposition regression of equation 3 with two different ways of "controlling" for the correlation between child age and child height-for-age: columns 5 and 6 control for 59 age-in-months fixed effects, while columns 3 and 4 simply omit children under 2 years old. This is important because height-for-age falls in age over the first two years of life, on average, and children with measured height in the DHS in India are older, on average, than children in sub-Saharan Africa (see figure 1), due to differences in mortality rates. Three conclusions are visible in the results: • Comparing column 1 with columns 3 and 5, controlling for age changes the amount of the "Asian Enigma" gap to be explained, although a substantial difference remains. Part of the reason Indian children appear shorter is because they are older, and have had more time to accumulate height shortfalls.
• Comparing columns 3 and 4 against columns 5 and 6, it appears that these two different methods of "controlling" for age produce very similar results.
• Comparing columns 2, 4, and 6, the coefficient on the density of open defecation is not sensitive to either of these age controls.
One question is whether mortality selection might bias the estimate of the association between open defecation and child height, if children who would be unusually short are the ones who are likely to die from exposure to open defecation. However, an analysis of historical infant mortality rates (from times when early life mortality was more common) by Bozzoli et al. (2009) finds that the required mortality rate for such selection to be quantitatively important is very large. We confirm this in our sample with the following procedure. 4% of the births reported in the five years before the survey do not have measured height; these are children who would have been eligible to have their height measured if they survived. We hypothetically assign each of these the tenth-percentile height among measured children who share their age-in-months and sub-national region (such as Uttar Pradesh); thus, we assume that they would have been unusually short, if measured. When we re-estimate column 2 with these 177,261 real and counterfactual observations, we find a coefficient on open defecation density of -0.099, which is very similar to the -0.092 observed on the real sample of 170,149.
I thank Jere Behrman and Subha Mani for suggestions that led to this table.