Early life height and weight production functions with endogenous energy and protein inputs

Highlights • We estimate height and weight production functions for infants.• We focus on the role of energy and protein intake.• We use IV to control for endogeneity and estimate a number of models.• The results indicate that protein play an important role in height and weight change.

We examine effects of protein and energy intakes on height and weight growth for children between 6 and 24 months old in Guatemala and the Philippines. Using instrumental variables to control for endogeneity and estimating multiple specifications, we find that protein intake plays an important and positive role in height and weight growth in the 6-24 month period. Energy from other macronutrients, however, does not have a robust relation with these two anthropometric measures. Our estimates indicate that in contexts with substantial child undernutrition, increases in protein-rich food intake in the first 24 months can have important growth effects, which previous studies indicate are related significantly to a range of outcomes over the life cycle. 2007; Engle et al., 2007Engle et al., , 2011Victora et al., 2008;Hoddinott et al., 2008Hoddinott et al., , 2013Behrman et al., 2009;Maluccio et al., 2009). Despite widespread concern about early-life undernutrition there is limited systematic knowledge about production technologies for key outcomes, particularly height and weight, needed to inform more-effective program and policy design. This gap is partially due to inherent difficulties in modeling these complex biological and behavioral processes-often strong assumptions are required for estimation, so that it is difficult to make definitive conclusions. A major challenge in estimating production functions for height and weight is that inputs reflect behavioral choices. Using data from the same Philippine study analyzed in this paper, Akin et al. (1992) and Liu et al. (2009) find that families allocate nutrients to compensate for prior poor health. Where allocations reflect compensatory behaviors that are not controlled for in the estimation, the estimated effect of nutrients on growth can be biased.
Another challenge is measurement error in inputs. Using related data from Guatemala, Griffen (2016) finds that estimates of energy effects on height are substantially larger using instrumental variables (IV) than with ordinary least squares (OLS) probably in part due to measurement error.
In this paper, we examine relations between energy intake and: (1) linear growth and (2) weight gain. We use longitudinal data from Guatemala and the Philippines that includes detailed information on anthropometric outcomes, nutrition and other inputs collected at intervals of two-three months to estimate height and weight production functions for children in the critical age range 6-24 months. In our specifications, height and weight depend on lagged height and weight, energy intakes, breastfeeding, diarrhea, and individual fixed endowments. We combine individual fixed-effects (FE) with instrumental variables (IV) to control for both endogeneity and measurement error.
This paper presents three important methodological contributions. First, we estimate production functions for two countries, Guatemala and the Philippines, and for two anthropometric measures, height and weight, which allows us to compare the robustness of our findings across different settings and anthropometric outcomes. Second, we improve on previous IV literature on growth by providing details of instrument selection and an assessment of how the results are robust to changes in the instrument set. We present estimates for numerous instrument combinations, putting emphasis on those judged more reliable based on over-identification and weak instrument tests. Third, in addition to considering total energy intake, which is the nutritional input usually considered in the economics literature, we disaggregateenergy intake into two components: proteins and (all) other macronutrients (which we refer to as ''non-proteins'', meaning fat and carbohydrates). This emphasis on dietary quality, highlighted by Arimond and Ruel (2004), is especially relevant because it may help design interventions that better reduce stunting and underweight. We find robust and positive effects of proteins on height and weight growth. Energy from other macronutrient consumption (non-proteins), is not systematically related to these anthropometric measures, which suggests that protein-rich foods are particularly important for growth of undernourished children.

Input selection
Our choice of inputs is guided by Black et al. (2008) who argue that inadequate diet and disease are the main immediate causes of stunting and wasting. With respect to diet, two energy sources have been identified as being especially important for child growth: proteins and nonprotein energy from other macronutrients. Infants require certain minimum amounts of energy and proteins to maintain long-term good health but these requirements are heterogeneous and depend on several factors including weight and whether the child is breastfed (FAO, 2001;WHO, 2007). Children's energy requirements are partly driven by energy costs of linear growth, which has two components: (1) energy needed to synthesize growing tissues and (2) energy stored in these tissues (FAO, 2001). These comprise approximately one-third of total energy requirements during the first three months of life, but despite increasing in absolute terms they decline to only 3% by age 24 months, in part because overall energy requirements increase substantially with body size. Proteins are needed to balance nitrogen loss, maintain the body's muscle mass, and fulfill needs related to tissue deposition (WHO, 2007). There is also evidence from research on animals that protein provides anabolic drive for linear bone growth (WHO, 2007). 2 To study the relative importance of protein and nonprotein sources, we first examine the relationship between total energy and height and weight and then consider the potential for separate roles of the two at once in a single growth model. The comparison of proteins with nonproteins highlights the relative importance of proteins in children's diets and informs what types of interventions might have greater impact on height and weight. 3 There is a limited literature focused on the distinction between total energy and protein energy. Pucilowska et al. (1993) find that high-protein supplementation in Bangladeshi children with shigellosis, a severe bacterial disease, increased weight compared to normal protein diets. A randomized evaluation for children up to 2 years of age in several European countries demonstrated that receiving baby formula with high protein content (% calories from protein) increased weight, but not height (Koletzko et al. (2009)). Both of these study populations, however, are different from the ones we examine. The Bangladeshi 2 Micronutrients also play important roles in tissue building (WHO, 2007), but there is limited information about them in our data. Hence our focus on protein and non-protein energy.
3 While other individual macronutrients may have different relationships with growth (WHO, 2007), separating them into their components while still treating them as endogenous was empirically infeasible.
sample is restricted to children recovering from shigellosiswhile the European sample had not experienced the same nutritional deficiencies found in our samples. Using a sample more similar to ours, Moradi (2010) finds that access to high-quality protein, such as from livestock farming, better predicts height in some African countries than other energy sources. Similarly, Baten and Blum (2014), using global information for the first part of the twentieth century, that includes Guatemala and the Philippines, also find that local availability of cattle, milk and meat were an important predictor of adult height. 4 A related issue is protein quality. Proteins are composed of amino acids with specific cell functions, and amino acid content defines protein quality. For instance, plant-based proteins lack essential amino acids unlike animal-based proteins (Dewey, 2013). In addition, plant-based diets have high levels of phytic acid, which might inhibit zinc absorption (Gibson, 2006), and zinc plays a key role in cellular growth and differentiation (Imdad and Bhutta, 2011). For animal-based protein, Mølgaard et al. (2011) argue that dairy intake has positive impacts on child growth. Although the mechanism is not entirely clear, this may be due to the stimulating effect on plasma insulin-like growth factor (IGF-1) (Michaelsen, 2013).
Breastfeeding is another critically important source of nutrition in early life (Black et al., 2013). In this paper, we have data on breastfeeding status but not on the amount of breast milk consumed. Thus, our energy intake measures exclude energy from breastmilk requiring us to control for breastfeeding status in the models.
Among diseases that affect growth, Walker et al. (2011) suggest that persistent diarrhea and other diseases can have long-lasting effects on children's physical development. Therefore, in our analyses, we incorporate diarrhea as an input, as it is considered a major contributor to stunting, wasting and child mortality (Black et al., 2013).

Height and weight production functions
The main challenges for estimating height and weight production functions include the endogeneity of inputs and measurement error (Behrman and Deolalikar, 1988). To overcome these, we follow the general approach developed in recent research on production function estimation for cognitive and non-cognitive skills Wolpin, 2003, 2007;Cunha and Heckman, 2007).
Let h i,t denote child i height at age t, w i,t weight at age t and x i,j the input (e.g., proteins, non-proteins, or disease) at age j (For simplicity, we present the model with a single input but generalization to several inputs is straightforward.). Fairly general height and weight production functions are: where m i is an individual fixed effect (including genetic endowments and fixed parental and household characteristics) and 2 h i;t and 2 w i;t are error terms. This formulation allows the entire input history to enter into both equations up to time t. Furthermore, it allows for impacts of past inputs on current height and weight and for the possibility that such impacts differ by age. This approach also distinguishes our work from other studies using the same data. Griffen (2016) relies on the fairly strong assumption that past inputs have constant effects on height in Guatemala, so that history plays little role in growth. Similarly, height production functions estimated by de Cao (2015) in the Philippines, assume that height growth depends only on current inputs.
Because they include individual fixed effects and the entire input history, Eqs. (1) and (2) are difficult to estimate. For example, if inputs are treated as endogenous and an IV approach were used, it would be necessary to have at least one instrument for each period in the entire input history. Thus, instead of directly estimating these two equations, we make two further assumptions that allow less demanding specifications in terms of data and instrument requirements, while remaining more flexible than previous specifications in the literature.
Assumption 1. Effects of past inputs follow a monotonic (likely decreasing) pattern at a constant rate g for each period. 5 That is: b tÀj = gb tÀ1Àj and d tÀj = gd tÀ1Àj .
Assumption 2. The coefficients on inputs in the height function are the same as those in the weight function, up to a multiplicative constant d tÀ1Àj = ((1 + s)/a)b tÀ1Àj .
Together, these assumptions reduce the set of endogenous variables to a tractable number, thereby reducing the number of required instrumental variables.
From Eq. (1) and taking first-differences in height we obtain: Incorporating the first assumption that b tÀj = gb tÀ1Àj , we obtain: 4 Relatedly, and using the same data from the Philippines that we use, Bhargava (2016) studies the association of macronutrients (proteins) and micronutrients (calcium) with anthropometrics, finding that both, protein and calcium are strongly associated with height and weight in the first 24 months of life and also on adolescence. However, Bhargava (2016) only controls for individual effects, assuming several time varying variables as exogenous. 5 While it seems most likely that nutritional inputs would have a larger impact during the 6-24 month age window we model, assuming it is decreasing is not strictly necessary. The rate can be different for the height and weight equations; we assume that is similar only for illustration purposes.
Next, consider the difference in Eqs.
(1) and (2) (after cross multiplication with s and a): Under the second assumption that d tÀ1Àj = ((1 + s)/ a)b tÀ1Àj , we have: where v Dh 2 h i;tÀ1 ÀaðgÀ1Þ 2 w i;tÀ1 . Under these assumptions, height growth can be expressed as a function of current inputs, past height and weight, and an error involving current (t) and previous period (t À 1) shocks. Current inputs enter directly; the full history of past inputs enter indirectly through the lagged height and weight.
We proceed in similar fashion for weight and obtain: i;tÀ1 À½ðgÀ1Þð1 þ sÞ þ 1 2 w i;tÀ1 . As with the change-in-height Eq. (3), the change-inweight Eq. (4) depends on current inputs, past height and weight, and an error including current and previous period shocks. 6 This framework forms the core of our approach to estimating production functions for height and weight.

Estimation and identification
Although differencing removes individual-level fixed effects and thus controls for important sources of potential bias (unobserved persistent heterogeneity including, e.g., genetic endowments and fixed parental and household characteristics), to consistently estimate the parameters in the relations for change in height (Eq. (3)) and change in weight (Eq. (4)), we still need to overcome several endogeneity problems. First, by construction previous height and weight are correlated with the error terms of Eqs.
(3) and (4) (see Eqs. (1) and (2)). Moreover, if we assume that the household responds to past shocks as is likely and for which there is evidence for the Philippines (Akin et al., 1992;Liu et al., 2009), current inputs may be correlated with the error terms. We address potential endogeneity by using IV, which also addresses bias due to random measurement error in x under the assumption that the instruments are uncorrelated with that measurement error. The set of candidate instruments we use differs by country but draws on plausibly exogenous factors including a randomized intervention in Guatemala and prices of common foods in both countries. We treat market prices as exogenous to households (as in Liu et al. (2009)). Using prices as instruments for inputs is a well-established approach in the estimation of production functions (Todd and Wolpin (2003)). We also include past height and weight measures, h i,tÀ2 and w i,tÀ2 as instruments to help identify the effects of lagged height and weight. (Instruments are described in further detail in Section 3.3.) Using the available instruments, we endogenize protein and non-protein intakes, as well as lagged height and weight. However, we do not have access to instruments in both countries that also would allow us to control for the potential endogeneity of breastfeeding or diarrhea. 7 Controlling for individual-level fixed effects is an important aspect of our approach, however, and goes part way toward addressing their potential endogeneity. For example, fixed effects control for the possibility that certain children have a pre-disposition for diarrhea, or live in particularly unsanitary households. However, if households change breastfeeding practices when health shocks affect their children's health or change sanitary conditions to reduce the diarrhea prevalence, the estimated effects of breastfeeding and diarrhea could be downward-biased. For instance, households that have increased breastfeeding could be compensating for negative health shocks, suggesting a negative relationship between growth and breastfeeding, while correcting for endogeneity could show a positive relationship (and similarly for diarrhea). Because our principal objective is to study the roles of proteins and non-proteins in the production functions, however, we do not emphasize the coefficients for diarrhea and breastfeeding but instead make clear the assumptions under which our primary coefficients of interest are consistently estimated even if breastfeeding or diarrhea are endogenous in the model. Our estimation approach is consistent provided the instruments are not correlated with the error term in the production function, conditional on breastfeeding and diarrhea as well as other covariates mentioned below. This is plausible for the same reason that the instruments are exogenous in relation to the energy inputs, e.g., that they are not correlated with individual-level time-varying health shocks. 8 In principle, there also could be interactions among inputs in the production function, such as between 6 Specifications of the change-in-height equation that exclude lagged weight, and the change-in-weight equation that exclude lagged height were also estimated. Results were similar to the more general specification (available on request). 7 Previous work using the Philippine data has used rainfall as an instrument for diarrhea (Akin et al., 1992). We attempted to endogenize diarrhea using spatial and temporal variation in rainfall and temperature as instruments in Guatemala, but they had minimal predictive power. To keep the structure parallel across the countries, we do not use rainfall to endogenize diarrhea in either country. 8 For instance, if some other disease is important in the production function, and we are not including it, our results hold if the instrumental variables are orthogonal to this other disease. nutrient intakes and diarrhea, or between breastfeeding and other nutrient intakes but a specification incorporating such interactions would be even more challenging to estimate, requiring additional instruments. Given that there are already four variables that we treat as endogenous in our main models (protein, non-protein, lagged height, and lagged weight), we do not estimate models with such potential interactions; instead, we studied possible interactions by splitting the sample. For instance, to examine whether diarrhea or breastfeeding interacts with diets, we estimated specifications for the sample that is breastfed and compare the results with the sample that is not breastfed. We carried out a similar exercise for diarrhea. Our results indicate that coefficients are not affected when we separate the sample by breastfeeding types. For diarrhea, there was some evidence of interaction effects, where diarrhea lowers the effects of macronutrients, but because most of the specifications suffer from problems of weak instruments, we are unable to draw strong conclusions. The estimation of the growth equations also includes an indicator for whether the child was female, number of days since the previous measurement, and age and age squared at time t.
Our methods permit us to improve upon the previous literature that investigates the effects of total energy on anthropometrics. Since we do not have a single set of preferred instruments, we are able to robustly study effects of total energy on height and weight across two settings. We do this estimating the changes in height and weight, first using total energy intakes and then separating protein and energy from other macronutrient intakes to examine their relative partial effects in each model.
The final estimating equations for the change in each anthropometric measure A i,t that we estimate, adding the additional controls to Eqs. (3) and (4), are: Non _ Prot i,t correspond to the total energy intake, protein intake and non-protein intake; days_no_diar i,t is the number of days without diarrhea between measurements; bf i,t is a dummy variable equal 1 if the child was breastfed during the period leading up to age t; age i,t and age 2 i;t are age and age squared; female i,t is a dummy variable equal to 1 if the child is a female; and gap_msmt i,t is the number of days between measurements. Finally, the error terms in Eqs. (3) and (4) exhibit serial correlation of order one by construction. We use cluster standard errors at the individual level to take into account this serial correlation, and also any possible correlation of individual error terms; using cluster standard errors is more general than a correction for serial correlation. Additionally the error terms are correlated between equations so there are possible efficiency gains of estimating a system of equations. Nonetheless given the already complex nature of the estimation, we estimate single equations. The cluster errors we calculate, therefore, can be seen as an upper bound of the standard errors.

Data
Estimation of (5) and (6) requires high-frequency longitudinal data in early life that contain information on the outcomes (height 9 and weight) and inputs (proteins and other macronutrients, breastfeeding, and diarrhea), as well as plausibly exogenous instruments. We now describe the data and contexts for two unique studies that fulfill these substantial requirements relatively well, one in Guatemala from the 1970s and the other in the Philippines from the 1980s.

Guatemala
We use data from The Institute of Nutrition of Central America and Panama (INCAP) 1969-1977 nutritional supplementation trial. Four rural villages from eastern Guatemala were selected, one relatively large pair ($900 residents) and one smaller pair ($500 residents). At the outset, the villages were similar in terms of child nutritional status, measured as height at age three years, and were highly malnourished with over 50% of children severely stunted, i.e., with height-for-age z-score <À3. One large and one small village were randomly selected to receive a high-protein supplement (Atole); the others received an alternative supplement devoid of protein (Fresco). A 180 ml serving of Atole contained 11.5 grams of protein and 163 kcal. Fresco had no protein and a 180 ml serving had 59 kcal. The main hypothesis was that increased protein would accelerate mental development; additionally, it was expected that the high-protein nutritional supplement would affect physical growth. The nutritional supplements were distributed in centrally-located feeding centers in each village . Virtually all (>98%) families participated ).
From 1969 to 1977, anthropometric measures (height and weight) were taken every three months for all children 24 months of age or under (including newborns entering the study) in the four villages. This yields a maximum usable sample for our analyses of 878 children measured at least twice by the age of 24 months. The amount of supplement intake was recorded daily in all villages. Home dietary information was collected every three months, including the types and amounts (except for breastmilk) of all foods and liquids consumed. These dietary histories were based on a 24-h recall period in the larger villages and a 72-h period in the smaller villages (from which we construct daily averages), and permit calculation of protein and non-protein intakes for the 24-h period by summing the nutritional content for each food item. The survey recorded the total months a child was breastfed. Nutrients from breastfeeding were not included in the nutritional intake calculations. Retrospective information on illness, specifically the length in days of episodes of diarrhea and fever, was collected semi-monthly.

The Philippines
We use the Cebu Longitudinal Health and Nutritional Survey, a survey of Filipino children born between May 1983 and April 1984 in 33 rural and urban communities (barangays) in Metropolitan Cebu. The baseline survey included 3327 women sampled at a median of 30 weeks of gestation, and yielded a sample of 3080 singleton live births. This sample also exhibits high levels of undernutrition; at age 24 months, 62% of the children were stunted and 32% underweight. During the first two years of each child's life, data were collected every two months. This included anthropometric measurements, 24-h dietary recall of types and amounts (except breast milk) of all foods and liquids eaten, breastfeeding, and recent illness history. For breastfed children, the survey also collected the frequency and length of time spent breastfeeding. Total protein and energy intakes were calculated from foods consumed the previous day (24-h recall method). At each survey, mothers reported whether the child had diarrhea in the past 24 h, and if so, when the episode began, and the number of days the child had diarrhea during the previous week (Adair et al., 2011). The maximum usable sample of children between 6 and 24 months of age for the Philippines is 2713.

Variable construction
Linear growth and weight gain are calculated as the difference between consecutive measurements. Although measurements were scheduled at specified intervals (every three months in Guatemala, every two in the Philippines), there were deviations including instances where a scheduled measurement did not occur. Because children experience high growth and growth spurts during the first two years of life, even differences of several days can be associated with significant differences in growth. We account for this by controlling for the exact number of days between measurements.
Ideal data for this analysis would have information on protein and non-protein intakes over the entire period between measurements, but even in these uniquely comprehensive studies such detailed information is not available. Therefore, we approximate intakes over the entire period by using the average of the 24-h intakes calculated from the dietary recall information at the beginning and end of each period (which decreases measurement error relative to using only one point in time) multiplied by the exact number of days between measurements. For Guatemala, we add to this figure the intakes from the supplement (which were measured daily throughout the period) to obtain total protein and other intakes (as well as their sum, measured as total energy). 10 For breastfeeding, we create a dummy indicator for whether the child was breastfed in the month previous to measurement at time t. While this does not fully exploit the detailed information available for the Philippines, it is done to have similar specifications across countries.
The final input we include is diarrhea. For Guatemala, the protocol was to collect information every 15 days, so it is possible to construct the number of days experiencing diarrhea for the complete periods between anthropometric measurements. 11 For the Philippines, it is only possible to construct the number of days with diarrhea during the week previous to each bimonthly anthropometric measurement. To extrapolate this to the full period between measurements, we estimate a count model for number of days with diarrhea for each two-month period with the Guatemalan data and use the estimated parameters from that model to predict number of days each Filipino child had diarrhea in each two-month period. 12 As outlined in Section 2.3, in our main specifications we instrument for protein, other macronutrient intakes, and lagged height and weight. We now describe in detail the other instruments besides twice lagged height and weight.
In both countries we use unit prices for various food items, selected with emphasis on foods with high protein content and/or important in the local diet. For Guatemala, prices are averages of national-level prices measured during December each year. We use lagged prices of eggs, chicken, pork, beef, dry beans, corn, and rice. Unit price variables for Guatemala are deflated and measured over the eight-year study period. For the Philippines, we use community-specific prices collected as part of the broader study. Between January 1983 and May 1986, enumerators visited two stores in each community, every other month, and collected prices (and quantity units) for a list of items. Not all items, however, were sold at each store at each visit. Consequently, there is not a complete set of prices for each item from each store (or even from each community in instances where no price was available from either store) in each measurement period. We selected as instruments the prices of dried fish, eggs, corn and tomatoes since these are the ones with the highest frequency in the sample. 13 We use both current and lagged prices of those selected food items. By estimating a large set of instrument combinations, our approach does not depend on any one particular price, avoiding subjective instrument selection.
For Guatemala, we also exploit the experimental variation resulting from the randomized allocation. We 10 For Guatemala we use an individual-level fixed-effects model to impute nutrient intakes for approximately 5% of missing observations. See Data Appendix Section 1. 11 Approximately 45% of such 15-day visits were missed. In those instances, we assume the child had similar diarrhea patterns across all 15-day intervals during that growth period and scale-up the observed number of days accordingly. 12 See Data Appendix Section 2 for details of the estimation of the count model for diarrhea. 13 See Data Appendix Section 3 for further details on prices.
use a dummy variable that indicates whether the village had a feeding center that provided the high-protein supplement. We also interact this indicator with the distance from the home of the child to that feeding center.
While the presence of a randomized allocation of a highprotein supplement provides an important source of exogenous variation, since there are four endogenous variables, additional instrumental variables also are used, i.e., twice lagged anthropometrics and food prices. For the Philippines we rely on price variation, which, unlike the annual Guatemalan food price data, varies both withinyears and spatially, with information on these food items for the majority of measurement periods and each of the 33 communities.

Descriptive statistics
Over the period from ages 6 to 24 months, each Guatemalan child is observed an average of 4.3 times and each Filipino child 9.1 times. The sample we describe includes all observations (measurements of children at different ages) with complete information for the following variables: change in height between consecutive measurement periods (linear growth), change in weight between consecutive periods (weight gain), total energy, energy from protein, energy from non-protein, breastfeeding indicator, and days with diarrhea. 14 The final number of observations used in each specification varies depending on the availability of the instrumental variables used in that specification, since instruments for some observations are missing. Table 1 compares the main variables for both samples. On average and at all ages, the Filipino children in the early 1980s were taller than the Guatemalan children in the 1970s. For example, at 12 months of age, Filipino children were on average 70.7 cm tall, while their Guatemalan counterparts were 1.8 cm shorter. In terms of average weight, however, there were no significant differences between countries-at 24 months, children from both countries averaged 9.8 kg. 44% of the Guatemalan children were stunted, and 27% underweight. The corresponding levels were lower, 25% and 11%, for Filipino children. In 2011 for low-and middle-income countries, average levels of stunting were 28% and of underweight 17%, and 36% and 18% in Africa (Black et al., 2013). With broadly similar levels of stunting and underweight, thus, our historical samples remain relevant to understanding undernutrition in many countries and regions. Table 2 shows that Guatemalan children appear more likely to have been breastfed at all ages. In both countries, breastfeeding declines with age. At six months, 99% of Guatemalan children were breastfed, while at 24 months only 18% were; the proportions were 76% and 14% for Filipino children.
Patterns between diarrhea and age are less clear. In Guatemala, average number of days with diarrhea (per 3-month measurement period) increases with age to 15 months, after which it declines. Levels are relatively lower in the Philippines, fluctuating between about 2 and 6 days (per 2-month period), with no clear age pattern.
For Guatemala, information is complete on all of the instruments except the distance to the feeding center, which is missing for $5% of observations. For the Philippines, on the other hand, incomplete price availability leads to larger reductions in the sample size. The potential sample has 24,820 child-age observations; the lagged price of corn, which is the most complete, has 18,710 observations and the lagged price of tomatoes, the least complete, has 16,084 observations.

Overview
We estimate height and production functions for children 6-24 months, the period widely considered to be a critical window for post-birth nutritional investment. 15 We use Generalized Method of Moments (GMM) for exactly-identified models and Limited Information Maximum Likelihood (LIML) for over-identified models because the latter allows for smaller finite-sample bias (Stock and Yogo, 2005). As noted, we cluster error terms at the individual level to take into account correlation of individual error terms and serial correlation (Baum et al., 2007). 16 We first estimate height and weight production functions using only total energy (i.e., the sum of calories from protein and other sources), then we analyze separately the roles of proteins and non-proteins. In all specifications, the energy intakes, lagged height, and lagged weight are treated as endogenous, and we control for breastfeeding, number of days without diarrhea since the previous measurement, child sex, number of days since the previous measurement, and age and age squared.
Because there are many potential instrument combinations, to establish general results that do not depend on one specific instrument combination, we estimated large subsets of all possible combinations. For Guatemala we first restricted the instrument sets to combinations that always had the Atole experiment indicator. Then, we 14 For the Philippines, the number of available observations is constant across variables, but decreases with child age due to attrition. For Guatemala, the number of children with available information on intakes and diarrhea is smaller than the number with anthropometric measures because the dietary and morbidity information for infants under 12 months was not collected until 1973. 15 There are additional substantive, as well as practical, reasons for the 6-24 month window. First, during the first six months most infants are breastfed; indeed WHO recommends exclusive breastfeeding from birth to age six months. Therefore, before that age proteins and non-proteins in the diet reflect non-exclusive breastfeeding that could be detrimental to growth. Second, it is not possible to study the production function at earlier ages because our final specification models growth and the candidate instrumental variables include second lags of height and weight (Section 2.2). Because we model growth and use these second lags, however, the analysis does incorporate information on individuals prior to six months of age. Third, while the frequency of measurements differs, both samples have measurements at ages six and 24 months, facilitating comparability. 16 The specifications also include predicted days of diarrhea. We do not explicitly account this in calculating the standard errors, instead relying on the general correction provided by clustered standard error calculations.
systematically varied inclusion of distance interactions with Atole indicator, second lags of height, second lags of weight, and from two to four of the seven food prices (eggs, chicken, pork, beef, rice, beans and corn). For the Philippines, we systematically varied inclusion of second lags of height, second lags of weight, and from two to six of the eight (four current and four lagged) food prices (eggs, fish, tomatoes and corn). A summary of our instrument combinations is found in the Data Appendix Section 6. For Guatemala, there are 546 specifications (i.e., each with a different instrument set) for the version of the model with total energy (Eq. (5)) and 525 when proteins and nonproteins are included separately (Eq. (6)). 17 The total number of specifications estimated for the Philippines is 602 for both models.
For each specification, we calculate the robust versionsof the Hansen-J (HJ) over-identification test, the Anderson-Rubin under-identification test (Anderson and Rubin, 1949), and the Wald F-statistic (robust Cragg-Donald or CD statistic) to detect weak instruments. Since our main models have four endogenous variables and we estimate them assuming heterokedasticity, it is not possible to compare CD statistics with critical values from Stock and Yogo (2005). The robust versions of these tests were developed in Kleibergen and Paap (2006). We also calculate for each endogenous variable Angrist and Pischke's (AP) partial F (Angrist and Pischke, 2009), which are informative about the presence of weak instruments. Finally, for all overidentified models we calculate the Hausman test of equality of OLS and IV estimates.
We use the HJ over-identification and the CD statistics to focus our analysis on specifications with stronger and more exogenous instruments. In general, the Anderson-Rubin and Hausman tests strongly support our identification strategy. Based on the Anderson-Rubin test, we reject under-identification in all specifications for Guatemala, while for the Philippines we reject under-identification in 96% of the specifications. The Hausman test rejects equality of OLS and IV estimates in 99% of the specifications with total energy and 90% of the specifications with protein and nonprotein separate in Guatemala and 87% and 98%, respectively, for the Philippines. Finally, we calculate the AP partial F statistic for the energy coefficient (l h energy and l w energy ) from Eq. (5) and the protein (l h prot and l w prot ) and non-protein coefficients (l h non prot and l w non prot ) from Eq. (6). These statistics are useful to make comparisons across equations and variables, but do not provide formal statistical support against weak instruments, since there are no critical values available for them. In general, the results suggest that the instruments are stronger for Guatemala: the AP partial F tends to be over 30 for the protein coefficients and over 7 for energy and non-protein coefficients. For the Philippines, the AP partial F for the total energy coefficient tends to be over 20. However, it is mostly below 5 for the protein and nonprotein coefficients, which suggests that instruments are weaker in the more general specification for the Philippines. 18 Despite these differences in AP statistics, results are broadly similar across countries, which suggests that we are identifying structural relationships between nutrients and anthropometrics.
Since each production function is estimated multiple times, we explore distributions of estimated coefficients rather than a single or small set of ''preferred'' specifications, allowing us to draw more general conclusions. We do not choose or define a preferred specification because there are no obvious criteria for doing so and because of the concern that any potential preferred specification would not be robust to changes in the set of instruments. Although a priori the instruments we propose are plausibly exogenous and strong, we put relatively more confidence in those instrument sets that better satisfy over-identification and weak instrument tests.
The results of each type of specification are presented in Tables 3-6 and Figs. 1-3. In Tables 3 and 5, and Fig. 1, we present the estimated overall energy coefficients. In Tables 4 and 6 (Panels A and B), and Fig. 2, we presentthe estimated protein coefficients, and in Tables 4 and 6 (Panels C and D), and Fig. 3, the estimated non-protein coefficients. Each table presents the 25th, 50th and 75th percentiles of the estimated coefficient distributions and, in the final two columns, the percentages of the coefficient estimates that are significantly (p < 0.05) positive or negative. For each Panel in each table, the first row reports distributions for all estimated specifications and, in subsequent rows, for specifications that are over-identified, and for those that have HJ Pvalues > 0.05 and CD statistics > 1, 3, or 7 (provided there are more than 10 such specifications in each case). 19 These sets of specifications focus on results for which relatively strong and exogenous instruments are available. Figs. 1-3 present point estimates (and associated 95% confidence intervals) for all specifications that have HJ P-values > 0.05 and CD > 1 (corresponding to the third rows in Tables 3-6). The scale of the x-axis corresponds to the natural logarithm of CD statistics and the y-axis the coefficient values. 20 To facilitate interpretation of the coefficient magnitudes, we simulate changes in height and weight when energy intakes increase ceteris paribus For this exercise, we use the most restrictive specifications with CD > 7 (or CD > 3 if there are fewer than ten specifications with CD > 7) and HJ P-values > 0.05. Within that set of specifications, we select the median coefficient and simulate effects of increasing energy intakes by 300 kcal per day, protein intakes by 10 g per day, or non-protein intakes by 250 kcal per day. Each of these is approximately one SD of respective intakes of 18-month old infants in both countries. This hypothetical daily increase is then multiplied by 90 in Guatemala and by 60 in the Philippines to approximate total intakes for a given measurement period, and then multiplied by corresponding coefficients to obtain anthropometric changes. We call this exercise median prediction. Table 3 summarizes for Guatemala distributions of coefficient estimates on total energy in the height and weight equations, and Fig. 1A and B show the coefficients and confidence intervals for the corresponding specifications with CD > 1. Total energy positively affects height and weight changes. These positive relationships are most evident for specifications with relatively stronger and more exogenous instruments. Our findings are consistent with previous literature that uses stronger identification assumptions estimating similar relationships from the same data sources Griffen, 2016).

Guatemala
For height in Guatemala, estimated coefficients on total energy are positive in the vast majority of cases, positive and significant (p < 0.05) in 35% of cases, and never negative and significant. The positive relationship is more 1st column: # of specifications that meet criteria; 2nd-4th col: percentile of distribution of estimated coefficients. 5th (6th) column: percent of estimated coefficients that are positive (negative) and significant at 5% significance level. 1st row: all specifications; 2nd row: all over-identified specifications for which # of IVs># of endogenous variables. Other rows include all specifications satisfying the indicated criteria based on the CD and HJ tests. All specifications include breastfeeding, diarrhea, sex, age, and age squared as covariates and a seasonal dummy for the Philippines, and lagged height and lagged weight, both of which are treated as endogenous. Height coefficients are divided by 1000 for presentation purposes. robust when we consider specifications with relatively stronger and more exogenous instruments, according to the tests. Restricting to over-identified specifications in which HJ P-values > 0.05 and CD > 3, total energy coefficient estimates are positive and significant 57% of the time.
To provide further interpretation of the magnitude of the coefficients, we calculate the median prediction (Section 4.1), taking the median coefficient of the specifications with CD > 3; we calculate the effect of increasing energy per day by 300 kcal. For Guatemala, this implies a 0.62 cm predicted change in height. For weight production functions, estimated coefficients on total energy are positive and significant for 36% of specifications, and are never significantly negative.
Specifications with higher CD statistics have larger proportions of positive significant coefficient estimates. Fig. 1B shows that while there are fewer specifications with higher CD statistic levels compared to the height model, for those with stronger instruments, the estimates are generally positive. The median prediction exercise indicates increasing energy intake by 300 kcal per day yields a predicted 620 g change in weight.
Next, we consider the roles of protein and non-protein energy separately in the growth model. Proteins robustly and positively affect growth in height and weight in Guatemala, but the relationship of non-proteins (after controlling for protein) with these anthropometric measures is non-positive. See Table 3 notes.  Fig. 2A) shows that for 53% of all specifications, protein coefficient estimates are positive and significant. In specifications with CD > 3, the estimates are always positive and significant. In specifications with stronger instruments, the estimated coefficient dispersion (i.e., the distance between the 25th and 75th percentiles) decreases; for specifications with CD > 1 the ratio of the coefficients in the 75th and 25th percentiles is 1.3, while [ ( F i g . _ 1 ) T D $ F I G ] for the specifications with CD > 3 the ratio is 1.06. Our median prediction exercise indicates that if protein were to increase by 10 g per day, the predicted change in height is 0.39 cm.
For weight change (Panel B of Table 4 and Fig. 2B), we find an even more robust pattern for proteins. In nearly all specifications (92%), protein coefficient estimates are positive and significant, and for specifications [ ( F i g . _ 2 ) T D $ F I G ] with CD > 1, they are always positive and significant. For all specifications, the estimate at the 75th percentile is only 1.2 times larger than that at the 25th percentile. This pattern of stability and significance of coefficient estimates also can be seen in Fig. 2B where the dispersion of the estimated coefficients is small, and there is a clear pattern of positive and significant effects of protein intake on weight growth. An increment in protein intake of 10 g per day results in a predicted 195 g change in weight.
[ ( F i g . _ 3 ) T D $ F I G ] By contrast, there is little evidence that energy from non-proteins affects changes in height and weight. Panel C in Table 4 and Fig. 3A show that for Guatemala, in nearly all cases (98%) the estimated coefficient is insignificant in the height model. For the weight production function (Panel D of Table 4 and Fig. 3B), the point estimates are never significant. Table 5 shows the distribution of the total energy coefficient estimates for the Philippines and Fig. 1A and B the corresponding coefficients and confidence intervals for specifications with CD > 1. As in Guatemala, positive relations are most evident for specifications with relatively stronger and more exogenous instruments. The positive impacts of total energy on height and weight are consistent with those found under somewhat stronger identification assumptions and using the same data, by Liu et al. (2009) andde Cao (2015).

Philippines
Across all specifications summarized in the Panel A of Table 5, 13% have positive and significant coefficient estimates (p < 0.05), while none have negative and statistically significant estimates. Restricting results to the 45 specifications with HJ test P-values > 0.05 and CD > 7, 64% of estimated total energy coefficients are positive and significant. Specifications with higher CD statistics tend to have more concentrated coefficient estimate distributions. If daily energy intake increases by 300 kcal the predicted change in height is 0.18 cm.
For weight, evidence is similar regarding the role of total energy. The bottom panel of Table 5 indicates that for 15% of all the specifications in the Philippines, the estimated coefficient on total energy is positive and significant and never negative and significant. Specifications with the highest CD statistics tend to have larger shares of positive and significant coefficient estimates. Our median prediction results in a predicted change in weight of 37 g.
Panel A of Table 6 (and Fig. 2A) shows that for 39% of all specifications, protein coefficient estimates are positive and significant. While there are fewer specifications with strong instruments than in Guatemala, for specifications with CD > 3, 100% of the coefficient estimates are positive and significant. In specifications with stronger instruments, the estimated coefficients dispersion decreases. Increasing protein consumption by 10 g per day is predicted to result in a 2.24 cm change in height.
For all specifications (Panel B of Table 6 and Fig. 2B), 48% of estimated coefficients on protein for weight are positive and significant -100% in specifications with CD > 3. Similar to Guatemala, coefficient estimate dispersion decreases with stronger instruments. Increasing protein consumption by 10 g per day results in a predicted 703 g change in weight.
Somewhat surprisingly, non-protein intakes are generally negatively related to both height and weight gain. For height, Panel C of Table 6 reports that 88% of the specifications with the strongest instruments (CD > 3) yield negative and significant estimated coefficients. For weight, 100% of estimates in specifications with the strongest instruments are negative and significant.
These findings for non-protein energy for the Philippines are somewhat counter-intuitive, because they suggest that such energy intakes are detrimental to growth. Most individual foods (including those consumed in these regions during the study periods), however, include both proteins and non-proteins and virtually all diets do. Consequently, it is unlikely that actual intakes would change in a fashion that increased energy from nonproteins while simultaneously holding proteins constant. Since Filipino children's diets included both intakes, on net any negative effects of other macronutrient sources would have been partly or fully offset by protein effects. For example, not including breastmilk, at age 6 months, 93% of children had some protein consumption and from ages 14 to 24 months, all did. Moreover, at age 6 months 75% of children are breastfed, which also provides protein intakes. In Section 4.5, we show that the model predicts that a dietary change (relatively rich in proteins but with some energy from other sources) indeed has positive effects on height and weight, despite negative coefficient estimates on non-proteins.
There are several potential explanations for the finding that non-proteins are less robustly related to anthropometrics than proteins. First, it is possible that energy from macronutrients other than proteins do not affect height and weight, at least aggregating the other macronutrients as we do. Second, it may be that non-linearities are not captured. For instance, it could happen that carbohydrates and fat need some proteins to have an effect on anthropometrics-if protein intakes are zero or very low, other intakes would not affect height and weight. Third, dietary changes after children stop breastfeeding can result in poorer quality diets, especially poor quality of carbohydrates and low micronutrient density, weakening any potential link to anthropometrics. Fourth, the available instruments simply may not be powerful enough to detect effects of other macronutrients; protein and non-protein intakes are highly correlated (even before instrumentation), making it difficult econometrically to identify their distinct effects; in that sense, Guatemala greatly benefits from the experimental Atole intervention, which provides a clear and strong exogenous variation for protein, though it is less powerful for other macronutrients.

Effects of other inputs and controls
In addition to the different nutrition intakes, our analysis provides estimates of the coefficients on lagged height, lagged weight, breastfeeding, and diarrhea. The results clearly indicate some catch-up height and weight growth. The lagged height coefficient is consistently negative and mostly significant in the change-in-height equation, indicating that shorter children at the end of one period tend to grow more in the next period. Similarly, the lagged weight coefficient is consistently negative and mostly significant in the weight equation so that lighter children at the end of one period gain more weight in the following period. With the caveat that the estimates for breastfeeding and diarrhea are potentially biased due to endogeneity, our coefficient estimates for number of days without diarrhea are consistently positive and significant for weight in both samples, suggesting that diarrhea has detrimental effects on weight gain as generally found in the literature. The coefficient estimates for breastfeeding are positive and mostly significant for Guatemala. In the Philippines, the coefficient estimates generally show a positive association between breastfeeding and height while the associations between breastfeeding and weight show no consistent pattern, similar to findings from Adair and Popkin (1996). 21

Counterfactual exercise: increasing nutritional intakes
We next simulate the full effects of additional protein and non-protein intakes on child height and weight for the Philippines, complementing the simpler median predictions we used when interpreting individual coefficients. From the set of specifications with HJ P-values > 0.05, we select the specification with the highest CD. The simulation is based on adding one egg per week to a child's diet, assuming no other changes in diet and no change in diarrhea. Eggs are good for such simulations. They were widely available in the localities where these studies are situated and are easily consumed by infants. They not only contain highly bioavailable protein, but also contain energy from other macronutrients, similar to many other naturally protein-rich foods. A medium (44 g), whole raw egg contains on average 5.5 g of protein and 40.9 calories from non-protein. 22,23 Based on our parameter estimates, a child who consumed an additional egg per week on top of existing diet, for 18 months -from 6 to 24 months of agewould gain an additional 0.72 cm in height and 265 grams in weight.

Conclusions
Arimond and Ruel (2004) described associations between children's dietary diversity and their height. We build on their insights, examining effects of diet and particularly diet composition on height and weight growth for children between ages 6 and 24 months, giving special attention to differences between diets rich and poor in proteins. We improve upon previous literature by making weaker identifying assumptions, considering two important anthropometric measures-height and weight, investigating the robustness of our results to the use of a number of different instruments, and separately investigating the effects of energy from proteins and from nonproteins while controlling for breastfeeding and diarrhea. We take advantage of two rich databases, one for Guatemala and the other for the Philippines, which have longitudinal information on height, weight, and protein and energy intakes with high frequencies of observations. IV estimation strategies are used to overcome endogeneity and measurement error problems, using food prices and, in the case of Guatemala, a randomized nutritional intervention, as instruments. Because there are many instruments and instrument combinations available, we present results that comprehensively summarize these combinations rather than selecting only a single set of instruments. Our findings indicate that increasing energy intake increases both height and weight in both countries. But the source of that energy, protein versus non-protein, matters. In these poor populations characterized by high levels of chronic undernutrition, increases in protein intake drive increases in child height and weight.
These results provide evidence on an important puzzle in the literature while pointing to possible modifications to interventions designed to improve children's nutritional status. A systematic review by Manley et al. (2013) using meta-analysis techniques shows that while the average impact of income transfers from social protection programs on height-for-age is positive, effect sizes are small and not statistically significant. If households use these transfers largely to increase the quantity of calories consumed, if the increases in protein consumption is small in magnitude, or if these proteins are not allocated to children, then our results suggest that such transfers will have little impact on child height-precisely what Manley et al. (2013) find. Headey and Hoddinott (2015) examine impacts of Green Revolution-induced increases in rice productivity on children's anthropometric status. They find no impact of these on child height, results also consistent with what we observe here. Our findings, in conjunction with these other studies, suggest that interventions designed to increase household incomes may only improve children's nutritional status when they are linked to mechanisms that also improve the quality of children's diets. Such interventions, e.g., linking nutritional behavior change communication to social protection interventions or ''nutrition-sensitive agriculture'' await further study.

Funding
The authors thank Grand Challenges Canada (Grant 0072-03), Bill and Melinda Gates Foundation (Global Health Grant OPP1032713), and the Eunice Shriver Kennedy National Institute of Child Health and Development (Grant R01 HD070993) for Financial Support. The funders have no involvement in the analysis and interpretation of the data, writing of the paper, or the decision to submit the paper for publication.

Conflict of interest
There are no conflicts of interest.

Acknowledgements
This version of the paper has benefited with comments made by two anonymous referees and participants of the seminars at LACEA, PAA and University of Pennsylvania. 21 All results available on request. 22 Agricultural Research Service of the United States Department of Agriculture. http://ndb.nal.usda.gov/ndb/foods/show/112 accessed on 17th September 2014. 23 If households were to purchase the eggs, the cost would have been $0.37% of the annual average income.