Differences in Tsimane children’s growth outcomes and associated determinants as estimated by WHO standards vs. within-population references

Anthropometric measures are commonly converted to age stratified z-scores to examine variation in growth outcomes in mixed-age and sex samples. For many study populations, z-scores will differ if calculated from World Health Organization (WHO) growth standards or within-population references. The specific growth reference used may influence statistical estimates of growth outcomes and their determinants, with implications for biological inference. We examined factors associated with growth outcomes in a sample of 152 Tsimane children aged 0–36 months. The Tsimane are a subsistence-scale population in the Bolivian Amazon with high rates of infectious disease and growth faltering. To examine the influence of growth reference on statistical inferences, we constructed multiple plausible models from available infant, maternal, and household attributes. We then ran identical models for height-for-age (HAZ), weight-for-age (WAZ), and weight-for-height (WHZ), with z-scores alternately calculated from WHO and robust Tsimane Lambda-Mu-Sigma growth curves. The distribution of WHO relative to Tsimane HAZ scores was negatively skewed, reflecting age-related increases in lower HAZ. Standardized coefficients and significance levels generally agreed across WHO and Tsimane models, although the strength and significance of specific terms varied in some models. Age was strongly, negatively associated with HAZ and WAZ in nearly all WHO, but not Tsimane models, resulting in consistently higher R2 estimates. Age and weaning effects were confounded in WHO models. Biased estimates of determinants associated with WHO HAZ may be more extreme in small samples and for variables that are strongly age-patterned. Additional methodological considerations may be warranted when applying WHO standards to within-population studies, particularly for populations with growth patterns known to systematically deviate from those of the WHO reference sample.

Introduction Anthropometric measures of body size are widely used to assess growth, nutritional status, and biological fitness [1][2][3]. For mixed age and sex samples, these measures are often converted to age-stratified z-scores (i.e. height-for-age, HAZ; weight-for-age, WAZ; weight-for-height zscore, WHZ), calculated against a large internal or established external reference-e.g. the U.S. CDC growth charts or the World Health Organization (WHO) Growth Standards [4]. The current WHO standards, in place since 2006, were derived from a large, longitudinal multiethnic survey and are ideal for cross-population comparisons. Unlike the previous CDC/ NCHS reference, the WHO standards also importantly established growth of breastfed infants-who have slower growth trajectories than formula-fed infants-as the normative baseline for children 0-24 months of age [5][6][7].
However, the distinction between "reference" and "standard" has meaningful methodological implications. A "reference" represents growth outcomes in a particular place and time, whereas the WHO "standard" represents optimal growth potential, i.e. how children "ought to grow under optimal conditions" [5,8,9]. Systemic negative deviations from the WHO standards are generally interpreted as evidence of suboptimal growth attributed to nutritional and pathogenic exposures. However, genetic and other factors also influence population-specific growth trajectories [10][11][12][13][14]. For example, the WHO standards have been shown to alternately underor overestimate the prevalence of stunting, underweight, and overweight in affluent populations in China, Japan, and India [15][16][17][18]. Even populations included in the WHO reference samples are not fully represented, as the survey excluded low socioeconomic status families, families living above 1500 m altitude, mothers who smoked during pregnancy or lactation, children born at < 37 weeks or � 42 weeks, and children with substantial morbidities [5,14]. As such, an individual child's growth relative to other children in their population will always be more faithfully represented by within-population z-scores, even within affluent populations [4,19].
Cole advises considering whether the goal is to examine variation in "healthy growth" or "representative growth" in deciding whether to use the standards or a local reference, if available [8]. Kramer et al. have further cautioned that deviance from an optimal standard may have little bearing on a child's relative biological fitness within a population [20]. We further propose that the choice of local reference vs. growth standard may differently influence statistical relationships between estimated growth outcomes and locally varying social, economic, or biological factors-with implications for inferring biological relevance. Mean WHO HAZ scores decline systematically across early childhood in resource poor-settings due to nutritional and infectious conditions [21][22][23][24], resulting in increased age-related variance in WHO HAZ scores that may bias parameter estimates in mixed-age samples [25,26]. Although researchers often control for child age in statistical models [27,28,29], the systematic deviance in WHO-derived z-scores may bias or confound estimates of size differences associated with locally varying determinants, particularly those correlated with age or developmental changes. In contrast, within-population growth references should minimize the influence of endemic influences in estimating relative size, resulting in more accurate and biologically relevant estimates of local growth determinants in regression models.
The Tsimane are a high-fertility, high-mortality population of forager-horticulturalists residing in the Bolivian Amazon [30]. Tsimane infants are exclusively breastfed for four months and weaned later than two years on average [31]. However, no Tsimane households have access to improved or safely managed water sources, and few households have electricity. Endemic parasitism and infectious diseases impose substantial immune and energetic costs [31][32][33][34]. Infant mortality rates have been previously estimated at more than double the national rates for Bolivia-largely owing to respiratory and gastrointestinal infection [35]with higher parity and shorter IBI associated with increased risk of infant mortality and growth faltering [36,37]. In previous surveys of child nutritional status as assessed by WHO standards, 47% of children aged 0-5 were classified as stunted and 18% as underweight [12]. Given the co-occurrence of protective and risk factors that may influence Tsimane growth patterns (e.g. prolonged breastfeeding vs. endemic infectious disease), developing appropriate initiatives to improve Tsimane child welfare requires accurate identification of local growth determinants and at-risk individuals.
We examined growth outcomes of Tsimane children aged 0-36 months in association with different age-related and fixed infant, maternal, and household variables. We assessed model parameters in side-by-side comparisons of identical models with HAZ, WAZ, and WHZ scores calculated from WHO and Tsimane Lambda-Mu-Sigma (LMS) growth curves. The Tsimane LMS curves were generated from 30,118 mixed-longitudinal measures from 9,614 individuals, using methods identical to those used in formulating WHO standards, allowing for robust comparisons [12,38]. We observed that coefficient estimates and significance levels generally agreed between WHO and Tsimane-derived WAZ and WHZ models, but differed for specific terms in HAZ models, largely owing to age-related confounding in WHO scores.

Data collection
MM conducted a mixed-longitudinal study of infant feeding practices and maternal and infant health outcomes across nine Tsimane villages from September 2012-April 2013. The villages varied with respect to river access and distance to the market town of San Borja (pop. 24,000). All families with children aged 0-35 months were asked to participate, resulting in a sample of 156 families from 150 households, and representing 92% of all eligible families present. Anthropometric measures were collected in participants' homes during initial ethnographic interviews. A subsample of 41 infants who were less than one year of age at the time of initial interview were recruited for prospective follow-up study, with follow-up measures taken approximately every 6 weeks for the next 8 months. Subjects followed prospectively contributed 2-6 measures total (mean ± SD = 3.2 ± 1.4 measures per subject), with the number of measures varying due to age at entry and intermittent absences. A total of 287 anthropometric measures are included in the final mixed-longitudinal sample (156 from initial interviews, 131 from follow-up). Male infants and remote villages were over-represented in the follow-up group as compared to the cross-sectional only group (Table 1).
Infant recumbent length was measured to the nearest 0.5 cm using a pediatric measure mat. Maternal and child standing heights were measured to the nearest 0.1 cm using a portable Seca 217 stadiometer. All heights were measured in duplicate and averaged in the event of a discrepancy. Maternal and child weights were measured to the nearest 0.1 kg with a digital scale (Tanita BF680W Duo Scale), using the tare method to weigh infants-in-arms. The scale was placed on a small raised wooden platform to minimize measurement error on the uneven surfaces of participant homes. All subjects were weighed fully clothed (Tsimane women typically wear lightweight skirts and tops). Infants and young children are typically dressed in only a tshirt or a t-shirt with lightweight cotton pants (they do not wear diapers). Infants were removed from swaddling materials before measurement. Weight and standing height were measured barefoot.
Current feeding status (exclusive breastfeeding, breastfeeding, weaned) was determined by maternal 24-hour recall reported at all interviews. Age of complementary feeding (CF) introduction was recorded from maternal recall at initial interview (for non-exclusively breastfeeding children) or when a change in feeding status was first reported (for exclusively breastfeeding children in the prospective sample). Birth order, preceding interbirth interval, and number of live siblings under the age of five were determined through maternal interviews and checked against family health records and demographic and census information previously collected by the Tsimane Health and Life History Project [30].

Ethics statement
All study protocols were approved by the University of California Santa Barbara Institutional Review Board. Participant consent and approvals to conduct the research in Bolivia were obtained through several channels. The Tsimane Health and Life History Project maintains formal agreements with the local municipal government of San Borja and the Tsimane governing body (Gran Consejo Tsimane') to conduct research with Tsimane communities. MM additionally arranged independent agreements to conduct the present study with the Gran Consejo and leaders of participating study communities. In compliance with national requirements for conducting scientific research in Protected Areas of Bolivia, approval for the study was granted from the Estación Biológica del Beni and the Ministerio del Medio Ambiente y Agua. The purpose of the study was explained to each of the study villages in community meetings held prior to beginning data collection and individually during participant recruitment. Participants gave verbal informed consent before each interview and follow-up visit, as most Tsimane women are illiterate. Mothers gave verbal consent for infant participation. Verbal consent was not recorded. The verbal consent procedure was described in the protocol approved by the Institutional Review Board. Participants were compensated with small care packages that included household goods (e.g. yarn, thread, combs) and over-the-counter medicines (e.g. paracetamol, salve).

Z score calculations and statistical analyses
WHO and Tsimane LMS HAZ, WAZ, and WHZ scores were calculated using the open-source 'localgrowth' R package (https://github.com/adblackwell/localgrowth), which was developed from previously published databases and R code [12,38]. To maintain comparable sample sizes across models, we removed observations from two subjects with unknown previous IBIs, and four observations with missing height or weight measures. The final sample included 152 subjects and 281 mixed-longitudinal measures.
We examined WHO-and Tsimane-derived z-scores in association with several locally relevant static and age-related independent variables expected to influence growth outcomes. Factors expected to be associated with poorer growth outcomes included shorter interbirth intervals and higher parity [39][40][41], remote village residence or rainy season birth [42][43][44][45] greater number of household dependents under the age of five [46], and relatively early complementary feeding introduction (0-3 months) or weaning [47,48] (see S1 Text for extended discussion of selected variables). We first constructed a baseline linear mixed-effects model consisting of child ID as a random effect, and child sex, age (in months), and maternal height as fixed effects. This baseline model was run separately for each of WHO and Tsimane HAZ, WAZ, and WHZ scores, using the full sample of all observations for children aged 0-35 months (n subjects = 152, n observations = 28, see Models 1a-e in S1 Table). The following variables were then considered additively in separate models (Models 2-6a-e in S1 Table): previous IBI, birth order, number of dependents under the age of five, village distance to market, and birth season. Additional models considered age-related feeding practices-duration of exclusive breastfeeding and weaning status-and were run on age-specific subsets: exclusively breastfeeding vs. breastfeeding with complementary feeding (ages 0-5 months, Models 7a-f in S1 Table); complementary feeding introduction at 0-3 vs. 4-6 months (ages 6-35 months, Models 8a-f in S1 Table); breastfeeding vs. weaned (ages 6-35 months, Models 9a-f in S1 Table). Because the aim of this study was to compare statistical relationships between independent variables and WHO-vs. Tsimane-derived z-scores in a variety of plausible models, we did not correct for multiple comparisons or apply model selection criteria to individual models.
All models were run using the lme4 package in R. AIC and BIC were extracted from summary results. Due to differences in sample sizes, AIC values are only comparable for models 1a, 1b -6a, 6b. Wald confidence intervals were extracted from standard errors using built-in commands. Marginal and conditional R 2 values and p-values were estimated using command features of the piecewiseSEM and lmerTest packages. Data

Model comparisons
Standardized beta coefficients estimated separately for Tsimane and WHO z-score derived models generally agreed in the direction, magnitude, and significance of many, but not all independent variables (see Table 3 for a visual summary of model comparisons, and S1 Table for full model results). WHO and Tsimane-derived HAZ scores were positively and similarly associated in magnitude for maternal height, IBI � 33 vs. < 33 months, parity � 7 vs. 2-6 and CF introduction at 0-3 vs. 4-6 months. In neither WHO or Tsimane-derived models were growth outcomes associated with EBF vs. CF status in children 0-5 months, or village region or birth season in the full mixed-age sample. No additive independent variables examined were associated with WHZ in either WHO or Tsimane models (Table 3).
WHO-derived outcomes differed in specific models estimating significant associations with sex (HAZ and WAZ Models 2-3 in S1 Table), primiparous vs. 2-6 births (HAZ Model 3 in S1 Table), number of siblings under age 5 (HAZ and WAZ Model 4 in S1 Table), CF at 0-3 vs. 4-5 months (WAZ Model 8 in S1 Table), and weaned vs. breastfed (HAZ Model 9 in S1 Table). Child age was strongly and negatively associated with WHO but not Tsimane z-scores in all HAZ and WAZ models run on the full, mixed-age sample (Models 1-6 in S1 Table), and with HAZ in models restricted to children ages 6-35 months (Models 8a-b, 9ab in S1 Table). Age was positively associated with WHO and Tsimane-derived HAZ, WAZ, and WHZ among infants 0-5 months, though the association was significant only in Tsimane models. Fig 2 illustrates similarities in Model 8a-b (S1 Table) for HAZ in association with all coefficients except for age. Fig 3 illustrates differences estimated in Model 9a-b (S1 Table) for HAZ in association with age and weaning status Removing age from Model 2a but retaining maternal height, infant sex, and IBI did not change the effect of IBI, but resulted in a much poorer model fit: AIC increased from 788 to 844, while the marginal R 2 decreased from 33% to 10% (Age interactions in S1 Table).
Conversely, removing age from the corresponding Tsimane model resulted in a slightly improved model fit, with no change in the estimated effect of IBI (Age interactions in S1 Table). Similar results were obtained by removing age from models of the effect of age of CF  introduction on HAZ (Age interactions in S1 Table). Although interaction terms for age and CF were not significant, WHO and Tsimane models predicted different effects of CF on HAZ across infant ages (Fig 4). In Tsimane LMS models, children introduced CF earlier are predicted to have consistently higher HAZ scores across ages. At 9, 26, and 36 months, predicted height centiles for female children with mothers of average height and CF at 0-3 vs. 4-6 months were: 64 th , 67 th , 68 th vs. 52 nd , 55 th , 57 th . In contrast, WHO models predicted declining effects of CF on HAZ with age. At 9 and 26 months and holding other model terms constant, heights of children introduced CF at 0-3 vs.  4-6 were predicted at the 35 th and 7 th vs. the 16 th and 2 nd centiles. By 30 months, predicted height centiles converge to zero for both CF groups. Weaning, which was negatively associated with Tsimane-derived HAZ, was associated with WHO HAZ only after removing age from the model (Age interactions in S1 Table). There was a significant interaction between age and weaning status on WHO-derived HAZ (Age interactions in S1 Table), but component terms were highly correlated (variance inflation factors > 20).  Largely owing to differences in predicted age and sex effects, marginal R 2 values (total model variance explained by fixed effects) were consistently larger for WHO as compared to Tsimane LMS models. In other words, because the WHO references produced more variance in z-scores, and this variance was due to divergence from the references with sex and age, these two parameters necessarily accounted for a large proportion of the variance when added to the models. For example, fixed effects included in the baseline model (Models 1a-f, which include maternal height, infant sex, and infant age) explained 27%, 15%, and 1% of total variance in WHO-derived HAZ, WAZ, and WHZ scores, but 10%, 6%, and 1% of variance in corresponding Tsimane LMS scores (S1 Table). After adding IBI (Models 2a-f in S1 Table), marginal R 2 values increased to 33%, 20%, and 2% in, respectively, WHO HAZ, WAZ, and WHZ models, and to 15%, 10%, and 2% in corresponding Tsimane models.

Discussion
Accurate identification of local factors associated with child growth variance is relevant to anthropological research on child health disparities and fitness attributes [20], as well as for the development of population-specific interventions to improve child health [26].We examined if the use of WHO standards vs. robust within-population references differently influences growth determinant estimates in a mixed-age sample of indigenous Tsimane children. The standardized coefficients estimated by WHO and Tsimane models agreed across many models, however specific models differed in estimates of total variance explained, as well as the magnitude and significance of some coefficients (Table 3, S1 Table).
Particularly for HAZ, differences in model outcomes largely reflect greater age-related variance in WHO as compared to Tsimane-derived z-scores. These results are not wholly unexpected given the respective reference populations [4], and the fact that Tsimane LMS scores are pre-adjusted for within-population age and sex specific growth. In previous side-by-side comparisons, Tsimane mean 50 th centile values for height were equivalent to WHO 10 th centiles up through age two. Tsimane and WHO height velocity curves were similar until about three months of age, but then fall between WHO 10 th and 35 th percentiles up through age two [12]. Similarly, in this study, WHO and Tsimane HAZ scores did not substantively vary before 6 months of age (Models 7a-b in S1 Table), but age was consistently and negatively associated with WHO HAZ in mixed-age samples (Models 1-6 and 8-9a-b in S1 Table). The patterns observed here and previously may reflect the growth-inhibiting effects of infectious diseases and associated inflammatory responses generally during critical growth periods in infancy and early childhood [49][50][51].
Age-related increases in growth faltering relative to the WHO standards have been widely observed across low-and middle-income populations, reflecting systemically poorer nutritional and pathogenic conditions that critically influence growth prenatally and during the first 1000 days of life [21][22][23][24]. Knowing this, researchers may control for age or age group in regression models investigating growth determinants in mixed-age samples [28,29]. However, our results suggest additional methodological approaches may be warranted. For example, Tsimane children who are weaned are smaller than those who are still breastfeeding, though this relationship is only apparent in Tsimane LMS models. The main factors influencing weaning likelihood among the Tsimane are advancing child age and subsequent maternal pregnancyin itself a factor influenced by the time elapsed since birth [47,52]. Removing age from the Tsimane model weakens the association between HAZ and weaning but slightly improves model fit, while removing weaning does not affect the age term (Age interactions in S1 Table). In contrast, adjusting for age confounds the relationship between weaning and HAZ in the WHO models, but removing age results in a substantially poorer model fit (Age interactions in S1  Fig A in S1 Fig). Such a difference could influence modeling decisions or inferences in studies in which age-related confounding was not specifically being considered.
Age and age-related variables may be irrevocably confounded in mixed-age samples, or require larger sample sizes, more complex non-linear age terms, or more discrete variable measures to assess interactions or lagged effects. We also suggest that while greater variability generally reduces statistical power, associations for some variables-as appears to be the case for siblings under the age of five-may be disproportionately influenced by more extreme measures resulting from negative skew in WHO HAZ distributions (Fig B in S1 Fig). Studies with small sample sizes may be particularly susceptible to biasing due to extreme measures and age-confounding.
The two variables most robustly associated with growth outcomes in both WHO and Tsimane models-relatively longer IBIs and relatively earlier CF-merit additional brief commentary. The association between shorter prior IBI and lower HAZ is consistent with previous research linking shorter prior IBI to poorer child nutritional status and greater morbidity and mortality risks [40,53]. Future research in this population should consider if and how IBI correlates with risks at specific stages--e.g. reflecting a relationship between restricted fetal growth and size in early infancy, or sibling competition affecting growth at later stages.
We had hypothesized that early CF would be associated with poorer growth outcomes, due to likely increased pathogen exposure and reduced protective immunity and nutritional buffering from breastmilk [54][55][56][57][58][59]. However, CF relative to EBF was not associated with any growth outcomes in Tsimane infants aged 0-5 months, while earlier CF introduction was associated with greater mean HAZ in children 6-35 months of age (Model 8a-b in S1 Table). These results were robust to both WHO and Tsimane LMS models, suggesting that systemic growth faltering after 6 months in this population should not be attributed to the transition from EBF to CF. The relationship between earlier CF introduction and better growth outcomes may reflect reverse causality as has been observed elsewhere-i.e. with faster growing infants introduced complementary feeding earlier [60][61][62][63]. Continued intensive breastfeeding among the Tsimane may be sufficient to buffer additional infectious risks, or the quantity of foods and liquids given may be negligible enough to supplement without supplanting breast milk intake [47]. Additional longitudinal measures of infant size before and after CF introduction are needed to better assess the relative costs and benefits of early CF in this population.
These and other model associations should be cautiously interpreted, however, as our main objective was to assess differences in model estimates using WHO standards vs. Tsimane LMS references. We did not consider different modeling methods [4,64], adjust for multiple comparisons [65], or perform any variable selection procedures to determine the best approximating models [66]. We also stress that the differences in coefficients and significance estimates between WHO vs. Tsimane models were generally modest, and may be further minimized by a larger sample size or adjusted p-values. Our model comparisons must be reproduced in other populations to corroborate our conclusions regarding statistical inferences.
If our results are substantiated, the methodological implications may be particularly relevant for researchers working in small-scale populations and interested in fitness-relevant or modifiable environmental factors that influence growth outcomes. We have observed that variance in Tsimane children's WHO HAZ scores, which ultimately reflect differences in growth relative to an optimal potential, does not scale linearly with variance in stature in their own community. Population-wide genetic and environmental factors that influence systemic deviation from WHO standards may have little bearing on within-population variation in biological fitness or health outcomes [20]. As a result, variation in factors associated with differences in WHO-derived but not within-population-derived growth outcomes may have little biological relevance locally. As another example, WHO WAZ was lower in Tsimane males, which could be interpreted as reflecting sex-biased parental investment. However, growth curves for male and female Tsimane children do not substantially differ before 5 years of age [12], and sex differences were not apparent in Tsimane models. Tsimane boys may deviate from WHO standards to a greater extent than do girls for a variety of reasons, but sex-biasing in local behavioral or other growth determinants would only be inferred if sex had been a predictive factor of Tsimane WAZ scores.
If the research objective is to assess local determinants of variability in growth outcomes, longitudinal changes in age-specific cohorts or within-population growth references should ideally be used. Of course, adequately powered longitudinal studies are often not logistically feasible, and robust, up-to-date references are not available for many populations. Hermanussen and colleagues have devised a method for generating synthetic LMS growth reference charts for any population using limited sets of mean measurements [67]. Another approach might be to integrate the modelling of local growth curves and predictors of growth into a single model. This should ideally be more sophisticated that simply modelling, for example, height as a dependent variable with a linear age term in the model, since growth is rarely linear; rather nonlinear fits (i.e. using GAMLSS) should be used. Bayesian approaches (i.e. STAN/ brms, MCMCglmm) might also be useful for comparing or model averaging the posterior parameter estimates obtained with different growth modelling procedures.
Additional methods can also be employed to better estimate and substantiate the biological relevance of WHO-derived growth outcomes. For example, WHO HAZ scores and their determinants may be underestimated in statistical models run on mixed age samples (0-59 months), as the standard deviations used to derive them increase with age [25] and infants aged 0-23 months have been only partially exposed to harmful or protective environmental factors that cumulatively influence growth [26]. These biases may be avoided by analyzing absolute height-for-age differences (i.e. difference between observed height and WHO reference value) [25] and running separate multiple regressions for children 0-23 and 24-59 months of age [26]. In our study, WHO WHZ scores were approximately normally distributed and produced more concurrent results between WHO and Tsimane models. WHZ may be a less biased proxy of within-population variability in size, though there we observed very little variability in WHZ in our study. Finally, researchers can emphasize the biological relevance, rather than statistical significance of, observed variation in WHO z-scores by a priori establishing clinically or epidemiologically relevant thresholds for interpreting coefficient estimates of WHO-or within-population-derived z-scores [20].
In closing, we stress that the WHO growth standards remain ideal for between-population comparisons and assessment of large-scale health interventions or secular trends that may impact the prevalence of poor or excess nutrition within specific populations. We do not dispute that Tsimane children's smaller sizes relative to those of children living under more optimal conditions are in large part due to modifiable environmental conditions that severely impact their health [68,69]. However, these conditions are experienced similarly across ages and families. Individual, maternal, and household factors likely exert more influence on local variation in growth outcomes than these systemic conditions. Age-related skew in WHO HAZ scores may hinder accurate identification of these factors and the magnitude of their effects. For future studies with this population, Tsimane LMS z-scores will likely be more accurate than WHO z-scores in identifying local growth determinants and individuals with locally aberrant growth patterns [12]. Researchers working with other small-scale populations may consider other approximate growth references or additional methodological steps when examining the influence of local growth determinants.