Enteropathogen seroepidemiology among children in low-resource settings

Little is known about enteropathogen seroepidemiology among children in low-resource settings. We measured serological IgG response to eight enteropathogens (Giardia intestinalis, Cryptosporidium parvum, Entamoeba histolytica, Salmonella enterica, enterotoxigenic Escherichia coli, Vibrio cholerae, Campylobacter jejuni, norovirus) using multiplex bead assays in cohorts from Haiti, Kenya, and Tanzania. By age 2 years, most children had evidence of exposure by IgG response to the pathogens studied. We discovered a shift in IgG distributions for many pathogens as children age, caused by boosting and waning from repeated exposures, which complicates interpretation of seroprevalence among older children. Longitudinal profiles revealed important variation in enteropathogen IgG response above seropositivity cutoffs, underscoring the importance of longitudinal designs to estimate seroincidence rates as a measure of force of infection. In longitudinal cohorts there was a linear relationship between seroprevalence and prospective seroincidence rates, suggesting the two measures provide similar information about variation in pathogen transmission.


Introduction
A broad set of viral, bacterial, and parasitic enteropathogens are leading causes of the global infectious disease burden, with the highest burden among young children living in lower income countries [1] . Infections that result in acute diarrhea and related child deaths drive disease burden estimates attributed to enteropathogens, but recent advances in molecular assays have shown that asymptomatic infections are extremely common [2,3] . Understanding the full picture of enteropathogen exposure and infection in populations is important to assess transmission, disease burden, naturally acquired immunoprotection, as well as to design public health prevention measures, and measure intervention effects.
One approach to developing a higher-resolution picture of exposure to enteropathogens is to measure antibody responses in saliva or serum [4][5][6][7] . Since many enteropathogen infections are asymptomatic, antibody-based measures provide a more complete assessment of population exposure compared with clinical surveillance. For example, antibody-based estimates of incident exposure to Salmonella enterica and Campylobacter jejuni were 2-6 orders of magnitude higher than standard case-based surveillance in European populations [8][9][10][11] , and a recent study of Salmonella enterica serotype Typhi in Fiji found similarly high discordance between antibody-based cumulative incidence and case-based surveillance [12] . Beyond surveillance, studies have started to use enteropathogen-specific antibody responses to measure the effects of household-and school-based water, sanitation, and handwashing interventions [13][14][15] and to examine heterogeneity in exposure across environmental conditions [16][17][18][19] .
Many enteropathogens elicit a transiently high antibody response after infection that wanes over time. In lower transmission settings where antibody responses could be monitored longitudinally after distinct infections, Salmonella enterica , Campylobacter jejuni , Cryptosporidium parvum, and Giardia intestinalis (syn. Giardia lamblia , Giardia duodenalis ) immunoglobulin G (IgG) levels in blood have been shown to wane over a period of months since infection; IgM and IgA levels decline even more quickly [10,[20][21][22][23][24] . Using information about a population's mean antibody decay profile from the time of infection has inspired methods to estimate seroincidence rates from cross-sectional studies in low transmission settings [10,25] . As examples in this study will illustrate, in higher transmission settings frequent exposure and antibody boosting prevails, which could favor longitudinal designs to estimate seroincidence.
To our knowledge, there has been no detailed characterization of enteropathogen seroepidemiology among children in low-resource settings where transmission is intense beginning early in life [3] . Drawing on cohorts of young children in Haiti, Tanzania, and Kenya that measured serological antibody response to eight enteropathogens using multiplex bead assays [6,7,26,27] , our objectives were to identify general seroepidemiologic patterns across pathogens and populations and to evaluate methods that estimate epidemiologic parameters of interest, such as the force of infection. Our results provide new insights into the seroepidemiology of enteropathogens among children living in low-resource settings, provide guidance for future study design and analysis, and highlight key remaining knowledge gaps.

Study populations
The analysis included measurements from cohorts in Haiti, Kenya, and Tanzania. Blood specimens were tested for IgG levels to eight enteropathogens using a multiplex bead assay on the Luminex platform ( Table 1 ). The Haitian cohort included repeated measurements among children enrolled in a study of lymphatic filariasis transmission in Leogane from 1990Leogane from -1999 . Leogane is a coastal agricultural community west of Port au Prince. At the time of the study its population was approximately 15,000, most homes had no electricity and none had running water. In total, the Haiti study tested 771 finger prick blood specimens collected from 142 children ages birth to 11 years old, with each measurement typically separated by one year (median measurements per child: 5; range: 2 to 9). In Kenya, a 2013 prospective trial of locally-produced, in-home ceramic water filters enrolled 240 children in a serological substudy [26] . Study participants were identified through the Asembo Health and Demographic Surveillance System, which is located in a rural part of Siaya County, western Kenya along the shore of Lake Victoria. Only 29% of the population had piped drinking water (public taps), water source contamination with E. coli prevailed (93% of samples tested), and the average age children began consuming water was 4 months [26] . Children aged 4 to 10 months provided blood specimens at enrollment, and again 6 to 7 months later (n=199 children measured longitudinally). In Tanzania, 96 independent clusters across 8 trachoma-endemic villages in the Kongwa region were enrolled in a randomized trial to study the effects of annual azithromycin distribution on Chlamydia trachomatis infection [27] . The population is very rural, and water is scarce in the region: at enrollment, 69% of participants reported their primary drinking water source, typically an unprotected spring, was >30 minutes walk one-way. From 2012 to 2015, the Tanzania study tested dried blood spots from between 902 and 1,577 children ages 1-9 years old in annual cross-sectional surveys (total measurements: 4,989). Although children could have been measured repeatedly over the four-year study in Tanzania, they were not tracked longitudinally. There was no evidence that the Kenya and Tanzania interventions reduced enteropathogen antibody response ( Supplementary Information File 1 ), so this analysis pooled measurements over arms in each population.

Age-dependent shifts in population antibody distributions
We estimated seropositivity cutoffs using receiver operator characteristic (ROC) curve analyses for Giardia, Cryptosporidium, and Entamoeba histolytica including a panel of external, known positive and negative specimens, as previously reported [6,26] . We used two within-sample approaches to estimate seropositivity cutoffs for pathogens without ROC-based cutoffs: we used Gaussian mixture models [30] fit to measurements among children ages 0-1 year old (to ensure a sufficient number of unexposed children), and we used presumed seronegative distributions among children who experienced large increases in antibody levels (details in Methods). Classification agreement was high between the different estimation approaches (agreement >95% for most comparisons; Supplementary Information File 2 ). Mixture models failed to estimate realistic cutoff values if there was an insufficient number of unexposed children, which was the case for enterotoxigenic Escherichia coli heat labile toxin β subunit (ETEC LT β subunit) and Vibrio cholerae toxin β subunit in all cohorts, and for nearly all antigens in Tanzania where the study did not enroll children <1 year old ( Table 1 ).
Among children <2 years old, antibody levels clearly distinguished subpopulations of seronegative and seropositive, but there were not distinct seronegative and seropositive subpopulations by age 3 years for most pathogens measured in Haiti ( Figure 1 ) and Tanzania ( Figure 1 -supplement 1 ). By age 3 years, the majority of children were seropositive to Cryptosporidium , ETEC LT β subunit, and norovirus GI.4 and GII.4; in all cases antibody distributions were shifted above seropositivity thresholds. In contrast, there was a qualitative change in the antibody response distributions to Giardia , E. histolytica , Salmonella and Campylobacter with increasing age, shifting from a bimodal distribution of seronegative and seropositive groups among children ≤1 year old to a unimodal distribution by age 3 years and older ( Figure 1, Figure 1 -supplement 1 ). A direct comparison of age-dependent shifts in antibody distributions to Giardia VSP-3 antigen and C. trachomatis Pgp3 antigen in Tanzania illustrates stark differences in enteropathogen-generated immune responses versus pathogens like C. trachomatis that elicit a response that consistently differentiates exposed and unexposed subpopulations as children age ( Figure 1 -supplement 2 ). Distributions of IgG levels in the younger Kenyan cohort (ages 4-17 months) showed distinct groups of seropositive and seronegative measurements for most antigens ( Figure 1 -supplement 3 ). IgG responses to ETEC LT β subunit and cholera toxin β subunit were near the maximum of the dynamic range of the assay for nearly all children measured in the three cohorts ( Figure 1 , Figure 1 -supplement 1 , Figure 1 -supplement 3 ), with evidence of waning IgG levels as children aged, presumably from adaptive immunity ( Figure 1 , Figure 1 -supplement 1 ).

Joint variation in antibody response
We examined joint variation in IgG responses to determine whether multiple recombinant antigens for the same pathogen provide additional information for individual-level immune response and to determine whether there was evidence for cross-reactivity between closely related antigens. Responses to Giardia recombinant VSP antigens were strongly correlated in Haiti (Spearman rank: ⍴ =0.99), Kenya ( ⍴ =0.84) and Tanzania ( ⍴ =0.97), which would be expected due to cross-reactivity in the antigens ( Figure 2 ). Cryptosporidium (Cp17, Cp23) and Campylobacter (p18, p39) antigens were strongly correlated but high within-individual variability suggests that measuring multiple recombinant antigens for these pathogens, derived from different proteins, yields more information about exposure than measuring one alone ( Figure 2 ). High correlation between Salmonella LPS Groups B and D, between norovirus GI.4 and GII.4, and between ETEC and V. cholerae likely reflected antibody cross-reactivity. Correlation could also result from multiple previous infections with different Salmonella serogroups or different norovirus genogroups. We excluded cholera β toxin antibody responses from remaining analyses because of the difficulty of interpreting its epidemiologic measures in light of high levels of cross-reactivity with ETEC heat labile toxin β subunit, since heat labile toxin-producing ETEC is very common among children in low-resource settings [3] , and there was no documented transmission of cholera in the study populations during measurement periods. A broader comparison across all antigens in each cohort revealed no other antigen combinations with high correlation ( Supplementary Information File 3 ).
Birth to three years of age: a key window of antibody acquisition and primary seroconversion IgG levels increased rapidly with age for most pathogens in all three cohorts ( Figure 3 ). There was some evidence of maternally-derived IgG among children under 6 months old with a drop in mean IgG levels by age, but this pattern was only evident for norovirus GI.4 in Haiti ( Figure 3 ) and Cryptosporidium Cp17 and Cp23 in Kenya ( Figure 3 -supplement 2 ). Age-dependent mean IgG responses to many pathogens declined from a young age, presumably from exposure early in life and acquired immunity. Mean IgG levels declined after age 1 year for Giardia in Haiti

Longitudinal antibody profiles
In Haiti and Kenya we examined longitudinal IgG profiles among children and compared them with seropositivity cutoffs. Vaccine immunogenicity and pathogen challenge studies among healthy adults often use a 4-fold increase in antibody levels (difference of +0.6 on the log 10 scale) as a criterion for seroconversion [31][32][33] . In Kenya, 4-fold increases in IgG largely coincided with a change in status from seronegative to seropositive, presumably because increases in IgG followed primary exposure in the young cohort ages 4-17 months ( Figure 5 ). Many Kenyan children exhibited >4-fold increases and decreases in IgG response to Campylobacter p18 and p39 antigens above the seropositivity cutoff, a result of earlier primary infection and/or additional infection and boosting during the study period ( Figure 5 ). In the Haitian cohort, which was followed beyond the window of primary exposure, children commonly had >4-fold increases and decreases in IgG while remaining above seropositivity cutoffs-a pattern observed across pathogens but particularly clear for Cryptosporidium ( Figure 5 -supplement 1 ).

Force of infection estimated by seroconversion rates
The seroconversion rate, an instantaneous rate of seroconversion among those who are susceptible, is one estimate of a pathogen's force of infection and a fundamental epidemiologic measure of transmission [34] . Longitudinal designs in Haiti and Kenya enabled us to use individual child antibody profiles to estimate average incidence rates of seroconversion and seroreversion during the study. We defined incident seroconversions and seroreversions as a change in IgG across a pathogen's seropositivity cutoff. In a secondary analysis, we defined incident boosting as a ≥4-fold increase in IgG to a final level above a seropositivity cutoff, and incident waning as ≥4-fold decrease in IgG from an initial level above a seropositivity cutoff. The secondary definition captured large changes in IgG above seropositivity cutoffs, which aligned with repeated boosting and waning observed in the Haitian cohort ( Figure 5 -supplement 1 ).
Seroconversion rates in Kenya ranged from 0.1 seroconversions per child-year for E. histolytica to >3 seroconversions per child-year for ETEC and Campylobacter ( Figure 6) . Highest transmission pathogens (ETEC, Campylobacter ) were dominated by seroconversion, whereas seroreversion rates were higher than seroconversion rates for E. histolytica and Salmonella ( Figure 6 ). The proportion of children who changed serostatus over the 6 month follow-up period reflected the same pattern as seroconversion rates, and exceeded 90% for ETEC and Campylobacter ( Figure 6 ).
In Haiti, seroconversion rates ranged from 0.34 seroconversions per child-year ( E. histolytica ) to 1.13 seroconversions per child-year (ETEC) ( Table 2 ). Seroincidence rates estimated from 4-fold changes in IgG led to more events identified and higher rates compared with those estimated from seroconversion alone ( Table 2 ). For example, Cryptosporidium incident cases increased from 70 to 190 (a 2.7 fold increase) and the average rate increased from 0.64 (95% CI: 0.54, 0.77) to 0.80 (95% CI: 0.70, 0.90) per child-year (a 1.3 fold increase) when using a 4-fold IgG change criteria because of substantial IgG boosting and waning above the seropositivity cutoff ( Figure 5 -supplement 1 ).
There was an approximately linear relationship between seroprevalence and prospectively estimated seroconversion rates in Kenya and Haiti ( Figure 7 ). Overall levels and slope of the relationship differed considerably between cohorts, presumably because Kenya measurements were within a window of primary exposure for most children (4-17 months) whereas Haiti measurements extended from birth to 11 years and captured lower incidence periods with overall higher seroprevalence as children aged.
Since seroprevalence and seroconversion rates appeared to embed similar information about relative strength of enteropathogen transmission, we attempted to estimate seroconversion rates directly from age-structured seroprevalence in Kenya. We estimated seroconversion rates from seroprevalence curves using methods developed for cross-sectional ("current status") data, a common approach in serosurveillance of vaccine preventable diseases [34] , malaria [35] , and dengue [36] ( Supplementary Information File 4 ). Seroconversion rates estimated from cross-sectional, semiparametric spline models were similar to rates estimated prospectively for all pathogens, but had substantially wider confidence intervals owing to loss of information by ignoring the longitudinal data structure ( Figure 6 -supplement 1 ). Parametric approaches including an exponential proportional hazards model [37] and a reversible catalytic model [35] , which both assumed a constant seroconversion rate over the age range, had narrower confidence intervals than the semiparametric spline model but tended to underestimate seroconversion rates compared with longitudinal estimates ( Figure 6 -supplement 1 ).

Discussion
Across cohorts in Haiti, Kenya, and Tanzania, we identified consistent patterns in IgG responses that provide new insights into enteropathogen seroepidemiology among children in low-resource settings. We observed the most population-level heterogeneity in IgG levels and seroconversion between birth and 3 years, reflecting high transmission and early life primary exposure. In older children, IgG levels either remained uniformly high, or began to wane through adaptive immunity. For particularly high transmission pathogens (e.g., ETEC, Campylobacter ), most variation in IgG levels was observed among children <1 year old. A novel discovery was that for Giardia, Cryptosporidium, E. histolytica, and Campylobacter population antibody level distributions shifted with age: distributions among children <1 year were generally bimodal and reflected seronegative and seropositive subpopulations, but by age 3 years were unimodal and not centered clearly above or below seropositivity cutoffs. The shift of IgG distributions from bimodal to unimodal, resulting from a combination of antibody boosting, waning and adaptive immunity, complicates the interpretation of seroprevalence at older ages: among older children a seronegative response could either mean the children were never exposed or they were previously exposed but antibody levels waned below seropositivity cutoffs. Our findings suggest that among children in low-resource settings, seroprevalence and other derived epidemiologic parameters like seroconversion rates are most clearly interpretable when estimated within a window of primary infection, and the window likely closes by age 2-3 years. High levels of between-individual variability in antibody response identified through longitudinal profiles in Haiti and Kenya, with significant boosting and waning of antibody levels above seropositivity cutoffs, reinforce the value of longitudinal designs to derive antibody distributions among unexposed and to estimate seroincidence rates.
Measuring heterogeneity in exposure with mean IgG levels avoids the need to define seropositivity cutoffs and the difficulty of shifting interpretations of seroprevalence as children age, which are both germane considerations for enteropathogens. For comparisons within-study and within-pathogen, such as endpoints in randomized trials, comparisons based on quantitative IgG response will have high internal validity and require almost no assumptions [7] . Yet, the absence of global reference standards to translate arbitrary units of an assay into antibody titers can complicate comparisons across assays or studies; moreover, quantitative responses provide only indirect information about actual transmission in a population, which prevents direct comparisons between pathogens. Classifying responses as seronegative and seropositive to estimate seroprevalence and seroconversion rates resolves these limitations and extends inference to more epidemiologically interpretable parameters, such as force of infection.
The step of defining seropositivity cutoffs presents challenges for enteropathogens, but our findings provide guidance for future studies. Our results demonstrate that within-sample approaches to estimate seropositivity cutoffs are feasible for enteropathogens, but there are important considerations that must be addressed through design and analysis. Two-component mixture models fit the data and provided reasonable cutoff estimates only when restricted to an age range that included two clearly delineated subpopulations of seronegative and seropositive responses. For most pathogens studied, this required measurements among children <1 year old, an age range during which IgG responses still followed a bimodal distribution. The Tanzania study enrolled children 1 to 9 years old, and it was impossible to characterize the distribution of unexposed IgG levels with a mixture model for any enteropathogens except Giardia and E. histolytica ( Figure 1 -supplement 1 ). Even in the Haiti and Kenya cohorts, which measured children <1 year, the only reliable approach to estimate seropositivity cutoffs for the highest transmission pathogens like ETEC and Campylobacter was to estimate a distribution among presumed unexposed by identifying measurements among children who subsequently experienced a large increase in IgG ( Figure 1, Figure 1 -supplement 3 ), a strategy only possible in a longitudinal design. High levels of agreement (>95%) across multiple cutoff approaches, consistent with findings from trachoma serology [38] , supports a pragmatic approach that could include multiple strategies within the same study, depending on data availability and on the empirical distribution of IgG response. Measuring a sufficient number of young children before primary infection, preferably with longitudinal measurements, will help ensure that within-sample seropositivity cutoff estimation is possible.
The instantaneous seroconversion rate is considered a more direct measure of transmission than seroprevalence and is widely used as one measure of a pathogen's force of infection [34] . Pathogens with the fastest rising IgG and seroprevalence curves (e.g., ETEC, Campylobacter ; Figure 3, Figure 4 ) also had the highest seroconversion rates measured prospectively over the study period. Moreover, average seroprevalence, equal to the area under the age-seroprevalence curve [7] , was linearly related to seroconversion rates in both Haiti and Kenya ( Figure 7 ). These results suggest that seroprevalence will likely be rank-preserving for comparing relative pathogen transmission if measured over an age range with increasing seroprevalence. Yet, differences in the slope of the relationship between seroprevalence and seroconversion rates across countries suggests that the mapping between epidemiologic measures is context dependent, and in this study was likely influenced by differences in age and rates of seroreversion across countries. Finally, our comparison of seroconversion rate estimates in the Kenya cohort derived from age-structured seroprevalence that ignored the longitudinal data structure showed that although the estimates were broadly comparable, cross-sectional methods were generally more variable, biased, or both compared with longitudinal estimates ( Figure 6 supplement 1 ). If enteropathogen seroincidence rates (force of infection) are a key parameter of interest, our results support the use of longitudinal designs that are more efficient and can estimate seroreversion rates, significant for many enteropathogens ( Figure 6, Table 2 ); cross-sectional measurements technically contain no information about seroreversion ( Supplementary Information File 4 ), which poses difficulties for enteropathogens.
Since many enteric infections are subclinical [2,3] , differences between pathogens in seroprevalence or seroconversion rates would not necessarily translate to differences in disease burden. For example, about 60% of ETEC strains globally produce heat labile toxin (LT), but of those only 33% also produce heat stable toxin [39] , which are the strains thought to contribute more to severe, acute diarrheal illness [3] . Higher quantities of enteropathogen genetic material in stool has been shown to distinguish clinical disease from asymptomatic infection [40,41] , but it is unlikely that serum IgG levels could be used in a similar way because of their potential role in protective immunity.
High correlation between Salmonella LPS Groups B and D, between ETEC LT β subunit and cholera toxin β subunit, and between norovirus GI and norovirus GII at the individual level ( Figure 2 ) was consistent with cross-reactivity, and suggests that antibody-based measures of exposure will be less specific than molecular assays for these enteropathogens [40] . Salmonella LPS Groups B and D have antigenic overlap in their oligosaccharides [42] , and ETEC LT β subunit and cholera toxin β subunit are known to be immunologically cross-reactive [43] . There was no known cholera transmission in the study populations so we assumed the elevated responses to the cholera toxin β subunit reflected exposure to LT-producing ETEC, but without confirmed measures of infection that remains an assumption. Norovirus GI and GII virus-like particles are antigenically different, but cross-reactivity between norovirus genogroups is possible [44] . In addition, a recent study from Uganda found high seroprevalence for both GI and GII norovirus suggesting repeated infections by viruses of the same or different genogroup which potentially may boost cross-reactive antibodies [45] .
This study had limitations. First, when measurement timing is independent from known infections, measurements will typically fall at intermediate points in the IgG boosting and waning process, and will thus be more variable than if they were carefully timed to precede and follow infection. Second, the Haiti and Tanzania cohorts were not originally designed to assess enteropathogens, and antibody measurements were not paired with measures of clinical symptoms of diarrhea or with measures of patent infection in stool. In Kenya, stool testing for pathogens was limited and low prevalence prevented detailed comparisons between serological and molecular results: for example, only 14 children were positive by PCR to Cryptosporidium [26] . Third, without definitive measures of infection we could only infer differences in transmission through antibody-based measures of exposure. Analyses of Plasmodium falciparum [46,47] and dengue virus [48] illustrate how paired, longitudinal measurements of patent infection and antibody response enable richer characterizations of antibody dynamics and more comprehensive assessments of infectious disease transmission. High-resolution, longitudinal assessment of paired enteropathogen infection and antibody measurements among children could provide valuable, additional insights into the pathogen-specific antibody kinetics following primary-and secondary infections. Such studies could also assess whether approaches to estimate enteropathogen incidence from cross-sectional samples, which rely on estimates of antibody decay with time since infection [10,25] , could be used among children in low-resource settings.

Conclusions
Among children in Haiti, Kenya, and Tanzania, antibody-based measures of enteropathogen exposure reflected high transmission with primary exposure to most pathogens occurring by age 1-2 years. For many enteropathogens, we observed a consistent shift in population-level IgG distributions from bimodal to unimodal between birth and age 3 years -a result of antibody boosting and waning from repeated exposure. This shift in distribution complicates the interpretation of seroprevalence among children beyond the window of primary exposure, and seroincidence rates must account for IgG boosting and waning above seropositivity cutoffs.
Antibodies are a promising approach to measure population-level enteropathogen exposure, and seroepidemiologic measures of heterogeneity and transmission are central considerations for their use in trials or in serologic surveillance. Our findings show that for most enteropathogens, the ideal window to measure heterogeneity in antibody response closes by age two years in low-resource settings, and studies that plan to estimate seroincidence rates (force of infection) should favor longitudinal designs with multiple measurements in this early age window.

Ethics statement
In Haiti, the human subjects protocol was reviewed and approved by the Ethical Committee of St. Croix Hospital (Leogane, Haiti) and the institutional review board at the US Centers for Disease Control and Prevention (CDC). After listening to an overview of the study, individuals were asked for verbal consent to participate. Verbal consent was deemed appropriate by both review boards because of low literacy rates in the study population. With each longitudinal visit, the study team re-consented participants before specimen collection. Mothers provided consent for children under 7, and children 7 years and older provided additional verbal assent. In Kenya, the human subjects protocol was reviewed and approved by institutional review boards at the Kenya Medical Research Institute (KEMRI) and at the US CDC. Primary caretakers provided written informed consent for their infant child's participation in the trial and blood specimen collection and testing [26] . The original trial was registered at clinicaltrials.org (NCT01695304). In Tanzania, the human subjects protocol was reviewed and approved by the Institute for Medical Research Ethical Review Committee in Dar es Salaam, Tanzania and the institutional review board at the US CDC. Parents of enrolled children provided consent, and children 7 years and older also provided verbal assent before specimen collection.

Multiplex bead assays
Sera from the Haiti cohort study were diluted 1:400 and analyzed by multiplex bead assay as described in detail elsewhere [6,7,29] . Lipopolysaccharides (LPS) from Group D Salmonella enterica serotype Enteritidis, LPS from Group B S. enterica serotype Typhimurium, and recombinant heat labile toxin β subunit protein from enterotoxigenic Escherichia coli (ETEC LT β subunit) were purchased from Sigma Chemical (St. Louis, MO). Recombinant Campylobacter p18 and p39 antigens [49] were expressed and purified as previously described [14] . Virus-like particles from norovirus GI.4 and GII.4 New Orleans were purified from a recombinant baculovirus expression system [7,50] . These antigens were coupled to SeroMap (Luminex Corp, Austin, TX) beads in buffer containing 0.85% NaCl and 10 mM Na 2 HPO 4 at pH 7.2 (PBS) using 120 micrograms for 1.25 x 10 7 beads using the methods described by Moss and colleagues [51] . Coupling conditions and externally defined cutoff values for the Giardia , Cryptosporidium , and E. histolytica antigens as well as for the Schistosoma japonicum glutathione-S -transferase (GST) negative control protein have been previously reported [6] .
For the Kenya cohort study, an optimized bead coupling technique using less total protein was performed in buffer containing 0.85% NaCl and 25 mM 2-(N-morpholino)-ethanesulfonic acid at pH 5.0. The β subunit protein from cholera toxin was purchased from Sigma Chemical. The GST negative control protein (15 μg), Cryptosporidium Cp17 (6.8 μg) and Cp23 (12.5 μg) proteins and the Campylobacter p39 protein (25 μg) were coupled to 1.25 x 10 7 beads using the indicated protein amounts [14] . The Giardia , E. histolytica , ETEC, cholera, and Campylobacter p18 proteins were coupled using 30 μg of protein per 1.25 x 10 7 beads. Salmonella LPS B and LPS D were coupled to the same number of beads using 60 μg and 120 μg, respectively. Blood spot elutions from the Kenya study were diluted to a final serum concentration of 1:400 (assuming 50% hematocrit) and analyzed by multiplex bead assay as described by Morris and colleagues [26] .
For the Tanzania cohort study, the same conditions described in the Kenya cohort study were used to couple antigens from Giardia , Cryptosporidium , E. histolytica , ETEC β toxin subunit, cholera β toxin subunit, GST, and Salmonella LPS group B and LPS group D. Campylobacter p39 and p18 were both coupled at 25 μg per 1.25 x 10 7 beads in buffer containing 0.85% NaCl and 25 mM 2-(N-morpholino)-ethanesulfonic acid at pH 5.0. Dried blood spots were eluted in the casein-based buffer described previously [24] and samples were diluted to either 1) 1:400 serum dilution with 50 µl run per well for year 1 or 2) 1:320 serum dilution with 40 µl run per well for years 2-4. All samples were run in duplicate. The incubation steps, washes, and data collection methods used in the multiplex bead assay were performed as described previously [24]. All samples were run in duplicate, and the average median fluorescence intensity minus background (MFI-bg) value was recorded. The Tanzania study used different bead lots in year 1 and years 2-4; we confirmed that the use of different bead lots had no influence on the results ( Supplementary Information File 1 ).

Antibody distributions and determination of seropositivity
In all analyses, we transformed IgG levels to the log 10 scale because the distributions were highly skewed. Means on the log-transformed data represent geometric means. We summarized the distribution of log 10 IgG response using kernel density smoothers. In the Tanzania and Haiti cohorts, where children were measured across a broad age range, we stratified IgG distributions by each year of age <3 years to examine age-dependent changes in the population distributions. To assess potential cross-reactivity between antigens, we estimated pairwise correlations between individual-level measurements in each cohort using a Spearman rank correlation [52] and visualized the relationship for each pairwise combination with locally weighted regression fits [53] .
We compared three approaches to estimate seropositivity cutoffs. Approach 1: External known positive and negative specimens were used to determine seropositivity cutoffs for Giardia VSP-3 and VSP-5 antigens, Cryptosporidium Cp17 and Cp23 antigens, and E. histolytica LecA antigen.
Cutoffs were determined using ROC analysis as previously described [6,26] for all antigens except for LecA, VSP-3, and VSP-5 in Haiti; in these cases, the mean plus 3 standard deviations of 65 specimens from citizens of the USA with no history of foreign travel were used to estimate cutoffs [6] . Approach 2: We fit a 2-component, finite Gaussian mixture model [30] to the antibody distributions among children 0-1 years old, and estimated seropositivity cutoffs using the lower component's mean plus three standard deviations. The rationale for restricting the mixture model estimation in Haiti and Tanzania to children 0-1 years old was based on initial inspection of the age-stratified IgG distributions that revealed a shift from bimodal to unimodal distributions by age 3 ( Figure 1 ), and to ensure that there was a sufficiently large fraction of unexposed children in the sample to more clearly estimate a distribution among seronegative children. Approach 3: In the longitudinal Haiti and Kenya cohorts we identified children <1 year old who presumably seroconverted, defined as an increase in MFI-bg values of +2 or more on the log 10 scale. A sensitivity analysis showed that an increase of 2 on the log 10 scale was conservative to identify seroconversion for most antibodies considered in this study; an increase of between 0.3 to 2.16 MFI-bg lead to optimal agreement with ROC-based and mixture model-based classifications in Kenya, and an increase of 0.92 to 2.41 led to optimal agreement across antigens and references in Haiti ( Supplementary Information File 5 ). We then used the distribution of measurements before seroconversion to define the distribution of IgG values among the presumed unexposed. We used the mean log 10 MFI-bg plus three standard deviations of the presumed unexposed distribution as a seropositivity cutoff. We summarized the proportion of observations that were in agreement between the three classification approaches, and estimated Cohen's Kappa [54] . Additional details and estimates of seropositivity cutoff agreement are reported in Supplementary Information File 2 .
In analyses of seroprevalence and seroconversion, we classified measurements as seropositive using ROC-based cutoffs if available, and mixture model-based cutoffs otherwise. There were three exceptions. By age 1 year, a majority of children across the cohorts had IgG levels near the maximum of the assay's dynamic range for enterotoxigenic Escherichia coli heat labile toxin β subunit (ETEC LT β toxin) and Vibrio cholerae toxin β subunit. The absence of a sufficient number of unexposed children to ETEC LT β toxin, V. cholerae , and in some cases Campylobacter led mixture models either to not converge or to estimate unrealistically high seropositivity cutoffs beyond the range of quantifiable levels. For these pathogens, we used seropositivity cutoffs estimated from presumed unexposed measurements in the longitudinal Haiti and Kenya cohorts (approach 3, above). High levels of agreement between classifications ( Supplementary Information File 2 ) meant results were insensitive to choice of approach in these cohorts.

Age-dependent antibody levels and seroprevalence curves
We estimated population mean IgG levels and seroprevalence by age using semiparametric cubic splines in a generalized additive model, specifying binomial errors for seroprevalence, and random effects for children or clusters in the case of repeated observations [55,56] . We also estimated the relationships by age using a stacked ensemble approach called "super learner" that included a broader and more flexible library of machine learning algorithms [7,57,58] , and found similar fits to cubic splines. We estimated approximate, simultaneous 95% credible intervals around the curves using a parametric bootstrap from posterior estimates of the model parameter covariance matrix [59] . Supplementary Information File 6 includes additional details.

Seroincidence and force of infection
In the longitudinal cohorts in Kenya and Haiti, we estimated prospective seroconversion rates by dividing the number of children who seroconverted by the person-time at risk between measurements. We defined incident seroconversions and seroreversions as a change in IgG across a pathogen's seropositivity cutoff. In a secondary analysis aimed at capturing events that might occur above a pathogen's seropositivity cutoff, we defined incident boosting episodes as a ≥4-fold increase in IgG to a final level above a seropositivity cutoff, and incident waning episodes as ≥4-fold decrease in IgG from an initial level above a seropositivity cutoff. In the secondary definition, individuals were considered at risk for incident boosting episode if they were seronegative, if they experienced a ≥4-fold increase in IgG in their first measurement period, or if they experienced a ≥4-fold decrease in IgG in a preceding period (Haiti). We assumed that incident changes in serostatus were interval-censored at the midpoint between measurements. We estimated 95% confidence intervals for incidence rates with 2.5 and 97.5 percentiles of a nonparametric bootstrap [60] distribution that resampled children with replacement to account for repeated observations.
In Kenya, we estimated the proportion of children who seroconverted over a 6-7 month period by dividing the number of children who seroconverted by the number at risk at the beginning of the period. This analysis excluded 6 of 199 children measured prospectively whose measurements were separated by < 6 months (n=5) or by 8 months (n=1). We estimated 95% confidence intervals for incidence proportions using an exact binomial distribution.
We considered three additional estimators of the seroconversion rate in the Kenyan cohort using methods for "current status" (cross-sectional) data to see if methods based on age-structured seroprevalence curves alone were consistent with prospectively estimated rates. Current status approaches included: semiparametric cubic splines that estimate age-specific rates [34] , which were averaged over the empirical age distribution, an exponential proportional hazards model that assumed a constant rate over ages [37] , and a reversible catalytic model with constant seroconversion and seroreversion rate parameters [35] . The first two methods assume no seroreversion. Supplementary Information File 4 includes additional details.

Data availability and replication files
Analyses were not pre-specified since our objectives were descriptive and methodologic. Analyses were conducted in R version 3.5.0. We have reported all analyses conducted. All data and computational notebooks used to complete the analyses are available through GitHub and the Open Science Framework ( osf.io/r4av7 ).

Supplementary Information Files
Supplementary Information File 1. No effect of intervention or bead lot on enteropathogen antibody response in Kenya and Tanzania ( osf.io/vdp9a ).

Supplementary Information File 5.
Sensitivity analysis of change in IgG used to identify presumed unexposed measurements in Haiti and Kenya ( osf.io/gq9px ).

Supplementary Information File 6.
Estimation of age-dependent means and seroprevalence using multiple approaches ( osf.io/azsbf ).            . Age dependent seroprevalence curves fit with cubic splines among children ages birth to 10 years in Leogane, Haiti, 1990Haiti, -1999      Horizontal dashed lines mark seropositivity cutoffs for each antibody. The number of children measured at each visit was: n 1 =142, n 2 =142, n 3 =140, n 4 =131, n 5 =111, n 6 =66); 29 children had >6 measurements that are not shown. IgG response measured in multiplex using median fluorescence units minus background (MFI-bg) on the Luminex platform. Created with notebook (https://osf.io/wnt3r), which includes additional visualizations, and data (https://osf.io/3nv98). Proportion seroconverted (6 month period) Figure 6. Enteric pathogen seroconversion and seroreversion incidence rates and proportion that seroconverted among 199 children ages 4 to 17 months measured longitudinally in Asembo, Kenya, 2013. Vertical lines mark bootstrap 95% confidence intervals. Children who seroconverted were those who were seronegative at enrollment and seropositive at follow-up 6-7 months later; seroreversion was the opposite. The upper bound of the confidence interval for Campylobacter seroconversion rate extends to 7.5 and is truncated in the figure. IgG response measured in multiplex using median fluorescence units minus background (MFIbg) on the Luminex platform (N=398 measurements from 199 children). Created with notebook (https://osf.io/sqvj7) and data (https://osf.io/2q7zg). Supplement 1 compares prospective seroconversion rates with estimates from age-seroprevalence curves. Cross-sectional estimates of incidence were derived from age-specific seroprevalence curves using nonparametric cubic splines (splines), a parametric constant rate survival model (exponential), and a reversible catalytic model (RCM) that assumed a constant seroconversion rate and fixed seroreversion rate (estimated from the prospective analysis). Vertical lines mark 95% confidence intervals. (N=398 measurements from 199 children). Created with notebook (https://osf.io/3kczj) and data (https://osf.io/2q7zg).  Note that axis scales differ between panels to best-illustrate the estimates. Created with notebook (https://osf.io/pz8bd) and data (https://osf.io/2q7zg, https://osf.io/3nv98).