Enteropathogen antibody dynamics and force of infection among children in low-resource settings

Little is known about enteropathogen seroepidemiology among children in low-resource settings. We measured serological IgG responses to eight enteropathogens (Giardia intestinalis, Cryptosporidium parvum, Entamoeba histolytica, Salmonella enterica, enterotoxigenic Escherichia coli, Vibrio cholerae, Campylobacter jejuni, norovirus) in cohorts from Haiti, Kenya, and Tanzania. We studied antibody dynamics and force of infection across pathogens and cohorts. Enteropathogens shared common seroepidemiologic features that enabled between-pathogen comparisons of transmission. Overall, exposure was intense: for most pathogens the window of primary infection was <3 years old; for highest transmission pathogens primary infection occurred within the first year. Longitudinal profiles demonstrated significant IgG boosting and waning above seropositivity cutoffs, underscoring the value of longitudinal designs to estimate force of infection. Seroprevalence and force of infection were rank-preserving across pathogens, illustrating the measures provide similar information about transmission heterogeneity. Our findings suggest antibody response can be used to measure population-level transmission of diverse enteropathogens in serologic surveillance.


Introduction
A broad set of viral, bacterial, and parasitic enteropathogens are leading causes of the global infectious disease burden, with the highest burden among young children living in lower income countries (GBD 2016 DALYs andHALE Collaborators, 2016). Infections that result in acute diarrhea and related child deaths drive disease burden estimates attributed to enteropathogens, but asymptomatic infections are extremely common and the full scope of sequelae is only partially understood (Liu et al., 2016;Platts-Mills et al., 2018). Much of what we know about enteropathogen transmission is based on passive clinical surveillance, which reflects a small fraction of all infections. For example, antibody-based incidence of infection to Salmonella enterica and Campylobacter jejuni were 2-6 orders of magnitude higher than case-based surveillance in European populations (Simonsen et al., 2008;Falkenhorst et al., 2012;Teunis et al., 2012;Teunis et al., 2013), and a study of Salmonella enterica serotype Typhi in Fiji found similarly high discordance between antibody-based incidence and case-based surveillance (Watson et al., 2017). A more complete picture of enteropathogen infection in populations would help understand drivers of transmission, disease burden, naturally acquired immunoprotection, as well as to design public health prevention measures, and measure intervention effects.
Stool-based, high-throughput PCR assays have helped solve the logistical difficulties of singlepathogen testing for enterics and have provided new insights into pathogen-specific infections and disease burden (Liu et al., 2016;Platts-Mills et al., 2018). Yet, stool is not routinely collected in population-based surveys, and infection with many globally important enteric pathogens can be sufficiently rare and relatively short-lived to require designs with almost continuous surveillance (Platts-Mills et al., 2018;Lin et al., 2018). At the same time, large-scale serological surveillance platforms create new opportunities for expanded enteropathogen surveillance alongside other infectious diseases (Metcalf et al., 2016;Arnold et al., 2018). These challenges and opportunities have generated interest in antibody-based measurement as a complement to PCR for population-based enteropathogen surveillance (Griffin et al., 2011;Exum et al., 2016;Moss et al., 2014;, and for endpoints in observational and randomized studies (Crump et al., 2007;Zambrano et al., 2017;Chard et al., 2018;Vargas et al., 2017;Mosites et al., 2018;Wade et al., 2018;Egorov et al., 2018). eLife digest Diarrhea, which is caused by bacteria such as Salmonella or by viruses like norovirus, is the fourth leading cause of death among children worldwide, with children in lowresource settings being at highest risk. The pathogens that cause diarrhea spread when stool from infected people comes into contact with new hosts, for example, through inadequate sanitation or by drinking contaminated water. Currently, the best way to track these infections is to collect stool samples from people and test them for the presence of the pathogens. Unfortunately, this is costly and difficult to do on a large scale outside of clinical settings, making it hard to track the spread of diarrhea-causing pathogens.
The body produces antibodies -small proteins that can detect specific pathogens -in response to an infection. These antibodies help ward off future infections by the same pathogen, so if they are present in the blood, this indicates a current or previous infection. Scientists already collect blood samples to track malaria, HIV and vaccine-preventable diseases in low-resource settings. These samples could be tested more broadly to measure the levels of antibodies against diarrhea-causing pathogens. Now, Arnold et al. have used blood samples collected from children in Haiti, Kenya, and Tanzania to measure antibody responses to 8 diarrhea-causing pathogens. The results showed that many children in these settings had been infected with all 8 pathogens before age three, and that all of the pathogens shared similar age-dependent patterns of antibody response. This finding enabled Arnold et al. to combine antibody measurements with statistical models to estimate each pathogen's force of infection, that is, the rate at which susceptible individuals in the population become infected. This is a key step for epidemiologists to understand which pathogens cause the most infections in a population.
The experiments show that testing blood samples for antibodies could provide scientists with a new tool to track the transmission of diarrhea-causing pathogens in low-resource settings. This information could help public health officials design and test efforts to prevent diarrhea, for example, by improving water treatment or developing vaccines.
After infection, many enteropathogens elicit a transiently elevated antibody response that wanes over time. In lower transmission settings where antibody responses could be monitored longitudinally after distinct infections, Salmonella enterica, Campylobacter jejuni, Cryptosporidium parvum, and Giardia intestinalis (syn. Giardia lamblia, Giardia duodenalis) immunoglobulin G (IgG) levels in blood have been shown to wane over a period of months since infection; IgM and IgA levels decline even more quickly Strid et al., 2001;Priest et al., 2001;Priest et al., 2010;Falkenhorst et al., 2013;Hjøllo et al., 2018). Compared with permanently immunizing infections such as measles, transient immunity adds a layer of complexity to seroepidemiologic inference and methods. To our knowledge, there has been no detailed study of enteropathogen seroepidemiology among children in low-resource settings where transmission is intense beginning early in life (Platts-Mills et al., 2018). Such studies are needed to determine if serology is a viable approach to measure enteropathogen transmission in low-resource settings.
We conducted a series of analyses in cohorts from Haiti, Tanzania, and Kenya that measured serological antibody responses to eight enteropathogens using multiplex bead assays. Our objectives were to identify common patterns in antibody dynamics shared across enteropathogens and populations, and to evaluate serological methods to compare between-pathogen heterogeneity in infection, including estimates of force of infection. Our results provide new insights into the seroepidemiology of enteropathogens among children living in low-resource settings, and contribute advances to inform the design and analysis of surveillance efforts whose goal is to quantify heterogeneity in enteropathogen transmission through antibody response.

Study populations
The analysis included measurements from cohorts in Haiti, Kenya, and Tanzania. Blood specimens were tested for IgG levels to eight enteropathogens using a multiplex bead assay on the Luminex platform ( Table 1). The Haitian cohort included repeated measurements among children enrolled in a study of lymphatic filariasis transmission in Leogane from 1990 to 1999 (Lammie et al., 1998;Hamlin et al., 2012). Leogane is a coastal agricultural community west of Port au Prince. At the time of the study its population was approximately 15,000, most homes had no electricity and none had running water. In total, the Haiti study tested 771 finger prick blood specimens collected from 142 children ages birth to 11 years old, with each measurement typically separated by one year (median measurements per child: 5; range: 2 to 9). In Kenya, a 2013 prospective trial of locally-produced, inhome ceramic water filters enrolled 240 children in a serological substudy (Morris et al., 2018). Study participants were identified through the Asembo Health and Demographic Surveillance System, which is located in a rural part of Siaya County, western Kenya along the shore of Lake Victoria. Only 29% of the population had piped drinking water (public taps), water source contamination with E. coli prevailed (93% of samples tested), and the average age children began consuming water was 4 months (Morris et al., 2018). Children aged 4 to 10 months provided dried blood spot specimens at enrollment (February 2013), and again 6 to 7 months later (August to September, 2013; n = 205 children measured longitudinally). The Kenya study period encompassed seasonally heavy rains from March through June. In Tanzania, 96 independent clusters across eight trachoma-endemic villages in the Kongwa region were enrolled in a randomized trial to study the effects of annual azithromycin distribution on Chlamydia trachomatis infection (Wilson et al., 2019). The population is very rural, and water is scarce in the region: at enrollment, 69% of participants reported their primary drinking water source, typically an unprotected spring, was >30 min' walk one-way. From 2012 to 2015, the Tanzania study collected dried blood spots from between 902 and 1577 children ages 1-9 years old in annual cross-sectional surveys that took place from October through December at the conclusion of the dry season before heavy seasonal rains (total measurements: 4,989). Although children could have been measured repeatedly over the four-year study in Tanzania, they were not tracked longitudinally. There was no evidence that the Kenya and Tanzania interventions reduced enteropathogen antibody response (Supplementary file 1), so this analysis pooled measurements from the study arms in each population.

Age-dependent shifts in population antibody distributions
We estimated seropositivity cutoffs using three approaches: receiver operator characteristic (ROC) curve analyses for Giardia, Cryptosporidium, and Entamoeba histolytica including a panel of external, known positive and negative specimens, as previously reported Morris et al., 2018); Gaussian mixture models (Benaglia et al., 2009) fit to measurements among children ages 0-1 year old to ensure a sufficient number of unexposed children; and, presumed seronegative distributions among children who experienced large increases in antibody levels ( Table 1). Classification agreement was high between the different approaches (agreement >95% for most comparisons; Supplementary file 2). Among children < 2 years old, antibody levels clearly distinguished seronegative and seropositive subpopulations, but there were not distinct seronegative and seropositive subpopulations by age 3 years for most pathogens measured in Haiti ( Figure 1) and Tanzania (Figure 1-figure supplement 1). By age 3 years, the majority of children were seropositive to Cryptosporidium, enterotoxigenic Escherichia coli heat labile toxin B subunit (ETEC LT B subunit), and norovirus GI.4 and GII.4; in all cases antibody distributions were shifted above seropositivity thresholds. In contrast, there was a qualitative change in the antibody response distributions to Giardia, E. histolytica, Salmonella and ‡ Measured only in years 2-4 of the study (2013)(2014)(2015). DOI: https://doi.org/10.7554/eLife.45594.003 Figure 1. Age-stratified, IgG distributions among a longitudinal cohort of 142 children ages birth to 11 years in Leogane, Haiti, 1990Haiti, -1999. IgG response measured in multiplex using median fluorescence units minus background (MFI-bg) on the Luminex platform in 771 specimens, marked with rug plots below each distribution. Vertical lines mark seropositivity cutoffs are based on ROC analyses (solid), finite Gaussian mixture models (heavy dash), or distribution among presumed unexposed (light dash). Mixture models failed to converge for ETEC LT B subunit. Created with notebook (https://osf.io/dk54y) and data (https://osf.io/3nv98).    Campylobacter with increasing age, shifting from a bimodal distribution of seronegative and seropositive groups among children 1 year old to a unimodal distribution by age 3 years and older ( Figure 1, Figure 1-figure supplement 1). A direct comparison of age-dependent shifts in antibody distributions to Giardia VSP-3 antigen and Chlamydia trachomatis pgp3 antigen in Tanzania illustrates stark differences in enteropathogen-generated immune responses versus pathogens like Chlamydia that elicit a response that consistently differentiates exposed and unexposed subpopulations as children age (Figure 1-figure supplement 2). Distributions of IgG levels in the younger Kenyan cohort (ages 4-17 months) showed distinct groups of seropositive and seronegative measurements for most antigens (Figure 1-figure supplement 3). IgG responses to ETEC LT B subunit and cholera toxin B subunit were near the maximum of the dynamic range of the assay for nearly all children measured in the three cohorts (

Joint variation in antibody response
We hypothesized that IgG responses to closely related antigens would co-vary but that IgG responses to unrelated antigens would be uncorrelated. Joint variation in individual-level IgG responses aligned with hypothesized relationships based on antigenic overlap and shared epitopes. Responses to Giardia VSP antigens were strongly correlated in Haiti (Spearman rank: =0.99), Kenya (=0.84) and Tanzania (=0.97), as would be expected for antigens with conserved conformational epitopes ( Figure 2). Cryptosporidium (Cp17, Cp23) and Campylobacter (p18, p39) antigens were strongly correlated, but high within-individual variability suggests that measuring responses to multiple unique recombinant protein antigens yields more information about infection than measuring responses to one alone ( Figure 2). High correlation between Salmonella LPS Groups B and D, between norovirus GI.4 and GII.4, and between ETEC and V. cholerae likely reflected antibody cross-reactivity. Correlation could also result from multiple previous infections with different Salmonella serogroups or different norovirus genogroups. A comparison across all antigens revealed no other combinations with high correlation (Supplementary file 3).
We excluded cholera toxin B subunit antibody responses from remaining analyses because of the difficulty of interpreting its epidemiologic measures in light of high levels of cross-reactivity with ETEC LT B subunit. Heat labile toxin-producing ETEC is very common among children in lowresource settings (Platts-Mills et al., 2018), and there was no documented transmission of cholera in the study populations during measurement periods.
Birth to three years of age: a key window of antibody acquisition and primary seroconversion Despite enormous individual-level variation, age-dependent mean IgG curves exhibited characteristic shapes seen across diverse pathogens, and reflected high levels of early-life exposure . In Haiti, seroprevalence ranged from 66% (E. histolytica) to 100% (ETEC LT B subunit) by age 3 years ( Figure 3B), and in Tanzania, the majority of 1 year olds were already seropositive for Giardia (77%) and Cryptosporidium (85%) (Figure 3-figure supplement 2B). There was some evidence of maternally-derived IgG among children under 6 months old with a drop in mean IgG levels by age, but this pattern was only evident for norovirus GI.4 in Haiti ( Figure 3A Longitudinal antibody dynamics show significant boosting and waning above seropositivity cutoffs Based on age-dependent shifts in IgG distributions (Figure 1), we hypothesized that conversion of IgG levels to seropositive and seronegative status could mask important dynamics of enteropathogen immune response above seropositivity cutoffs, particularly among ages beyond the window of Figure 2. Joint distributions of select enteric pathogen antibody responses among children in three cohorts from Haiti, Kenya, and Tanzania. Each panel includes Spearman rank correlations () and locally weighted regression smoothers with default parameters, trimmed to 95% of the data to avoid edge effects. Antibody response measured in multiplex using median fluorescence units minus background (MFI-bg) on the Luminex platform. Empty Figure 2 continued on next page primary infection. In Haiti and Kenya we examined longitudinal IgG profiles among children. In the Haitian cohort, which was followed beyond the window of primary infection, children commonly had >4 fold increases and decreases in IgG while remaining above seropositivity cutoffs-a pattern observed across pathogens but particularly clear for Cryptosporidium (Figure 4). In Kenya, 4-fold increases in IgG largely coincided with a change in status from seronegative to seropositive, presumably because increases in IgG followed primary infection in the young cohort ages 4-17 months (

Comparison of serology with stool-based measures of infection
The Kenya study monitored diarrhea symptoms in weekly visits between enrollment and follow-up. The study collected stool from children whose caregivers reported diarrhea symptoms in the past 7 days, and tested stool for Cryptosporidium and Giardia infections using an immunoassay with additional PCR testing for Cryptosporidium as previously described (Morris et al., 2018). Among 132 children with paired serology measurements, those infected with Cryptosporidium (n = 17) and Giardia (n = 25) enabled us to compare stool-based measures of infection with IgG responses. Children with confirmed infections in diarrheal stools had higher IgG levels and seroprevalence at both time points compared with those who did not have confirmed infections, but many children without stoolconfirmed infections seroconverted during the study. Among children without confirmed Giardia infection in diarrheal stools, seroprevalence to VSP-3 or VSP-5 antigens increased from 1% (95% CI: 0%, 5%) at enrollment to 22% (14%, 32%) at follow-up; among children without confirmed Cryptosporidium infection, seroprevalence to Cp17 or Cp23 antigens increased from 16% (10%, 24%) at enrollment to 47% (37%, 57%) at follow-up. These findings suggest that many children were not shedding genetic material at the time of diarrheal stool collection, or many infections with these two pathogens were asymptomatic. Supplementary file 4 includes additional details.

Serological estimates of force of infection
The seroconversion rate, an instantaneous rate of seroconversion among those who are susceptible, is one estimate of a pathogen's force of infection and a fundamental epidemiologic measure of transmission (Hens et al., 2012). Serologically derived force of infection is useful for pathogens that commonly present asymptomatically, such as many enteric infections. Across diverse pathogens, steeper age-seroprevalence curves typically reflect higher transmission intensity (Corran et al., 2007;Pinsent et al., 2018), and age-adjusted seroprevalence equals the area under the age-seroprevalence curve (a summary measure) . We therefore hypothesized that seroprevalence and prospectively estimated force of infection should embed similar information about infection heterogeneity across pathogens. We also hypothesized that standard methods to estimate force of infection from age-structured seroprevalence would underestimate force of infection derived from longitudinal data because of significant antibody boosting and waning above seropositivity cutoffs.
Longitudinal designs in Haiti and Kenya enabled us to use individual child antibody profiles to estimate average rates of prospective seroconversion and seroreversion during the studies. We defined incident seroconversions and seroreversions as a change in IgG across a pathogen's seropositivity cutoff and estimated force of infection as incident changes in serostatus divided by person-time at risk. In a secondary analysis, we defined incident boosting as a ! 4 fold increase in IgG to a final level above a seropositivity cutoff and incident waning as !4 fold decrease in IgG from an initial level above a seropositivity cutoff. The secondary definition captured large changes in IgG   Haiti 1990Haiti -1999. Shaded bands are approximate, simultaneous 95% confidence intervals. IgG response measured in multiplex using median fluorescence units minus background (MFI-bg) on the Luminex platform (N = 771 measurements from 142 children). Created with notebook (https://osf.io/jeby3) and data (https://osf.io/3nv98). Data for some antigens measured among children < 5 years Figure 3 continued on next page above seropositivity cutoffs, which aligned with repeated boosting and waning observed in the Haitian cohort (Figure 4).
We found a rank-preserving relationship between pathogen seroprevalence and average force of infection in Kenya and Haiti ( Figure 5). Overall levels and steepness of the relationship differed between cohorts, presumably because Kenya measurements were within a window of primary infection for most children (4-17 months) whereas Haiti measurements extended from birth to 11 years and captured lower incidence periods with overall higher seroprevalence as children aged. Consistent with this interpretation, when we progressively narrowed the age range of the Haitian cohort and repeated the analysis, the relationship was steeper when estimated among children ages 0-2 years and flattened as measurements among older children were added (Figure 5-figure supplement 1).
Force of infection varied widely across pathogens in Kenya, ranging from 0.1 seroconversions per year for E. histolytica to >5 for Campylobacter ( Figure 5). In Haiti, force of infection ranged from 0.3 E. histolytica seroconversions per year to 1.1 ETEC seroconversions per year ( Figure 5). Force of infection estimated from 4-fold changes in IgG led to more events and slightly higher rates compared with those estimated from seroconversion alone ( Table 2). For example, Cryptosporidium incident cases increased from 70 to 204 (a 2.9 fold increase) and the average rate increased from 0.6 (95% CI: 0.5, 0.8) to 0.9 (95% CI: 0.7, 1.0) per child-year when using a 4-fold IgG change criteria because of substantial IgG boosting and waning above the seropositivity cutoff ( Figure 4). Sensitivity analyses that defined incident boosting over a range of 2-fold to 10-fold increases in IgG showed force of infection estimates were relatively stable across a wide range of definitions. In Haiti, the only pathogen for which force of infection estimated using a 4-fold increase in IgG was significantly higher than the seroconversion rate was Cryptosporidium (Supplementary file 5).
We evaluated whether model-based force of infection estimates from age-structured seroprevalence could accurately recover estimates from the longitudinal analyses. We focused on the Kenya cohort since children were measured repeatedly during the ages of primary infection and because longitudinal force of infection and seroreversion rate estimates varied considerably across pathogens ( Figure 5). We estimated force of infection from seroprevalence curves using methods developed for cross-sectional, 'current status' data, a common approach in serosurveillance of vaccine preventable diseases (Hens et al., 2012), malaria (Corran et al., 2007), and dengue (Ferguson et al., 1999;Katzelnick et al., 2018). Force of infection estimates from semiparametric spline models were similar to estimates from the longitudinal analysis for all pathogens, but had substantially wider confidence intervals owing to the loss of information from ignoring the longitudinal data structure ( Figure 6). Parametric approaches including an exponential survival model (Jewell and Laan, 1995) and a reversible catalytic model (Corran et al., 2007) yielded narrower confidence intervals than the semiparametric model but tended to underestimate force of infection compared with longitudinal estimates (Figure 6). Across pathogens, model-based force of infection estimates derived from seroprevalence were rank-preserving compared with nonparametric longitudinal analyses.
We conducted a simulation study to investigate whether longer sampling intervals in the cohorts (6 months in Kenya, 12 months in Haiti) could lead us to miss more frequent exposures and thus under-estimate force of infection. For each cohort, we created 100 imputed datasets that reconstructed a child's daily IgG levels, assuming that each infection boosted IgG and that it would wane exponentially. The simulation drew IgG boosts from empirical distributions in each cohort and used antibody-specific decay rates. We allowed for the maximum number of intermediate exposures between measurements as long as IgG levels could wane sufficiently to follow a child's empirical measurements, thus providing an approximate upper bound of the seroconversion rates (force of infection) that could plausibly be detected for each antibody. We down-sampled the daily datasets at intervals of 30, 90, 180, and 360 days to reflect realistic measurement intervals and estimated seroconversion rates. We found that for most pathogens studied, higher resolution sampling would not substantially increase seroconversion rates in absolute terms. Rates estimated through simulation increased from between 0.1 to 0.9 episodes per child-year at risk if measured with a sampling interval of 30 days instead of annually (Haiti) or every six months (Kenya). However, for pathogens with highest seroconversion rates, ETEC and Campylobacter, increases in rates estimated with 30 day sampling intervals detected a median of 4 to 8 additional seroconversions per child-year at risk compared with empirical rates (Figure 7). In relative terms, seroconversion rates in Haiti had a larger discrepancy; rates more than doubled when using a 30 day sampling interval, compared with the annual interval used in the study.

Discussion
In cohorts from Haiti, Kenya, and Tanzania we identified consistent patterns in IgG responses that provide new insights into enteropathogen seroepidemiology among children in low-resource settings. Most population-level heterogeneity in IgG levels and seroconversion was between birth and 3 years, reflecting high transmission and early life primary infection. For particularly high transmission pathogens (e.g., ETEC, Campylobacter), most variation in IgG levels was observed among children < 1 year old. In these study populations, endemic force of infection for enteropathogens was as high, and in many cases several fold higher, than force of infection estimated during epidemics of new dengue serotype introductions in Nicaragua and Peru (approximately 0.4 to 0.6 seroconversions per year) Reiner et al., 2014). Significant boosting and waning of antibody levels above seropositivity cutoffs identified through longitudinal profiles in Haiti and Kenya reinforce the value of longitudinal designs to derive antibody distributions among unexposed and to estimate force of infection. The shift of IgG distributions from bimodal to unimodal for many pathogens (Giardia, Cryptosporidium, E. histolytica, and Campylobacter), resulting from a combination of antibody boosting, waning and acquired immunity, complicates the interpretation of seroprevalence at older ages: among older children a seronegative response could either mean the children were never exposed or they were previously exposed but antibody levels waned below seropositivity cutoffs. The age-dependent shift contrasts with more stable differentiation of seronegative and seropositive groups observed for some other antibody responses (e.g., C. trachomatis pgp3 in Figure 1-figure supplement 2) and likely results from less robust and sustained IgG response following infection. Estimates of IgG halflife in the Haiti cohort were on the order of 10 weeks for most pathogens (Supplementary file 8), implying that time to seroreversion would be approximately 1 year without additional exposure (assuming exponential decay l = 0.01 corresponding to a 10 week half-life, a starting level of 10,000 MFI, and seropositivity cutoff of 300 MFI; t ¼ À log N t =N 0 ð Þ=l= 350 days). Seroreversion is therefore  Haiti, ages 0 -11 years  . Increases in mean IgG levels and seroprevalence with age imply IgG boosting from new infections or repeated infections outpaced IgG decay for all enteropathogens studied until at least age 3 years, and for many pathogens through age 10 years; seroprevalence thus reflects a conservative lower bound of a population's cumulative exposure over this age range. The age range over which seroprevalence provided useful epidemiologic information varied by pathogen and cohort. In Haiti, 100% of children were seropositive to ETEC LT B toxin before age 12 months, though seroprevalence did not exceed 90% for most other pathogens until age 5 years in Haiti (Figure 3). In Kenya, the age range of 4-18 months captured wide variation in seroprevalence for most pathogens. The Tanzania study enrolled children ages one and older due to a primary focus on trachoma monitoring, but missed the key window of variation in antibody response for all enteropathogens except E. histolytica and Salmonella (Figure 3-figure supplements 1 and 2). Studies that extend beyond 10 years into adolescence and adulthood would help determine whether enteropathogen seroprevalence remains sufficiently high that it no longer provides useful epidemiologic information. The shift in IgG distributions for some enteropathogens raises the question of whether population mean IgG levels stabilize at a new 'set point' with repeated infections as has been observed for dengue serotypes (Salje et al., 2018); if so, then the use of fold-changes in IgG would be preferred to seropositivity cutoffs to identify incident infections among older ages.
In low-resource settings, measuring a sufficient number of young children before primary infection, preferably with longitudinal measurements, will help ensure that within-sample seropositivity cutoff estimation is possible. Two-component mixture models fit the data and provided reasonable cutoff estimates only when restricted to an age range that included clearly delineated subpopulations of seronegative and seropositive responses. For most pathogens studied, this required measurements among children < 1 year old, an age range during which IgG responses still followed a bimodal distribution. The only reliable approach to estimate seropositivity cutoffs for the highest transmission pathogens like ETEC and Campylobacter was to estimate a distribution among presumed unexposed by identifying measurements among children who subsequently experienced a large increase in IgG, a strategy only possible in a longitudinal design. Although not attempted here, studies that wish to compare antibody response and seroprevalence between different sites should use a common assay platform and materials (e.g., shared bead coupling) with jointly estimated seropositivity cutoffs to help ensure comparability across sites, since there are currently no global reference standards to translate arbitrary units into antibody titers for enteropathogens. Seroepidemiologic measures that can be estimated from cross-sectional surveys are of particular interest for infectious diseases because most large-scale, population-based serosurveillance platforms use cross-sectional designs . Our results show that seroprevalence and force of infection estimated from seroprevalence models adequately summarize between-pathogen heterogeneity in transmission when compared with longitudinal estimates of force of infection. Pathogens with fastest rising mean IgG and seroprevalence with age (e.g., ETEC, Campylobacter, norovirus GII.4; Figure 3, Figure 3-figure supplement 1) had highest force of infection measured prospectively over the study period, and seroprevalence was rank-preserving with prospective force of infection in both Haiti and Kenya ( Figure 5). Seroprevalence alone thus appears to be sufficient to assess relative pathogen transmission if measured in an age range that captures ample heterogeneity in response (in these cohorts < 3 years old). Our findings align with modeling studies of other infectious diseases such as malaria (Corran et al., 2007), trachoma (Pinsent et al., 2018), and dengue , and suggest that enteropathogens share similar seroepidemiologic features conducive to population-based surveillance in cross-sectional surveys despite different  Figure 6. Enteropathogen seroconversion and seroreversion rates among 205 children ages 4 to 17 months measured longitudinally in Asembo, Kenya, 2013. The seroconversion rate is a measure of a pathogen's force of infection. Longitudinal estimates are non-parametric rates of incident seroconversions and seroreversions among children at risk, assumed to occur at the midpoint of the measurement interval. Cross-sectional estimators were derived from age-specific seroprevalence curves using semiparametric cubic splines (spline), a reversible catalytic model (RCM) that assumed constant seroconversion and seroreversion rates with the seroreversion rate estimated from prospective data, and a parametric constant rate survival model (exponential). Error bars mark 95% confidence intervals. IgG response measured in multiplex using median fluorescence units minus background (MFI-bg) on the Luminex platform (N = 410 measurements from 205 children). Created with notebooks (https://osf.io/sqvj7, https://osf.io/j9nh3) and data (https://osf.io/2q7zg). DOI: https://doi.org/10.7554/eLife.45594.019 children 7 years and older provided additional verbal assent. In Kenya, the human subjects protocol was reviewed and approved by institutional review boards at the Kenya Medical Research Institute (KEMRI) and at the US CDC. Primary caretakers provided written informed consent for their infant child's participation in the trial and blood specimen collection and testing (Morris et al., 2018). The original trial was registered at clinicaltrials.org (NCT01695304). In Tanzania, the human subjects protocol was reviewed and approved by the Institute for Medical Research Ethical Review Committee in Dar es Salaam, Tanzania and the institutional review board at the US CDC. Parents of enrolled children provided consent, and children 7 years and older also provided verbal assent before specimen collection.
analyzed by multiplex bead assay as described by Morris and colleagues (Morris et al., 2018). Each multiplex bead assay plate (N = 11) included five control sera: one negative sample and four positive control samples offering a range of responses to various antigen markers. For the positive control sample responses to the 11 enteric antigens used in this study, the average CV% was 6.3 with a standard deviation of 3.7. The median CV% was 5.3 with a range of 1.1% to 14.9%.

Tanzania
For the Tanzania study, the same conditions described in the Kenya study were used to couple antigens from Giardia, Cryptosporidium, E. histolytica, ETEC B toxin subunit, cholera B toxin subunit, GST, and Salmonella LPS group B and LPS group D. Campylobacter p39 and p18 were both coupled at 25 mg per 1.25 Â 10 7 beads in buffer containing 0.85% NaCl and 25 mM 2-(N-morpholino)-ethanesulfonic acid at pH 5.0. Dried blood spots were eluted in the casein-based buffer described previously  and samples were diluted to either 1:400 serum dilution with 50 ml run per well for year 1, or 1:320 serum dilution with 40 ml run per well for years 2-4. The incubation steps, washes, and data collection methods used in the multiplex bead assay were performed as described previously . All samples were run in duplicate, and the average median fluorescence intensity minus background (MFI-bg) value was recorded. Each multiplex bead assay plate (N = 37) included four control sera: one negative sample and three positive control samples offering a range of responses to various antigen markers. For the positive control sample responses to the nine enteric antigens used in this study, the average CV% was 8.4 with a standard deviation of 5.3. The median CV% was 5.3 with a range of 2.6 to 15.1. The Tanzania study used different bead lots in year 1 and years 2-4; we confirmed that the use of different bead lots had no influence on the results (Supplementary file 1).

Antibody distributions and determination of seropositivity
We transformed IgG levels to the log 10 scale because the distributions were highly skewed. Means of the log-transformed data represent geometric means. We summarized the distribution of log 10 IgG response using kernel density smoothers. In the Tanzania and Haiti cohorts, where children were measured across a broad age range, we stratified IgG distributions by each year of age <3 years to examine age-dependent changes in the population distributions. To assess potential cross-reactivity between antigens, we estimated pairwise correlations between individual-level measurements in each cohort using a Spearman rank correlation (Zar, 2005) and visualized the relationship for each pairwise combination with locally weighted regression fits (Cleveland and Devlin, 1988). We compared three approaches to estimate seropositivity cutoffs. Approach 1: External known positive and negative specimens were used to determine seropositivity cutoffs for Giardia VSP-3 and VSP-5 antigens, Cryptosporidium Cp17 and Cp23 antigens, and E. histolytica LecA antigen. Cutoffs were determined using ROC analysis as previously described Morris et al., 2018) for all antigens except for LecA, VSP-3, and VSP-5 in Haiti; in these cases, the mean plus three standard deviations of 65 specimens from citizens of the USA with no history of foreign travel were used to estimate cutoffs . Approach 2: We fit a 2-component, finite Gaussian mixture model (Benaglia et al., 2009) to the antibody distributions among children 0-1 years old, and estimated seropositivity cutoffs using the lower component's mean plus three standard deviations. The rationale for restricting the mixture model estimation in Haiti and Tanzania to children 0-1 years old was based on initial inspection of the age-stratified IgG distributions that revealed a shift from bimodal to unimodal distributions by age 3 (Figure 1). This approach ensured that there was a sufficiently large fraction of unexposed children in the sample to more clearly estimate a distribution among seronegative children. Approach 3: In the longitudinal Haiti and Kenya cohorts we identified children < 1 year old who presumably seroconverted, defined as an increase in MFI-bg values of +two or more on the log 10 scale. A sensitivity analysis showed that an increase of 2 on the log 10 scale was a conservative approach to identify seroconversion for most antibodies considered in this study; an increase of between 0.3 to 2.16 MFI-bg lead to optimal agreement with ROC-based and mixture model-based classifications in Kenya, and an increase of 0.92 to 2.41 led to optimal agreement across antigens and references in Haiti (Supplementary file 5). We then used the distribution of measurements before seroconversion to define the distribution of IgG values among the presumed unexposed. We used the mean log 10 MFI-bg plus three standard deviations of the presumed unexposed distribution as a seropositivity cutoff. We summarized the proportion of observations that were in agreement between the three classification approaches, and estimated Cohen's Kappa (Cohen, 1960). Additional details and estimates of seropositivity cutoff agreement are reported in Supplementary file 2. Mixture models failed to estimate realistic cutoff values if there was an insufficient number of unexposed children, which was the case for ETEC LT B subunit and cholera toxin B subunit in all cohorts, and for nearly all antigens in Tanzania where the study did not enroll children < 1 year old (Table 1).
In analyses of seroprevalence and seroconversion, we classified measurements as seropositive using ROC-based cutoffs if available, and mixture model-based cutoffs otherwise. There were three exceptions. By age 1 year, a majority of children across the cohorts had IgG levels near the maximum of the assay's dynamic range for ETEC LT B toxin and cholera toxin B subunit. The absence of a sufficient number of unexposed children to ETEC LT B toxin, cholera B toxin, and in some cases Campylobacter p18 or p39 led mixture models either to not converge or to estimate unrealistically high seropositivity cutoffs beyond the range of quantifiable levels. For these pathogens, we used seropositivity cutoffs estimated from presumed unexposed measurements in the longitudinal Haiti and Kenya cohorts (approach 3, above). High levels of agreement between classifications (Supplementary file 2) meant results were insensitive to choice of approach in these cohorts. We classified children as seropositive to Giardia, Cryptosporidium, Campylobacter, or Salmonella if antibody levels against either of the antigens from each pathogen were above estimated seropositivity cutoffs.

Age-dependent antibody levels and seroprevalence curves
We estimated mean IgG levels and seroprevalence by age using semiparametric cubic splines in a generalized additive model, specifying binomial errors for seroprevalence, and random effects for children or clusters in the case of repeated observations (Wood, 2017;Wood, 2012). We also estimated the relationships by age using a stacked ensemble approach called 'super learner' that included a broader and more flexible library of machine learning algorithms van der Laan et al., 2007;Polley et al., 2018), and found similar fits to cubic splines. We estimated approximate, simultaneous 95% confidence intervals around the curves using a parametric bootstrap from posterior estimates of the model parameter covariance matrix (Ruppert et al., 2003). Supplementary file 6 includes additional details.

Force of infection from longitudinal data
In the Kenya and Haiti longitudinal cohorts, we estimated prospective seroconversion rates as a measure of force of infection by dividing the number of children who seroconverted by the person-time at risk between measurements. We defined incident seroconversions and seroreversions as a change in IgG across a pathogen's seropositivity cutoff. Vaccine immunogenicity and pathogen challenge studies among healthy adults often use a 4-fold increase in antibody levels (difference of +0.6 on the log 10 scale) as a criterion for seroconversion (Bernstein et al., 2015;Jin et al., 2017;Chakraborty et al., 2018). In a secondary analysis aimed to capture significant changes above a pathogen's seropositivity cutoff, we defined incident boosting episodes as a ! 4 fold increase in IgG to a final level above a seropositivity cutoff, and incident waning episodes as ! 4 fold decrease in IgG from an initial level above a seropositivity cutoff. In the secondary definition, individuals were considered at risk for incident boosting episode if they were seronegative, if they experienced a ! 4 fold increase in IgG in their first measurement period, or if they experienced a ! 4 fold decrease in IgG in a preceding period (Haiti). To estimate person-time at risk used for rates and force of infection, we assumed incident changes were interval-censored and occurred at the midpoint between measurements. We estimated 95% confidence intervals for rates with 2.5 and 97.5 percentiles of a nonparametric bootstrap distribution (Wasserman, 2004) that resampled children with replacement to account for repeated observations.

Force of infection from age-structured seroprevalence in Kenya
In the Kenya cohort, we estimated force of infection through age-structured seroprevalence using multiple approaches. There is a long history methods development to estimate force of infection from age-dependent seroprevalence (Hens et al., 2012), which is of particular interest to large-scale, cross-sectional surveillance platforms . Our rationale was to determine if force of infection estimates from age-structured seroprevalence were comparable to estimates from the longitudinal analysis based on incident changes in serostatus.
As we show in Supplementary file 7, the age dependent seroprevalence curve is the difference between the cumulative distribution functions of seroconversion times and seroreversion times. In a special case of no seroreversion, age-specific seroprevalence is thus the cumulative hazard function. The age-specific force of infection can then be estimated as the hazard of seroconverting at age A = a: l(a) = F 0 (a) / [1 -F(a)], where F(a)=P(Y | A = a) is the proportion of the population who are seropositive at age a and F 0 (a) is the derivative of F(a) with respect to a. Key assumptions include stationarity/homogeneity (i.e., no intervention or cohort effects) and that there is no seroreversion (Hens et al., 2012). There was no evidence for large changes in transmission during the studies, even due to intervention (Supplementary file 1). We know for many enteric pathogens children in the Kenya cohort did serorevert (e.g., Figure 6); when assumption is violated, estimates provide a lower-bound of a pathogen's force of infection. We considered three different estimation approaches for force of infection from age-structured seroprevalence.

Exponential model (SIR model)
The simplest catalytic model, a susceptible-infected-recovered (SIR) model, assumed a constant force of infection over different ages, l(a) = l and no seroreversion (Hens et al., 2012). In the survival analysis context, this is equivalent to assuming a constant hazard, which can be estimated with an exponential survival model. We modeled the probability of being seropositive conditional on age with a generalized linear model fit with maximum likelihood that assumed a binomial error structure and complementary log-log link (Jewell and Laan, 1995). We estimated average force of infection from the model's intercept term:

Reversible catalytic model (SIS model)
For some infectious diseases, like malaria, reversible catalytic models have been proposed to estimate force of infection from an age-seroprevalence curve while accounting for antibody waning with time since infection (Corran et al., 2007). The model assumes a constant rate of seroconversion, l, but extends the SIR model by also assuming a constant seroreversion rate, r, equivalent to a susceptible-infected-susceptible (SIS) model. We modeled the probability of a child being seropositive conditional on age as a function of these two additional parameters: We fit the model with maximum likelihood assuming a binomial error structure and a fixed seroreversion rate. To incorporate information about the seroreversion rate in the model, we bootstrapped the dataset 1000 times, resampling children with replacement. In each bootstrap replicate, we estimated each pathogen's seroreversion rate using information from longitudinal data, and then fit the reversible catalytic model assuming a cross-sectional sample. The results are thus optimistic because they incorporate some information from the longitudinal design. Attempts to fit an SIS model assuming only a cross-sectional design with the seroreversion rate as a second free parameter led to highly unstable estimates, consistent with results presented in Supplementary file 7 that show age-dependent seroprevalence alone does not technically contain information about seroreversion (r). As an internal validity check, we confirmed that force of infection estimates from the SIS model matched those from the SIR model for ETEC and Campylobacter, pathogens which had seroreversion rates that approached 0.

Semiparametric spline model
We fit a model that allowed force of infection to vary flexibly by age using cubic splines in a generalized additive model (Wood, 2017) for an arbitrary function g Á ð Þ, which we fit with cubic splines that had smoothing parameters chosen through cross-Zambrano LD, Priest JW, Ivan E, Rusine J, Nagel C, Kirby M, Rosa G, Clasen TF. 2017. Use of serologic responses against enteropathogens to assess the impact of a Point-of-Use water filter: a randomized controlled trial in western province, Rwanda.