Protection From Natural Immunity Against Enteric Infections and Etiology-Specific Diarrhea in a Longitudinal Birth Cohort

Abstract Background The degree of protection conferred by natural immunity is unknown for many enteropathogens, but it is important to support the development of enteric vaccines. Methods We used the Andersen-Gill extension of the Cox model to estimate the effects of previous infections on the incidence of subsequent subclinical infections and diarrhea in children under 2 using quantitative molecular diagnostics in the MAL-ED cohort. We used cross-pathogen negative control associations to correct bias due to confounding by unmeasured heterogeneity of exposure and susceptibility. Results Prior rotavirus infection was associated with a 50% lower hazard (calibrated hazard ratio [cHR], 0.50; 95% confidence interval [CI], 0.41–0.62) of subsequent rotavirus diarrhea. Strong protection was evident against Cryptosporidium diarrhea (cHR, 0.32; 95% CI, 0.20–0.51). There was also protection due to prior infections for norovirus GII (cHR against diarrhea, 0.67; 95% CI, 0.49–0.91), astrovirus (cHR, 0.62; 95% CI, 0.48–0.81), and Shigella (cHR, 0.79; 95% CI, 0.65–0.95). Minimal protection was observed for other bacteria, adenovirus 40/41, and sapovirus. Conclusions Natural immunity was generally stronger for the enteric viruses than bacteria, potentially due to less antigenic diversity. Vaccines against major causes of diarrhea may be feasible but likely need to be more immunogenic than natural infection.

Vaccines are currently under development for several leading causes of diarrhea among children in low-resource settings, including Shigella [1], enterotoxigenic Escherichia coli [2], norovirus [3], and Campylobacter [4]. The development of immunity after natural infection is an important guide towards a successful vaccine. For rotavirus, observational analyses of natural immunity in low-resource settings found levels of protection that were comparable to that of vaccine efficacy [5][6][7], suggesting that natural immunity was a good predictor of vaccine performance. Assessment of natural immunity has informed cholera vaccine development, and levels of natural immunity are frequently used as a benchmark against which to judge vaccine candidates [8,9]. However, for many of the highest burden diarrhea etiologies, the degree of protection conferred by natural immunity is unknown, and immunologic surrogates of protection are imperfect. As large-scale investments in new vaccines are being made, better understanding of the magnitude, if any, of natural immunity to enteric pathogens is needed to inform expectations for vaccine effectiveness.
Estimates of natural immunity are confounded by the fact that children who are infected with an enteric pathogen may be more likely to be infected again due to high exposure risk and/ or greater host susceptibility compared with other children. For example, children with a water source that is contaminated with Cryptosporidium will be more likely than other children to be infected again, even if they acquire some immunity after their first infection. Because this bias is expected to be in the opposite direction of a protective effect of natural immunity, confounding may completely mask evidence of protection or identify prior infection as a risk factor for subsequent infection.
Although observed factors such as sociodemographics, environmental characteristics, and markers of malnutrition may be able to explain some of this heterogeneity, these variables have been only modestly associated with acquisition of specific enteric infections among children in low-resource settings [10][11][12][13][14], suggesting that it is difficult to predict exposure and susceptibility based on readily observed characteristics. Unmeasured factors, such as innate immunity (major histocompatibility complex variation), are likely also important to characterize individual susceptibility, but they are difficult to account for in this setting. Alternatively, exposure to other enteropathogens can be used as negative controls because prior exposure to other pathogens would not be expected to elicit immunity, but it may be a good proxy for exposure and susceptibility due to common transmission pathways and host-related risk factors.
In this study, we estimated the protective effects of natural infection by common enteric pathogens against subsequent subclinical infection and etiology-specific diarrhea identified by quantitative molecular diagnostics in a longitudinal birth cohort. We used cross-pathogen estimates of protection as negative controls to estimate the magnitude of cofounding bias due unmeasured heterogeneity of exposure and susceptibility and calibrate effect estimates to account for this systematic error.

METHODS
The study design and methods of the MAL-ED study have been previously described [15]. In brief, children were enrolled within 17 days of birth from November 2009 to February 2012 at 8 sites: Dhaka, Bangladesh; Fortaleza, Brazil; Vellore, India; Bhaktapur, Nepal; Naushahro Feroze, Pakistan; Loreto, Peru; Venda, South Africa; and Haydom, Tanzania. Each site obtained ethical approval from their institutions, and written informed consent was obtained from participants. Children were excluded if their mother was <16 years of age, their family intended to move from the study area, they were from a multiple pregnancy, their birthweight was ≤1500 grams, or they were diagnosed with congenital or severe neonatal disease. Surveillance was conducted for diarrhea, defined as maternal report of ≥3 loose stools in 24 hours or 1 stool with visible blood, at twice-weekly home visits until 2 years of age. Stool samples were collected monthly and during diarrhea episodes.
Stool samples from the subset of children with complete follow-up to 2 years of age were tested for 29 enteropathogens (Supplementary Table S1) by quantitative polymerase chain reaction (qPCR) using custom-designed TaqMan Array Cards (Thermo Fisher Scientific, Carlsbad, CA). Details of the assays and quality control have been previously described [16,17]. Pathogen-attributable diarrhea episodes were defined using adjusted attributable fractions (AFes) for each episode to account for subclinical infections [16,18]. In brief, we used the pathogen quantity, age, and sex-specific odds ratios (ORs) for diarrhea to estimate AFes for each diarrhea episode as follows: 1 -1/OR. We defined pathogen-attributable episodes when the pathogen quantity-derived AFe was .5 or higher (ie, majority attribution), as previously described [19]. In a sensitivity analysis, we excluded diarrhea episodes in which more than 1 etiology was identified. Severe pathogen-attributable diarrhea episodes were defined by a severity score greater than 6, derived from components of the Vesikari score [20].
Because low-quantity detection of pathogen nucleic acid by qPCR may not indicate an established infection, we defined infections that could confer natural immunity as any detection at a quantity corresponding to qPCR cycle threshold (Cq) value ≤30. For pathogens in which lower quantities were associated with diarrhea (where AFe ≥.5), the quantity associated with diarrhea was used (rotavirus, Cq ≤32.638; Shigella, Cq ≤30.507; adenovirus 40/41, Cq ≤30.424; and norovirus GII, Cq ≤30.357). In sensitivity analyses, we (1) used the more sensitive analytical cutoff of Cq <35 to define infections and (2) considered only attributable diarrhea episodes as able to confer natural immunity, a more specific definition. We further limited to new infections that occurred at least 21 days after a previous detection. In sensitivity analyses, we defined new infections after 14 and 31 days.
To model these effects, we used the Andersen and Gill extension of the Cox model for recurrent events [21] with a counting process formulation. Each risk period was defined by birth or age at 21 days after a prior infection to age at subsequent outcome, and each child contributed multiple risk periods from birth to 2 years of age (Supplementary Figure S1). Therefore, baseline hazards by age were estimated non-parametrically, and hazards of subsequent outcomes were conditioned on age. We estimated protection due to natural immunity as follows: (1-hazard ratio [HR]) × 100, in which the HR compared children who had experienced 1 or 2+ prior infections to children who experienced no prior infections. Robust variance accounted for correlation between risk periods within each child [22]. Models were adjusted for site and prespecified risk factors for enteric infections identified in previous analyses of MAL-ED [10][11][12][13][14]: sex, socioeconomic status (WAMI index [23]), enrollment weight-for-age z-score, maternal height, maternal education, crowding in the home, and percentage of days exclusively breastfed (from birth to the current risk period up to 6 months of age). In a sensitivity analysis, we stratified effects by age.

Bias Calibration
We used cross-pathogen estimates of protection, that is, associations between each enteric infection outcome and prior exposure to a different pathogen (eg, the protection due to prior Campylobacter infection on subsequent Shigella diarrhea) as negative control associations. We considered pathogens of the same type (bacteria, viruses, and parasites) as negative controls for each of the enteric infections. Because there were only 4 parasites, both bacteria and parasites were included as negative controls for Cryptosporidium. In sensitivity analyses, we included all pathogens as negative controls and compared calibrated estimates to estimates additionally adjusted for prior exposure to other pathogens.
We calibrated estimates with the negative controls using methods previously described [24,25] with the EmpiricalCalibration package in R. After estimating the negative control associations using the models above, we used maximum likelihood to generate a systematic error model for each pathogen outcome that fit a Gaussian distribution to the negative control estimates and accounted for the sampling error of each estimate. Assuming the systematic error does not change as a function of the true effect size, we then estimated a calibrated distribution for the effect of interest that incorporated both random error and the systematic error model estimated above. We computed calibrated effect estimates and confidence intervals (CIs) as the .5, .025, and .975 percentiles of the corresponding cumulative distribution function [25].

RESULTS
Among 1715 children with follow-up to 2 years and molecular testing of stool samples, there were a total of 52 382 detections of the top 10 causes of diarrhea ( Table 1). Half of these detections (n = 24 520, 46.8%) were at quantities of Cq ≤30 (or greater than the diarrhea-associated quantity) and 21 days distant from a prior infection. Among these infections, the children experienced 3526 attributable diarrhea episodes and 539 severe attributable diarrhea episodes. For each pathogen, more than half of children had at least 1 infection, except for rotavirus (48.9% of children). Shigella (n = 719 episodes; 29.6% of children with 1+ episodes) and rotavirus (n = 552; 26.0%) were the most common causes of diarrhea (Table 1).
Covariate-adjusted estimates of protection against subsequent subclinical infection and attributable diarrhea due to prior infection (subclinical or diarrhea) ( Figure 1A) were nearly equivalent to the corresponding unadjusted estimates (Supplementary Table S2). However, cross-pathogen negative control associations generated systematic bias distributions that were above the null for almost all pathogen outcomes, suggesting that the adjusted estimates were biased upwards and underestimated natural immunity (Supplementary Figure S2). The magnitude of estimated bias (Table 2) was generally larger for the diarrhea outcomes than the infection outcomes, and it was larger for estimates assessing 2+ prior infections compared with 1 prior infection. Shigella demonstrated the largest magnitude of bias. Hazard ratios between Shigella diarrhea and 2+ prior infections by negative control pathogens were systematically 42% higher than the expected null association (mean adjusted HR for systematic error, 1.42; 95% CI, 1.34-1.51). The viruses, rotavirus, astrovirus, sapovirus, and adenovirus 40/41 had similar bias distributions, with negative control associations an average of 18% higher than the null for the diarrhea outcomes. In contrast, there was no systematic error identified for norovirus GII or C jejuni/C coli ( Table 2).
After calibrating the adjusted estimates, estimates of protection generally increased ( Figure 1B, Table S3).
There was no evidence of protection against bacteria or parasites that were infrequently associated with diarrhea (  Table S5). Estimates of protection against rotavirus diarrhea were similar in the sites that had (Brazil, Peru, and South Africa) and had not introduced rotavirus vaccine. Protection against rotavirus infection was slightly stronger in sites that had introduced vaccine (cHR, 0.65; 95% CI, 0.42-1.00 for 1+ prior infection) compared with sites that had not (cHR, 0.76; 95% CI, 0.63-0.92).
Defining infections at a quantity cutoff of Cq <35 instead of Cq ≤30 resulted in inconsistent shifts in the estimates (Supplementary Table S6). For example, estimates of natural protection against diarrhea for rotavirus and astrovirus were closer to the null; estimates for norovirus GII and sapovirus were further from the null. Prior infections with Cryptosporidium were strongly predictive of subsequent infections, which may indicate that low-quantity detections are more likely to identify persistent infections rather than new infections (Supplementary Table S6).
Modification of the minimum duration between new infections (14 and 31 days instead of 21) resulted in minor changes to the estimates (Supplementary Tables S7 and  S8). Defining prior exposure by prior attributable diarrhea instead of prior infection generally resulted in smaller estimates of protection, suggesting potential misclassification of immunity acquired in the absence of diarrheal symptoms (Supplementary Table S9). The exclusion of diarrhea episodes with multiple attributable pathogens resulted in     slightly stronger estimates of protection for the enteric viruses (Supplementary Table S10). Including all pathogens as negative controls resulted in smaller estimates of bias, especially for the viruses, with calibrated estimates generally closer to the covariate-adjusted estimates (Supplementary  Table S11). Finally, adjusting for prior exposure to other pathogens did not completely correct the bias identified by the negative control calibration approach (Supplementary  Table S12).

DISCUSSION
We estimated strong protection due to prior infection against rotavirus, Cryptosporidium, and astrovirus diarrhea, which suggests that vaccine development for the latter 2 may be relatively feasible. Norovirus GII and Shigella also exhibited protection against diarrhea, but more strongly after multiple infections, which may reflect low potency or heterotypic protection. Less protection was observed for sapovirus and adenovirus 40/41. Estimates of protection against infection were universally smaller in magnitude, whereas estimates against severe diarrhea were larger, compared with diarrhea of any severity. Observed levels of natural protection were generally consistent with previous literature. Estimates for rotavirus were similar to those in a previous study from India (40%-80% protection against diarrhea) [5], and they were smaller in magnitude to estimates from Mexico [6] and Guinea-Bissau [7] (70%-90%). These estimates were also consistent with estimates of rotavirus vaccine efficacy in low-resource settings [26], which supports the use of the models for other pathogens. Levels of natural protection against norovirus GII were consistent with previous analyses of MAL-ED using a subset of samples (25%-30% protection against diarrhea) [27] and smaller than estimates from Peru (50%-80%) [28]. In contrast, no natural immunity to norovirus was observed in a study in Ecuador [29]. Estimates for astrovirus were larger than those from previous analyses of MAL-ED using the enzyme immunoassay in a subset of samples [30] and those from a study in rural Egypt [31]. Protection for Cryptosporidium was larger against diarrhea (70% vs 25%) and smaller against subclinical infection (20% vs 50+%) compared with a study in India [32]. The evidence for natural immunity to Cryptosporidium is supported by protection observed among adults with pre-existing serum antibodies [33] and delayed cryptosporidiosis among children with higher levels of anti-Cryptosporidium fecal immunoglobulin A (IgA) [34].
Because many of the pathogens included in this analysis are immunologically heterogeneous, these results should be interpreted as estimates of "functional protection" that reflect both homotypic and heterotypic immunity at the population-level. Different degrees of pathogen heterogeneity may explain variations in levels of protection. Shigella is antigenically diverse with 4 species and more than 50 serotypes [35], such that poor crossprotection may explain the relatively limited observed natural protection. A previous analysis of infection-derived immunity to Shigella in Chile found 14% protection overall but more than 70% serotype-specific protection [36].
The lack of natural protection observed for sapovirus, ST-ETEC, and C jejuni/C coli may also be explained by antigenic diversity. Sapovirus has 4 genogroups and 16 genotypes [37], with a lack of cross-protection [38]. Protection against ETEC is conferred by immune responses to more than 20 different colonization factors [39,40], and it was previously only observed for ETEC infections of the same toxin-colonization factor profile [41]. Likewise, protection against Campylobacter may be related to the polysaccharide capsule, which has a broad diversity of types [42].
The bias analysis using negative controls identified substantial unmeasured confounding in the covariate-adjusted estimates. Larger magnitudes of bias observed for the diarrhea than infection outcomes suggest that confounding by heterogeneity of susceptibility may be more important than heterogeneity of exposure because the latter would not be expected to differ based on the severity of the outcome. One source of heterogeneity in host susceptibility may be histoblood group antigens status [43]. It is interesting that bias was not observed for norovirus GII and C jejuni/C coli, which may reflect the tendency for norovirus to spread indiscriminately in populations (because it is a cause of outbreaks in other settings) [44] and the uniformly high frequency of Campylobacter detection [10,19]. Persistent carriage of Giardia [11], Salmonella [45], and H pylori [46] may explain the strong positive associations with repeated detections for these pathogens, because subsequent detections may not be capturing new infections.
This analysis leverages previous work by using molecular diagnostics across pathogens and attributable fractions to adjudicate diarrhea etiology in the context of frequent subclinical infection. This study design allowed for the novel bias correction based on multiple negative control associations, which generated an empirical null distribution that captured a common distribution of biases [24], but did not require the structure and magnitude of confounding to be identical [47]. However, these analyses were limited by the inability to assess homotypic immunity. The poor sensitivities of typing assays applied directly to stool specimens for rotavirus, Shigella, and ST-ETEC resulted in more than 50% of detections being typed as other, which prohibited assessment of homotypic immunity. Furthermore, speciation and typing directly from stool rather than isolates limited our ability to ensure that (1) G and P types for rotavirus and (2) toxins and colonization factors for ETEC were detected in the same organism. In addition, with only monthly sampling, there was likely under-ascertainment of infections. In the 5 sites that had not introduced rotavirus vaccine, 30.4% (n = 238) of first rotavirus infections based on serologic testing at 7 and 15 months of age (IgA ≥20 U/mL) were not detected by qPCR. A total of 36.6% of these children were qPCR positive at an older age, at a median age delay of 5.0 months (interquartile range, 2.8-9.1). Because we were unable to make this comparison for the other pathogens, serology was not included as evidence of prior infection in the analysis.

CONCLUSIONS
Estimates of natural protection for most enteric pathogens were modest and suggest that vaccines that simulate natural infections for the enteric bacteria are unlikely to provide important levels of protection. Vaccines currently in development, such as for Shigella, will likely need to provide heterotypic protection, be conjugated, and/or require boosting. Furthermore, protection against infection was limited, such that vaccines are unlikely to provide sterilizing immunity. Therefore, although vaccines could limit the acute burden of diarrheal illness and mortality, they may not effectively address the long-term impacts of subclinical infections, such as growth impairment [19].

Supplementary Data
Supplementary materials are available at The Journal of Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.