Association of intestinal bacteria with immune activation in a cohort of healthy adults

ABSTRACT Interactions among intestinal bacteria and the immune system contribute to the maintenance of a functional intestinal barrier in healthy individuals, and possibly to systemic immune activity. We hypothesized that intestinal bacteria would be associated with systemic biomarkers of innate and adaptive immune responses in healthy adults. 79 immune function markers were subjected to factor analysis resulting in 17 Immune Factors (IFs), each composed of 2–10 immune variables. Bacterial taxa from stool samples were identified at the family and genus levels by 16S rRNA amplicon sequence analysis and their read counts and relative abundances were utilized in a multiple linear regression model to identify microbial taxa associated with the IFs. A total of 10 significant associations were identified between bacterial taxa and IFs. The family Rikenellaceae showed a positive association with innate IF5 (including 5 chemokines, 2 cytokines, 2 adhesion molecules, and the macrophage metabolite neopterin) and a negative association with adaptive IF4 (including T-cells with activation marker HLA-DR). Additionally, Pseudomonadaceae and its genus Pseudomonas showed a negative relationship with innate IF5, and adaptive IF13 (including T-cell cytokines IL-10, IL-17, and IFN-γ) was negatively associated with Butyrivibrio and positively associated with Slackia. These associations suggest ongoing interactions between gut bacteria and the systemic immune system in healthy adults. The association of these taxa with the IFs may result from specific microbial-immune system interactions that play a role in maintenance of a healthy barrier integrity in our cohort of healthy adults. IMPORTANCE Chronic inflammation may develop over time in healthy adults as a result of a variety of factors, such as poor diet directly affecting the composition of the intestinal microbiome, or by causing obesity, which may also affect the intestinal microbiome. These effects may trigger the activation of an immune response that could eventually lead to an inflammation-related disease, such as colon cancer. Before disease develops it may be possible to identify subclinical inflammation or immune activation attributable to specific intestinal bacteria normally found in the gut that could result in future adverse health impacts. In the present study, we examined a group of healthy men and women across a wide age range with and without obesity to determine which bacteria were associated with particular types of immune activation to identify potential preclinical markers of inflammatory disease risk. Several associations were found that may help develop dietary interventions to lower disease risk.

direct interaction with the intestinal epithelial barrier or by the secretion of bioactive small molecules (e.g., short-chain fatty acids or other metabolites).One example of common host-microbe interaction in the intestine is the recognition of microbial-associ ated molecular patterns via pattern-recognition receptors.Antigen-specific activation of memory T and B lymphocytes may also occur (1)(2)(3)(4).In this study, our goal is to identify microbial taxa modulating innate and adaptive components of the systemic immune response in a cohort of healthy adults.We hypothesized that intestinal bacteria would be associated with function markers of innate and adaptive immune responses in healthy adults.
We used a uniquely large and diverse set of immune biomarkers measured in a population of clinically healthy adults.We applied factor analysis to these immune biomarkers to develop a smaller set of composite variables containing multiple specific measures of immune function that might be useful in identifying associations with types of immune activity (e.g., T-cell activation or blood levels of innate immune cells or inflammatory mediators).17 Immune Factors (IFs) were defined, including 8 innate IFs and 9 adaptive IFs.The obtained IFs were tested for association with intestinal bacteria at the family and genus levels.Our study identified microbial taxa significantly associated with innate and adaptive immune responses.

Study design
Healthy adults were recruited into a Nutritional Phenotyping Study conducted at the USDA Western Human Nutrition Research Center (ClinicalTrials.gov:NCT02367287) as previously described (5,6).Data on race and ethnicity of the participants were pre sented previously (5).Data used for the analysis of the association of microbial taxa with immune biomarkers in this observational study is from 355 females and males categorized into the three age categories (18)(19)(20)(21)(22)(23)(24)(25)(26)(27)(28)(29)(30)(31)(32)(33)(34)35-49 and 50-65 yr) and three BMI categories (18.5-24, 25-29 and 30-44 kg/m 2 ) (Table 1).A stool sample for microbial sequencing was collected by each participant, and to measure immune function, fasting blood was collected at the center after a 12-h overnight fast following consumption of a standard meal the evening before.Ethical approval for this study was received from the Institutional Review Board of the University of California, Davis, USA, as previously described (5,6).

Immune function markers
79 immune variables were measured from 362 healthy, fasting adults as described previously (7).The immune variables include soluble immune biomarkers measured in plasma and supernatants of cultured peripheral blood mononuclear cells (PBMC), and cellular markers measured using flow cytometry and a complete blood count (CBC).The immune markers are defined using the following categories: effector/memory T-cells and activation levels (n = 24), other lymphocytes including Th cells, NK, NK T-cells, and B cells (n = 11), PBMC cytokines (n = 8), CBC (n = 6), innate cell activation (n = 11), and plasma markers (n = 19) (Table S1).

CMV antibody test
A history of infection with cytomegalovirus (CMV) may expand memory/effector T-cells and increase the level of the CD4 and CD8 T-cell subsets (8,9).For this reason, we controlled the regression analysis of the association of microbial taxa with IFs for the CMV infection status.The study participants were assessed for a history of infection with CMV by measuring the CMV IgG level as described previously (7).

Bacterial 16S rRNA gene sequence analysis
Amplification and sequencing of the 16S rRNA V4-V5 region from bacterial DNA extracted from stool was performed by the Dalhousie University Integrated Microbiome Resource using primers 515F, GTGYCAGCMGCCGCGGTAA, and 926R, CCGYCAATTYMTT TRAGTT (10-12) as previously described (13).Sequences were analyzed using Qiime2 version 2019.10 ( 14) also as previously described (13).Except for two samples with 5,211 and 9,860 sequence counts, all samples included in the analysis had at least 10,000 sequence counts after quality filtering.

Statistical analysis
All statistical analyses were conducted in SAS 9.4 (SAS Institute, Cary, North Carolina, USA) other than the adjustment of microbial read counts which was performed using R software (v 3.6.3)(15).

Factor analysis
We used factor analysis as a method of data reduction to explain the large number of immune variables by a few factors through grouping correlated variables together.To optimize the effect estimates, the immune variables with missing values were imputed prior to factor analysis by the multiple imputation procedure using PROC MI.The number of missing variables in the data set is shown in Table S2.To achieve a normal distribu tion, immune variables were normalized using rank-based normal transformation.Factor analysis was applied to the rank-normalized immune variables using PROC FACTOR and the method of Principal, and a correlation matrix was computed for all the immune variables included (Table S3).The correlation matrix includes correlation coefficients between the immune variables and IFs.We used the sum of the squared factor loadings (coefficients) also known as the eigenvalues as the factor extraction method and we used eigenvalues > 1 as the extraction threshold, leaving 17 IFs.Orthogonal rotation was applied to the component matrix using the varimax method, in which the factors are assumed to act independently and not correlated.Factor loadings included positive factor loadings indicating a positive correlation between the factor and raw data, and negative factor loadings indicating a negative correlation.Factor scores (the load coefficient scores) obtained for each participant were used as dependent variables in multiple linear regression analysis.The scree plot showing the relationship between eigenvalues and IFs (Fig. 1) and the heatmap showing immune variable loadings on the IFs (Fig. 2) were created in GraphPad Prism V9.5.0.

Microbial taxa
16S rRNA sequencing libraries for microbial taxa were available from 355 participants.Microbial taxa found in 5% or fewer of the stool samples were filtered out from the data set.Taxa that were found in fewer than 25% of the subjects were represented as binary variables by the absence-presence scale where absence is indicated by "0" and presence is indicated by "1." The remaining taxa (found in ≥25% of the subjects) were analyzed as continuous variables.Our data set is composed of 26 families and 50 genera as continuous variables and 9 families and 25 genera as binary variables.Sequence counts attributed to microbial taxa were adjusted using two different approaches.In the first approach, total sum scaling (the number of sequence counts attributed to each taxon divided by the total number of sequence counts within each sample) was used to adjust for variation in sample library size.The resulting percent abundance values were then rank transformed to achieve a normal distribution.In the second approach, sequence counts were adjusted for variation in library size with the DESeq2 package (v 1.26.0)(16) in R (v 3.6.3)(15).The DESeq2-adjusted sequence counts for continuous variables were transformed using the log function in SAS returning natural logarithm values (with pseudo-counts of one added to all input data with zero read counts prior to log transformation so that zero counts would be handled by the normalization scheme explicitly).In addition, prior to modeling, all the DESeq-2-adjusted counts (continuous and binary variables) were standardized using PROC Standard in SAS.Only associations that were significant by linear regression analysis using both approaches for normaliza tion of sequence counts are reported in the body of this manuscript.

Association of microbial taxa with Immune Factors
Associations of microbial taxa at the family and genus levels with IFs were assessed using generalized linear regression model by the PROC GLM method in SAS.The models to describe associations with the 17 IFs as dependent variables included microbial taxa as independent variables, and categorical variables used as covariates including the three age categories, three BMI categories, and sex.The model was also controlled for the CMV IgG antibody status (positive/negative) as it was correlated with some immune markers of our study, primarily with different T-cell measures.
To minimize the effect of technical variation on the linear regression model, the GLM procedure was implemented on the microbial taxa using the two abundance scales, i.e., relative abundances and sequence counts, and two normalization methods, i.e., normalization by rank and natural logarithm as were explained above.Where bacteria were evaluated by sequence counts, taxa with sequence counts below the threshold of 25% were subjected to multiple regression analysis for presence/absence.To account for multiple comparisons, Hochberg adjusted P-values were computed.Statistical significance was set at P < 0.05.
The results of the analyses by the relative abundance and sequence count scales were compared.Reported in this paper are results from the analysis of microbial taxa by sequence counts and represent significant associations overlapping across the two approaches (Tables 2 and 3).All the significant associations found by each of the two analyses are presented separately in Tables S6 and S7.To compare regression coefficients between the two analyses, standardized regression coefficients were calculated.
To calculate the standardized regression coefficients, prior to the modeling, the immune data set was standardized by rank-normalization and also by factor analy sis producing standardized data by default.Where taxa were evaluated using rela tive abundance, data were standardized using rank transformation.Where taxa were evaluated using read counts, data were standardized using PROC Standard in SAS so that we would be able to compare regression coefficients between microbial taxa below and above the threshold of 25% (binary and continuous variables).
The standardized regression coefficients of the associations of microbial taxa with IFs represent the effect size of the associations according to the conventionally used values of 0.1-0.3 for small effect, 0.3-0.5 for moderate effect, and 0.5-1 for large effect (17)(18)(19).
The unstandardized regression coefficients for the significant associations are summarized in Table S8.GraphPad Prism v9.5.0 was used for the graphical representation of relationship of the IFs with microbial taxa present in <25% of the subjects (violin plots).Associations of microbial taxa present in ≥25% of the subjects with IFs were plotted using SAS (scatter plots).
For the 4 families and 3 genera that had significant associations with one or more IFs, a secondary, post hoc regression analysis was performed with all 79 immune variables in a regression model adjusted for sex, age, and BMI categories, and CMV infection status to identify further significant associations that might help in interpreting the primary findings with the IFs.A Benjamini-Hochberg (BH) P-value of <0.10 was used for this secondary analysis.Results are shown in Tables S11 and S12.

Characteristics of study participants
The study data set including the data of the immune function and microbial sequenc ing were collected from 355 clinically healthy adults recruited into the cross-sectional Nutritional Phenotyping Study.The participants were men and women enrolled into the sampling bins of the three age and BMI categories.The participants were also grouped into the two categories of CMV seronegative and seropositive IgG status.Measurement of the plasma IgG level confirmed that 176 (49.6%) individuals were CMV seropositive which means that they had a history of CMV infection, and 179 (50.4%) individuals were seronegative.The participants' characteristics are summarized in Table 1.

Immune Factors
Factor analysis was carried out on 79 immune biomarkers from 362 healthy adults.The variance in the original data set of immune variables is distributed among the retained IFs with the first IF explaining more variation than the other IFs (Fig. 1).17 IFs were extracted which jointly explained 86.6% of the variance in the data.Extracted components for each factor are shown in Fig. 2 and the related explained variances after the orthogonal rotation are reported in Table S4.8 of the IFs are comprised primarily of innate immune system biomarkers and 9 factors are comprised primarily of adaptive immune system biomarkers (Table S5).

Association of microbial taxa with Immune Factors
35 bacterial families were examined for associations with the 17 IFs.4 families were found to have significant associations with 4 different IFs with small effect sizes (Table 2; Fig. 3 and 4).Pseudomonadaceae with a range of relative abundance between 0 and 0.3% had a negative association with IF5, an unknown family of the order ML615J-28 with a range of relative abundance between 0 and 0.9% had a negative association with IF15, family Muribaculaceae with a range of relative abundance between 0 and 12.6% had a negative association with IF16, and Rikenellaceae with a range of relative abundance between 0 and 5.2% had a negative association with IF4 and a positive association with IF5.
Independently from the family-level analysis, 75 genera were examined for associa tions with the 17 IFs.4 genera were found to have significant associations with 3 different IFs with small effect sizes (Table 3; Fig. 5 and 6), including genera from two families (Pseudomonadaceae and Rikenellaceae) already identified as having associations with IF4 and IF5.Pseudomonas, the only genus present in the family Pseudomonadaceae in our data set, with a range of relative abundance between 0 and 0.3% had a negative association with IF5, similar to the association seen with Pseudomonadaceae at the family level, and an unknown genus of the family Rikenellaceae with a range of relative abundance between 0 and 4.4% had a negative association with IF4 as well as a positive association with IF5, as was seen with Rikenellaceae at the family level.In addition, the genus Butyrivibrio (family Lachnospiraceae) with a range of relative abundance between 0 and 4% had a negative association with IF13, while the genus Slackia (family Corio bacteriaceae) with a range of relative abundance between 0 and 0.3% had a positive association with the same factor, IF13.
Associations of these taxa at the family and genus level with the individual compo nents of the identified IFs are summarized in Tables S9 and S10.From 25% to 67% of the constituent variables within each factor were also significantly associated with the same taxa.In addition, we conducted a secondary, post hoc analysis of these taxa with all 79 immune variables to identify variables from other IFs that might be significantly associated with these taxa, as described in Methods.Only two additional immune variables were identified in this manner: blood lymphocyte concentration was negatively associated with Rikenellaceae (Table S11), and percent intermediate monocytes were negatively associated with Slackia (Table S12).

DISCUSSION
Weak associations (i.e., standardized regression coefficients < 0.3) identified among specific microbial taxa and IFs in healthy adult human gut microbial communities suggest that even in healthy humans Rikenellaceae might be associated with innate immune activation and Slackia with adaptive immune activation.On the other hand, Pseudomonas, Muribaculaceae, and Butyrivibrio were correlated with dampened innate immunity, monocyte activation and T-cell activation, respectively, in our data set.
The positive association of Rikenellaceae, a family in the phylum Bacteroidetes, with innate IF5 (particularly the constituent variables ICAM-1, VCAM-1, and MCP-1) and its negative association with both adaptive IF4 (indicating T-cell activation) and with blood lymphocyte concentration, indicates that this taxon is associated with both increased innate immune response (e.g., vascular adhesion molecules and chemokines  cytokines IL-1β and IL-6, atherosclerotic lesion in the aortic area, and macrophage infiltration within the lesions (26)(27)(28).These latter findings are consistent with the positive association of Rikenellaceae with markers of inflammation in IF5 including VCAM-1, ICAM-1, and MCP-1 found by this study, indicating that this taxon might be associated with enhanced migration and adhesion of innate (and other) immune cells to the site of inflammation.However, future studies are required to determine whether the increased abundance of Rikenellaceae is a consequence or cause of modulations of immune activity in subjects with higher BMI.The negative association of the family Muribaculaceae, another family in the phylum Bacteroidetes, with IF16 including monocytes expressing HLA-DR suggests that this family might be associated with suppressed monocytes activation and inflammation.Association of this taxon, which is positively associated with short-chain fatty acids such as butyrate and propionate (29)(30)(31)(32)(33), with reduced intestinal inflammation has been demonstrated in an obesity mouse model fed with resistant starch (34).Moreover, the abundance of Muribaculaceae was shown to decrease in a mouse model of colitis, and in a variety of inflammatory diseases in humans such as obesity and irritable bowel syndrome (35)(36)(37)(38), suggesting a possible association of this taxon with reduced inflammation.
Negative associations of both Pseudomonadaceae and Pseudomonas with innate IF5 may suggest that Pseudomonas is associated with dampened systemic inflammation, which seems counterintuitive based on the known role of Pseudomonas as a patho gen.Some Pseudomonas species are opportunistic pathogens which cause infections in immunocompromised and hospitalized patients, including individuals with cystic fibrosis, cancer, and diabetic wounds (39).However, previous research shows that secondary metabolites of Pseudomonas such as indole, amino acids, and peptides have significant biological activity, including antiinflammatory activity.For example, in vitro  studies of macrophage cell lines showed that supernatants from P. aeroginosa culture broth significantly reduced the release of TNF-α, one of the components of IF5, and its mRNA expression level and inhibited the LPS-induced polarization of macrophages (40,41).In addition, the study by Khan et al. (41) reported a decreased plasma level of TNF-α with oral administration of the bacterial supernatant as well as a proline-based cyclic dipeptide, a compound identified in the bacterial supernatant, to rats followed by LPS injection (41).These findings are consistent with the negative association of this microbial taxon with IF5 and the possibility that the Pseudomonas species found in these healthy volunteers may dampen inflammation, though this observational study does not assess causality.In addition, no previous human study has been found to validate this association.On the other hand, Pseudomonas rarely causes gastrointestinal infections in healthy individuals.A small human study showed that oral administration of live cultures of P. aeroginosa had no clinical symptoms in healthy individuals (42,43).The reason is not known.However, detection in the stool was transient unless antibiotics were adminis tered prior to inoculation, suggesting that in most healthy individuals, Pseudomonas ingested with food does not colonize the gastrointestinal tract.Pseudomonas species are found in milk and vegetable products (44).It is therefore possible that the detection of higher amounts of Pseudomonas in this study is due to the consumption of foods more likely to contain Pseudomonas.We speculate that higher consumption of minimally processed vegetables and milk products may be associated with a healthier diet in our study volunteers.Therefore, we suggest that Pseudomonas may be acting as a marker of a healthy dietary pattern, though this speculation does not have a precedent in the literature.
The negative association of an unknown family of the order ML615J-28 with IF15 (inflammation markers; the MMPs) is supported by studies showing that the abundance of this taxon is reduced in patients with disease-associated chronic inflammatory states such as the one in obesity (37,38,45).In addition, the abundance of this taxon was shown to be increased in lean-BMI individuals and it was found by another study to be negatively associated with BMI and multiple disease biomarkers in blood such as LDL, triglycerides, and uric acid (46,47).
Positive association of genus Slackia (member of family Coriobacteriaceae) with adaptive IF13, which includes production of the T-cell cytokines IL-10, IFN-γ, and IL-17 has not been reported previously.However, the family Coriobacteriaceae and genus Slackia have been associated with increased gut permeability and inflammation (48)(49)(50)(51)(52)(53) which could cause increased activation and development of effector/memory T-cells producing these effector cytokines.Secondary analysis of individual immune variables which identified a negative association between Slackia and the percentage of intermediate monocytes is also a novel finding and should be assessed by experimental studies.
On the other hand, Butyrivibrio, a genus in the family Lachnospiraceae, was negatively associated with adaptive IF13.Multiple studies have shown that butyrate-producing Butyrivibrio is associated with gut health.For example, a report using a mouse model of amyotrophic lateral sclerosis showed that the reduction in the relative abundance of Butyrivibrio was associated with a significant reduction in the expression level of tight junction protein zonula occludens-1 (ZO-1), increased intestinal permeability, and increased intestinal and plasma levels of the inflammatory cytokine IL-17 (54).Also, studies have shown that administration of Butyrivibrio to mice can decrease intestinal damage and inflammation caused by Campylobacter-induced enterocolitis or chemical damage (55)(56)(57).It is possible that induction of Treg cells by butyrate (58,59) could account for such protection.Butyrate may also dampen local innate immune activation and support enterocyte growth and survival as additional protective mechanisms (60).Thus, the negative association of Butyrivibrio with adaptive IF13 could be related to higher Treg activity dampening the development of memory CD4 and CD8 T-cells that would produce these cytokines.However, in our secondary, post hoc analyses after identification of this association of IF13 with Butyrivibrio, we did not observe a positive association of Butyrivibrio with Tregs which would have supported a role for Tregs in our observations, nor did we observe a negative association with innate immune variables.However, in support of the idea that Butyrivibrio may be protective against tissue damage in the intestine (or elsewhere), our secondary analysis did find a marginally significant (P = 0.072 after adjustment for multiple comparison) negative association of Butyrivibrio with plasma matrix metalloproteinase (MMP) −3 (stromelysin-1).MMPs in plasma are used as indicators of tissue damage and MMP-3 is known to be increased in a human model of small bowel ischemia-reperfusion injury in conjunction with enterocyte apoptosis (61).This association indirectly supports the hypothesis that higher Butyrivibrio levels in the gut may protect against intestinal epithelial damage.Though the mecha nism behind this possible protection is unknown it may involve dampened T-cell activity, as suggested by the negative association with IF13.

Strengths and limitations
One strength of our study is that we used a broad range of immune markers to identify associations of both the innate and adaptive immune systems with intestinal bacteria.A second strength is that we examined these associations in a population of healthy adults, perhaps allowing identification of less robust associations than have previously been identified in other populations including both healthy individuals and those with diagnosed, immune-mediated diseases.A third strength is that we were careful to identify statistically significant associations using adjustment for multiple comparisons as well as reproducible associations with bacterial taxa by highlighting associations found using two different approaches to quantifying bacteria (i.e., relative abundance and DESeq2-adjusted sequence counts).
A limitation of our study is that its cross-sectional design does not allow conclu sions to be drawn about the cause-effect association between the microbial taxa and IFs.Therefore, future experimental studies in human and animal models are required to further examine the causality of these associations.However, the associations of microbial taxa with IFs identified in our cohort of healthy adults may be indicative of the potential preclinical markers of inflammatory diseases and may represent potential targets of dietary intervention to lower the disease risk.A second potential limitation of our study is that we retained low-abundance microbial taxa (i.e., <1% relative abundance in all study participants) that might not reach a threshold to trigger immune activation.However, previous work has shown that even small numbers of certain bacteria can affect the immune function (62,63), thus, we included these taxa in our analysis.Although our findings, particularly those on the low abundant microbial taxa identified in this study including Pseudomonadaceae, Pseudomonas, ML615-J28, and Slackia should be interpreted with caution, the identified associations between these taxa and immune responses provide hypothesis-generating results to support the design of future studies examining the relationships between commensal microbes and immune response in healthy adults.

Conclusion
In conclusion, the associations identified in this study between the microbial taxa and innate and adaptive IFs suggest that there are multiple, ongoing interactions between these taxa and the immune system.These associations may contribute to low level, chronic immune activation, but may also contribute to the maintenance of an intact intestinal barrier and lower systemic inflammation in our group of healthy individuals without acute or chronic disease.Some of these associations may have been more difficult to identify in more diverse study populations, containing both healthy individu als and those with diagnosed inflammatory diseases.

FIG 3 FIG 4
FIG 3 Linear regression plots showing association of family Rikenellaceae with the adaptive IF4 and innate IF5 with the best fit line, shaded area representing 95% confidence intervals for the linear regression model, and dotted lines representing 95% prediction limit.The read counts were adjusted for varying sequencing depths by the median normalization method in DESeq2 and normalized by the log ratio based on natural logs (Ln) for varying distribution across the subjects.

FIG 5
FIG 5 Linear regression plots showing the association of an unknown genus of the family Rikenellaceae with the adaptive IF4 and innate IF5 with the best fit line, shaded area representing 95% confidence intervals for the linear regression model, and dotted lines representing 95% prediction limit.The read counts were adjusted for varying sequencing depths by the median normalization method in DESeq2 and normalized by the log ratio based on natural logs (Ln) for varying distribution across the subjects.

FIG 6
FIG 6 Violin plots demonstrating the distribution of IFs by microbial genera with read counts in <25% of the subjects."0" indicates absence and "1" indicates the presence of the microbial taxa in 355 subjects.Median values are annotated with full lines, and lower and upper quartiles with dashed lines.

TABLE 1
Demographic characteristics of healthy adult participants (n = 355)

TABLE 2
Microbial families associated with IFs with adjusted P-values < 0.05 a indicates microbial families with read counts in <25% of the subjects which were used as binary variables.Research Article Microbiology SpectrumNovember/December 2023 Volume 11 Issue 6 10.1128/spectrum.01027-237

TABLE 3
Microbial genera associated with IFs with adjusted P-values < 0.05 Indicates microbial genera with read counts in <25% of the subjects which were used as binary variables. a