Geographic social vulnerability is associated with the alpha diversity of the human microbiome

ABSTRACT The human microbiome ecosystems living within the human body are exposed to exogenous foreign substances from the various environments that humans live within. Therefore, dynamics between microbes and human hosts can be influenced by environmental changes, which potentially impact microbiome composition and diversity. Geographic area, as a microbiome-relevant environmental factor, has been well reported. Human geography, however, is often linked to socioeconomic status, racial and ethnic population enclaves, and disparities in area disadvantage. Potential mechanisms linking the microbiome to factors tied to area disadvantage include household crowding, use of public transportation, and lack of exposure to biodiverse natural environments. Through an analysis of data from the Human Microbiome Project including healthy adults who reported residential area information at the time of microbiome sampling (n = 201), we found a significant relationship between the social vulnerability index (SVI), as a measure of area disadvantage, and multiple alpha diversity measures across oral, airways, and urogenital sites when controlling for age, gender, and body mass index. With regard to race/ethnicity, we found significant mediation by SVI score to explain racial/ethnic differences in urogenital microbiome diversity in females. Our results highlight the importance of considering environmental variables such as area social vulnerability as a variable of interest in microbiome studies within healthy individuals and suggest a potential role to explain urogenital race/ethnicity differences. Future studies including a diverse, representative community-based population, more precise residential location, and inclusion of related risk factors such as dietary intake, are needed to further understand the implications of these results. IMPORTANCE As a risk factor for conditions related to the microbiome, understanding the role of SVI on microbiome diversity may assist in identifying public health implications for microbiome research. Here we found, using a sub-sample of the Human Microbiome Project phase 1 cohort, that SVI was linked to microbiome diversity across body sites and that SVI may influence race/ethnicity-based differences in diversity. Our findings, build on the current knowledge regarding the role of human geography in microbiome research, suggest that measures of geographic social vulnerability be considered as additional contextual factors when exploring microbiome alpha diversity.

from a sampling site into an index or group of indices used to describe the micro biome population (2).Historically, lower gut microbiome alpha diversity has been associated with disease in humans (3,4), while higher alpha diversity may be patho genic in other body sites (like the oral cavity) (5,6).As a field, there is uncertainty in the mechanisms underlying associations between increased or decreased microbial diversity and disease, as well as the impact of human and environmental heterogeneity on healthy versus unhealthy microbiome ecosystems (7)(8)(9).Nevertheless, the role of microbiome-associated bacteria on physiologic function is well described, and bacteria at various sites along the human body (biogeography) contribute to different physiologic processes essentially creating a symbiont relationship.As a result of this interconnected relationship, factors that impact human health or transformations within human lifestyle or experiences may impact the human microbiome.
Traditionally, human lifestyles' impact on microbiome function and composition is attributed mostly to diet and antibiotic exposures.However, humans are inhabitants of physical environments in which microbes live and populate, and human experiences, or differences in experiences, may reasonably impact human-microbiome interconnec tions.Within the physical environment, these interactions may include the influence of indoor water sources, building materials, and greenspace (trees, grass areas) within a person's physical environment (10).These influences on the microbiome differ not only in geography but also potentially by socioeconomic position, which can be measured as area disadvantage or the Centers for Disease Control and Prevention (CDC) valida ted measurement, the social vulnerability index (SVI) (11).The literature to date has sparsely focused on the neighborhood connection to the microbiome (12,13).Notably, factors within residential areas (neighborhoods) are linked to multiple health outcomes including cardiovascular disease (CVD) (14), asthma (15), and all-cause mortality ( 16) that are associated with or influenced by the microbiome.
Physical environments, such as residential areas, are often disparate based on long-standing discriminatory policies that have led to the disinvestment of certain neighborhoods and residential segregation of individuals by racial/ethnic group, socioeconomic status (SES), and/or class.Differences in the physical environment that can influence the microbiome may then account for differences reported in the microbiome by SES (10,17).The gut microbiome may be influenced by disparities in physical environments along with disparities by SES and race/ethnicity.Miller et al. (18) reported that alpha diversity measurements differed by residential SES status for adults in Illinois.Potential mechanisms to explain microbiome variation based on the residential area include directly influential factors associated with residence in a disadvantaged area, such as limited access to resources for a healthy diet, disproportionate exposure to chronic stressors, and disparities in exposures to cesarean births and breastfeeding (17).Additionally, neighborhood disadvantage status can lead to exposure to environmental pollutants that increase microbiome metabolites linked to CVDs, such as trimethylamine N-oxide (TMAO) (19).
Several studies, including ones using Human Microbiome Project (HMP) data, have reported differences in diversity and species composition across racial and ethnic groups (20)(21)(22)(23)(24). Recent calls in the literature have emerged for contextualization when race or ethnicity is used as participant phenotype variables (25,26).Ethnicity is defined as social groupings based on a combination of shared language, history, religion, and culture, while race is defined as individuals identified within a social group that reflects exposures to inequities which then impact health.Ethnicity and race are often represented as biological representations or independent determinants of health or the microbiome.But they are not.Race and ethnicity are representations of intersecting social factors and inequities that influence and shape the microbiome.When presented as independ ent determinants of the microbiome, genetic causes are often the primary presented mechanism.In a study evaluating differences across ethnicities in the American Gut Project and the Human Microbiome Project, Brooks et al. (27) found differences within the gut microbiome across ethnicities, with a significant but not overwhelming amount of the variation explained by a few taxa and genetic influences.However, in an analysis of adults at a Cleveland, Ohio Medical Center, Sun et al. (28) showed that certain microbio tas remained significantly associated with neighborhood disadvantage when controlling for ethnicity.
Because of the evidence showing an association of geography, physical environment exposures, and ethnicity/race with microbiome composition, but still limited investiga tions to the role of intersecting variables such as residential area disadvantage, we conducted an analysis on the processed 16S ribosomal RNA (rRNA) sequenced data from phase 1 of the HMP to evaluate the role of residential area social vulnerability on microbiome diversity.The aim of this work is to examine the association between geographic disadvantage using the SVI with microbiome diversity at multiple human sites (oral, gastrointestinal, urogenital, skin, and nasal) in healthy human subjects while controlling for participant clinical and demographic characteristics.We hypothesize that more disadvantaged residential areas will have less beneficial diversity at specific body sites collected within the HMP.Secondarily, we predict that the interaction of ethnicity and residential disadvantage will explain a significant amount of the known variation across ethnicities.

Human Microbiome Project overview
The HMP study was the first population-level, body-wide metagenomic microbiome study conducted with the intent to characterize the ecology of human-associated microbial communities at clinically relevant body habitats (20,29,30).The phase 1 of the HMP study cohort comprised 300 donors with no known medical diagnoses recruited in two locations in the USA: St. Louis, MO, and Houston, TX.The HMP cohort design and sample collection methods have been described in detail elsewhere (29).In brief, microbiome and blood samples, and participant phenotype data were collected from healthy adults recruited to one of the two study centers.Microbiome samples were collected up to three times at 15-18 body sites, with 15 sites in men and 3 additional vaginal sites in women.Participant phenotype data such as age, gender, occupation, race, and ethnicity were collected at the initial sampling visit, and further phenotype data, including breastfed status and residential geography at the time of first sampling visits, were collected through follow-up interviews with participants.The data from the HMP has served as a reference data source for understanding distinct biogeographic sites across the human body and studying site-specific ecology such as gut antibiotic resistome, skin, and oral microbial communities.When evaluating clinical metadata, ethnic/racial background is reported as one of the strongest factors in the associations of metagenomic pathways and microbes (29).

Participant phenotype variables
Access to participant phenotype variables was requested and approved through the Intramural National Institutes of Health (NIH) database of Genotypes and Phenotypes (dbGaP).Following NIH Intramural Research Program guidelines, the analysis was exempt from Institutional Review Board approval.Phenotype variables were selected based on the stated research objectives: to evaluate associations of residential disadvantage and microbiome diversity and to explore associations between residential disadvantage and race/ethnicity-related microbiome diversity.Phenotype data were included if factors were known from the literature to have an impact or to be influenced by either residential disadvantage, microbiome diversity, or race/ethnicity.As a result, variables included in this analysis were the following: age, gender, body mass index (BMI), occupation, participant birthplace, self-identified race/ethnicity, parental birthplace, and whether the subject was breastfed.Participant's residential location was determined from the first three digits of the participant's ZIP code obtained during the HMP follow-up interview at the first sampling visit.The first three digits designate a US postal sectional center facility or postal sorting area, providing a wider level of the area than the full five-digit ZIP code.The broad diet characterization variable on usual meat consumption that was collected through the post-visit interview was not included.
Race and ethnicity are defined as descent-associated characterizations by the National Academy of Science, Engineering, and Medicine report on characterizing populations (30).One recommendation for the use of descent-associated characteris tics is to avoid the use of multiple descriptors within an analysis to allow for a clear interpretation of results.Therefore, race and ethnicity data were re-coded using the participant's stated race as the primary identifier.Ethnicity was not chosen as the sole descent-associated descriptor because ethnicity was not assessed for all self-identified groups, other than Latino/LatinX ethnicity.Thus, the use of ethnicity as a sole descriptor would have not adequately described the HMP 1 study population.When Hispanic/Lat inX participants did not state a race (White, Black, or Asian), the participants were included within the race/ethnicity group of Hispanic/LatinX.When Black or White was chosen with Hispanic/LatinX as an ethnicity, the participant was categorized by racial category.Within the sample, a total of three participants had this category.Birthplace for the participant and parental birthplace were recorded based on US/Canada or non-US or Canada nativity.

DNA extraction, sequencing, and feature table generation
DNA extraction, 16S rRNA gene amplicon sequencing, sequence quality control, and operational taxonomic unit (OTU) feature table generation procedures have been described in detail by the HMP study investigators in previous HMP manuscripts (31)(32)(33)(34).Briefly, sequence data from 16S rRNA variable (V) regions V1-3 and V3-5 were preprocessed and demultiplexed using the QIIME microbiome analysis software package.To generate the feature tables used in this analysis, OTU picking was performed at 97% sequence similarity using OTUPipe, UCHIME, and UCLUST (35,36).Taxonomy was assigned using the Ribosomal Database Project classifier version 2.2.

Human Microbiome Project OTU data processing and filtering
All microbiome 16S data used for this analysis was processed and publicly available for download at the HMP data portal (https://www.hmpdacc.org/hmp/HMQCP/all/). Please see the full details of the 16S amplicon data processing and OTU table generation on the HMP Data Analysis and Coordination Center data portal (https://portal.hmpdacc.org/).OTU tables generated from phase 1 of the HMP healthy volunteers were downloaded from the HMP Data Analysis and Coordination Center.When examining the overlap between the Primary Sample Numbers (PSNs) from the OTU tables from hypervariable region V1-3 to V3-5, a large proportion of the PSNs overlapped between the two feature tables.Because of this, minimal samples were omitted by not including the data from V1-3 (Fig. S1).Therefore, to prevent primer bias-associated variability in microbiome results (37), PSNs from only V3-5 were included in this analysis (accessed from http:// downloads.hmpdacc.org/data/HMQCP/otu_table_psn_v35.txt.gz).

Human Microbiome Project microbiome sample metadata processing and filtering
Microbiome sample metadata and mapping files, corresponding to the OTU tables, were also downloaded from the HMP Data Analysis and Coordination Center website.The 15 and 18 specific sampling sites that were collected in males and females (30), respectively, were categorized into broad body sampling site categories by the HMP investigators (Fig. 1).
Sample metadata files and OTU tables were imported into JMP Statistical Discovery Software (JMP v16; SAS Headquarters, Cary, NC).Only microbiome samples from the first study visit were used in this analysis, as postal sorting code (PSC) information was collected in association with this time point.If more than one sub-site was collected for the respective participant (from visit 1 samples), the sample with the maximum OTU count was used so that one broad site sample (if available) was analyzed for each study subject.The final data matrix derived from the V3-5 region used for further data merging included 45,383 OTUs and 824 summarized broad site samples from 216 HMP partici pants from phase 1 of the HMP.Alpha diversity metrics including the total number of present OTUs with at least a count of 1, the Shannon Index, and the Inverse Simpson were calculated on the OTU final data matrix.The total number of present OTUs was calculated by summing up the total number of counts found within an individual's broad site sample.

Social vulnerability index: introduction and overview
The social vulnerability index (SVI) is an index of community-level vulnerability initially developed by the CDC to support governments in directing assistance during crises (11).It is a relative rank of area-level risk (e.g., census tract or county) in four separate themes, with a fifth composite score over all themes.The themes are SES (hereafter referred to as SVI-SES) including risk factors such as higher rates of poverty or lower FIG 1 Broad sampling sites for microbiome characterization.Specific sampling sites were categorized into "broad" body sites by HMP investigators.Sub-sam pling sites were grouped into a total of four broad sites for men and five broad sites for women.Samples collected from mid-vagina, posterior fornix, and vaginal introitus sites were classified as "urogenital tract"; samples collected from attached keratinized gingiva, buccal mucosa, hard palate, palatine tonsils, saliva, subgingival plaque, supragingival plaque, throat, and tongue dorsum were classified as "oral" samples; samples collected from the throat and anterior nares were classified as "airways" samples; samples collected from the left antecubital fossa, left retroauricular crease, right antecubital fossa, and right retroauricular crease were classified as "skin" samples; and stool samples were classified as "gastrointestinal tract" samples.Therefore, males had four broad sites sampled (oral, airways, skin, and gastrointestinal tract), while females had five broad sites sampled (oral, airways, skin, gastrointestinal tract, and urogenital tract).These broad sites are the sites from which alpha diversity metrics for this analysis were derived.educational attainment, household characteristics (SVI-HC) including higher proportions of households with vulnerable members (e.g., children and older adults) or with poor English proficiency, racial and ethnic minority composition (SVI-MC), and housing type and transportation (SVI-HT) including proportions of households without access to a private vehicle or living in vulnerable housing situations like mobile homes or group quarters.An overall measure (SVI-O) is calculated based on the relative ranking of each of the individual themes weighted evenly.While the CDC publishes the SVI at the census tract and county levels, data are easily accessible and rankings can be calculated for neighborhoods and areas of specific to the need at hand (e.g., ZIP codes for a city).The SVI has been used in wide-ranging public health and disaster recovery research, from the spatial distribution of emergency shelters to COVID-19 (coronavirus disease 2019) vaccine uptake.

Construction of SVI measure from postal sorting code
For this project, the SVI was constructed at the PSC level based on methodology published by the CDC.The PSC corresponds to the first three digits of the five-digit ZIP code, which was reported by the HMP and therefore the smallest geographic level available to assign participants.Census tracts were assigned to a PSC if their geographic centroid fell within that PSC.Raw population counts from the 2010 census tract-level SVI were aggregated to the PSC level based on census tract PSC assignment.These data were supplemented with missing denominators from census data (e.g., an estimate of total occupied housing units).SVI values were then calculated at the national level for PSCs following the CDC protocol (Fig. 2).Intraclass correlation coefficients (ICCs) of SVI indices across census tracts provide information on variance in SVI within versus between census tracts (38).At the national scale, the PSC ICC of the census tract SVI themes ranges from 0.06 to 0.54, indicating that 6%-54% of the variability found in these indices is between, rather than within, the PSC level (Table S1).Therefore, regression estimates for determining SVI indices are biased toward zero.

Statistical analyses
Descriptive statistics (mean and SD for normally distributed continuous data, median and interquartile range for ordinal and non-normal data, and frequencies and percen tages for categorical data) were used to describe the demographic characteristics, SVI levels, and alpha diversity measurements in the study population.Parametric [Pearson correlation, t-test, and analysis of variance (ANOVA)] and non-parametric (Spearman's correlation, Wilcoxon rank-sum test, and Kruskal-Wallis test) tests were used to examine relationships between demographic, SVI levels, and alpha diversity measurements.Games-Howell post hoc tests were used for all significant one-way ANOVA tests.Dunn-Bonferroni post hoc tests were used for all significant Kruskal-Wallis tests.Factors that were significantly related to each Shannon Index for the alpha diversity outcome in the bivariate analysis were entered in the multiple linear regression models.Age, sex, and BMI were controlled in all models based on a priori information.Normality and homosce dasticity were checked by residual and normal plots.Multicollinearity was checked by tolerance and variance inflation factor.SPSS PROCESS macro (39) was used to assess whether SVI mediated the relationship between race and urogenital tract Shannon Index.All data analyses were conducted using IBM SPSS Statistics, version 28 (IBM, Armonk, NY).Statistical significance was defined as a P-value < 0.05.

Distance-based correlation analysis
A correlation analysis was conducted using JMP version 16 (SAS Headquarters, Cary, NC) between PSC geographic distances and the average Shannon Index differences at each PSC to examine if there was an overall effect of geographic location differences on alpha diversity differences.To determine the distance between PSCs, the geographic distances (in meters) between the centroids of the various ZIP3 areas were used to form pairwise comparisons where each PSC was compared to every other PSC (NEAR_DIST).This computation created 78 unique differences.To compute the difference of the alpha diversity metric (Shannon Index), the average Shannon Index for each broad site and each PSC was calculated.The absolute value difference between all average Shannon Index values compared to all other average Shannon Index values was determined.This computation created 78 unique absolute Shannon Index differences (when comparing the Shannon Index of each PSC to that of every other PSC).Oral, airways, and urogenital broad sites were included in the analysis as these were the sites significantly associated with alpha diversity (Shannon Index).To test the effect of SVI differences (by PSC locality), the same pairwise difference for each SVI of each PSC to every other SVI at all other PSCs was calculated and the absolute value of the difference was computed.The overall correlation of the geographic distance to the absolute Shannon Index difference was computed using a Spearman correlation analysis.A partial correlation analysis compar ing the diversity index differences to the geographical distances while controlling for the SVI difference is reported.
STORMS (Strengthening The Organizing and Reporting of Microbiome Studies) Checklist has been made available at https://osf.io/8nauz/.

Study population characteristics
Among the final sample of participants (Fig. 3; n = 201), the mean age was 26.68 (SD 5.2) and the mean BMI was 24.4 (SD 3.7) with 51.2% (n = 103) of the sample reporting male gender.The most frequently reported employment status was student (48.3%, n = 97).Of the 201 included participants, 80.2% (n = 142) were breastfed as an infant, 90.5% (n = 182) were born in the USA or Canada, and 79.5% (n = 159) and 79.0% (n = 158) had a mother or father born in the USA or Canada, respectively.Regarding race/ethnicity, most participants were White (72.1%, n = 145), 10% (n = 20) were Asian, 8% (n = 16) were Hispanic/LatinX, 6% (n = 12) were designated as Other which includes mixed-race individuals, and 4% (n = 8) were Black.The mean number of weeks since the last dental visit was 73.2 (SD 15.3).Participant characteristics including mean, SD, and percentage of the study population are described in Table 1.
The PSC distribution of the sample population is shown in Fig. 4. Overall, 55.9% (n = 112) of the participants were from PSCs in Texas, and 44.3% (n = 89) were from PSCs in Missouri.Within each state, most participants were from the 770 postal sorting area in Houston, TX (n = 84, 41.7%), from the recruitment center at Baylor University and the 631 postal sorting area (n = 52, 25.8%) in St. Louis, MO, from the recruitment center at Washington University, St. Louis.Among these PSCs, the majority of self-reported career category was student (Fig. S2).There were four postal sorting areas that only contained one participant: postal sorting areas 652, 741, 786, and 787.

Alpha diversity distribution across broad sites and participant-specific variables
Across the included cohort, oral tract samples had the highest average alpha diversity (Shannon Index, Inverse Simpson, observed OTUs) compared to other body sites (Fig. 5).Alpha diversity of oral samples was significantly higher in males versus females [Shannon Index (t(179)=0.976,P = 0.037, d = 0.31) and inverse Simpson (U = 3,177, P = 0.010, r = 0.19), respectively] but did not differ by sex at any other broad sampling site.A history of breastfeeding as a child was associated with lower oral broad site inverse Simpson values (U = 1,379, P = 0.040, r = 0.16).For race/ethnicity, Asian participants had significantly lower urogenital alpha diversity (Shannon Index), compared to White (Games-Howell post hoc test P = 0.025), Hispanic/LatinX (Games-Howell post hoc test P = 0.002), and Other (Games-Howell post hoc test P = 0.021) participants, and lower stool Shannon indices versus Other (Games-Howell post hoc test P = 0.028) participants as well.Airways, skin, and oral broad sites did not have significant race/ethnicity differences, and partici pant occupation was not associated with alpha diversity across broad sampling sites.Maternal or participant place of birth was not associated with any differences in broad site alpha diversity, but having a father not born in the USA or Canada was associated with higher airway observed OTUs (U = 1,210, P = 0.030, r = 0.18) and lower urogenital observed OTUs (U = 343, P = 0.026, r = 0.24).Parental place of birth was significantly different by race/ethnicity (P < 0.001).All Asian sample participants had fathers not born in the USA/Canada (Table S2).Participant's BMI was positively associated with stool (Shannon Index: r = 0.21, P = 0.005; inverse Simpson: r s = 0.16, P = 0.033; and observed OTUs: r s = 0.17, P = 0.028) and oral (inverse Simpson: r s = 0.15, P = 0.044) alpha diversity.Age was also positively associated with stool (Shannon Index: r = 0.16, P = 0.035 and observed OTUs: r s = 0.19, P = 0.014) and oral (observed OTUs: r s = 0.16, P = 0.031) alpha diversity.Time (weeks) since subjects had last visited the dentist was significantly associated with the alpha diversity of broad sampling sites in addition to the expected impact on the oral microbiome (Fig. S3A through C).Time since the last dental visit was positively associated with oral (observed OTUS: r s = 0.24, P = 0.001) and stool (observed OTUs: r s = 0.43, P < 0.001) alpha diversity, while this metric was negatively associated with alpha diversity in airways (Shannon Index: r = −0.18,P = 0.031; inverse Simpson: r s = −0.30,P < 0.001; Fig. S3D).

Residential area variables: postal sorting codes and social vulnerability index and race/ethnicity associations
Significant variations in SVI scores for overall and sub-themes were found across postal sorting areas within the sample population (Table 2).The range of the SVI overall score varied from 0.071 to 0.904, suggesting a broad range of socially vulnerable areas within the sample.The two PSCs with the highest number of participants, 770 and 631, had scores of 0.904 and 0.654, respectively.Race/ethnicity was significantly different across SVI indices, P < 0.001 (Fig. S4).Monte Carlo Pearson χ 2 and likelihood ratio test results showed no statistical significance between three-digit ZIP code SVI and race/ethnicity (P = 0.086; Fig. S5).

Distance-based correlation analysis
The pairwise distance-based correlation analysis for airways, urogenital, and oral sites is displayed in Fig. S7.When the Shannon Index differences by PSC in airways were compared to geographic distance, no significant association was observed, r = −0.005(P = 0.971), nor was there an association observed with the Shannon Index differences by PSC in urogenital samples, r = −0.064(P = 0.640).However, a statistically significant positive association was found when comparing the Shannon Index difference by PSC in oral with geographic location distance r = 0.413 (P < 0.001).This finding suggests a relationship between geographic distances across PSCs and Shannon Index differences in oral sites; hence, as geographic distance increases so does the alpha diversity in oral sites.When controlling for SVI differences, the significant positive association between geographic distance and oral Shannon Index differences remained significant (r = 0.423, P < 0.001).

Evaluating SVI as mediator of race and ethnicity on alpha diversity
The urogenital site was the only Shannon Index measurement that was associated with both race and SVI indices; thus, regression and mediation analysis to evaluate race/ethnicity differences were only done with this body site.Within this sub-sample, race/ethnicity per group sample size was White n = 55, Asian n = 9, Black n = 4, Hispanic/LatinX n = 9, and Other n = 6.When adjusted for age, BMI, and SVI overall score, pairwise comparisons of mean urogenital Shannon Index across race/ethnicity categories showed significantly lower Shannon Index from samples obtained from Asian women when compared to samples from White women (mean difference = −0.91,P = 0.004), Hispanic/LatinX women (mean difference = −1.10,P = 0.007), and women categorized as Other (mean difference = −1.07,P = 0.030).There was not a statistically significant difference when controlling for SVI overall score between Asian and Black women (mean difference = −0.822,P = 0.362), but there was when controlling for SVI-HC (mean difference = −1.20 P = 0.046).
Mediation analysis to evaluate a mediating role for SVI on the association between race/ethnicity and Shannon Index at the urogenital site using samples from White women as the reference showed that the SVI Overall score partially mediated the association between race/ethnicity and urogenital Shannon Index.As shown in Fig. 9, the SVI overall score mediated 53.5% of the relationship between White and Hispanic/Lat inX women (indirect effect 95% confidence interval [CI] 0.06-0.37)and 57.2% of the relationship between White and women within the Other category (indirect effect 95% CI 0.05-0.38).All SVI sub-themes, except for SVI-SES, showed a significant mediation effect when comparing both Hispanic/LatinX and Other to White (Table S2).When comparing Asian to White, only significant mediation for the SVI-MC sub-theme was found (indirect effect 95% CI 0.037-0.39).Mediation of SVI comparing White to Black for SVI overall and for SVI sub-themes did not show a significant mediation effect from SVI (Table S2).

DISCUSSION
In this present analysis of data from the HMP including healthy adults who reported residential area information at the time of microbiome sampling, we found a significant relationship between the SVI, a measure of area disadvantage, and multiple alpha diversity measures across oral, airways, and urogenital sites when controlling for age, gender, and BMI.We also observed a significant mediation of area social vulnerability score to explain racial/ethnic differences in urogenital microbiome diversity in females.These main findings suggest the importance of the inclusion of residential area depriva tion in microbiome studies, especially when race is used as a variable of interest.
The human microbiome ecosystems living within the human body are exposed to exogenous foreign substances from the various environments that humans live within (12).Therefore, dynamics between microbes and human hosts can be influenced by environmental changes, thus foreseeably impacting the microbiome, composition, and diversity.The influence of residential area as an environmental factor that has influence of microbiome bacterial community characteristics has been well reported (40,41).Human geography, however, is often linked to socioeconomic status, racial and ethnic population enclaves, and disparities in area disadvantage, including social vulnerability.Specific potential mechanisms linking the microbiome to factors tied to area disadvant age include household crowding, use of public transportation, and lack of exposure to biodiverse natural environments.Implications for understanding how area deprivation may influence microbiome diversity and future directions for inclusion in microbiome research are further discussed.
Notably, our analysis is not the first HMP analysis to evaluate the role of geography.Lloyd-Price et al. ( 41) evaluated the role of PSC across species-level bacterial taxa using HMP shotgun metagenomics sequencing data but found no significance across PSCs from Houston, TX, and St. Louis, MO.However, to our knowledge, ours is the first analysis to connect the PSC data to geographic SVI, assigning a proxy for area disadvantage to the categorical PSC data.SVI has been associated with chronic conditions like chronic respiratory disease (42) and CVD (43), which have also been linked to the human microbiome.We found that oral and airways site alpha diversity was significantly associated with all SVI indices.Interestingly, we found that there was a significant relationship between distances across PSCs and Shannon Index differences for the oral broad site.These results suggest that geographically located factors may have a role in oral alpha diversity.The oral microbiome ecosystem composition is shaped throughout life by multiple factors which may account for this association, including individual SES, oral health behaviors, and diet (44,45).In contrast, the airway microbiome is also shaped significantly by environmental contaminants such as pathogens and air pollution (46), along with obesity (47), a risk factor linked to SVI (48).Future studies are needed to confirm our findings and in particular to explore geographic factors that may contribute to oral diversity.
An interesting finding from our analysis was the association between the last dental visit and oral alpha diversity, along with other broad sampling sites including the gastrointestinal tract and airways.Dental care and oral health are underappreciated markers of social determinants of health, and reduced access to preventive dental care can result from factors ranging from lack of supplemental dental insurance to location and availability of dental resources.We found that the time that participants last visited the dentist was significantly and positively associated with increased observed OTUs in both oral and gastrointestinal tract microbiome samples.Previous work has reported higher oral diversity with poor overall oral health (49), and the association here with time since the last dental visit (average of 1.4 years, 74 weeks) reinforces this.The oral microbiome is connected to the gut through the gastrointestinal tract, and transmission of oral-associated microbes has been demonstrated in previous research (50,51).Therefore, alterations in the oral microbiome because of environmental or socioeconomic factors impacting other microbiome niches should be logically consid ered.Whether this increase in oral and gastrointestinal alpha diversity is associated with beneficial or pathogenic microbes is out of the scope of this current analysis but will be of great interest in future work.Nevertheless, these findings contribute to the mounting evidence that universal oral screening and increased access to dental care are needed to achieve whole-person health and reduce the risk of disease across populations.
Although alpha diversity at the urogenital site is reported to have the least alpha diversity across body sites (20), we were still able to identify a significant association between SVI and urogenital microbiome diversity.For the total sample, household composition, minority composition of the area, and housing and transportation type within the PSC area were significant suggesting potential area-specific measurements needed in future studies.Determinants of the urogenital microbiome vary from personal habits to sexual behaviors and the physiologic role of sex hormones (52).Relevant to our results, sex hormones may be influenced by exposures linked to social vulnerability area characteristics as areas with higher minority composition and housing characteristics are reported to have disparities in water and air pollution exposures (53).Future studies conducting vaginal microbiome sampling should therefore examine potential exposures that may be related to geographic area disadvantage.
Interestingly, we did find a significant relationship between the minority composition of the area and housing and transportation type, and skin microbiome.Factors known to influence skin microbiome include skin hydration level and sebum level (54), both of which may be influenced by area-level factors such as climate and heat exposures.The literature does suggest that areas with higher minority populations, as would be represented by higher SVI minority composition levels, do have disparate climate and heat exposures (55).The potential role of these and other area-level factors on skin microbiome remains to be determined.
The HMP found that for all body sites, ethnicity was the host phenotypic variable with the most associations (20).Our analysis using a sub-sample of the HMP participants found that race/ethnicity was associated with stool and urogenital alpha diversity.Lower Shannon Index was found among Asian women compared to all groups.Interestingly, all Asian women in our sample had fathers born outside of the USA/Canada, which was significantly associated with lower urogenital OTU.Thus, the role of parental birthplace on urogenital microbiome race/ethnic differences should be further explored.With regard to our research objective to contextualize the microbiome beyond race, we identified a mediating role of SVI to explain race/ethnicity associations with Shannon Index.When comparing samples from White women to Hispanic/LatinX and Other women, there was a significant role of overall SVI in explaining the racial/ethnic differences.Race/ethnicity-specific differences in vaginal microbiome at the species level have been well reported (21,24).Although our results of alpha diversity are not directly comparable to species-level differences, our results suggest that there is at least a role of area deprivation on diversity when evaluating race/ethnicity differences.Furthermore, our use of race and ethnicity as characterizing variables in this analysis is consistent with recommendations from the National Academies of Sciences, Engineering and Medicine (NASEM) report (30) on characterizing populations as the intended use of these descent-associated variables is to explore disparities in outcomes that occur due to differences in sociopolitical conditions and not to explore biological differences or to serve as general proxies for individual behavioral exposures.Whether area deprivation is also involved at the species level will need to be determined in future analyses and studies.Given the recent calls to contextualize race/ethnicity-specific microbiome results, our results suggest that area social vulnerability should be considered as a contextual variable.

Limitations
Although we used rigorous statistical methods and an established marker of social vulnerability to examine the association between geographic disadvantage and microbiome diversity at multiple human sites, there are limitations of our analysis that are important to consider.One, the HMP sample overall included a large proportion of students who could impact the external validity of our results.Within our analysis, most of the samples lived within one of two PSCs that correspond to the research recruitment sites in St. Louis, MO, and Houston, TX.We anticipate this skewed sample may have impacted our results as those living near urban academic centers may have different microbiome-relevant exposures due to urban academic settings being located within areas with significant socioeconomic disadvantages (56).An additional limitation is that the length of time within the residential area was not measured.And with regard to the distance analysis in this report, it is based on computing the geographic distances between the centroids of each PSC and not the actual location of the individuals for whom we have determined the absolute value of Shannon Index differences.Nonethe less, the relationship between these distances and the alpha diversity in the oral broad site opens up the potential for exploratory pathways in future microbiome studies.
Based on our primary research objective, our analysis was limited to participants who provided three-digit ZIP code data and to samples collected at the first visit.Thus, our sample size is smaller than other published HMP analyses.This may have impacted our results in several ways.Unlike previously reported HMP studies (20), we did not identify race/ethnicity differences across several broad body sites.There was a further reduction in sample size for the urogenital site analysis, leading to a notable reduction in sample size within samples from non-White women, particularly samples from Black women.Despite this limitation, we still found meaningful associations between SVI and microbiome characteristics, which highlight the importance of considering geographic area disadvantage in future studies of the human microbiome.Importantly, although the HMP 1 study design sampled a percentage of minoritized individuals who was represen tative of population-level estimates, the study sample was predominately self-identified White individuals.For future studies, evaluating differences between the minoritized groups within the sample likely requires over-sampling beyond a grouped populationlevel estimate, especially when evaluating the role of geographic exposures.
Other limitations in this current analysis include our use of the original HMP bioinformatics pipelines using the OTU picking methodology.Since the publication of that work, other pipelines have been established such as amplicon sequence variance picking strategies, and therefore, data from these workflows could have generated slightly different alpha diversity metrics.Nevertheless, we wanted the focus of this analysis to be on associations between microbial diversity and SVI, and not on the specific bioinformatics pipeline, and therefore chose to use the original well-documen ted and benchmarked HMP feature tables to generate the alpha diversity metrics used in this analysis.There were also potentially relevant variables collected in the HMP, such as prior pregnancy and meat consumption diet pattern, that were not included in this analysis due to the reduction in sample size.As geographic data were only obtained at one time point, our analysis is cross-sectional, and therefore causal conclusions cannot be made.As a result, our use of mediation analysis at the urogenital site was not to determine causality but to provide contextualization for the relationship between race/ ethnicity and alpha diversity.
Further limitations exist in our analysis regarding geographic data.First, our estimate of geographic disadvantage is limited to areas corresponding to PSCs, which lack the specificity of neighborhood-level data, such as census tracts.This may have impacted our ability to find an association between stool diversity and SVI, comparably more precise neighborhood-level measures of area.For example, Sun et al. (28) and Miller et al. (18) previously identified an association between stool microbiome composition and neighborhood disadvantage that we did not replicate in our current analysis.It is important to note that there are potential ethical and legal considerations with the collection of more precise geographic area data within microbiome studies that will need further discussion and consideration for inclusion in primary data collection studies (57).Second, data regarding the length of time at the residence was not obtained; therefore, inferences about the length of exposure to microbiome diversity cannot be made.Because of the measurement error associated with aggregating smaller to larger geographic areas, findings are expected to be biased toward zero.Third, SVI, as with any index of deprivation or vulnerability, does not present the full picture of any community but rather presents an objective and comparable starting point toward understanding the role of structural and environmental issues in community needs.Although these limitations posed initial challenges in our analysis of geographic disadvantage and microbial diversity, we hope our methods and included variables will assist researchers to incorporate neighborhood and environmental variables in future study designs of the human microbiome.

Future directions for microbiome research
Evaluating the human microbiome through the context of geographic area disadvantage may provide a pathway to further understanding the link between the microbiome and human health, and the need for geographic area-based interventions or policy changes relevant to public health (39).For instance, relevant to the oral microbiome local changes in the oral microbiome and oral cavity have been associated with systemic pathology including increased circulating inflammatory biomarkers (58), along with increased risk for diabetes mellitus (59) and CVD (60).Microbiome metabolites, such as TMAO, have been linked to neighborhood disadvantage (19).With regard to the skin microbiome, consideration of microbiome and area disadvantage associations may explain disparities present in microbiome-linked skin conditions, such as psoriasis (61,62).In addition, the integration of recent literature on the role of factors linked to both microbiome and health disparities, such as diet and breastfeeding status may provide insight into microbiome-related mechanisms for health disparities (17).This current work focused on associations between alpha diversity across different sampling sites and both environmental and clinical factors in healthy human subjects.Relationships between alpha diversity and human disease have been well established in some microbiome niches, such as the relationship between increased oral alpha diversity and both local oral-associated diseases [i.e., periodontal disease (63,64)] and systemic diseases that are associated with poor oral health [i.e., alcohol use disorder (6)].Future work will also benefit by incorporating taxonomic-based analyses in the study of relationships between social determinants of health and the microbiome to inform potential mechanisms linking SVI and human disease and provide preliminary information for future targeted research.
Despite the likely role of the student enclave effect from the HMP sample population, our urogenital results suggest a need to include women of Asian descent and to include parental birthplace as a potential contextual factor, within future microbiome studies.This will require diversification of research teams, identification of specific barriers to participation, and community-based outreach within multiple heritage Asian commun ities (65).It is important to note that the importance of considering the intersection of area disadvantage with race/ethnicity is to not engender the notion that being disadvantaged is akin to being a member of a racialized or minoritized group.Its consideration potentially allows for the identification of racialized mechanisms (i.e., segregation or housing discrimination that limits economic investment) associated with microbiome differences that may further explain mechanisms of health disparities.Due to the disproportionate number of minoritized individuals in the USA who live in geographically disadvantaged areas due to systemically racist policies that impact community needs, future microbiome studies should not only specifically recruit these participants but should also include geographic area data to contextualize the role of race/ethnicity.

Conclusions
Using the validated measure of SVI for area social vulnerability, we identified significant associations with alpha diversity within oral, airways, skin, and urogenital broad sites within a sub-sample of HMP participants.Race/ethnicity was significantly associated with SVI and urogenital diversity, and SVI partially mediated race/ethnicity differences when comparing White to Hispanic/LatinX women and women categorized as Other.Our results suggest a potential role for SVI to explain urogenital race/ethnicity differences considering residential area social vulnerability and highlight a need to include area disadvantage in microbiome studies within healthy individuals.However, our race/ ethnicity results are based on a reduced sample size.Future studies including a diverse, representative community-based population, more precise residential location, and inclusion of related risk factors such as dietary pattern intake are needed to further understand the implications of these results and improve external validity and reproduci bility in future microbiome studies.

DATA AVAILABILITY
All microbiome 16S data used for this analysis were processed and publicly available for download at the Human Microbiome Project data portal.For access to participant phenotype information, approval needs to be requested through the National Institutes of Health database of Genotypes and Phenotypes.

ETHICS APPROVAL
Access and approval to the HMP participant phenotype data set were granted to N.F.(principal investigator), K.A.M., and G.R.W. from the Intramural National Institutes of Health database of Genotypes and Phenotypes.

ADDITIONAL FILES
The following material is available online.

FIG 3
FIG 3 Diagram of HMP data sampling, study population selection and inclusion of geographic disadvantage variable, and social vulnerability index.Flowchart of the variables of interest and included participants from phase 1 of the HMP.DACC, Data Analysis and Coordination Center; SVI, social vulnerability index.

FIG 4
FIG 4 Distribution of postal sorting code (PSC) included in the analysis and the number of people for each PSC.Number of people (y-axis) by postal sorting code (x-axis).Color of the bar represents the residential state of the participant to be either Missouri (blue) or Texas (green).

FIG 7
FIG 7 Observed OTUs' alpha diversity indices Colored by SVI associations.Observed OTUs by broad site colored by coefficient.Top panel shows the observed OTUs' alpha diversity metric (y-axis) by broad site (x-axis) colored by the Spearman coefficient between broad site observed OTUs and the SVI total.Bottom panel shows the Spearman correlation coefficient (colored by the Spearman coefficient) between broad site observed OTUs and the SVI sub-themes of socioeconomic status (SES), housing composition (HC), minority language (ML), and housing transportation (HT).The size of the dot represents the absolute value of the Spearman coefficient of the diversity index versus the SVI sub-theme for that broad site.Black boxes around sub-SVI theme data points indicate statistical significance at P < 0.05.

FIG 8
FIG 8 Alpha diversity measures by postal sorting code.Alpha diversity metrics by broad site across the postal sorting codes.Colors represent the broad sampling site.(A) Shannon Diversity Index for broad site (y-axis) by the postal sorting code (x-axis).(B) Inverse Simpson Index for broad site (y-axis) by the postal sorting code (x-axis).(C) Total observed OTUs for broad site (y-axis) by the postal sorting code (x-axis).To evaluate the alpha diversity across postal sorting codes, Kruskal-Wallis tests were performed with post hoc testing with Bonferroni's correction when appropriate.* indicates significance level at P < 0.05.GI, gastrointestinal.

FIG 9
FIG 9 Mediation of the relationship between race/ethnicity and urogenital Shannon Index with SVI.(A) Mediation model of the observed relationship between race/ethnicity (comparing Hispanic/LatinX versus White) and urogenital Shannon Index with SVI overall as a mediator, model adjusted for age and BMI (n = 83).(B) Mediation model of the observed relationship between race/ethnicity (comparing Other versus White) and urogenital Shannon Index with SVI overall as a mediator, model adjusted for age and BMI (n = 83).Dashed line indicates total effect.

TABLE 1
Participant characteristics of Human Microbiome Project sub-sample, n = 201

TABLE 2
Postal sorting code (PSC) by social vulnerability indices, including the number of the sample population within each PSC a a SVI-SES, Social Vulnerability Index_Socioeconomic status; SVI-HC, Social Vulnerability Index_Housing composition; SVI-MC, Social Vulnerability Index_Minority Composition; SVI-HT, Social Vulnerability Index_Housing and Transportation.

TABLE 4
Mean differences of urogenital Shannon Index and social vulnerability indices by race/ethnicity comparison (n = 83) a a Bold indicates P value < 0.05.All p values adjusted for multiple comparisons: Sidak.All models adjusted for age, BMI, and the respective SVI index.