Obesity differs significantly across the epidemiological transition. From 2018-2019, the METS-Microbiome study recruited 2,085 participants (~60% women) ages 35-55 years old from five different sites (Ghana, South Africa, Jamaica, Seychelles, and US). Of these participants, 1,249 have been followed on a yearly basis since 2010 under the parent METS study. Data from 1,867 participants with complete data sets were used in this analysis. Overall mean age was 42.5 ± 8.0 years (Table 1). Mean fasted blood glucose was 105.2 ± 39.4 mg/dL, mean systolic blood pressure was 123.4±18.1 mm Hg and mean diastolic blood pressure was 77.2 ± 13.1 (Table 1). When compared to the high-income countries (Jamaica, Seychelles, and US), both women and men from the lower- and middle-income countries (Ghana and South Africa) had significantly lower BMI, fasted blood glucose and blood pressure (systolic and diastolic). Mean BMI was lowest in the South African men (22.3 kg/m2 ± 4.1) and highest in US women (36.3 kg/m2 ± 8.8). When compared to the US, all sites had significantly lower prevalence of obesity (p<0.001 for all sites except for Seychelles: p=0.02). Prevalence of hypertension was lowest in Ghanaian men (33.1%) and highest in US men (72.7%). Prevalence of diabetes was lowest in South African women and men (3.5% for women and men) and highest for Seychellois men (22.8%). When compared to the US, prevalence of hypertension and diabetes was significantly lower in countries at the lower end of the spectrum of HDI (i.e., Ghana and South Africa) when compared to the US (p<0.001).
Microbial community composition and predicted metabolic potential differs significantly between countries and correlates with obesity. Following the removal of samples that had fewer than 6,000 reads and features less than ten reads in the entire dataset, a total of 433,364,873 16S rRNA gene sequences were generated from the 1,873 fecal samples which were clustered into 13,254 ASVs. Country of origin describes most of the variation in microbial diversity and composition, with significant differences in both alpha and beta diversity. Although there were major variations in alpha diversity between countries and large degree of inter-individual variation within countries, Ghana showed significantly greater diversity for all the alpha diversity metrics (Observed ASVs, Shannon Diversity and Faith’s phylogenetic diversity) when compared to all other countries. The Seychelles and US had the lowest alpha diversity (Fig. 1). The stool microbiota alpha diversity of non-obese individuals was significantly greater when compared with that of obese individuals (Fig. 1). Beta diversity was also significantly different between countries (Fig. 1, Supplementary Tables 2 & 3; principal coordinate analysis, weighted UniFrac distance; F-statistic =58.67; p < 0.001; unweighted UniFrac distance; F= 39.87; p < 0.001) and obese group (weighted UniFrac distance; F-statistic =2.39; p = 0.031; unweighted UniFrac distance; F=6.06; p < 0.001).
Next, we compared fecal microbiota diversity between obese individuals with their non-obese counterparts within each country independently. Greater alpha diversity was detected in non-obese subjects in the Ghanaian (Observed ASVs, Faith PD; p<0.05) and South African cohorts (Observed ASVs; p<0.05) only (Supplementary Table 1). Similarly, significant differences in beta diversity between obese and non-obese microbiota were observed in Ghana (Unweighted UniFrac; p<0.05), South Africa (Unweighted UniFrac; p<0.05) and US (Weighted UniFrac; p<0.05) data sets (Supplementary Tables 2 & 3). These results suggest that the beta diversity differences observed in the Ghanaian and South African participants may partly be due to the presence of more abundant fecal microbiota taxa in the fecal samples whereas among the US participants, the differences may be related to the abundance of rare taxa. Collectively, these observations suggest that country is a major driver of the variance in gut microbiota diversity and composition among participants with or without obesity with marked contributions from Ghana and South Africa and modest contribution from the US in the overall cohort.
We also examined whether country of origin or obesity relates to the presence of specific microbial genera frequently used to stratify humans into enterotypes (Arumugam et al. 2011). As expected, large differences in enterotype between the countries were observed. The Prevotella enterotype (P-type) was enriched on the African continent, with 81% and 62% in Ghanaians and South Africans respectively while Bacteroides enterotype (B-type) was dominant in the US (75%), Jamaican cohorts (68%), and comparable proportions of both enterotypes among individuals from Seychelles. Further, obese individuals displayed a greater abundance of B-type whereas a higher proportion of the P-type associated with the non-obese group (Supplementary Table 4). Consistent with this observation, the abundance of B-type correlated with higher BMI (p=0.004) than P-type. Significantly greater diversity and increased levels of total SCFA were observed in participants in the P-type (Supplementary Table 4). The relative abundance of shared and unique features between the different countries illustrated by the Venn diagram showed that Ghana carries the largest proportion of unique taxa than the other countries, and US the lowest (Fig. 1).
Microbial taxa differ significantly between countries and between lean and obese individuals. In comparison with the US, South African fecal microbiota had a significantly greater proportion of Clostridium, Olsenella, Bacilli and Mogibacterium; Jamaican samples had a significantly greater proportion of Bacilli, Bacteroides, Clostridia, Dialister, Enterobacteriaceae, and Oscillospiraceae; Seychelles samples had a significantly greater proportion of Clostridium, Olsenella and Haemophilus; and Ghanaian samples had a significantly greater proportion of Clostridium, Prevotella, Weisella, Enterobacteriaceae and Butyricicoccaceae. The US samples had a significantly greater proportion of Aldercreutzia, Anaerostipes, Clostridium, Eggerthella, Eisenbergiella, Ruminococcaceae and Sellimonas compared to the 4 countries (Fig. 2 and Supplementary Fig. 1).
When adjusted for country, age, and sex (p < 0.05; false discovery rate (fdr)-corrected), 38 Amplicon Sequence Variants (ASVs) were significantly different between obese and non-obese groups. The obese group was characterized by an increased proportion of Allisonella, Dialister, Oribacterium, Mitsuokella, and Lachnospira, whereas non-obese microbiota had a significantly greater proportion of Alistipes, Bacteroides, Clostridium, Parabacteroides, Christensenella, Oscillospira, Ruminococcaceae (UBA1819), and Oscillospiraceae (UCG010) (Fig. 2).
Microbial taxonomic features predict obesity overall and within each country. Using supervised Random Forest machine learning, the predictive capacity of the gut microbiota features in stratifying individuals to country of origin, sex, or with metabolic phenotypes were assessed. The predictive performance of the model was calculated by area under the receiver operating characteristic curve (AUC) analysis, which showed a high accuracy for country of origin (AUC = 0.97), and a comparatively lower level of predictive accuracy for obese state (AUC = 0.65) (Fig. 3). Sex was predicted with AUC = 0.75, the diabetes status with AUC = 0.63, hypertensive status with AUC = 0.65 and glucose status with AUC = 0.66. Random Forest analysis was also used to identify the top 30 microbial taxonomic features that differentiate between countries and obese states. Similar to the ANCOMBC results, Prevotella and Streptococcus were at a greater proportion in the microbiota of Ghanaian and non-obese individuals, whereas Mogibacterium was at a greater proportion in the South African cohort. A greater proportion of Megasphaera was associated with the Jamaican cohort, while a greater proportion of Ruminococcaceae was observed in the American microbiota. Weisella, which was identified as having a significantly greater proportion in the Ghanaian cohort using ANCOMBC, was observed to be a discriminatory feature for Seychelles microbiota using Random Forest (Supplementary Fig. 2).
Similarly, the predictive capacity of the gut microbiota features in stratifying individuals by obese state was assessed at each of the five study sites. The predictive performance of the model was calculated by AUC analysis, which showed a moderate accuracy for obese state for all sites, namely, Ghana (AUC = 0.57), South Africa (AUC = 0.52), Jamaica (AUC = 0.48), Seychelles (AUC = 0.43) and US (AUC = 0.52) (Supplementary Fig. 3).
Predicted genetic metabolic potential differs by country and obesity status. The predicted potential microbial functional traits resulting from the compositional differences in microbial taxa between countries and obese state were assessed. PICRUSt2 predicted a total of 372 MetaCyc functional pathways. ANCOM-BC analysis adjusted for sex, age and BMI identified 67 pathways (p< 0.05; false discovery rate (fdr)-corrected), LFC>1.4) that accounted for discriminative features between the 4 different countries with the US (Supplementary Fig. 4). In comparison with US, MetaCyc pathways differentially increased in Ghana and Jamaica include methylgallate degradation, norspermidine biosynthesis (PWY-6562), gallate degradation I pathway, gallate degradation II pathway, histamine degradation (PWY-6185), and toluene degradation III (via p-cresol) (PWY-5181). South African samples had a greater proportion of L-glutamate degradation VIII (to propanoate) (PWY-5088), isopropanol biosynthesis (PWY-6876), creatinine degradation (PWY-4722), adenosyl cobalamin biosynthesis (anaerobic) (PWY-5507), respiration I (cytochrome c) (PWY-3781). MetaCyc pathways linked to norspermidine biosynthesis (PWy-6562), mycothiol biosynthesis (PWY1G-0), were at a greater proportion in the Seychelles samples, whereas reductive acetyl coenzyme A (CODH-PWY), and chorismate biosynthesis II (PWy-6165) were depleted in the US samples. ANCOM-BC analysis adjusted for site, sex and age identified 24 predicted pathways that differentiated between obese and non-obese individuals (Supplementary Fig. 4). Notably, the microbiota of non-obese individuals had a greater proportion of predicted pathways including the TCA cycle, amino acid metabolism (P162-PWY, PWY-5154, PWY-5345), ubiquinol biosynthesis-related pathways (PWY-5855, PWY-5856, PWY-5857, PWY-6708, UBISYN-PWY), cell structure biosynthesis and nucleic acid processing (PWY0 845, PYRIDOXSYN-PWY).
Next, KEGG orthology (KO) involved in pathways related to butanoate (butyrate) metabolism and LPS biosynthesis were investigated. Predicted genes involved in butyrate biosynthesis pathways showed that enoyl-CoA hydratase enzymes (K01825, K01782, K01692), lysine, glutarate /succinate enzymes (K07250, K00135, K00247), glutarate/Acetyl CoA enzymes (K00175, K00174, K00242, K00241 K01040, K01039) were differentially abundant in participants from Ghana, South Africa, Jamaica, and Seychelles in comparison to the US cohort. The relative abundance of succinic semialdehyde reductase (K18122) was significantly increased only in South Africa, Jamaica, and Seychelles population. Further, predicted genes proportionally abundant only in specific countries were observed. For instance, succinate semialdehyde dehydrogenase (K18119) was enriched only in the Ghanaian cohort, 4-hydroxybutyrate CoA-transferase (K18122) enriched among South African participants and lysine/glutarate/succinate enzyme (K14268) differentially abundant within the Seychelles population. The relative abundance of predicted genes encoded for enzymes such as maleate isomerase (K10799), 3-oxoacid CoA-transferase(K01027) and pyruvate/acetyl CoA (K00171, K00172, K00169) were greater in the US participants compared with participants from the 4 countries (Supplementary Fig. 5). The non-obese exhibited a significantly greater abundance of genes that catalyze the production of butyrate via the fermentation of pyruvate or branched amino-acids such as enoyl-CoA hydratase enzyme (K0182), Leucine/Acetyl CoA enzyme (K01640) and pyruvate/acetyl CoA enzyme (K00171, K00172, K00169, K1907) by contrast obese individuals were differentially enriched for succinyl-CoA:acetate CoA-transferase (K18118) (Supplementary Fig. 5). All analyses were adjusted for country, sex, BMI and age (fdr-corrected p < 0.05).
Several gut microbial predicted genes involved in LPS biosynthesis differentially enriched among the countries (p< 0.05; false discovery rate (fdr)-corrected) were identified. In particular, the relative abundance of specific LPS genes (K02560, K12973, K02849, K12979, K12975, K12974) were significantly enriched in Ghana, South Africa, Jamaica, and Seychelles when compared with US. Higher proportions of LPS genes including K12981, K12976 K09953, K03280 were significantly increased in Seychelles samples in comparison with US samples and also significantly increased in the US cohorts in comparison with participants from Ghana, South Africa, and Seychelles. US samples had a greater proportion of the following genes (K15669, K09778, K07264, K03273, K03271) in comparison with the other 4 countries (Supplementary Fig. 6). Non-obese individuals had a greater abundance of predicted genes encoding LPS biosynthesis (K02841, K02843, K03271, K03273, K19353, K02850) whereas only 1 LPS gene (K02841) differentially elevated in the non-obese group (Supplementary Fig. 6). All analyses were adjusted for country, sex, BMI and age (fdr-corrected p < 0.05).
Microbial community composition and taxonomy correlate with observed fecal SCFA concentrations. All countries had significantly higher weight-adjusted fecal total SCFA levels when compared to the US participants (p<0.001), with Ghanaians having the highest weight-adjusted fecal total SCFA levels (Supplementary Table 5). When compared to their obese counterparts, non-obese participants had significantly higher weight-adjusted fecal total and individual SCFA levels (Supplementary Table 6). Total SCFA levels displayed weak, but significantly positive correlation with Shannon diversity (r = 0.0.074). A similar trend was observed in the different individual SCFAs, namely valerate (r = 0.19), butyrate (r = 0.12), propionate (r = 0.073) and acetate (r = 0.058) (Fig. 4). Observed ASVs were not significantly correlated with total SCFAs (p>0.05). Levels of acetate, butyrate and propionate exhibited strong significant correlations with total SCFA, whereas valerate levels significantly correlated negatively (r = -0.09) with total SCFAs. Next, we assessed if levels of total SCFAs could be predicted by a mixed model. Country explained 45.7% of the variation in SCFAs. No significant effect was explained either by obesity or Shannon diversity.
To explore the connection between SCFAs with gut microbiota, Spearman correlations between taxa that were proportionally significantly different between countries and concentrations of SCFAs were determined. Valerate negatively correlated with the proportion of Clostridium, Prevotella, Faecalibacterium, Roseburia and Streptococcus, which were all positively correlated with acetate, propionate, and butyrate. Similarly, the proportions of Christensenellaceae, Eubacterium, and UCG 002 (Ruminococcaceae) were significantly positively associated with valerate, and negatively correlated with acetate, propionate, and butyrate. In addition, only a single ASV annotated to Ruminococcus was observed to be positively associated with all 4 SCFAs (Fig. 5). Similarly, Spearman’s rank correlation coefficients were calculated between the differentially abundant ASVs identified between obese and non-obese group with concentrations of SCFAs. Broadly, the proportions of most ASVs were significantly positively associated with acetate in comparison with the other 3 SCFAs. Consistent with the correlations mentioned above, valerate negatively correlated with most ASVs that were found to be positively correlated with the three major SCFAs, acetate, propionate, and butyrate and vice versa. The relative proportions of ASVs belonging to Allisonella, Erysipelotrichaceae and Libanicoccus positively correlated with acetate, propionate, and butyrate, whereas significantly negative relationships were observed between Parabacteroides and Bacteroides abundances with the aforementioned SCFAs. Valerate showed significantly positive associations with Oscillospiralles and Ruminococcaceae abundances and significantly negative correlations with Lachnospira and Eggerthella abundances (Fig. 5).