The Gut Microbial Diversity of Newly Diagnosed Diabetics but Not of Prediabetics Is Significantly Different from That of Healthy Nondiabetics

Gut microbiota is considered to play a role in disease progression, and previous studies have reported an association of microbiome dysbiosis with T2D. In this study, we have attempted to investigate gut microbiota of ND, PreDMs, NewDMs, and KnownDMs. We found that the genera Akkermansia and Blautia decreased significantly (P < 0.05) in treatment-naive diabetics and were restored in KnownDMs on antidiabetic treatment. To the best of our knowledge, comparative studies on shifts in the microbial community in individuals of different diabetic states are lacking. Understanding the transition of microbiota and its association with serum biomarkers in diabetics with different disease states may pave the way for new therapeutic approaches for T2D.

(P ϭ 0.0041) compared to ND. Adiponectin did not change in any group compared to ND. Lipid peroxides, a marker for oxidative damage, were found to be significantly increased in both NewDMs (P ϭ 0.0008) and KnownDMs (P ϭ 0.0014) but not in PreDMs compared to ND, while total antioxidant capacity was found to be significantly (P ϭ 0.029) low only in NewDMs compared to ND (Table 1).
Microbial diversity analysis and identification of differentially abundant microbial signatures. A total of nearly 44 million (43,902,890) high-quality sequences were retained after removal of low-quality sequences for taxonomic classification with average sequence reads of 430,420.49 Ϯ 239,742.68 per sample (see Table S1 in the supplemental material). A total of 12,827 operational taxonomic units (OTUs) were observed from all four study groups after removing singleton OTUs. Taxonomic assignment was performed using a 97% similarity cutoff with Greengenes reference database v13_8. Good's coverage of Ն99% indicated a high degree of sequence coverage. In alpha diversity analysis, nonparametric indices such as the number of observed OTUs for richness and Simpson index for evenness were calculated. The observed number of OTUs showed that alpha diversity decreased significantly in NewDMs compared to ND (P ϭ 0.0055) and KnownDMs (P ϭ 0.0011), whereas a significant difference was not observed between KnownDMs and ND (Fig. 1A). Simpson index showed a significant increase in alpha diversity only in KnownDMs compared to ND (P ϭ 0.0002) (Fig. 1A). Overall bacterial community composition was analyzed by using generalized UniFrac distances (20) followed by permutational multivariate analysis of variance (PERMANOVA) test (R ϭ 0.07, P ϭ 0.001) (Fig. 1B). The distance matrix is combined with unweighted and weighted UniFrac distances in a common structure and therefore is able to provide a much wider range of biologically appropriate changes. Two distinct clusters of KnownDMs and NewDMs were observed, whereas PreDMs formed an overlapping cluster with ND, indicating that the bacterial diversity of PreDMs is similar to that of ND. Interestingly, the diversity cluster of KnownDMs was found to be close to ND compared to NewDMs. Significant differences in bacteria belonging to five phyla, namely, Bacteroidetes, Firmicutes, Proteobacteria, Actinobacteria, and Verrucomicrobia, were observed in the gut microbiota of diabetic subjects (Fig. 2). Bacteria belonging to the phyla Firmicutes and Proteobacteria were significantly increased, whereas those from Bacteroidetes were significantly reduced in NewDMs (P ϭ 0.0009 and log 2 fold change [log 2 FC] ϭ 1.09, P ϭ 0.001 and log 2 FC ϭ 1.51, and P ϭ 0.007 and log 2 FC ϭ Ϫ0.62, respectively) and KnownDMs (P ϭ 0.0009 and log 2 FC ϭ 0.58, P ϭ 0.006 and log 2 FC ϭ 0.99, and P ϭ 0.0009 and log 2 FC ϭ Ϫ0.37, respectively) compared to ND ( Fig. 2 and Table S2). The ratio of Firmicutes to Bacteroidetes was calculated for all study groups. It was 1:4.94 for ND and 1:4.24 for PreDMs, and it changed significantly in NewDMs to 1:1.49. In KnownDMs on antidiabetic treatment, it was found to be changed to 1:1.23 (Table S3). The phylum Verrucomicrobia was found to be significantly decreased in NewDMs compared to ND (P ϭ 0.0009 and log 2 FC ϭ Ϫ14.2). In KnownDMs, the phylum Actinobacteria was found to be significantly increased compared to ND (P ϭ 0.011 and log 2 FC ϭ 1.16). A total of 1,127 OTUs were found to be significantly different in four study groups (P Ͻ 0.05). Of these OTUs, 10 OTUs belong to genus Akkermansia, 36 to Prevotella, 74 to Blautia, 24 to Ruminococcus, 45 to Escherichia, 50 to Lactobacillus, 4 to Megasphaera, 3 to Sutterella and 5 to Acidaminococcus (Table S4). In all the study groups, 519 genera were identified after merging all the OTUs belonging to the same genus, though they differed in their abundance. Of these genera, Prevotella, Megasphaera, Akkermansia, Escherichia, Sutterella, Lactobacillus, Acidaminococcus, Blautia, and Ruminococcus were found to have higher abundance than other genera in all four groups (Fig. 3). In NewDMs, Akkermansia, Blautia, and Ruminococcus showed significantly decreased abundance (P ϭ 0.0009 and log 2 FC ϭ Ϫ14.2, P ϭ 0.0009 and log 2 FC ϭ Ϫ2.52, and P ϭ 0.006 and log 2 FC ϭ Ϫ0.39, respectively), and a similar trend was observed for Prevotella (P ϭ 0.054 and not significant), one of the dominant genera found in Indian gut (19,21,22), while Lactobacillus (P ϭ 0.01 and log 2 FC ϭ 5.27) showed increased abundance compared to ND. Significantly increased abundance of Megasphaera (P ϭ 0.005 and log 2 FC ϭ 1.42), Escherichia (P ϭ 0.003 and log 2 FC ϭ 1.96), and Acidaminococcus (P ϭ 0.008 and log 2 FC ϭ 2.90) and decreased abundance of Sutterella (P ϭ 0.003 and log 2 FC ϭ Ϫ0.66), was observed in KnownDMs compared to ND. In KnownDMs, increased abundance of Akkermansia was observed compared to NewDMs (P ϭ 0.0009 and log 2 FC ϭ 13.48) ( Fig. 3 and Table S2).
Random forest analysis. We used random forest analysis to identify differentially abundant or most discriminant features of microbiome and serum metabolites associated with the disease. Analyzing the microbial features, we found that Akkermansia and Sutterella are highly discriminative genera among four study groups with the highest mean decrease score (see Fig. S1A in the supplemental material). Among the serum biomarkers, fasting glucose, HbA1c, methionine, and total antioxidants are found to be highly discriminative parameters with the highest mean decrease score among four study groups (Fig. S1B).
Identification of driver genera between four study groups based on NetShift analysis. We generated microbial association networks for ND, PreDMs, NewDMs, and KnownDMs followed by mining only statistically significant (P Ͻ 0.05) positive associ- ation networks separately using CCREPE (Compositionality Corrected by REnormalization and PErmutation) tool (http://huttenhower.sph.harvard.edu/ccrepe). To identify the driver genera between the case and control, NetShift workflow was performed (24). Driver genera can be identified based on the NESH score and node size. NESH is a Neighbor Shift score which represents directional changes in individual node associations, and a node represents each taxon. The node size is proportional to their respective NESH score, and a node is colored red if its betweenness increases from control to case. The nodes that are big and red are important community drivers (24). Comparison of ND (control) and PreDM (case) network (Fig. 4A) revealed Bifidobacterium, Faecalibacterium, Sutterella, and Phascolarctobacterium as the driver nodes (genera) with higher NESH scores (red color and bigger nodes), followed by Bacteroides, Blautia, Dorea, and Parabacteroides with low NESH scores (red color and smaller nodes) (Table S6). Among these driver genera, Blautia was found to be positively associated with major abundant genera such as Akkermansia, Clostridium, and Ruminococcus, along with other less abundant genera in ND (control). However, in PreDMs, Blautia showed positive association with Bacteroides, Butyricicoccus, and Faecalibacterium and not with Akkermansia, Clostridium, and Ruminococcus similar to ND, suggesting that Blautia may be a community driver for PreDMs. Another major driver Sutterella, which was found to be associated only with Bacteroides in ND, was associated with many other genera such as Bacteroides, Bifidobacterium, Butyricicoccus, Faecalibacterium, and Roseburia in PreDMs. Comparison of ND (control) with NewDMs (case) revealed that Prevotella, Parabacteroides, Roseburia, Ruminococcus, and Sutterella were found to have high NESH scored, indicating that these were the driver nodes ( Fig. 4B and Table S6). Sutterella, one of the main drivers, was found to be associated with Bacteroides in ND and shifted its association in NewDMs with Dorea and Lachnospira. Another driver, Prevotella, showed association with Dialister and Oscilospira in ND, which was shifted to Blautia and Clostridium in NewDMs. Similarly, driver Ruminococcus was associated with Blautia, Clostridium, Coprococcus, and Dorea in the ND group and shifted its association with Oscilospira and Roseburia in NewDMs. Comparison of ND (control) with KnownDM (case) network revealed Dialister, Faecalibacterium, Haemophilus, Lachnospira, Phascolarctobacterium, Oscillospira, and Sutterella as top driver nodes (high NESH score), followed by Blautia, Akkermansia, and Streptococcus with low NESH scores ( Fig. 4C and Table S6). In KnownDMs, Sutterella was found to be associated with Bacteroides, Bifidobacterium, Megasphaera, and Ruminococcus, but in ND, it showed association only with Bacteroides. Similarly, the genus Akkermansia in KnownDMs was found to be associated with Clostridium, Dialister, and [Eubacterium], while in ND, it was found to be associated with many different genera along with Clostridium and [Eubacterium]. Thus, from these analyses, Sutterella was found to be a common driver genus across three disease groups. Besides driver genus analysis, identification of core hub communities among the four study groups was analyzed using NetShift workflow (detailed description of NetShift workflow used for this analysis is mentioned in Text S1 in the supplemental material) ( Fig. S2A to C). We found significant change in core hub communities in NewDMs, while the core hub communities were similar in PreDMs and KnownDMs compared to ND.
Association of key taxa with biochemical parameters. For identifying the association of microbial taxa with significantly altered serum biomarkers (P Ͻ 0.05), we selected nine significant (P Ͻ 0.05) differentially abundant bacterial genera, namely, Prevotella, Akkermansia, Blautia, Megasphaera, Escherichia, Lactobacillus, Ruminococcus, Sutterella, and Acidaminococcus, that were found in all four groups and that were also identified as driver genera in NetShift analysis by using the Spearman correlation method (Fig. 5).

DISCUSSION
T2D is a widespread metabolic disorder that leads to various chronic health complications. Recently, the gut microbiome has been recognized as a major driver in the establishment of T2D. There are reports indicating a dysbiosis of gut microbiota in T2D subjects in Caucasian and Indian populations (4,19). Vrieze et al. have reported that transfer of intestinal microbiota from lean donors to individuals with metabolic syndrome decreases insulin resistance (25).
Our earlier study has reported dysbiosis of gut microbiota in Indian diabetic subjects (19). In this study, we have analyzed the gut microbiome of PreDMs, NewDMs, KnownDMs on antidiabetic treatment, and ND individuals. Twenty-five different serum biomarkers were checked and compared with the gut microbiota to assess the different states of diabetes. Targeted 16S rRNA amplicon sequencing was used to assess the microbial diversity, community shuffling, and identification of driver taxa for the disease state. We have investigated relationships between a wide array of serum biomarkers responsible for progression of T2D with significantly diverged and differentially abundant taxa in each study group. Significantly different patterns were observed in the gut color nodes (circles) with a higher NESH score resulting in a bigger node. Edge (line) is assigned between the nodes; green represents microbial association only in the control, red represents association only in the case, and blue represents common microbial association of a node in the case and control. microbiota of PreDMs, NewDMs, and KnownDMs compared to ND. In KnownDMs, abundance of some microbial taxa was found to be similar to that of ND group.
Increased levels of BCAA and AA are found to be associated with insulin resistance, obesity, and T2D (26). Adams reported that BCAA and its metabolites are elevated in the blood of diabetic subjects (27), and their increased levels are associated with inflammation and insulin resistance, characteristics of T2D (10,28). In our data, we found that BCAA and AAA remained elevated in both NewDMs and KnownDMs but not in PreDMs. We also found significantly low levels of histidine in NewDMs but not in PreDMs and KnownDMs. It has been reported that histidine supplementation in obese women with metabolic syndrome (29) and obese rats fed a high-fat diet (30) reduced insulin resistance, obesity, and metabolic syndrome by lowering inflammation and oxidative stress. Diabetic individuals are known to have a low-grade inflammation, and inflammatory markers are found to be elevated in their blood. We also found elevated levels of IL-6, an inflammatory cytokine (31), in NewDMs and KnownDMs but not in PreDMs, while LPS, a marker of low-grade inflammation, which induces metabolic endotoxemia (32) was found to be increased only in NewDMs. Since oxidative stress is known to be involved in the establishment of insulin resistance and diabetic complications (33), we measured total antioxidant capacity and lipid peroxides, a marker of oxidative damage to lipids in the blood. We found a significant decrease in total antioxidant capacity and increase in lipid peroxidation in treatment-naive NewDMs but not in PreDMs. In KnownDMs on treatment with metformin, an increase in total antioxidant capacity and decrease in lipid peroxidation were observed.
Earlier reports have demonstrated association of lower bacterial diversity with the disease condition (34,35). In our study, a significantly lower number of observed OTUs was found in NewDMs compared to ND, which increased in KnownDMs on antidiabetic treatment (Fig. 1A). A lower alpha diversity in NewDMs and higher alpha diversity in KnownDMs suggests that there is loss of bacterial diversity in the disease condition, and interestingly, antidiabetic treatment helps in regaining bacterial diversity. Additionally, we have analyzed diversity of rare taxa to understand its community structure along with abundant taxa in different study groups. Interestingly, a higher number of rare taxa in KnownDMs were observed compared to ND and NewDMs. These results suggest that altered diversity of rare taxa may play an important role in structural as well as functional attributes of gut microbiota after antidiabetic treatment. On the basis of the results of beta diversity analysis, we found that the microbial diversity of prediabetics (PreDMs) is similar to that of nondiabetics (ND). However, the bacterial diversity of treatment-naive diabetics (NewDMs) was found to be different from that of nondiabetics (ND) and diabetics on antidiabetic treatment (KnownDMs). Interestingly, in KnownDMs, the microbial diversity is observed to be trending toward that of ND, probably due to antidiabetic treatment. Microbial diversity analysis at the phylum level revealed higher abundance of Firmicutes and Proteobacteria and decreased abundance of Bacteroidetes among NewDMs and KnownDMs, similar to earlier reports (19,21).
At the genus level, microbial diversity analysis indicated that the levels of Prevotella, Akkermansia, Megasphaera, Blautia, Lactobacillus, Escherichia, Ruminococcus, Sutterella, and Acidaminococcus varied in the different study groups. Abundance of Akkermansia decreased significantly in NewDMs compared to ND. Decreased abundance of this mucin-degrading bacterial species is correlated with the onset of inflammation and metabolic disorders in mice (36,37). Protein AMuc_1100 from Akkermansia or pasteurized bacterium has been linked to reduction in fat mass development, insulin resistance, and dyslipidemia in mice (38). Metformin treatment commonly prescribed for diabetes has also been linked with higher levels of Akkermansia in diabetic patients (39) due to enhancement of mucin-producing goblet cells (40). We did not find any change in the abundance of Ruminococcus in PreDMs, in contrast to the report of Ciubotaru et al. (41). Additionally, we observed decreased abundance of Prevotella, Blautia, and Ruminococcus and increased abundance of Lactobacillus in NewDMs. Prevotella is one of the dominant taxa in the Indian population (19,21) and is known to be associated with a diet rich in plant-based polysaccharides (42,43). Prevotella is also known to produce propionate, a short-chain fatty acid (SCFA) (44), which promotes reduction of hepatic lipogenesis and helps in the reduction of lipids in blood (45). Taken together, these observations may indicate that a high abundance of Prevotella in ND and low abundance in NewDMs can be a distinct biomarker of diabetes in the Indian population. In a recent study, it was reported that host genetics-driven changes in microbiome composition result in increased levels of SCFAs, such as propionate, which increases the risk of developing T2D, suggesting a causal relationship between microbiota and type 2 diabetes (46). This warrants conducting genetics-driven microbiome association studies in the Indian diabetic population to understand the functional impact of SCFAs on host metabolism at the population level. Among Firmicutes, we observed decreased abundance of Blautia, a known producer of short-chain fatty acids (47) in NewDMs. In KnownDMs, recovery of Blautia was probably associated with antidiabetic treatment, as described in a study on an Asian population (48). Observations of high abundance of Lactobacillus (3) and decreased abundance of Akkermansia in NewDMs corroborate previous findings (5). In KnownDMs, we found increased abundance of Megasphaera, Escherichia, and Acidaminococcus and decreased abundance of Sutterella, which is similar to earlier findings (6,(49)(50)(51). de la Cuesta-Zuluaga et al. reported that metformin treatment in diabetics is associated with increased abundance of Megasphaera in the Colombian population (39).
In gut microbiota, microbial community survives through their characteristics of mutualism and commensalism. During disease progression, the physiology of the host changes significantly, which affects the gut microbial community and their interaction pattern. Under these circumstances, some microbes act as key players in the community, known as driver microbes (24). Different microbes interacting with each other in the community constitute core taxa. We analyzed positive associations among highly abundant genera in each group. NetShift analysis of core hub communities revealed that ND subjects have the maximum number of core hubs representing common genera, which changed significantly in NewDMs. In PreDMs, in addition to core hubs observed in ND, Sutterella was identified as an additional core hub. Earlier reports have suggested that the genus Sutterella is found to be associated with many diseases such as type 1 diabetes and inflammatory bowel disease (IBD) (50,51). In KnownDMs, core hub communities were found to be similar to ND. Increased abundance of genus Sutterella has been reported earlier in prediabetic gut microbiota (52). We found that Sutterella was a major and common driver across all disease groups.
Further, we analyzed correlation of microbiota with biochemical parameters measured to assess the status of diabetes. We observed significant decrease in total antioxidant capacity and increase in lipid peroxides in NewDMs compared to ND. The abundance of Akkermansia was positively correlated with total antioxidant capacity and inversely correlated with lipid peroxides in all groups. Administration of live or attenuated Akkermansia to diabetic rats led to decrease in oxidative stress, lipotoxicity, GLP-1, LPS, inflammation, and increase in HDL and improvement in liver function (53). We did not find any significant inverse association of Akkermansia and inflammatory markers, although Akkermansia is reported to reduce low-grade inflammation (36). We observed a strong inverse association of Akkermansia with glucose and HbA1c, similar to those reported by Schneeberger et al. (36). Recently, administration of Akkermansia has been shown to improve glucose homeostasis in mice fed a HFD (high-fat diet) (40). A significant association between the genus Prevotella, the most abundant genus in the Indian gut, and parameters such as glucose, lipids, BCAA, and AAA was not observed. A study by Pedersen et al. (17) demonstrated a positive association of Prevotella with BCAA and suggested that increased levels of circulating BCCA are due to the high prevalence of Prevotella copri, which was not found in our study. We observed a higher level of Prevotella in ND than in NewDMs. Kovatcheva-Datchary et al. (54) demonstrated that consumption of a diet rich in plant-derived fibers improved glucose metabolism through increased abundance of Prevotella in the Caucasian responder group and that increasing Prevotella by fecal transplantation improved glucose metabolism in germfree mice. Previously, an increased abundance of Lactobacillus in Indian type 2 diabetic patients (19) and a positive correlation between Lactobacillus-derived metagenomic clusters with fasting glucose and HbA1c was observed in Caucasian type 2 diabetic patients (4). In our study, we also find a positive correlation between Lactobacillus abundance with glucose and HbA1c level. This could be due to the higher genetic potential of Lactobacillus to utilize carbohydrates (55). However, analysis at lower taxonomic level such as species or strain is required, since probiotic strains of Lactobacillus are reported to be beneficial for lowering blood glucose (56). In our study, we found increased abundance of Escherichia in KnownDMs, which was positively correlated with blood metabolites such as glucose, tyrosine, and lipid peroxides. It is known that metformin, which is commonly used as an antidiabetic agent, leads to disturbance of the intestinal microbiota and increases in the abundance of opportunistic pathogens such as Escherichia (6,57). The increased abundance of genus Escherichia observed in our KnownDMs was possibly due to metformin. Further investigations are necessary to understand its positive correlation with blood metabolites in diabetic subjects.
Thus, this study gives us insights into the altered microbial community composition among different diabetic groups compared to ND and their association with clinical biomarkers in the Indian population. We are aware that the key limitation of this study is the sample size for PreDMs and NewDMs compared to both ND and KnownDMs. A larger study with more samples would help to generalize these findings. We also propose comparing prospectively gut microbiota changes in the same patient group before and after therapeutic introduction and to match it with prediabetic and nondiabetic subjects in future studies.
Conclusions. Our findings show differences in the gut microbiome in PreDMs, NewDMs, and KnownDMs compared to ND. In PreDMs, the gut microbiome does not change significantly from that of ND, whereas in NewDMs, both the abundance and diversity changed significantly, which in KnownDMs on antidiabetic treatment seems to be restored to some extent.

MATERIALS AND METHODS
Study population and sample collection. This is a retrospective study using a total of 102 subjects from the western region of India who were selected for this study during 2015 to 2016. All subjects were 30 to 60 years old. Healthy subjects with HbA1c of Յ5.7% were termed nondiabetic subjects (ND) (n ϭ 35). Diabetic subjects with antidiabetic treatment for at least the past year with HbA1c of Ն6.5% were termed known diabetes mellitus subjects (KnownDMs) (n ϭ 39). Newly diagnosed diabetic subjects who were not on any antidiabetic medication with HbA1c of Ն6.5% were termed newly diagnosed diabetes mellitus subjects (New-DMs) (n ϭ 11, of which n ϭ 5 are obese), and prediabetic subjects with HbA1c of 5.7% to 6.4% were termed prediabetics (PreDMs) (n ϭ 17). All the study groups were differentiated based on the HbA1c level by ADA (American Diabetes Association) guidelines (58). The study and the experimental protocols were approved by the institutional ethical committee of the National Centre for Cell Science (NCCS) (Pune, India), and informed consent and metadata were obtained from all participants.
The exclusion criteria for all four groups included antibiotic consumption in the last 3 months, any major gastrointestinal surgery, and presence of any known chronic or clinical disorder. All participants were screened before sampling, and an early morning stool sample was collected on the following day in a sterile stool container. Early morning fasting blood sample was also collected on the same day by phlebotomists from the clinical laboratory (Golwilkar Metropolis, Pune, India) to assess serum biomarkers. Fecal samples from all the subjects were collected and stored at Ϫ80°C until further processing, whereas blood samples were processed immediately.
Biochemical analysis. Fasting plasma glucose and glycated hemoglobin (HbA1c) were measured using hexokinase and high-performance liquid chromatography (HPLC) (Tosoh Bioscience, USA) method, respectively. Total cholesterol, triglycerides, and HDL cholesterol were measured by the serum enzymatic method. Apolipoproteins A1 and B were estimated by serum nephelometry (BN ProsPec system, Siemens, Germany). Vitamin B 12 , folic acid, and homocysteine were measured using competitive-binding immunoenzymatic assay. All measurements were done on an autoanalyzer (Architect Integrated CI-2800; Abbott, USA) at Golwilkar Metropolis, Pune, India. IL-6 and LPS levels in serum were estimated using a human IL-6 Quantikine high-sensitivity (HS) enzyme-linked immunosorbent assay (ELISA) kit (catalog no. HS600B; R&D Systems, MN, USA) and LPS ELISA kit (catalog no. CEB52Ge; Cloud Clone Corp, USA). Serum samples diluted 1:100 were used to measure adiponectin by ELISA (catalog no. DRP 300; R&D Systems, MN, USA). Blood plasma samples were deproteinated using sulfosalicylic acid (SSA). Deproteinated samples were used for the quantification of plasma amino acids by HPLC coupled with solvent delivery systems, autosampler, and photodiode array detector (all from Agilent 1100 series, Agilent Technology, Germany). A precolumn derivatization was done for analysis of the amino acids using a derivatizing agent, o-phthalaldehyde. From serum samples, assessment of total antioxidants was performed and measured spectrophotometrically at 450 nm by the protocol of Kambayashi et al. (59). Lipid peroxides were measured in plasma in nanomoles of malondialdehydes formed by the protocol of Acharya et al. (60).
DNA extraction and 16S rRNA gene amplicon sequencing. Total community DNA was extracted from all 102 samples using QIAamp stool DNA minikit (Qiagen, Germany) per the manufacturer's instructions. DNA was quantified using NanoDrop (ND-1000; Thermo Fisher Scientific, USA), and the quality of DNA was checked by gel electrophoresis. The DNA samples were subjected to amplification of 16S rRNA gene using V4 region-specific primers (V4 Forward [5=GTGCCAGCMGCCGCGGTAA3=] and V4 Reverse [5=GGACTACHVGGGTWTCTAAT3=]) (61). PCR was performed using the following conditions: initial denaturation at 95°C for 3 min; 25 cycles with 1 cycle consisting of 95°C for 30 s, 55°C for 30 s, and 72°C for 30 s; and a final extension step at 72°C for 7 min (61). A 2.5-l DNA (5-ng/l concentration) sample was used as a template in each 25-l PCR mixture. After amplification, products were cleaned using AMPure XP beads (catalog no. A63882; Beckman Coulter, Inc., USA) and subjected to library preparation using NextraXT library preparation kit (Illumina, USA) followed by limited cycle PCR to enrich the adapter ligated DNA molecules. Final cleanup was performed using AMPure XP beads to obtain libraries which were assessed for fragment size distribution using TapeStation (catalog no. 5067-5582; Agilent Technologies, USA) and were quantified using Qubit DNA (catalog no. Q32854; Thermo Fisher Scientific, USA) before sequencing. The quantified libraries were clonally amplified on cBOT and sequenced using Illumina HiSeq 2500 (Illumina Inc., USA) with 2 ϫ 250 bp paired-end chemistry. Sequences retrieved from Illumina HiSeq sequencing are available at the NIH Sequence Read Archive (SRA) under the Bioproject identifier (ID) or accession no. PRJNA448494.
Bioinformatic analysis. Paired-end reads were assembled using PEAR v0.9.10 software (62). The assembled reads were trimmed by using cutadapt version 1.13 (63) to remove adapter sequences from both ends. The quality filtered sequences were used for further analysis using Quantitative Insights Into Microbial Ecology (QIIME) v. 1.9 (64). Operational taxonomic units (OTUs) were binned by using closed reference OTU picking strategy using UCLUST algorithm (64) at 97% sequence similarity using Greengenes database v13_8 (65). Representative sequences from each OTU were used for taxonomic assignment using the RDP classifier (66). Singletons were removed from the OTU table, and the OTU table was normalized for the least number of sequences (86,770 sequences per sample) and used for downstream analysis. To calculate the Firmicutes-to-Bacteroidetes ratio, the formula used was ratio ϭ ͩ A B : B B ͪ where A is mean abundance for Bacteroidetes and B is mean abundance for Firmicutes. The ratio for each sample was calculated, and the average ratio is mentioned for each group. Random forest analysis. Genera and metabolites important for differentiating disease status were identified using random forest algorithm. The top 30 most abundant genera present in all samples and serum metabolites were included for analysis. The ranking of genera and serum metabolites according to mean decrease in accuracy (mean decrease Gini score) were obtained from the random forest algorithm using default parameters in the R 3.4.0. environment "randomForest" (with ntree ϭ 1,000), as mentioned in previous reports (67,68).
NetShift analysis. CCREPE (version 1.7.0) analysis (http://huttenhower.sph.harvard.edu/ccrepe) was performed separately for all study groups to identify the significant (P Ͻ 0.05) positive correlations among the highly abundant genera, which resulted in a positive edgelist. This edgelist is further used to analyze the driver microbes and core hub genera of study groups using the NetShift tool (24) available at https://web.rniapps.net/netshift/index_file.php.
Statistical analysis. All biochemical parameters were analyzed using a nonparametric Mann-Whitney test to understand the comparison between two study groups, which were compared one at a time. The differentially abundant genera were analyzed by mean difference with the false-discovery rate (FDR) correction using the discrete FDR (DS-FDR) method (69). Kruskal-Wallis test followed by FDR correction was applied to OTU table to derive differentially abundant diabetes-related biomarkers (OTUs) using the QIIME command group_significance.py. A generalized UniFrac distance was performed using GUniFrac Package (20) to identify the differences among four study groups, i.e., ND, PreDMs, NewDMs, and KnownDMs, and statistical test permutational multivariate analysis of variance (PERMANOVA) was performed using the vegan package in R (https://cran.r-project.org or https://github.com/vegandevs/ vegan). Spearman correlation was performed to identify associations among biochemical parameters and microbial genera using R package Hmisc (https://cran.r-project.org/web/packages/Hmisc/index.html), and visualization was done using ggplot2 package in R (70).
Data availability. The data sets generated and/or analyzed during the current study are available in the following repositories. Raw data are available on NIH Sequence Read Archive (SRA) under the Bioproject ID PRJNA448494. To enable future analysis, metadata and OTU table data are available at https://github.com/aksbiome/Type-2-Diabetes-and-gut-microbiome.

SUPPLEMENTAL MATERIAL
Supplemental material is available online only. TEXT S1, PDF file, 0.1 MB.