Upper respiratory tract microbiome profiles in SARS-CoV-2 Delta and Omicron infected patients exhibit variant specific patterns and robust prediction of disease groups

ABSTRACT The pandemic caused by the SARS-CoV-2 virus is the reason for many deaths globally. It is hypothesized that the upper respiratory tract (URT) microbiome shares similarities with lung microbiota and can modulate host immune responses to the virus. During the pandemic, several SARS-CoV-2 variants have emerged with different clinical outcomes and immune dysfunction, yet their association with changes in the URT microbiome has not been identified, which may provide an assessment of lung health in the presence of those variants with differential disease outcomes. In this study, we sequenced V3–V4 region of the 16S rRNA gene from the URT microbiome of healthy controls (HC), Delta- and Omicron-infected patients from Eastern India, which showed higher inter-individual diversity (β) and lower intra-individual diversity (α) in COVID-19 patients compared to healthy controls. Healthy control microbiome showed enrichment of commensals like Streptococcus symci, Prevotella melaninogenica, Neisseria perflava, Veillonella tobetsuensis, Veillonella nakazawae, Haemophilus parainfluenzae, Fusobacterium pseudoperiodonticum, and Bifidobacterium longum compared to COVID-19 samples. Ct value of patients significantly positively correlated with Streptococcus symci and Streptococcus toyakuensis indicating a possible inverse relation with viral load. We also observed few bacterial taxa like Staphylococcus caprae, Pseudomonas aeruginosa, Vibrio tritonious, and Rothia mucilaginosa are discriminating Omicron from Delta which is, in turn, enriched mostly with Enterobacter mori, Acinetobacter baumannii, and Klebsiella pneumoniae. Further investigation showed nine control-associated bacteria had higher to lower trends among three groups viz., Control > Omicron > Delta. These bacteria also contributed to the successful prediction of disease groups with high accuracy (90% ± 0.5%) by random forest analysis. IMPORTANCE The role of the upper respiratory tract (URT) microbiome in predicting lung health has been documented in several studies. The dysbiosis in COVID patients has been associated with disease outcomes by modulating the host immune system. However, although it has been known that different SARS-CoV-2 variants manifest distinct transmissibility and mortality rates in human populations, their effect on the composition and diversity of the URT microbiome has not been studied to date. Unlike the older variant (Delta), the newer variant (Omicron) have become more transmissible with lesser mortality and the symptoms have also changed significantly. Hence, in the present study, we have investigated the change in the URT microbiome associated with Delta and Omicron variants and identified variant-specific signatures that will be useful in the assessment of lung health and can be utilized for nasal probiotic therapy in the future.

S ARS-CoV-2 infection has emerged as a major public health concern in the 21st century.This novel coronavirus disease, known as COVID-19, was first reported from Wuhan, China, and since then has rapidly spread globally in almost every coun try (1).According to WHO, till 29 July 2022 approximately 572 million people have contracted COVID-19 and 6 million have succumbed to death globally (2).In India, a total of ~44 million people have been affected and ~0.5 million deaths have occurred due to COVID-19 to date (3).The major route of infection for COVID-19 is through the nasal cavity.At first, the virus accumulates in the nasopharynx and/or oropharynx and then aspirates into the lungs causing more serious infection (4).Hence, it is of immense importance to maintain nasal and oral health during COVID-19 infection to prevent severe outcomes (5).A healthy nasal and/or oral microbiome, collectively termed as the upper respiratory tract (URT) microbiome, is considered to be a protector of respiratory health and can provide resistance to invading pathogens (6).Alterations in the URT microbiome can potentially change the host immune response against viral and secondary bacterial infections (7,8).Several 16S rRNA gene sequencing and a few RNA seq-based studies from Europe, Asia, and America on the URT microbiome of COVID-19 patients have been reported.Most of the studies have observed that specific genera like Acinetobacter, Pseudomonas, Staphylococcus, Fusobacterium, etc. are high in URT of patients (9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23).
Since the beginning of the COVID-19 pandemic, several prominent variants of SARS-CoV-2 have spread globally, including Alpha, Beta, Delta, and Omicron.Alpha and Beta both occurred in South Africa and the UK in late 2020, respectively (24).Delta was first identified in India in late 2020 and became the major cause of a disastrous second wave in March-June 2021 killing more than 0.4 million people in the country (25).In late 2021, Omicron and its sub-variants were identified in South Africa and since then it has been the major variant dominating globally (26).Delta resulted in a more severe form of the disease requiring oxygen and hospitalization compared to the newer variant Omicron, which showed relatively milder symptoms but with greater transmissibility (27).It is evident from previous studies that the URT microbiome significantly overlaps with the lung microbiome thereby considered as a good predictor of overall lung health (7).On the other hand, different variants of SARS-CoV-2 with varying symptoms and host immunological responses might have an impact on the URT microbiome further impacting lung health.The majority of the studies published to date have mostly characterized the URT microbiome in COVID-19 patients in comparison to healthy controls but the effect of specific SARS-CoV-2 variants on driving dysbiosis in the URT has not been identified before (18).In our current study, we aimed to investigate the changes in the URT microbiome composition and diversity associated with either the Delta or the Omicron SARS-CoV-2 variant in COVID-19 patients and compared the same with that of healthy controls.To the best of our knowledge, such extensive results on the variant-specific signatures of the URT microbiome have never been reported in COVID-19 patients globally.

Study participants and their characteristics
In total, 43 COVID-19 patients (24 Delta and 19 Omicron) and 19 HC were recruited in this study.All 19 Omicron and 5 of the delta samples were collected by the clinicians of COM&JNM Hospital in Kalyani from December 2021 to January 2022 and the rest of the 19 Delta samples were collected from ICMR-Regional Medical Research Centre, Bhubaneswar during the COVID second wave and published recently by members of our group (28).Healthy controls were collected from Kalyani during November-Dec 2021.
The average age of the patients and healthy controls was 35.95 Y ± 15.41 SD and 33.10 Y ± 12.26 SD, respectively.The average Ct value of the patients was 20.44 cycles ± 3.84 SD with 44% (19/43) having Ct < 20 (29).The Charlson comorbidity index was > 1 (min-max: 0-3) in 34.88% of patients (n = 15).Among the 43 patients, post-COVID complications were found as follows: (i) six patients developed diabetes, (ii) three developed chronic obstructive pulmonary disease, (iii) three reported post-COVID dementia, and (iv) two reported hypertension.Details of the clinical symptoms observed during COVID-19 infection for Delta and Omicron are given in Table 1.

Lower microbial diversity was observed in patients compared to healthy controls
The total number of paired-end reads generated from all the 63 biospecimens (COVID-19: 43, healthy control: 19, and negative control: 1) sequenced was ~84.20 M (average ~1.33 ± 0.4 M).After initial QA/QC and adjustment for the negative control (supplementary information; Table S1), 44.24 M quality filtered, non-chimeric, singleton removed paired-end reads were obtained (maximum: 1.28 M, minimum: 0.29 M, and mean: 0.70 M) for amplicon sequence variant (ASV) binning, rarefaction analysis, and taxonomic classifications.The total ASV count without singletons was 12,172.Plateaued rarefaction curves (Supplementary information; Fig. S1) of α-diversity indices (Shannon and observed feature) were observed at ~40,000 reads, which is sufficient for our data as the sample with the minimum number of reads is 293,962.Intra-individual variability (α-diversity) measured by the Shannon, Chao1, and Simpson indices showed significantly higher (P < 0.01) microbiome diversity in healthy controls compared to COVID-19 patients (Fig. 1a through c).
Principal coordinate analysis (PCoA) using the β-diversity (Bray-Curtis dissimilarity index) showed that COVID-19 patients and healthy controls formed separate clusters at both the genus and species levels (Fig. 1d and e).We have used PCoA points for statistically comparing the PC1, PC2, and PC3 coordinate values between COVID patients and controls at both species and genus levels separately.We observed COVID-19 patients have significantly higher PC1 and PC2 coordinate values (Wilcoxon P-value < 0.05) compared to controls (Table S2).This shows a higher inter-individual variability across COVID-19 patients, whereas healthy control microbiome composition is mostly similar across individuals at both the genus and species levels.

URT microbiome in COVID-19 patients are depleted for control-associated bacteria
A total of 34 phyla, 924 genera, and 1,429 species were observed.Among them, those taxa with mean relative abundance >1% and present in at least 50% of individuals in either COVID-19 patients or healthy controls were termed as the "core taxa." The "core taxa" consists of 7 phyla, 34 genera, and 42 species (Table 2).These core phyla, genera, and species contributed ~99%, ~85%, and ~70% of the total relative abundance, respectively.
Linear discriminant analysis (LDA) combined with effect size (LEfSe) analysis revealed key bacterial taxa at phyla, genera, and species levels, which enabled discrimination (FDR-P Kruskal-Wallis <0.05 and LDA Threshold > 3) between COVID-19 patients and     2).They are mostly reported to be found in the oral and nasal cavities of healthy control individuals in previous studies (30)(31)(32)(33)(34)(35)(36)(37), which support our observation that they are depleted in COVID-19 patients and abundant in the control individuals.

Ct value is significantly positively correlated with Streptococcus
COVID-19 viral load as inferred by the Ct value was correlated with the core species to gain information on the relationship between viral load and URT microbiome composi tion.It was observed that Streptococcus symci and Streptococcus toyakuensis (Supplemen tary information; Fig. S2) are significantly positively correlated with Ct value (rho ≥ 0.3, P < 0.05), whereas oral non-commensal Pseudomonas aeruginosa (38) is significantly negatively correlated (rho ≤−0.4,P < 0.05) with Ct value (Supplementary information; Fig. S2).Streptococcus sp. is commonly found in the nasal cavity (30), and Pseudomonas aeruginosa is an oral pathogen (38,39).This indicates that control-associated bacteria in URT are associated with higher Ct value and lower viral load whereas pathogenic P. aeruginosa is associated with an increase in viral load.

URT microbiome diversity is lesser in Delta-infected patients compared to Omicron and healthy controls
Estimates of intra-individual α-diversity indices show that Delta-infected patients have significantly lesser URT microbiome diversity compared to healthy controls and Omicron-infected patients (Fig. 3a through c).Whereas healthy control individuals showed significantly higher intra-individual diversity in URT microbiome composition compared to the two SARS-CoV-2 variant infected groups.This hints at a possible dysbiosis with the predominance of few bacteria in Delta-infected patients compared to healthy controls and Omicron-infected patients.
Inter-individual β-diversity index (Bray-Curtis) showed healthy controls, Delta-, and Omicron-infected patients clustering separately (Fig. 3d and e) at both genus and species levels.Interestingly, healthy controls clustered very closely indicating similar microbiome composition among them.

Distinct URT microbiome composition in Delta-and Omicron-infected patients
The URT microbiome of Delta-infected patients (n = 24) were compared to the Omi cron-infected patients (n = 19) for identifying variant-specific signatures of the URT microbiome.Principal component analysis (PCA) on the core genera from both these data showed prominent clusters indicating distinct microbiome composition in both groups (Fig. 4a).
Interestingly, among the top five species that are discriminating Omicron sam ples, Prevotella melaninogenica, Neisseria perflava, and Veillonella nakazawae are also previously found to be predominated in healthy control URT microbiome.

Lesser dysbiosis in Omicron than Delta-infected patients with respect to healthy controls
To investigate the extent of dysbiosis in Omicron-and Delta-infected patients compared to healthy controls, we have performed PCA on the core species data for Delta (n = 24), Omicron (n = 19), and healthy control (n = 19) samples.The PCA plot (Fig. 6a) showed that Delta patients and healthy controls separately clustered along the PC1 (13.36%), while Omicron patients are mostly clustered along the PC2 (9.12%).Most importantly, the Delta samples revealed their clear distinctness from both the Omicron and healthy controls (Fig. 6a) by forming a prominent cluster in the PCA plot suggesting a clear dysbiosis present in them.We have compared the PCoA coordinate median values for PC1, PC2, and PC3 axes between (i) Delta-infected patients and healthy controls, and (ii) Omicron-infected patients and healthy controls.We observed that Delta patients have significantly higher PC1 and PC2 values (Wilcoxon P-value < 0.05) compared to controls.However, Omicron patients have significantly higher values only along the PC2 axis (Wilcoxon P-value < 0.05) and not along the PC1 axis.This shows that compared to healthy controls, the dysbiosis in Omicron samples is lesser as compared to Delta samples (Table S3).
To further investigate the key taxa that may be contributing to this difference in dysbiosis, we have performed one-way ANOVA and post hoc test for linear trend on the core species to determine the changes in relative abundance among the three groups, viz., healthy controls, Omicron-, and Delta-infected groups.This revealed nine key species Streptococcus symci, Veillonella tobetsuensis, Veillonella rogosae, Veillonella nakazawae, Haemophilus parainfluenzae, Fusobacterium pseudoperiodonticum, Prevotella melaninogenica, Neisseria perflava, and Neisseria subflava have trend of higher to lower abundance as control > Omicron > Delta (FDR-P ANOVA <0.05, post hoc test for linear trend    P < 0.05) (Fig. 6b; Fig. S3).These nine species showed a trend of regular decrease in abundance from healthy controls to Omicron-and Delta-infected patients.Notably, all these nine species were previously found to be significantly enriched in healthy controls when compared with the COVID-19 patients.
During the long course of the pandemic, newer variants of SARS-CoV-2 have emerged at different time points, with varying transmissibility and disease severity thus influencing lung health differentially, which can be deciphered from the composition and diversity of the URT microbiome of these patients.The newer variant Omicron is more transmissible with lesser mortality compared to other variants.The symptoms of Omicron infection and host immunological reactions are also milder compared to the older variant Delta (43).However, to the best of our knowledge, there is no study highlighting the difference in microbiome composition and diversity between patients infected by any of the two variants.
Since different SARS-CoV-2 variants are known to impact differently on the disease outcome, and the URT microbiome has been shown to modulate host response in COVID-19 patients, hence, in this current study, we hypothesized that SARS-CoV-2 Delta and Omicron-infected patients will exhibit variant-specific signatures in their URT microbiome that are also significantly different from that of the healthy controls.For this, we have initially compared the URT microbiome composition and diversity between COVID-19 patients and healthy controls and further characterized the dysbiosis associated with patients infected with Delta vs Omicron variants.URT microbiome diversity in COVID-19 patients is significantly lower compared to healthy controls supporting five previous studies (9,10,12,21,23) while refuting other three (11,16,20), which showed no changes in alpha diversity between COVID-19 patients and healthy controls.This can be attributed to the differences in geographical locations, ethnicity as well as sequencing approaches like metatranscriptomics, 16S, and shotgun metage nomic sequencing (9)(10)(11)(12)(13)(14)(15)(16)(17)(18)(19)(20)(21)(22)(23) undertaken by the respective studies.
To date, several methods have been applied based on the abundance or occupancy (i.e., detection across samples) to identify the core microbiome.A number of microbiome studies have used core criteria to be (i) taxa present in at least 50% of the individuals (44,45) or (ii) taxa having at least 1% mean relative abundance (46) and performed further statistical analyses on them.However, several studies have also considered a combined criteria using microbial abundance and occurrence to be a robust method for core identification, which can explain more than 80% of the genera abundance (47,48).Hence, we have considered both the criteria of abundance and occupancy for core microbiome identification as followed by our previous publication from the same lab (49).In this way, we have reduced the chance of including those rare taxa, which may have high abundance in only a few samples but absent in majority of the samples, thereby reducing the chance of getting false positives in the downstream analyses.
We have observed that Streptococcus symci, Prevotella melaninognica, Neisseria perflava, Neisseria subflava, Bifidobacterium longum, Veillonella tobetsuensis, Veillonella nakazawae, and Corynebacterium accolens can significantly discriminate the URT microbiome of healthy controls from COVID-19 patients.These bacteria were previously reported to be commonly found in the oral and nasal cavities of healthy humans (30-35, 50, 51).Previous reports have shown that Neisseria subflava, Neisseria perflava, and Veillonella tobetsuensis are the early colonizers (32,33,35) of the oral cavity and specifically Veillonella tobetsuensis coaggregates with Streptococcus sp.(35).Both Veillonella tobetsuensis and Veillonella nakazawae are known to metabolize lactate produced by other oral bacteria (52).This metabolic activity results in the production of less acidic byproducts, such as acetate, propionate, and carbon dioxide, which help buffer the oral environment.This acid-neutralizing effect contributes to the prevention of dental caries and enamel erosion by maintaining an optimal pH level in the oral cavity (53).Neisseria subflava and Neisseria perflava possess antimicrobial properties that can help protect the oral cavity from harmful pathogens by producing bacteriocins and antimicrobial peptides that inhibit the growth of competing bacteria, thereby limiting the colonization of potentially harmful microbes.
Ct value, or cycle threshold value, is a metric used in quantitative polymerase chain reaction assay.In the context of viral infections, including COVID-19, Ct values are used to estimate the viral load in a patient's sample.A lower Ct value suggests a higher viral load, while a higher Ct value may indicate a lower viral load.We have found that Streptococcus symci and Streptococcus toyakuensis are significantly positively correlated with Ct value.On the contrary, Pseudomonas aeruginosa (11,39), a known nasal and oral pathogen was found to be negatively correlated with Ct value.Similar observation was also reported for Streptococcus and P. aeruginosa by Rhoades et al. (11).A study by Gale et al. (54) showed a strong association between Pseudomonas aeruginosa and nasopharyngeal inflammation in RNA virus-infected immunocompromised patients with HIV.In these cases, Pseudomonas aeruginosa inversely correlated with CD4 cell count, which is crucial for immune regulation (54).Influenza virus infection has also been associated with increased susceptibility to Pseudomonas aeruginosa infection, contribu ting to lung damage through heightened MMP-9 expression (55,56).Pseudomonas aeruginosa is also commonly known as a biofilm-forming opportunistic pathogen and a study from China showed a specific strain of Pseudomonas aeruginosa with unique epigenetic modifications could exhibit increased biofilm production thereby augment ing antibiotic resistance and facilitating in vivo colonization in COVID-19 patients (39).Conversely, many Streptococcus species possess adhesive molecules for their attachment to the epithelial layer of the upper respiratory tract (57).This hinders viral attachment to the host's respiratory epithelial layer.They can also regulate oral pH by secreting both H 2 O 2 and alkali molecules, thereby inhibiting foreign pathogen establishment (57).This indicates that an increase in Streptococcus sp. and pathogenic P. aeruginosa in the URT of patients is indicative of reduced and increased viral load, respectively.
We have compared our observation with other URT microbiome studies (number of studies = 7) (11,12,17,19,20,58,59) on COVID-19 patients and healthy controls who have only performed 16S rRNA gene sequencing (Supplementary information; Table S4).We have seen that in those studies Acinetobacter, Pseudomonas, and Staphylococcus are mostly found in the URT of COVID-19 patients (5/7; 70% of the studies) (Supple mentary information; Table S4).Whereas Prevotella, Neisseria, and Streptococcus were commonly found in the URT of healthy controls (4/7; 57% of the studies).This is similar to our observation where we have found Streptococcus, Prevotella, and Neisseria to be discriminating control samples from COVID-19, whereas Acinetobacter and Pseudomonas were found to be discriminating COVID-19 samples from controls (Fig. 2a).
Deeper investigation revealed nine key species Streptococcus symci, Veillonella tobetsuensis, Veillonella rogosae, Veillonella nakazawae, Haemophilus parainfluenzae, Fusobacterium pseudoperiodonticum, Prevotella melaninogenica, Neisseria perflava, and Neisseria subflava, which showed a trend of gradual decrease in abundance from healthy controls > Omicron > Delta-infected patients.All these species were previously found to be significantly high in healthy controls when compared to COVID-19 patients.Our analysis also indicates that compared to healthy controls, the extent of dysbiosis in the URT of Omicron-infected patients is lesser than in the Delta-infected patients with enrichment of few healthy URT-associated bacteria in the former compared to the latter.
The objective of our prediction model is to identify the predictive power of the microbiome composition on different COVID-19 subtypes (Delta and Omicron) as well as healthy controls.Due to the lack of URT microbiome data specific to Delta and Omicron variant-infected patients in the existing literature and no available external data sets that could serve as relevant test data for our prediction model, we strategically partitioned our own data.We divided our data into a training set (60%) and a test set (40%) to assess the model's performance as done in reference (66).We considered the test data as a new independent set as in reference (66).By training the random forest model on the training set and then evaluating it on the test set, we aimed to understand how well the discriminating microbial taxa for the Delta, Omicron, and healthy control based on a subset of our data can be used to predict test data.The division of data into random training and test sets helped to evaluate the model's generalization capabilities.We also repeated the process 100 times thereby allowing us to access the robustness of the associations and gain insights into the potential diagnostic or biomarker value of these microbial features.
In a nutshell, our study provides evidence that the different SARS-CoV-2 variants emerging during the course of the pandemic have shaped the upper respiratory tract microbiome in a variant-specific manner, which might impact the overall lung health.This is an important addition to the growing scientific data on the COVID microbiome and will be useful for future studies focusing on studying host-microbiome interac tions in COVID-19 patients.The variant-specific microbiome signatures can also lead to differences in host immunological responses in patients infected with either Delta or Omicron variants.Our study also highlights that the URT microbiome of patients infected by the newer variant Omicron are more similar to healthy controls than the older variant Delta-infected patients.This is an interesting observation hitherto unreported during the pandemic from any global population.This may have huge implications in studying the nature of the newer variants that might help in understanding the trajectory through which the dreaded pandemic is now moving towards an endemic condition on the planet.
Studies have shown that nasal probiotic spray can result in effective recovery in sinusitis and respiratory syncytial virus infection (67).Clinical studies have shown persistent P. aeruginosa colonization in gut and oral samples of COVID-19 patients (11,39) supporting our observation as it is most predominant in Omicron-infected patients.On the other hand, oral Bifidobacterium longum supplementation is suggested for treating dental inflammation and influenza infection (37,68).A functional study has also shown oral Bifidobacterium longum supplementation can prevent P. aeruginosa-driven infection in mice (69).We have observed a very low abundance (<0.5%) of Bifidobacte rium longum in both Delta-and Omicron-infected patients.Hence, our findings can help in further development of nasal probiotic spray for COVID-19 patients taking into account different variants that are already prevailing in the population and also for the newer emerging variants.
Globally, there are seven URT microbiome studies on COVID samples done by 16S sequencing.Although our total sample size (n = 62) is greater than or comparable to four such studies (n = 19, n = 48, n = 59, and n = 71) (12,17,20,59) but is lesser than the remaining three (n = 103, n = 134, and n = 141) (11,19,58) studies on COVID-19 URT microbiome (Table S4).Thus, our sample size might be a limitation of this study since it was logistically difficult during the pandemic period.However, despite these issues, our study represents a valuable initial investigation into the association between specific SARS-CoV-2 variants and the URT microbiome, and to the best of our knowledge, no other study has explored variant-specific microbiome profiles in COVID-19 patients.

Study participants and sample collection
The Institutional Ethics Committee of College of Medicine and JNM Hospital, Kalyani, and National Institute of Biomedical Genomics, Kalyani have approved this study.Written informed consent was obtained from all patients and controls.URT swabs (19) from 5 Delta and 19 Omicron-infected patients were collected by clinicians of COM&JNM hospital in Kalyani during December 2021 and January 2022.The variant level informa tion was obtained by viral RNA sequencing in NIBMG through INSACOG (70).Information on fever, cough, diarrhea, Ct value, and Charlson comorbidity index (71) was collected from all the patients by experienced clinicians.URT swabs of healthy volunteers were obtained from the same geographical region among those who have not reported positive for COVID-19 since the beginning of the pandemic.All the participants had not taken any antibiotics 1 month prior to sample collection.Another set of 19 delta samples have been collected from ICMR-Regional Medical Research Centre, Bhubaneswar during the COVID second wave and published recently by members of our group (28).Sequence data files from our previous study were jointly analyzed with the data generated in our current study for obtaining taxonomic profiles and diversity estimates using our analysis pipeline.All the participants did not have any oral infection or conditions like periodontitis or xerostomia.

Microbiome DNA isolation and 16S rRNA gene sequencing
Microbiome DNA was isolated from the URT swabs collected from patients and healthy controls by using QIAamp BiOstic Bacteremia DNA Kit while maintaining proper Bio-Safety (BSL-II) guidelines followed in the COVID laboratory of COM&JNMH Hospital.Microbiome DNA was then transferred to NIBMG and stored at −20°C until further processing.Microbiome DNA was subjected to genomic characterization of both culturable and non-culturable microbial taxa by next-generation sequencing of variable regions (V3-V4) of 16S rRNA gene using paired-end (2 × 250 bp) sequencing chemistry on Illumina NovaSeq 6000 platform (supplementary information).

Microbiome data analysis
Adapter sequences were removed during the demultiplexing of raw files obtained from the sequencer.After obtaining the demultiplexed individual sample files, FASTQC reports were generated to check the quality of the demultiplexed raw reads (http://www.bioinformatics.babraham.ac.uk/projects/fastqc) (72).All the downstream bioinformatics analysis was performed using QIIME2 (v2022.8)(73).The paired-end reads were imported in the QIIME2 environment using manifest file (73,74).In the first step, denoising of the raw reads was performed using DADA2 (75) algorithm integrated in QIIME2 (73).The denoising step was performed by (i) removing the primers, (ii) not allowing any ambiguous base (N < 0), (iii) allowing read length > 250 bp, (iv) merging the paired-end reads with > 12 bp overlap, and (v) allowing error rate < 2. After the denoising step, ASV feature table (comprising sample IDs with quality filtered and non-chimeric read numbers) and representative sequences were generated.The representative sequences were then aligned with SILVA-v.138(76) reference database using QIIME2 to obtain taxonomic classification based on VSEARCH (v 2.1.3)consensus taxonomy (73,74,77) classifier with default settings.Species identification was done by aligning the represen tative sequences obtained from ASVs by BLAST (78) and species were assigned only when the query fulfilled the BLAST identity > 99%.Rarefaction plots were generated with the actual number of reads for each sample to confirm if (i) the number of ASVs and (ii) the estimated α-diversity index (Shannon) were independent of the inter-individual variation of the total number of reads generated for each individual, i.e., reached a plateau even with a minimum number of reads.ASV feature table was further used for generating phylogenetic tree, α-and β-diversity estimation using QIIME2 diversity plugin.Negative control adjustment was done for BSL-II and DNA isolation kit by removing those ASVs from the cases and controls that are >1% in abundance in the negative control.

Statistical analyses
The taxonomic data did not show any significant differences (P-value PERMANOVA = 0.11) between the samples collected in COM&JNMH and ICMR-RMRC as evident by the PCoA plot generated using Bray-Curtis dissimilarity index, hence, the samples were pooled together for further statistical analysis (Supplementary information; Fig. S4).
Bacterial phyla, genera, and species with mean relative abundance >1% and present in at least 50% of individuals in either Omicron or Delta patients or healthy controls were termed the "core taxa" and these were carried forward for further analyses.Principal component analysis was performed on core taxa between study groups by R v4.1.1 using "PCAtools" (81) and "vegan" (80) packages.Correlation analysis between core taxa and Ct value for COVID-19 patients was performed using Spearman's rank correlation test in R v4.1.1 using "dplyr" and "Hmisc" packages (82,83).Regression analysis was performed using GraphPad Prism 8 (84) and subsequent plots were generated.
Multivariate linear discriminant analysis of effect size (85) was performed to obtain bacterial taxa significantly differentiating (i) COVID-19 patients from healthy controls, and (ii) Delta from Omicron groups.To identify those bacterial species that are significantly different between Delta and Omicron groups, we performed t-test for all the core species (n = 42) between the two groups and plotted the fold change in mean relative abundance for each taxon in X-axis and P-values in Y-axis represented in a volcano plot using GraphPad Prism 8 (84).To investigate whether any trend in the distribution of the relative abundance for members of the core species existed among the Control, Omicron, and Delta groups, one-way ANOVA and post hoc (Tukey-Kramer) test for linear trend was performed.
We hypothesized H 0 : there is no difference in the mean relative abundance of core species among the three groups (control, Omicron, and Delta) and H A : mean relative abundance of core species will be high in control followed by Omicron-and Delta-infec ted patients.RF analysis was performed to identify the discriminatory power of the members of the core species on the Delta, Omicron, and healthy controls.RF analysis was carried out with the help of R packages "caret" (86) (https://topepo.github.io/caret/)and "randomForest" (87) (https://cran.r-project.org/web/packages/randomForest/index.html)(details in supplementary information).

FIG 3
FIG 3 Difference in α-and β-diversity estimates among Delta-, Omicron-infected patients, and healthy controls.(a-c) Violin plot showing groupwise (Delta, Omicron, and healthy controls) difference of α-diversity estimates (Chao1, Shannon, and Simpson) (P-value <0.05).(d and e) PCoA plot based on Bray-Curtis dissimilarity index of the microbial communities of each group (Delta, Omicron, and healthy controls) at (d) genus and (e) species levels, respectively.

FIG 4
FIG 4 Taxonomic difference between Delta-and Omicron-infected patients.(a) PCA using core species level taxonomic data of Delta and Omicron variants infected patients' samples.(b) Volcano plot to illustrate the significant association of microbial species with Omicron.The X-axis shows the estimate of the log of fold change (FC) of the mean relative abundance, and the Y-axis shows the −logarithm of the FDR P-value to the base 10.The species positively associated with the Omicron group are on the right side ("∆") and those negatively associated are on the left side ("•").

FIG 5
FIG 5 LEfSe analysis of core genus and species of two groups, i.e., Delta-and Omicron variant-infected patients.LEfSe analysis identified the most differentially abundant (a) genus and (b) species using LDA score > 3 and P-value < 0.05 are shown."Green" color denotes taxa discriminating for Omircon-infected patients (n = 19), and "red" color denotes taxa discriminating for Delta-infected patients (n = 24).

FIG 6
FIG 6 Taxonomic difference among healthy controls, Delta-, and Omicron-infected patients.(a) PCA using core species level taxonomic data of Delta samples, Omicron samples, and healthy controls.(b) Box and whisker plots showing the trend of significant decrease in abundance of nine key species from healthy controls to Omicron to Delta variant-infected samples (one-way ANOVA P value FDR < 0.05, post hoc test for linear trend P < 0.05).

FIG 7
FIG 7 Random forest analysis using core species.(a) Variable importance plot based on mean decrease accuracy value from the random forest classifier between three groups (Delta, Omicron, and healthy control).(b) Confusion matrix showing prediction of three groups based on initial training data set (60% of the total data) and test data set (remaining 40% data).

TABLE 2
Significantly abundant core genera and their respective mean relative abundance in COVID-19 infected patients, healthy controls, Delta, and Omicron patient groups

TABLE 3
Significantly abundant core species and their respective mean relative abundance in COVID-19-infected patients, healthy controls, and Delta and Omicron patient groups November/December 2023 Volume 11 Issue 6 10.1128/spectrum.02368-2310

TABLE 3
Significantly abundant core species and their respective mean relative abundance in COVID-19-infected patients, healthy controls, and Delta and Omicron patient groups (Continued) a -, not significant.November/December 2023 Volume 11 Issue 6 10.1128/spectrum.02368-2311