Oral Microbiota Perturbations Are Linked to High Risk for Rheumatoid Arthritis

Oral microbial dysbiosis is known to increase susceptibility of an individual to develop rheumatoid arthritis (RA). Individuals at-risk of RA may undergo different phases of disease progression. In this study, we aim to investigate whether and whereby the oral microbiome communities alter prior to symptoms of RA. Seventy-nine saliva samples were collected from 29 high-risk individuals, who were positive for anti-citrullinated protein antibodies (ACPA) and have no clinical arthritis, 27 RA patients and 23 healthy controls (HCs). The salivary microbiome was examined using 16S ribosomal RNA gene sequencing. Alpha and beta diversity analysis and the linear discriminant analysis were applied to examine the bacterial diversity, community structure and discriminatory taxa between three groups, respectively. The correlation between salivary bacteria and autoantibodies were analyzed. In the “pre-clinical” stages, salivary microbial diversity was significantly reduced comparing to RA patients and HCs. In contrast to HCs, like RA patients, individuals at high-risk for RA showed a reduction in the abundance of genus Defluviitaleaceae_UCG-011 and the species Neisseria oralis, but an expansion of Prevotella_6. Unexpectedly, the relative abundance of Porphyromonas gingivalis, reported as opportunistic pathogens for RA development, was significantly decreased in high-risk individuals. Additionally, we identified four genera in the saliva from high-risk individuals positively correlated with serum ACPA titers, and the other two genera inversely displayed. In summary, we observed a characteristic compositional change of salivary microbes in individuals at high-risk for RA, suggesting that oral microbiota dysbiosis occurs in the “pre-clinical” stage of RA and are correlated with systemic autoimmune features.


INTRODUCTION
Rheumatoid arthritis (RA) is a systemic autoimmune inflammatory disease that primarily involves the joints. Over the past years, research focusing on the earliest stage of RA has led to the discovery of RA-related systemic inflammation and autoimmunity in the pre-clinical stage. The presence of circulating autoantibodies, elevation of cytokines and chemokines levels, and increase of acute phase reactants precede clinical arthritis (Rantapaa-Dahlqvist et al., 2003;Berglin et al., 2004;Nielen et al., 2004;Jorgensen et al., 2008;Sokolove et al., 2012). Prospective studies define ACPA-positive individuals as populations at risk for developing RA, and the chance in subjects with seropositivity and arthralgia can be as high as nearly 30% within 1 year (van de Stadt et al., 2011). In view of potential preventive strategies, the preclinical and earliest stages of RA are likely to represent important therapeutic windows within which disease outcomes can be dramatically modulated. Exploring risk factors and biomarkers for RA, especially in the earliest stage of disease is definitely an urgent need.
Complex interplay between genetic and environmental factors contributes to RA etiopathogenesis. An ancient theory of infectious etiology of RA has now been re-emphasized. Recent studies have further corroborated this theory by indicating mucosal origins of the disease. Data from epidemiological and translational studies suggest that environmental exposure and dysbiosis in mucosal sites (lung, gastrointestinal tract, and oral cavity) have causative roles in the development of RA (Scher et al., 2012(Scher et al., , 2013(Scher et al., , 2016. Of note, Porphyromonas gingivalis, a periodontal pathogen, was capable of producing peptidylarginine deiminase (PAD) enzyme to citrullinated antigens. The presence of antibodies to P. gingivalis was positively correlated with the increased titer of ACPA (Mikuls et al., 2012;Lappin et al., 2013). Additionally, a recent study revealed a new species in the oral sites of RA patients which promoted citrullinated antigens production (Konig et al., 2016). This species, Aggregatibacter actinomycetemcomitans, led to dysregulated PAD function and release of hypercitrullinated proteins through inducing neutrophil migration and neutrophil extracellular traps (NETs) formation (Hirschfeld et al., 2016;Konig et al., 2016). Animal studies further showed that two periodontal pathogens, P. gingivalis and Prevotella nigrescens aggravated collagen-induced arthritis in mice (de Aquino et al., 2014). These observations support the theory that oral dysbiosis may be an origin of autoantigen production and a risk factor for RA.
Although the oral microbiome is substantially altered in RA patients (Scher et al., 2012;Zhang et al., 2015), little is known about its status in the initial stages of disease development. Whether this alteration precedes clinically evident arthritis and associated with disease development or merely represents a resultant or concomitant phenomenon of the disease is unclear. As ACPA is highly specific (Schellekens et al., 2000), detectable early and predictive of rapid progression and erosion of RA (Nielen et al., 2004;Ronnelid et al., 2005;Syversen et al., 2008;Rombouts et al., 2015), individuals with ACPA positivity are at increased risk for RA, especially those with arthralgia (van de Stadt et al., 2011). We therefore sought to test whether the oral microbiome exhibits distinct taxonomic features in seropositive individuals at high-risk for RA and provide a potential avenue for early detection, intervention or prevention.

Participants and Study Design
Individuals at high-risk for RA were recruited from West China Hospital, Sichuan University, China. These subjects had a positive serum antibody for ACPA, with or without arthralgia at the time of enrollment. Absence of arthritis was confirmed by physical examination of 44 joints. RA patients were diagnosed according to the American College of Rheumatology (ACR) 2010 classification for RA. Most RA patients were receiving oral disease-modifying anti-rheumatic drugs (DMARDs) and/or corticosteroids at the time of enrollment. Patients receiving biological agents were excluded. Age, gender and ethnicitymatched healthy controls (HCs) with no personal history of inflammatory arthritis were recruited. ACPA-negative profiles for HCs were obtained from the health management center. Subjects from all three study groups were ≥18 years old. Individuals having a history of antibiotics treatment or surgery in the last 3 months, current extreme diet, major organ dysfunction, cancer, other rheumatic or autoimmune diseases including osteoarthritis, systemic lupus erythematosus, Sjögren syndrome, diabetes were excluded. A total of 79 participants who met the inclusion and exclusion criteria were enrolled, including 29 high-risk individuals, 27 RA patients and 23 HCs. The study procedure was approved by the Biomedical Research Ethics Committee, West China Hospital of Sichuan University (ChiCTR1900022605), and the written consents were obtained from all the participants according to the Declaration of Helsinki. Sociodemographic factors and clinical activity are summarized in Table 1.

Periodontal Health Evaluation and Saliva Collection
Periodontitis was assessed using a self-reported questionnaire involving bleeding on brushing teeth, non-traumatic loose or missing teeth, or periodontal disease diagnosed by a dentist. Individuals reporting any of these issues were recorded positive. Participants were asked to refrain from eating, drinking or smoking for 30 min prior to sample collection. For saliva collection, participants were first asked to rinse mouth with bottled water to remove food debris, keep lips shut for 3 min, and then spit saliva directly into a 50ml sterile Falcon tube (Becton). After collection, samples were immediately frozen and stored at −80 • C.

Bacterial DNA Extraction
Microbial DNA was isolated from saliva using the FastDNA R SPIN Kit for Soil and the FastPrep R Instrument (MP Biomedicals, Santa Ana, CA) according to manufacturer's protocols. Concentration and purification of the final DNA were determined by NanoDrop 2000 (Thermo Scientific, USA), and quality checked by 1% agarose gel electrophoresis. The V3-V4 hypervariable regions of the bacteria 16S rRNA gene were amplified with primers 338F (5 ′ -ACTCCTACGGGAGGCAGCAG-3 ′ ) and 806R (5 ′ -GGACTACHVGGGTWTCTAAT-3 ′ ) by thermocycler PCR system (GeneAmp 9700, ABI, USA). The resulted PCR products were extracted from a 2% agarose gel and further purified using the AxyPrep DNA Gel Extraction Kit (Axygen Biosciences, USA) and quantified using QuantiFluor TM -ST (Promega, USA) according to the manufacturer's protocol.

Illumina MiSeq Sequencing and Processing of Sequencing Data
Purified amplicons were pooled in equimolar and paired-end sequenced on an Illumina MiSeq platform (Illumina, San Diego, USA). Raw fastq files were quality-filtered by Trimmomatic and merged by FLASH with the following criteria: (i) the reads were truncated at any site receiving an average quality score <20 over a 50 bp sliding window, (ii) sequences whose overlap being longer than 10 bp were merged according to their overlap with mismatch no more than 2 bp, and (iii) sequences of each sample were separated according to barcodes (exactly matching) and Primers (allowing 2 nucleotide mismatching), and reads containing ambiguous bases were removed. Operational taxonomic units (OTUs) were clustered with 97% similarity cutoff using UPARSE (version 7.1) with a novel "greedy" algorithm that performs chimera filtering and OTU clustering simultaneously. The taxonomy of each 16S rRNA gene sequence was analyzed by RDP Classifier algorithm (http://rdp.cme.msu. edu/) against the Silva (SSU132) 16S rRNA database using confidence threshold of 70%. The fastq files were deposited into the NCBI Sequence Read Archive (SRA) database (Accession Number: PRJNA578951).

Statistical Analysis
To determine statistically different bacterial taxa among the three groups, we applied the Kruskal-Wallis H-test, with multiple test corrected by Benjamini-Hochberg false discovery rate (FDR) test. Post-hoc test was applied to further determine the difference between each group-pair if multiple test among the three groups was significantly different. The linear discriminant analysis (LDA) effect size (LefSe) analysis was applied to detect the most discriminatory taxa among groups. Different features with an LDA score cut-off of 3.0 were identified. For cross-sectional analyses of baseline characteristics and comparison of diversity indexes among three groups, differences were evaluated using the one-way ANOVA test, corrected by FDR. The ANOSIM test was applied to the binary euclidean distance matrix containing all analyzed samples to define if the overall structure of the microbiota was significantly different between the groups. Spearman's correlation analyses were used to assess potentially clinically relevant associations on all taxa. The correlation network between the genera was plotted using cytoscape. Significant correlations with absolute value of Spearman correlation coefficient (rho) >0.5 were plotted. Twotailed P < 0.05 were considered significant.

Characteristic of Participants
Detailed demographic characteristics of the participants enrolled in the study are given in Table 1. Age and gender were comparable among the three groups. The mean disease duration in RA patients was 17.9 months (median 12.5 months) and the mean DAS28 was 4.98, reflecting the presence of moderate disease. Nine out of 29 individuals reported having a history of arthralgia in high-risk group, although no arthritis based on physical examination of 44 joints was detected at the time of enrollment. Among RA patients, ACPA and IgM-RF positivity were 93 and 85%, respectively, compared to 100 and 14% in high-risk individuals (ACPA, P = 0.23; IgM-RF, P < 0.0001). Compared to the high-risk individuals, the sera antibodies titers in RA patients were 1.  each sample increased sharply before reaching a plateau, which indicates that the number of bacterial sequences obtained represented the bacterial communities well, as the rarefaction curves tended toward saturation ( Figure 1A). When compared to HCs and RA patients, oral microbial alpha-diversity was significantly reduced in high-risk individuals (designated "Pre"), as shown by the Shannon diversity index (Pre vs. RA), the ace community richness index and Faith's phylodiversity index (Pre vs. HCs) (Figures 1B-D). Subsequently, we analyzed whether the overall structure of the oral microbiota of HCs differed from that of RA and at-risk individuals based on binary euclidean distance. We further applied PCoA analysis to cluster samples along orthogonal axes of maximal variance. As shown in Figure 1E, beta-diversity plots differentiated the oral microbiota of HCs, RA patients and high-risk individuals (ANOSIM test; R = 0.1657, P = 0.001).

Alteration of Specific Taxa Abundance in High-Risk Individuals
To further probe the distinct bacterial taxa among groups, we first analyzed the relative abundance of the most abundant taxa. The pie charts revealed that the dominant phyla across all subjects were Firmicutes, Proteobacteria, Bacteroidetes, Fusobacteria, and Actinobacteria, which together accounted for more than 95% of bacterial sequences (Figure 2A). We then analyzed the differential phyla among the three groups. Notably, the relative abundance of Firmicutes increased gradually from HCs, highrisk individuals to RA patients (Figures 2A,B, P = 0.0421 for RA vs. HCs), while Proteobacteria showed an inverse trend of transition (Figures 2A,B, P = 0.0003199 for RA vs. HCs). The ratio of Firmicutes to Proteobacteria significantly increased from HCs to high-risk individuals to RA patients as evaluated by Chisquare test (Supplementary Figure 1, P =0.0083). Other taxa including Actinobacteria and Patescibacteria showed a similar trend of enrichment in high-risk individuals and RA patients vs. HCs (Figures 2A,B).
We then applied Kruskal-Wallis followed by FDR correction and the LEfSe method to analyze more specific differences in microbiota composition among three groups. At the genus level, Defluviitaleaceae_UCG-011 was significantly decreased in both RA and high-risk individuals compared to HCs ( Figure 2C). Characteristic genera changes were found in highrisk individuals, with Rothia genus increased (Pre vs. RA) and Filifactor decreased (Pre vs. HCs, Figure 2C). The abundance of three genera, Actinomyces, Prevotella_6, and Parvimonas showed tendencies of gradual change in different stages of disease represented by the three groups, although significant differences were achieved only between RA patients and HCs. The genera enriched in RA patients compared to HCs included Actinomyces, Prevotella_6, and Selenomonas_3, whilst Neisseria, Haemophilus, Parvimonas, and Eubacterium_yurii_group were diminished in RA patients (Supplementary Figure 2). These findings were further verified by the LEfSe analysis, as shown in the cladogram and the contributory discriminate taxa with LDA score >3 was plotted for each group (Figures 3A,B).
The titres of antibodies to a periodontopathic species, P. gingivalis, are associated with the presence of RA-related autoantibodies in at-risk individuals (Mikuls et al., 2012;Johansson et al., 2016). A further study indicates P. gingivalis not only induced periodontitis, but also worsened the concurrent T cell-dependent arthritis in mice (de Aquino et al., 2014). We, therefore, assumed an elevated abundance of P. gingivalis in the saliva of high-risk individuals and RA patients. Contrary to our initial hypothesis, there was no significant difference in its relative abundance in RA patients compared to HCs. Moreover, the relative abundance of P. gingivalis was significantly decreased in high-risk individuals (Pre vs. HCs, Figure 3C). We also found two uncultured species within the Rothia genus and Porphyromonas genus more abundant in high-risk individuals than in RA patients. Intriguingly, among other differentially abundant species in RA patients (Supplementary Figure 3), an unclassified species annotated within the Prevotella_6 genus was elevated in the saliva of RA patients (RA vs. HCs, P = 0.009). The relative abundance of Neisseria oralis was decreased in both RA patients and high-risk individuals ( Figure 3C).

Systemic Autoimmune Signature in High-Risk Individuals and RA Patients Is Associated With Characteristic Saliva Taxa
To further investigate whether the observed changes in saliva microbiota was associated with the autoimmune characteristics in high-risk individuals and with other disease parameters in RA, we analyzed the correlations of the relative abundance of bacterial genera with (1) serum concentrations of ACPA and RF in highrisk and RA individuals, (2) disease activity parameters including CRP, ESR, DAS28, and (3) course of the disease in RA patients. FIGURE 3 | LEfSe analysis revealed the specific taxa changes in high-risk individuals (Pre) and RA patients. LefSe analysis was applied to identify differentially abundant taxa, which are highlighted on the phylogenetic tree in cladogram format (A) and for which the LDA scores more than 3 are shown (B). (C) Species abundance changes unique to at-risk individuals, and those consistent with changes in RA patients. LDA, linear discriminant analysis; LefSe, the LDA effect size. *p < 0.05; **p < 0.01; and ***p < 0.001.
The study revealed that in high-risk individuals, serum ACPA concentration was positively correlated with the relative abundance of Eubacterium nodatum_group, Peptostreptococcus, Tannerella, norank_o__Absconditabacteriales_SR1, while conversely associated with Haemophilus and Neisseria (Figure 4A), both of which were significantly decreased in RA patients (Supplementary Figure 2). A negative correlation with Pseudomonas was found for RF in high-risk individuals ( Figure 4A).
In RA patients, however, no genus was correlated with the disease course, which indicates a relatively stable saliva bacterial community in the time spans (Figure 4B). Contrary to the study in high-risk individuals, only norank_o__Absconditabacteriales_SR1 was positively associated with serum ACPA concentration in RA patients. Notably, the concentration of RF was significantly associated with the relative abundance of Abiotrophia, Corynebacterium, Fretibacterium, and Prevotella_1. Clinical disease activity parameter DAS28 positively correlated with Ruminococcaceae_UCG-014, which was also associated with CRP and ESR. In addition, Lactobacillus and norank_f__Saccharimonadaceae, both increased in RA patients (Supplementary Figure 2), positively correlated with ESR ( Figure 4B). Finally, CRP value at the time of enrollment was also positively associated with Lactobacillus and negatively with Oribacterium.

Examination of Interactions Among Differentially Abundant Microbes
A correlation network was constructed to investigate the co-abundance and co-exclusion interactions between the differentially abundant microbes. The correlated genera were from eight phyla as indicated in Figure 5A. Most of the correlations within the community were positive, with only a few negative correlations. Gemella genus was negatively associated with Megasphaera and Prevotella_6. Prevotella_6, on the other side, was also negatively associated with Porphyromonas. Additionally, significant negative correlation was found between Atopobium and Neisseria.

Models of Saliva Bacterial Biomarkers Profile and Predicted Function in RA Patients
We next used the machine learning random forests algorithm to construct a prediction model (Breiman, 2001). A panel of 11 genera was selected based on the model (Figures 5B,C). The efficacy of these differentially expressed bacteria in discriminating between RA patients and HCs was calculated using a receiver operator characteristic (ROC) curve. The area under the ROC curve (AUC) was 80.0%, and the 95% confidence interval (CI) was 67-93% ( Figure 5D).
We then applied PICRUSt (Langille et al., 2013) to infer the functional content of the microbiota. The associations between differentially abundant taxa with predicted functional pathways KEGG (Kyoto Encyclopedia of Genes and Genomes) was illustrated in Supplementary Figure 4A. Specifically, an increase in OTUs functional representatives of the bacterial toxins pathway, carbohydrate digestion and absorption pathway, starch, and sucrose metabolism pathway was observed in RA patients compared to HCs. By comparison, OTUs involved in fatty acid biosynthesis, glutathione metabolism, glycan biosynthesis and metabolism, inorganic ion transport and metabolism, and phosphatidylinositol signaling system were decreased in RA patients (Supplementary Figure 4B).

DISCUSSION
Our study identified, for the first time, compositional alterations of oral microbiota in individuals at high-risk for RA, who have developed systemic autoimmunity associated with RA. Importantly, several saliva taxa changes were associated with serum autoantibody levels or clinical disease parameters in high-risk individuals and/or RA patients. A bacterial biomarker panel was constructed in discriminating RA patients from healthy individuals. Putative markers involved in bacterial toxins pathway, carbohydrate digestion and absorption were upregulated while pathways involved in fatty acid biosynthesis, glycan biosynthesis and metabolism, and phosphatidylinositol signaling were under-represented in RA patients. These findings support the hypothesis that microbiome changes occurring in mucosal sites such as the oral cavity might contribute to disease pathogenesis in the initial stages of RA.
We found that the microbial diversity in the oral cavity was comparable between HCs and RA patients. This is in line with a prior study that reported similar microbial richness and diversity in subgingival samples from new-onset RA patients (NORA) and HCs (Scher et al., 2012). Similar results were also reported in stool samples from RA patients and HCs (Zhang et al., 2015). Noticeably, we identified that in "preclinical" high-risk individuals, microbial diversity and richness was significantly reduced compared to RA patients and HCs. The overall composition of saliva microbial communities was also different amongst the three groups, suggesting oral microbiota perturbations already exist in the "pre-clinical" stage of RA.
Although some genera were significantly altered in RA patients (discussed later), only a few genera characterized the oral microbiome of high-risk individuals, including Rothia and Filifactor. Of note, some species of the genus Rothia have been identified as opportunistic pathogens and can cause bacteremia, endocarditis, joint infections, and pneumonia (Boudewijns et al., 2003;Verrall et al., 2010;Ramanan et al., 2014;de Steenhuijsen Piters et al., 2016). We found that the Rothia genus, and an uncultured species annotated within the genera were enriched in high-risk individuals compared to RA patients. Interestingly, the Filifactor genus (including Filifactor alocis and an unclassified species) was diminished in high-risk individuals. In fact, F. alocis has been associated with periodontitis and periodontal biofilm formation. Similarly, even though suspicions exist that the PAD-producing P. gingivalis might be involved FIGURE 5 | Saliva bacterial biomarkers characterizes RA patients. (A) Plots of co-abundance and co-exclusion association networks between differentially abundant genera. Each node represents one genus. The node size is proportional to the mean relative abundance of the genus in all samples. Node color indicates the phylum it belongs to. Lines between nodes show positive correlations (solid orange lines) or negative correlations (dashed green lines). Line width is proportional to the value of Spearman correlation coefficient and reflects the magnitude of association. (B-D) A random forest model was applied to identify bacterial biomarkers for RA patients. Ranked lists of genera in order of random forests reporting feature importance scores were obtained (B) and an AUC-validation method was used to determine the optimal genera set (C). The built ROC curve based on the selected panel of 11 genera yield an AUC of 0.8 (D).
in RA pathogenesis (Mikuls et al., 2012;de Aquino et al., 2014), our study found comparable level of P. gingivalis in RA patients and HCs and no association of this species with autoantibody titer was observed. This finding is in agreement with recent studies that did not find an association between P. gingivalis or its PAD and RA (Scher et al., 2012;Konig et al., 2015). Interestingly, we found decreased level of P. gingivalis in high-risk individuals, even though periodontitis prevalence was similar to that in HCs. The unique oral microbial features in high-risk individuals reflect the characteristics of the pre-clinical stage.
As expected, only a minority of taxa changes characterized the high-risk stage while most of the significant discrepancies were only found between the diagnosed RA patients and HCs. Importantly, some taxa exhibited a tendency of gradual change during the development stages of the disease, including three genera Actinomyces, Prevotella_6, Parvimonas and an unclassified species within Prevotella_6. Indeed, the enrichment of Actinomyces and Prevotella have been reported in RA patients' oral samples (Scher et al., 2012;Zhang et al., 2015). Recently, much attention has been paid to the potential pathogenic role of Prevotella spp., especially an intestinal species Prevotella copri, in RA. Presence of P. copri in fecal samples was strongly correlated with disease in NORA (Scher et al., 2013). A recent study further highlighted the association of the Prevotella spp. with RA pathogenesis by revealing its enrichment in fecal sample of "pre-clinical" RA individuals (Alpizar-Rodriguez et al., 2019). Our study identified the increase of Prevotella_6 in saliva samples with disease progression, which further indicates that Prevotella, not limited to the gut but also other mucosal sites, may be involved in RA pathogenesis. Additionally, consistent with a previous study in DMARDs naïve RA patients (Zhang et al., 2015), both Haemophilus spp. and Neisseria spp. were depleted in our group of RA patients. Intriguingly, although not significantly reduced compared to HCs, the relative abundance of both genera negatively correlated with the level of serum ACPA in high-risk individuals. Further, in line with the study by Zhang et al. (2015), Lactobacillus spp. was elevated in saliva of our RA patients, and its level was positively correlated with acute inflammation makers such as CRP and ESR. Given these tantalizing clues indicating a contribution of microbial dysbiosis at various mucus sites to RA pathogenesis, it is likely that complex microbiome network in multiple mucus sites align together in this process.
However, our study has limitations. The oral cavity is home to one of the most diverse microbial communities of human body, with the flowing saliva containing the "whole-mouth" bacterial community and most commonly studied (Mascitti et al., 2019). However, bacteria colonize all sites within the oral cavity, forming different habitats with non-overlapping microbial populations. Regional variations, such as the bacterial profiles in the supragingival plaque of tooth surfaces and subgingival plaque, etc., were not investigated. Samples from multiple oral sites may help build a panoramic view of the oral microbiome association with the disease (Mascitti et al., 2019). Additionally, relatively small number of participants limits the statistical power of the study, given that samples from the individuals in pre-clinical stages is difficult to acquire. A larger study including more participants would help draw a more concrete conclusion. Another limitation is that most of the involved established RA patients were under treatment, which may affect the salivary microbiome of RA patients. Even though we have a main study focus on the high-risk individuals, who were treatment-naïve, an also treatment-naïve group of RA patients would help improve the study. Finally, the assessment of periodontitis was based on self-reported questionnaires regarding periodontitis symptoms and medical history. Periodontitis status might be underestimated or overestimated in an individual. However, as the questionnaire was applied to all individuals independently, a presumably equal readout would be expected. Further, our study didn't find an association between the abundance of specific periodontitis pathogen with autoantibody titer in either high-risk individuals or RA patients. The seemingly paradoxical finding of lower abundance of periodontitis associated pathogen P. gingivalis in high-risk individuals complicates what is known about the relationship between periodontitis and RA pathogenesis. Further studies involving larger number of pre-clinical individuals and graded severity of periodontitis assessment would help better clarify this association.

CONCLUSIONS
In summary, we demonstrate that the pre-clinical stage of RA is characterized by oral dysbiosis, some of which are correlated with autoantibody titer. Common and unique microbial features exist in the high-risk stage compared to established RA. Our findings support the mucosal origin hypothesis in the development of RA. Further mechanistic insights into possible causation through well-designed prospective human studies and evidence derived from in vivo experiments from animal models are warranted.

DATA AVAILABILITY STATEMENT
The fastq files were deposited into the NCBI Sequence Read Archive (SRA) database (Accession Number: PRJNA578951).

ETHICS STATEMENT
The studies involving human participants were reviewed and approved by Biomedical Research Ethics Committee, West China Hospital of Sichuan University. The patients/participants provided their written informed consent to participate in this study.

AUTHOR CONTRIBUTIONS
YLuo and YLiu designed and supervised the study, reviewed and edited the manuscript. YT, LZ, PQ, HZ, QZ, and YZ collected the samples and sociodemographic, and pathological data. YT, YLi, and LS curated the data. YT, LZ, and YLuo conducted analyses and wrote the first draft of the manuscript. All authors contributed to manuscript revision, read, and approved the submitted version.