Differential Diagnosis of Osteoarthritis and Rheumatoid Arthritis by Bioinformatics Analysis

Background: Osteoarthritis (OA) and rheumatoid arthritis (RA) is the most common joint disease. The aim of my current academic work is to identify important genes associated with OA and RA, clarify their underlying mechanisms, and dene differences between OA and RA. Methods: Gene expression proles of GSE55235 were available from Gene Expression Omnibus database. Differentially expressed genes between 1) OA tissues and normal tissues, 2) RA tissues and normal tissues, were picked out by GEO2R tool, Venn diagram software, and Volcanic map. Next, we made use of the Database for Annotation, Visualization and Integrated Discovery to analyze Kyoto Encyclopedia of Gene and Genome pathway and gene ontology. The protein-protein interaction of these DEGs was visualized by Cytoscape with Search Tool for the Retrieval of Interacting Genes. Of entire PPI network analyzed by Molecular Complex Detection plug-in and obtain central nodes. Then the central nodes were re-analysis via DAVID. We next obtained the intersection of the analysis results of GSE55457 and GSE55235 to verify the results. Finally, potential biomarkers were evaluated by receiver operating characteristic. Results: Twelve genes were found to be signicantly enriched in the TNF signaling pathway in OA, and eleven genes were found to be signicantly enriched in the Chemokine signaling pathway in RA. Receiver operating characteristic curve suggested that the detection of JUN in OA and CCL5, CXCL9, CXCL10, CXCL11, CXCL13 in RA exhibited a high diagnostic performance. Conclusions: We have identied six signicantly DEGs on the basis of integrated bioinformatical methods, which could be potential targets for differential diagnosis for OA and RA.


Background
Osteoarthritis (OA) and rheumatoid arthritis (RA) are common joint diseases in the world. Although they have different pathogenesis, they have similar etiology, similar clinical manifestations, and some of the same cellular and molecular bases [1][2] . RA is a multi-system chronic in ammatory disease, mainly characterized by persistent and symmetrical joint pain, swelling and morning stiffness. Basic pathological changes include synovial in ammation, RA pannus formation, articular cartilage, and bone destruction. Joint deformity and loss of function as a result of"swan neck" and "button hole" deformity in ngers of middle-late patients, joint stiffness, and joint subluxationhas made this disease gradually become one of the leading causes of loss of human labor and disability [3][4][5] . According to epidemiological statistics, RA affects 0.5-1% of the global adult population [6] , while OA, as a common joint disease, may cause symptomatic OA in about 10% of the global population over the age of 60 [7] . The prevalence of OA is signi cantly higher in women than in men, and the incidence tends to be younger [8] . OA is characterized by joint pain, stiffness, and swelling caused by synovitis, reduction of articular cartilage tissue, thickening of subchondral bone, and formation of osteophytes [9] . Synovial membrane can not only secrete synovial uid to reduce the friction on the joint surface, but also can provide the necessary nutrients for the articular cartilage, which is conducive to maintaining the stability and exibility of the joint. However, when the synovium is stimulated by trauma, infection and strain, it can cause synovitis and secrete in ammatory mediators, cause degradation of cartilage matrix, and accelerate the pathological process of OA. At present, the etiology of OA is not yet clear. According to existing research, the occurrence of OA is related to age, gender, family history, obesity, trauma, increased joint load and other factors [10] .
The differential diagnosis of RA and OA is challenging because the progression of the diseases usually precedes clinical manifestations, in addition, symptoms of OA and RA also have overlap in advanced cases of the disease. Therefore, nding out the differentially expressed genes of OA and RA is helpful for the accurate diagnosis, accurate treatment and prognosis of the patients. While some genes and diagnostic markers have been identi ed in both RA and OA, they are insu cient to fully understand the mechanisms underlying the initiation and progression of these common join disease [11,12] .
In recent years, high-throughput platform-based microarrays have been widely used in the molecular diagnosis, classi cation and prognosis assessment of diseases [13] . This is an important method for high-throughput study of disease gene expression pro les, which can detect a large amount of genetic information in a short period of time. As research proceeds, large amounts of genetic information are uploaded to public databases. Most of the data has not been used effectively due to differences in research purpose and tissue speci city [14] . Therefore, on the basis of previous research results, using bioinformatics technology to re-select and analyze the gene chip data can make up for these de ciencies.
In this study, we rst selected the GSE55235 from Gene Expression Omnibus (GEO). Second, we applied the GEO2R online tool, Venn diagram software and Volcanic map, to obtain the commonly differentially expressed genes (DEGs). Third, a database for annotation, visualization, and synthesis of discoveries (DAVID) was used to analyze these DEGs, including molecular function (MF), cellular component (CC), biological processes (BP), and the Kyoto encyclopedia of genes and genomes (KEGG) pathway. Fourth, we established a protein-protein interaction (PPI) network and then applied cellular molecular complex detection(MCODE) for additional DEG analysis to identify some core genes. We then reanalyzed the KEGG pathway enrichment of these genes. To further prove the reliability of the above results, GSE55457 from the same platform as GSE55235 is used for veri cation. GSE55235 also has OA tissue, RA tissue and normal tissue. Finally, potential biomarkers were evaluated by receiver operating characteristic. In conclusion, our bioinformatics study providesbiomarkers that may be effective targets for the differential diagnosis of OA and RA patients.

Methods
Microarray data information NCBI-GEO is regarded as a free public database of microarraygene pro le and we obtained the gene expression pro le of GSE55235 and GSE55457 in normal tissue, OA tissue and RA tissuefrom the different tissue samples. Microarray data of GSE55235 and GSE55457 were all on account of GPL96 Platforms ([HG-U133A] Affymetrix Human Genome U133A Array). GSE55235 consists of synovial tissue of 10 healthy joints, 10 OA joints and 10 RA joints. GSE55457 consists of synovial tissue of 10 healthy joints, 10 OA joints and 10 RA joints.

Data processing of DEGs
DEGs between OA and normal tissues, RA and normal tissues were identi ed via GEO2R online tools [15] with|logFC| >2 and adjust P value <0.05. Then, the raw datain TXT format was checked in Venn software online todetect commonly DEGs. A DEG with logFC <0 wasconsidered a down-regulated gene, while a DEG with logFC >0 was considered an up-regulated gene.
Gene ontology and pathway enrichment analysis Gene ontology analysis (GO) is a common method of de ning genes and their RNA or protein products to determine the unique biological properties of high-throughput transcriptome or genomic data [16] . KEGG is a collection of databases dealing with genomes, diseases, biological pathways, drugs, and chemical materials [17] . DAVID is an online bioinformatics tool that aims to identify the function of a large number of genes or proteins. We used DAVID to visualize DEG enrichment in BP, MF and CC pathways [18] . (P <0.05).
PPI network and module analysis PPI information was evaluated by an online tool, STRING (Search Tool for the Retrieval of Interacting Genes) [19] . Then, the STRING app in Cytoscape [20] was applied to examine the potential correlation between these DEGs (maximum number of interactors=0 and con dence score ≥0.4). In addition, the MCODE app in Cytoscape was used to check modules of the PPI network (degree cutoff=2, max. Depth=100, k-core=2,and node score cutoff=0.2).

Identi cation of candidate genes
We used GSE55457 to further prove the reliability of the above candidate genes by extracting the intersection of GSE55235 and GSE55457 core genes. Finally, potential biomarkers were evaluated by receiver operating characteristic curve (ROC) with R software (v3.4.0; http://bioconductor.org/biocLite.R). The results were visualized and the area under the ROC curve (AUC) of the core gene was calculated. AUC >0.5 indicates this gene has good diagnostic value.

Identi cation of DEGs
There were 10 normal tissues, 10 OA tissues and 10 RA tissues in our present study. From GEO2R online tools, we extracted 321 and 538 DEGs from OA and RA, respectively (Fig. 1). A total of 162 DEGs in common were detected, including 73 down-regulated genes (logFC <0) and 89 up-regulated genes (logFC >0) (Table. 1  3) for MF, they were both enriched in immunoglobulin receptor binding, OA DEGs were enriched in antigen binding, transcription factor activity, and RNA polymerase II core promoter proximal region sequence-speci c binding, while RA DEGs were enriched in chemokine activity and immunoglobulin receptor binding (Fig. 3).
KEGG analysis results demonstrated that OA and RA DEGs both enriched in Rheumatoid arthritis, Osteoclast differentiation. OA DEGs were particularly enriched in TNF signaling pathwaywhile RA DEGs were particularly enriched in Leishmaniasis (P <0.05) (Fig. 4).

Discussion
To identify more useful biomarkers for the differential diagnosis of OA and RA, this study used bioinformatic methods on thebasis of GSE55235 pro le datasets. Using GEO2R and Venn software, we revealed a total of 162 changed DEGs in common, including 89 up-regulated and 73 down-regulated DEGs. Then, Gene Ontology and Pathway Enrichment Analysis using DAVID methods showed that 1) for BP, the difference is that OA DEGs is enriched in cellular response to broblast growth factor stimulus, while RA is enriched in response to cAMP. 2) for GO CC the difference is that OA DEGs were enriched in extracellular space,while RA DEGs were enriched in external side of plasma membrane. 3) for molecular function (MF),OA DEGs were enriched in antigen binding, transcription factor activity,and RNA polymerase II core promoter proximal region sequence-speci c binding, while RA DEGs were enriched in chemokine activity and immunoglobulin receptor binding. For pathway analysis,OA was particularly enriched in TNF signaling pathway while RA was particularly enriched in Leishmaniasis (P <0.05). Next, OA DEGs PPI network complex of 190 nodes and 882 edges was constructed via the STRINGonline database and Cytoscape software. Then, 23 vital genes were screened from the PPI networkcomplex by Cytotype MCODE analysis. And RA DEGs PPI network complex of 319 nodes and 2194 edges was constructed and 17 vital genes were screened. There were 12 genes found to signi cantly enrich in the TNF signaling pathway in OA and eleven genes were found to signi cantly enrich in the Chemokine signaling pathway in RA via re-analysis of DAVID. In order to further prove the reliability of the above results, the GSE55457 was used for veri cation, and the intersection of the analysis results of GSE55457 and GSE55235 was obtained. Finally, potential biomarkers were evaluated by receiver operating characteristic analyses, which suggested that the detection of JUN in OA and CCL5, CXCL9, CXCL10, CXCL11 and CXCL13 in RA exhibited a high diagnostic performance.
JUN family members Jun B, c-Jun and Jun D constitue the transcriptional activation protein important subunits (AP-1; activator protein 1), widely involved in cell proliferation, differentiation, apoptosis, and the tumor formation process [21] . Typically, in families of protooncogenes, there is a class of genes that can be directly induced and activated by a second messenger, which we call IEGs, or immediate early genes. This group of genes is mainly related to the biochemical changes in the cell, but also related to the speci c changes in the cell, or is one of the mediators, so it is sometimes called the third messenger [22] . C-Jun is an important member of the IEG family and an important indicator of subchondral bone remodeling. Immunohistochemicalstudies reveal that c-Jun proteins areupregulated in chondrocytes from the articular cartilage of OA patients [23] . Additionally, studies have shown that c-Jun regulates chondrocyte apoptosis during OA [24] .
Chemokines (CK) are a class of chemotactic cytokines, which mediate the entry of in ammatory cells such as neutrophils and monocytes into the synovium where in ammation exists. According to the different positions of conserved cysteine residues, chemokines can be divided into four classes, namely CXC, CC, C and CX3C. In the process of mediating in ammation, CXC and CC chemokines play a particularly important role [25] . The results of bioinformatics analysis showed that CCL5, CXCL9, CXCL10, CXCL11 and CXCL13 had signi cant differences.
The serum level of CCL5 isincreased in RA patients, and CCL5 issigni cantly correlated with the range of motion lesions in rheumatoid arthritis, such as joint swelling index, blood sedimentation, and c-reactive protein [26]. In contrast, the content of CCL5 in synovial uid of OA patients did not signi cantly increase, even though there was a large number of in ammatory cells [26] . The level of CCL5 produced by synovial in ammatory cells in vitro was not signi cantly different from that of CCL5 produced by peripheral blood leukocytes. Current research suggests that when stimulated with a combination of IFN-gamma and TNFalpha, synovial tissue cells and synovial broblast-cell lines (fourth or fth passage) were able to secrete large amounts of CXCL9, CXCL10, CXCL11, and CXCL13 [27,28] . Beyond that, the expression levels of chemokines CCL5, CXCL9, and CXCL10 in RA synovial membrane were signi cantly higher than that of OA synovial membrane [29] . The CXCL13 level increase was associated with disease activity of RA. Serum CXCL13 can be used as a reference index to evaluate the severity of RA patients and is of high value in the diagnosis of RA [29] .
Chemokines and chemokine receptors play an important role in the in ammatory process and are also widely involved in the occurrence of some in ammatory diseases. Therefore, the use of chemokines and chemokine receptors as differential diagnostic targets may provide new ideas and methods for the clinical diagnosis of OA and RA.

Conclusions
Taken above, our bioinformatics analysis study identi ed six genes (JUN, CCL5, CXCL9, CXCL10, CXCL11, and CXCL13) between OA and normal tissues, and RA and normal tissues on the base of microarray datasets. Results showed that these six genes could play key roles in the progression of OA or RA.
Therefore, these genes are expected to be a new molecular target for the differential diagnosis of OA and RA. However, there are few reports on the comparative study of these 6 genes between OA and RA. Therefore, these predictions should be veri ed by a series of experiments in the future. Moreover, this work has important theoretical signi cance and broad application prospect for nding new drug targets and screening new drugs and clinical application in the treatment of OA and RA. Availability of data and materials

Abbreviations
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.

Funding
The present study was funded by the the Middle-Aged Youth Talent  The common DEGs between OA and normal tissues, RA and normal tissues.