Deciphering Expression Proling and Functional Signatures of Individualized Obesity-Associated Genes in Ossication of Ligamentum Flavum Through RNA-Sequence Data Mining

Background: Obese individuals predispose to ossication of ligamentum avum (OLF), whereas the underlying connections between obesity phenotype and OLF pathomechanism are not fully understood, especially during early life. This study aimed to explore obesity-associated genes and their functional signatures in OLF. Methods: Gene microarray expression data related to OLF were downloaded from the GSE106253 dataset in the Gene Expression Omnibus (GEO) database. The potential obesity-related differentially expressed genes (ORDEGs) in OLF were screened. Then, gene-ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were applied for these genes. Furthermore, protein-protein interactions (PPI) were used to identify hub ORDEGs, and Metascape was used to further verify the key signaling pathways and immune-related function signatures of hub ORDEGs. Finally, correlation analysis of hub ORDEGs and identied OLF-related inltrating immune cells (OIICs) was constructed to understand the possible mechanical link among obesity, immune response and OLF. Results: OLF-related differentially expressed genes and 2051 obesity-related genes from four databases were intersected to obtain 99 ORDEGs, including 54 upregulated and 55 downregulated genes. GO and KEGG analysis revealed that these genes were mainly involved in metabolism, inammation and immune-related biological functions and pathways. A PPI network was established to determine 14 hub genes (AKT1, CCL2, CCL5, CXCL2, ICAM1, IL10, MYC, PTGS2, SAA1, SOCS1, SOCS3, STAT3, TNFRSF1B and VEGFA). The co-expression network demonstrated that this module was associated with cellular response to biotic stimulus, regulation of inammatory response, regulation of tyrosine phosphorylation of STAT protein. Furthermore, Metascape functional annotations showed that hub genes were mainly involved in receptor signaling pathway via JAK-STAT, response to TNF and regulation of defense response, and their representative enriched pathways were TNF, adipocytokine and JAK-STAT signaling pathways. Subgroup analysis indicated that T cell activation might be potential immune function processes involved, and correlation analysis revealed that cDCs, memory B-cells and preadipocytes were highly correlated inltrating immune cells. Conclusions: Our study deciphered individualized obesity-associated gene signature for the rst time, which may facilitate exploring the underlying cellular and molecular pathogenesis and novel therapeutic targets of obesity-related early-onset OLF. The function analysis of hub genes. (A-B) The functions of hub genes were mainly enriched in receptor signaling pathway via JAK-STAT, response to tumor necrosis factor and regulation of defense response. (C-E) The most signicant pathway and related genes. The results show that these hub genes are mainly involved in JAK-STAT signaling pathway, cytokine-cytokine receptor interaction, adipocytokine signaling pathway, and chemokine signaling pathway. (F-H) The most signicant immune response and related genes. The results show that these hub genes are mainly involved in T cell activation, cellular response to interferon-gamma, lymphocyte migration and mononuclear cell differentiation.


Introduction
Ossi cation of ligamentum avum (OLF) is the major pathogeny of severe thoracic myelopathy characterized by abnormally progressive heterotopic ossi cation of the intraspinal ligament with unelucidated pathogenesis. 1 Thoracic OLF (TOLF) prevails among East Asian population with the reported prevalence of 3.8-63.9%, which may contribute to deteriorating neurological dysfunction. 2 Although surgical intervention could block this process, it was still accompanied by multiple complications and high surgical risks. 3,4 Therefore, it is absolutely imperative to elucidate the molecular mechanism in occurrence and development of OLF, thereby probing reliable biomarkers and novel therapeutic modalities.
To date, genetic, stress, ageing, in ammatory and metabolic factors have been found to be associated with the TOLF, whose etiological heterogeneity may indicate distinctive pathological phenotypes. [5][6][7][8][9] Except for susceptible middle-aged and elderly people, anecdotal observations indicated that a clinical subset of younger patients with obesity also suffered from severe or even diffuse TOLF. Additionally, concrete evidence has identi ed obesity as an independent causal factor for the development of TOLF in nonelderly adults. 10 Besides the systemic in ammatory state of obesity, our previous research has proven that leptin, an adipocyte-derived product encoded by Ob gene, could induce signi cant osteogenic differentiation in ossi ed ligamentum avum cells (OLFCs), but not in normal LF cells (NLFCs), which implied that OLFCs are considerably more sensitive to leptin stimulation. 11 Therefore, we speculated the obesity-associated factors and mechanisms would participate in the progression of OLF. However, the underlying biochemical links between obesity phenotype and OLF pathogenesis have not been systematically investigated.
Multiple research strategies have been adopted to excavate the molecular mechanisms. Recently, various high-throughput sequencing technologies have gained extensive attention and provided compelling insights into OLF pathogenesis, including transcriptomics, proteomics, metabolomics and epigenetics. 9,12−14 In addition, bioinformatics analysis of gene expression pro les or other sequencing data has identi ed several featured genes and related biological functions of OLF. 15,16 For instance, Wu et.al constructed key ceRNA mechanism networks and functional pathways associated with OLF using gene expression pro les. 15 Our recent work identi ed distinct immune-related genes and in ltrating immunocyte patterns in OLF based on transcriptome data and experiment validation, which exploited a new research direction. 16 Therefore, adopting bioinformatics algorithms for screening differentially expressed obesity-related genes and analyzing their functional signatures based on high-throughput sequencing data derived from OLF samples could help lift the veil on the underlying relationship between obesity and OLF.
In the present study, the mRNA microarray dataset GSE106253 from the Gene Expression Omnibus (GEO) was reinterpreted from other perspectives to screen the obesity-related differentially expressed genes in OLF (ORDEGs). Then, the protein-protein interaction (PPI) was constructed to further determined hub ORDEGs, and correlation analysis, co-expression analysis, gene-ontology (GO) enrichment analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis were applied for their functional annotations. Finally, the correlation analysis between differential in ltrating immune cells and hub ORDEGs was performed to preliminarily analyze possible immune-related functions of these genes.

Material And Methods
Collection of OLF-Related Microarray Data and Human Obesity-Related Genes Dataset According to the inclusion criteria: 1) organism: Homo sapiens; 2) expression pro ling by microarray; 3) samples: OLF ligament tissues and normal ligament tissues, an eligible high-throughput RNA-sequencing data (GSE106253) was downloaded from Gene expression Omnibus (GEO), which contains the mRNA information of ligamentum avum tissue from 4 TOLF patients and 4 healthy individuals. Meanwhile, all obesity-related gene lists in Homo sapiens were obtained from Integratomics TIME, GWAS, T-HOD and KEGG PATHWAY databases.
Identi cation of OLF and Obesity-Related Differentially Expressed Genes (ORDEGs) The raw data from GSE106253 dataset was reviewed for background correction and data normalization through the affy package of R Software. Based on the predetermined statistical threshold of |fold change| > 1 and adjusted p-values < 0.05, differentially expressed genes (DEGs) between OLF samples and healthy controls were screened out through GEO2R, an interactive online tool. Furthermore, these DEGs and ORGs were intersected to obtain the OLF-related and obesity-related DEGs (ORDEGs). Clustering heatmap and circular heatmap were performed to describe the expression of DEGs and ORDEGs, respectively.

Analyses of Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG)
GO analysis was performed to annotate ORDEGs, and to illustrate their functions in biology process (BP), cell component (CC), and molecular function (MF). KEGG analysis was conducted to investigate which cellular pathways may participate in the functional alterations of ORDEGs, and to understand high-level and biological functions from large-scale molecular datasets. The clusterPro ler package was used for the GO and KEGG analysis. The cutoff values for the GO and KEGG analyses were set at p < 0.05.

Construction of Protein-Protein interaction (PPI) Network and Hub ORDEGs Identi cation
Search Tool for the Retrieval of Interacting Genes (STRING; http://string-db.org; version 11.5), an online database for predicting protein interactions, was applied to construct the PPI network of ORDEGs when interactions score > 0.7 were taken as statistically signi cant. Cytoscape (version 3.8.1) was used to visualize molecular interaction networks and important nodes of protein interactions within the network were further identi ed by ranking the scores of each node. Considering most networks were scale-free, the hub genes with degree ≥ 30 were selected. Pearson correlation analysis was performed among hub ORDEGs. Furthermore, a network of genes and their co-expression genes was analyzed via GeneMANIA4.

Functional and Correlation Analysis of Hub ORDEGs and OLF-Related In ltrating Immune Cells (OIICs)
Metascape was used to further verify the function enrichment of hub ORDEGs with P-value < 0.05 as the cutoff. Hub genes pathway analysis was performed and visualized by ClueGO (version 2.5.8) and CluePedia (version 1.5.8). P-value < 0.01 was considered to be statistically signi cant. In addition, considering the obesity status is closely related to in ammation and immune response, the immune-related function annotation of hub ORDEGs were analyzed separately using ImmuneSystemProcess in ClueGO (version 2.5.8). Finally, we conducted a correlation analysis of hub ORDEGs and OIICs to better explain the role of hub ORDEGs since 14 differential OIICs with potential effect on the development of OLF (such as with cDCs, NK CD56 bright cells, and preadipocytes) has been identi ed by our recent study. 16 Statistical Analysis and Visualization All statistical analyses and visualization were performed using R software (version 3.6.3), SPSS software and GraphPad Prism 7 software. Nonparametric test was used to examine the statistical relationship between two non-normally distributed data, and parametric test was used to examine the statistical relationship between two normal distributed data. Gene expression levels between two groups were analyzed by Student's t-test. The expression of hub ORDEGs was compared for correlation analysis using the Pearson test. All data were expressed as mean ± standard deviation. P < 0.05 was considered as statistically signi cant.

Identi cation of ORGs in OLF
The detailed work ow diagram of this study is depicted in Figure 1. After the raw data was processed by R software for background correction and data normalization (Figure 2A), a total of 920 DEGs, consisting of 532 up-regulated genes and 388 down-regulated genes were identi ed between OLF samples and normal controls ( Figure 2B). The clustering heatmap showed that top forty DEGs can clearly distinguish OLF tissues from normal tissues ( Figure 2C). After deleting overlapping genes, a total of 2051 obesityrelated gene in Homo sapiens were obtained from above four databases ( Figure 2D). After intersection of these 920 DEGs and 2051 ORGs, 99 ORDEGs in OLF were ultimately identi ed ( Figure 2E). The twodimensional PCA depicted a signi cant difference in these genes to allow further analysis ( Figure 2F), including 54 up-regulated genes and 55 down-regulated genes ( Figure 2G). The expression pro le of top 70 ORDEGs was intuitively visualized through a circular heatmap ( Figure 2H).

Analysis of Go Enrichment Functions
The biological functions of up-regulated and down-regulated genes were analyzed separately ( Figure 3A, 3H). GO analysis identi ed that the up-regulated ORDEGs were signi cantly enriched in BP, including hormone secretion, hormone transport and peptide hormone secretion ( Figure 3B, 3C), whereas downregulated ORDEGs were mainly enriched in regulation of lipid metabolic process, response to lipopolysaccharide and response to molecule of bacterial origin ( Figure 3I, 3J). In terms of CC, upregulated genes were mainly involved in cation-transporting ATPase complex, ATPase dependent transmembrane transport complex and mediator complex ( Figure 3D, 3E) while down-regulated ORDEGs were largely enriched in phosphatidylinositol 3-kinase complex, membrane raft and membrane microdomain ( Figure 3K, 3L). Moreover, MF demonstrated that up-regulated genes were mainly related to neuropeptide hormone activity, receptor ligand activity and signaling receptor activator activity ( Figure 3F, 3G), and down-regulated were mainly enriched in G protein-coupled receptor binding, phosphatase binding and nuclear receptor activity ( Figure 3M, 3N). These genes could be related to multiple biological pathways orchestrating OLF pathogenesis.

Analysis of KEGG Enrichment Pathways
Analysis of lists of ORDEGs in terms of enriched biological pathways was also conducted separately for upregulated and downregulated genes. According to the enrichment analysis results of biological pathways, upregulated ORDEGs were mainly involved in collecting duct acid secretion, oxidative phosphorylation, adipocytokine signaling pathway, gastric acid secretion and cytokine-cytokine receptor interaction ( Figure 4A, 4B, 4C). The top ve KEGG terms among down-regulated ORDEGs were primarily associated with adipocytokine signaling pathway, mTOR signaling pathway, PPAR signaling pathway, insulin signaling pathway and JAK-STAT signaling pathway ( Figure 4D, 4E, 4F). Interestingly, in upregulated and down-regulated ORDEGs, there is a common pathway, adipocytokine signaling pathway, which is implicated in multiple biological reactions ( Figure 4G). Moreover, we also identi ed another two remarkable obesity-related crosstalk pathways, mTOR signaling pathway ( Figure 4H) and JAK-STAT signaling pathway ( Figure 4I). Dysregulation of module cluster genes might therefore regulate OLF development through acting on these potential pathways, which were potential signatures for OLF.

PPI Network Construction and Hub Genes Selection
To systematically analyze the relationships between the common ORDEGs, we constructed a PPI network using the STRING database after removing unconnected nodes ( Figure 5A). Cytoscape visualized the PPI network of ORDEGs, which consisted of 81 nodes and 313 edges ( Figure 5B). A total of 14 hub nodes (degree ≥ 30), including AKT1, CCL2, CCL5, CXCL2, ICAM1, IL10, MYC, PTGS2, SAA1, SOCS1, SOCS3, STAT3, TNFRSF1B, VEGFA, were considered hub genes in the OLF DEG list ( Figure 5C). Table 1 shows the detailed information and molecular functions of the key genes. In addition, the result of independence testing analysis suggested that all hub genes were signi cantly decreased in OLF samples except for CCL5 ( Figure 5D). Correlation analysis among 14 genes was further conducted to investigate their whole interrelations ( Figure 5E). The results showed that SOCS3 and CCL2 had the highest positive correlation with a spearman's correlation coe cient of 0.99 ( Figure 5F). STAT3 was positively correlated with SOCS3 (r = 0.98, Figure 5G) and VEGFA (r = 0.98, Figure 5H). VEGFA was also positively correlated with CXCL2 (r = 0.98, Figure 5I).

Analysis of the Functional Characteristics of Hub Genes
The analysis results from the GeneMANIA database showed that 14 hub genes and their co-expressed genes constitute a complex PPI network with co-expression of 65.62%, genetic interactions of 15.03%, physical interactions of 8.28%, shared protein domains of 3.99%, pathway of 3.86% and co-localization of 3.21%, whose functions were mainly associated with cellular response to biotic stimulus, regulation of in ammatory response, regulation of tyrosine phosphorylation of STAT protein ( Figure 5J). Furthermore, Metascape functional annotation results revealed that hub genes were mainly enriched in response to lipopolysaccharide, positive regulation of cell adhesion, cellular response to biotic stimulus, T cell activation, receptor signaling pathway via JAK-STAT and receptor signaling pathway via STAT ( Figure 6A, 6B). Meanwhile, ClueGO revealed that the most involved pathways were JAK-STAT signaling pathway, cytokine-cytokine receptor interaction, adipocytokine signaling pathway, and chemokine signaling pathway ( Figure 6C, 6D, 6E). By this token, immune-related or in ammation-related biological responses and pathways were strongly linked to these hub genes in OLF. Therefore, further subgroup analysis was conducted to separately investigate possible immune functions of these genes, and T cell activation (55.56%), cellular response to interferon-gamma (22.22%), lymphocyte migration (11.11%) and mononuclear cell differentiation (11.11%) were identi ed as the potential immune responses involved ( Figure 6F, 6G, 6H) .

Correlation Analysis of Hub Genes and In ltrating Immune Cells
The above results demonstrated that the hub genes were also highly enriched in immune-related or in ammation-related responses and pathways. Our previous study had identi ed 14 types of OLF-related in ltrating immune cells (OIICs) using ssGSEA and xCell algorithm based on the gene expression matrix from the GSE106253 ( Figure 7A). To explore the underlying mechanisms associated with these potential biomarkers, we estimated the correlations between these genes and in ltration of immune cell types in OLF samples. According to r > 0.90 and p < 0.001, 27 ORDEG-OIICs correlation pairs were screened, and cDCs were signi cantly associated with the most ORDEGs ( Figure 7B

Discussion
As a multifactorial disease, obesity has been considered as an important risk factor associated with the pathogenesis of OLF. 10,17 However, obesity-mediated genetic mechanism underlying the progression of OLF has not been fully understood. In this study, GSE106253 was selected as expression pro ling by high-throughput sequencing dataset in our analysis. First, we screened signi cantly ORDEGs by intersecting OLF-related gene expression pro les and obesity-related gene lists, and their global biological functions and signal transduction pathways were analyzed. Then, fourteen hub genes were identi ed by constructing a protein-protein interaction network among them, and these hub genes were found to get involved in a variety of biological processes, molecular functions, signaling pathways and immune responses through correlation analysis, co-expression analysis, Metascape functional annotations, and subgroup analysis of immune function. Furthermore, we found that there was a signi cant correlation between the hub genes and the distinct immune cell types according to the established in ltrating immune cell patterns in OLF by our previous ndings. Collectively, the current study preliminarily investigated the expression pro le of obesity-related genes in OLF and comprehensively elaborated their functional characteristics based on RNA-sequence data for the rst time, which provided novel insights into understanding the pathogenesis s and treatment strategies of obesity-related OLF.
The GO and pathway enrichment analysis was of great importance for appreciating the whole biological functions and molecular mechanisms of these ORDEGs pro les in OLF. GO analysis showed that the all ORDEGs were mainly enriched in metabolic responses and obesity status. Besides, the upregulated and down-regulated ORDEGs were found to be enriched in a common KEGG pathway, adipocytokine signaling pathway, mainly including TNF-αsignaling, leptin signaling and adiponectin signaling, which is implicated in several obesity-related biological reactions. Our previous study has proven that leptin/LepR signaling could promote osteogenesis differentiation through JAK2/STAT3 pathway and indicated the intrinsic interaction between obesity and OLF 11. Moreover, our proteomics coupling with experimental veri cation also elucidated the de nite role and mechanisms of TNF-αin development of OLF. 13,18 Besides, adiponectin signaling has been also widely involved in bone and cartilage metabolism, but contradictory results regarding its effect on bone formation and turnover were reported. 19 Numerous studies have demonstrated a pro-osteogenic potential for adiponectin in vivo in vitro through several downstream pathways. 20-23 However, Abbott et al showed that adiponectin may stimulate cellular plasticity of osteoblasts towards adipocytes. 24 Duan et al showed that AdipoRon signi cantly alleviates the calci cation of OA chondrocytes via AMPK-mTOR signaling. 25 Generally speaking, circulating adiponectin levels are decreased in obese individuals, 26 thus it's an interesting issue whether and how adiponectin in uence the development and progression of OLF. Moreover, besides JAK/STAT signaling pathway, we also identi ed a potential functional pathway, mTOR signaling pathway. On one hand, numerous studies have demonstrated that the mTOR activation was implicated in metabolic diseases, such as obesity and diabetes .27,28 On the other hand, increasing mTOR signaling could promote the chondrogenesis, osteogenesis, and heterotopic ossi cation. 29,30 Therefore, it is speculated that mTOR signaling crosstalk may be a potential molecular mechanism linking obesity and OLF, which is worth further study in the future.
Constructing a PPI network through grouping and organizing all the genes encoding proteins is a reliable measure to screen hub genes and regulatory modules in the exploration of disease mechanism. By this means, 14 hub genes were obtained, including CCL2, SAA1, SOCS3, CXCL2, IL10, PTGS2, STAT3, SOCS1, TNFRSF1B, AKT1, ICAM1, MYC, VEGFA and CCL5. First, several cytokines or chemokines associated with in ammation and immunity were observed including CCL2, CCL5, CXCL2, IL10, PTGS2 and TNFRSF1B. For instance, Luis et al revealed that CCL2 and CCL5 participated in the immunomodulation of osteoblast differentiation during M1/M2 transition. 31 Aimalie et al found marrow adipocyte-derived CXCL2 in obese population contribute to osteolysis. 32 Generally, obesity have been shown to be associated with chronic low-grade systemic in ammation and the unspeci c activation of immune system, 33 which, in turn, are believed to be associated with the onset and development of OLF. In addition, STAT3 has been veri ed to play an important role in the osteogenesis differentiation of ligamentum avum cell. 11 Importantly, SOCS family proteins form part of a classical negative feedback system that regulates cytokine signal transduction. 34 SOCS1 and SOCS3 is involved in negative regulation of cytokines that signal through the JAK/STAT3 pathway, thus they may become potential targets for progression of OLF. 35 AKT1 can regulate many processes including metabolism, cell proliferation, growth and angiogenesis, and has been demonstrated to participate in terminal stages of endochondral bone formation that is pathological nature of OLF. 36 ICAM1 has been considered as an important adhesion molecule involved in Bone Homeostasis. 37 Tanaka et al found that ICAM-1+ osteoblasts can bias bone turnover to bone resorption, 38 which prompted some novel thought about whether ICAM-1 may be potential therapeutic target for OLF. Zhang et al proposed that SAA1 was linked with progression of obesity, 39 but its role in OLF requires further veri cation. Yang et al indicated that angiogenesis was responsible for development of OLF. 40 VEGFA and MYC, as important participator and regulator in angiogenesis process, might be associated with advancement of OLF. 41,42 Based on the Metascape database, we found that the hub gene is mainly involved in biological processes such as receptor signaling pathway via JAK-STAT, response to tumor necrosis factor and regulation of defense response. ClueGO results also revealed that the most involved pathways were TNF signaling pathway, adipocytokine signaling pathway and JAK-STAT signaling pathway. These results indicated that these hub genes played an important role in in ammatory and immune response associated with obesity status in OLF. In order to further explore the role of possible immune function of these genes in OLF, we separately evaluate their involved immune response, including T cell activation, cellular response to interferon-gamma, lymphocyte migration and mononuclear cell differentiation. A previous research found differential responses of peripheral lymphocytes among patients with the continuous-type ossi cation of posterior longitudinal ligament (OPLL), those with the segmental-type OPLL and healthy volunteers. 43 Furthermore, transcriptomic data from calci c aortic valve disease (CAVD) samples showed increased granzyme, perforin, CD8, and interferon-gamma, supporting the presence and increased activity of T lymphocytes. 44 Moreover, interferon-gamma could speci cally restrain macrophage capacity for calcium reabsorption and osteoclast activity in CAVD. 44 These ndings in other types of heterotopic ossifying disease can be referenced for future research of OLF-related immunologic mechanisms. Our recent study has identi ed 14 types of differential OIICs, 16 so correlation analysis between hub genes and OIICs were conducted. For instance, SOCS3 and STAT3 were positively correlated with cDCs and IL10 and CCL2 were positively correlated with memory B-cells while VEGFA and CXCL2 were negatively correlated with NK CD56 bright cells. However, there is insu cient literature evidence to support these results, thus the speci c mechanism of these correlations in OLF requires further experimental evidence.
The highlight of this study is to explore the expression pro les of individualized obesity-associated genes and their comprehensive function annotation in OLF for the rst time. These hub genes and biological possess crosstalk may be critical for uncovering the pathogenesis of obesity-related OLF. However, there are still some limitations in this study. Due to the limited clinical samples, we are regrettably unable to verify these hub genes at this stage. In our future research, these identi ed target genes will be veri ed by RT-qPCR, western blotting or immunochemistry when clinical specimens are su cient and ethical approval is granted. In addition, the potential mechanisms of these genes and immune cells need to be further explored through in vitro and in vivo experiments. Importantly, we have applied scienti c bioinformatics algorithm and integrated statistical methods to decipher these results within the data and show our new ndings. These results, to some extent, could be enlightening for the subsequent mechanism studies.

Conclusion
Taken together, ninety-nine DEGs and fourteen hub genes associated with obesity were rst identi ed through bioinformatics analysis for high throughput sequencing data between OLF samples and control samples. The biological functions and pathways of the identi ed genes provide a more detailed molecular mechanism for understanding the obesity phenotype and OLF pathogenesis. By combining a reliable deconvolution algorithm with largescale genomic data, we found signi cant correlation between hub genes and OIICs in OLF. Our investigations may provide new insights into the speci c molecular mechanisms, diagnostic biomarkers and targeted treatment of OLF. The data analyzed in this article comes from Gene Expression Omnibus (GEO) database (http://www.ncbi.nlm.nih.gov/geo). The accession number can be found in the Materials and Methods section of the article.

Competing interests
The authors declare that they have no competing interests.  Abbreviations: ORDEGs, obesity-related differentially expressed genes. Figure 1 Page 18/25

Figures
Flow chart of the whole analysis process.