Identification of Potential Biomarkers and Immune Features of Sepsis Using Bioinformatics Analysis

Sepsis remains a major global concern and is associated with high mortality and morbidity despite improvements in its management. Markers currently in use have shortcomings such as a lack of specificity and failures in the early detection of sepsis. In this study, we aimed to identify key genes involved in the molecular mechanisms of sepsis and search for potential new biomarkers and treatment targets for sepsis using bioinformatics analyses. Three datasets (GSE95233, GSE57065, and GSE28750) associated with sepsis were downloaded from the public functional genomics data repository Gene Expression Omnibus. Differentially expressed genes (DEGs) were identified using R packages (Affy and limma). Functional enrichment of the DEGs was analyzed with the DAVID database. Protein-protein interaction networks were derived using the STRING database and visualized using Cytoscape software. Potential biomarker genes were analyzed using receiver operating characteristic (ROC) curves in the R package (pROC). The three datasets included 156 whole blood RNA samples from 89 sepsis patients and 67 healthy controls. Between the two groups, 568 DEGs were identified, among which 315 were upregulated and 253 were downregulated in the septic group. These genes were enriched for pathways mainly involved in the innate immune response, T-cell biology, antigen presentation, and natural killer cell function. ROC analyses identified nine genes—LRG1, ELANE, TP53, LCK, TBX21, ZAP70, CD247, ITK, and FYN—as potential new biomarkers for sepsis. Real-time PCR confirmed that the expression of seven of these genes was in accordance with the microarray results. This study revealed imbalanced immune responses at the transcriptomic level during early sepsis and identified nine genes as potential biomarkers for sepsis.


Introduction
Sepsis is defined as a life-threatening organ dysfunction caused by a dysregulated host response to infection. Despite advances in critical care management over the past few years, sepsis is still associated with high mortality and morbidity worldwide [1]. It has been reported that sepsis causes 30 million episodes and 6 million deaths per year globally. However, according to the WHO, the data have missed incidences in the low-and middle-income countries, which means that the true burden arising from sepsis is far more serious. Therefore, the early diagnosis of sepsis is necessary to provide timely treatment. Markers currently in use, for example, CRP, PCT, and IL-6, have intrinsic shortcomings such as a lack of specificity and failures in the early detection of sepsis [2]. Many researchers are committed to exploring new biomarkers for sepsis. For example, studies have found that serum levels of presepsin, soluble urokinase plasminogen activator receptor, and soluble triggering receptor expressed on myeloid cell 1, as well as the expression of CD64, are upregulated among sepsis patients. Newly identified classes of biomarkers such as microRNAs, long noncoding RNAs, and the human microbiome are also arousing general interest [3]. Despite the increase in different potential biomarkers, such efforts have not yet yielded satisfactory results, which warrants further validation.
It is not surprising that a large proportion of sepsis biomarkers still focuses on the inflammatory part of this condition. Sepsis is characterized by disrupted inflammatory responses. It has been proposed that following a major inflammatory insult, there are simultaneous inflammatory and immunosuppressive responses [4]. Pattern recognition receptors such as TLR recognize and elicit inflammatory responses against pathogenic factors, for example, by triggering leukocyte and complement activation [5]. Concurrent immune cell function impairment (e.g., neutrophil defects) and T-cell apoptosis also occur, leading to immune suppression in patients with sepsis [6]. This could trigger secondary infections and undermine the immune system. However, the complex inflammatory responses during sepsis have not been fully elucidated.
Bioinformatics analysis offers an ideal way to screen large gene expression datasets to comprehensively understand the mechanisms underlying sepsis. In this study, we integrated three datasets and used a bioinformatics analysis approach to detect key genes and potential new biomarkers involved in sepsis. Molecular mechanisms underlying the inflammatory responses during sepsis were also explored to search for possible new treatment targets for sepsis.

2.
1. Data Sources. The three gene expression datasets analyzed in this study were downloaded from the Gene Expression Omnibus (GEO) database (https://www.ncbi.nlm.nih .gov/geo/) and used to identify DEGs. GSE95233 [7], GSE57065 [8], and GSE28750 [9] were taken as representative datasets of patients with sepsis. For each dataset, data from day 1 of sample collection were used to analyze the gene expression based on the GPL570 platform (HG-U133_Plus_ 2). All data were freely available online, and this study did not involve any human or animal experiments.

Identification of DEGs.
Background expression value correction and data normalization of the raw data were carried out using the Affy package in R (Affy, version 1.64.0). Subsequently, the Linear Models for Microarray Analysis R package (limma; version 3.42.2) was applied for differential expression analysis. Volcano plots were generated using Bioconductor (http://bioconductor.org/biocLite.R). DEGs were identified as those with a t-test value of P < 0:05 and a ½ logFC > 1:5.

Functional and Pathway Enrichment Analyses. Gene
Ontology (GO) analyses were used for the exploration of functional roles of gene sets, while KEGG analyses were used to classify the pathways in which such genes might function. For comprehensive functional annotation, GO and KEGG analyses of the identified DEGs were conducted using the DAVID tool (https://david.ncifcrf.gov/). A false discovery rate ðFDRÞ < 0:05 in both GO and KEGG analyses was set as the threshold for significant enrichment.

Quantitative
Real-Time PCR. RNA was extracted from whole blood using TRIzol reagent (12183-555, Invitrogen) following the manufacturer's instructions. cDNA was synthesized with a SuperScript™ III First-Strand Synthesis SuperMix for qRT-PCR (11752-050, Invitrogen). Power SYBR ® Green PCR Master Mix (4367659, Applied Biosystems) was used for qRT-PCR to analyze mRNA expression. GAPDH was used as an internal control, and the relative mRNA expression levels were calculated using the 2 -ΔΔCT method. The primer pairs used in the experiments are listed in data S1.

Results
3.1. Identification of DEGs. Datasets GSE57065, GSE95233, and GSE28750 were downloaded from the GEO database and analyzed using R packages (Affy, limma, and ggplot2). Volcano plots were generated to visualize fold changes of the DEGs. Of the 568 DEGs evaluated, 315 were upregulated and 253 were downregulated in the sepsis group ( Figure 1).

Functional Enrichment of DEGs.
Enrichment analysis techniques extract biological information from a set of genes or proteins. To identify key genes related to sepsis, gene functions were annotated using the DAVID online software database. GO annotation analysis showed enrichment of DEGs involved in inflammatory responses such as the innate response, T-cell receptor pathway, and antigen processing and presentation (Figure 2(a)). KEGG pathway analysis showed that the genes involved in sepsis were associated with different infections such as influenza, tuberculosis, and HTLV-1, further linking inflammation with sepsis ( Figure 2(b)).  Table 1(a) being in cluster A and genes in Table 1(b) in cluster B. These findings suggested that these genes play important roles in sepsis. Tables 2(a) and 2(b) show the functional annotation of the two clusters. Cluster A was enriched in genes involved in T-cell biology, antigen presentation, and natural killer (NK) cell function. Intriguingly, most genes in cluster A were downregulated, suggesting a possible immunosuppression process early in sepsis. In contrast, cluster B mainly contained genes related to innate immune responses such as neutrophil-mediated immunity and phagocytosis. These genes are closely related and cooperate with each other to respond to different types of insults. Overall, these 40 genes comprised the hub genes during sepsis. While some of these genes are well-characterized key elements in sepsis, others might represent new potential biomarkers for sepsis. Nine genes (LRG1, ELANE, TP53, LCK, TBX21, ZAP70, CD247, ITK, and FYN) were chosen for further investigation of their roles in and potential use as biomarkers for sepsis.

ROC Curve.
To identify new potential biomarkers for sepsis, ROC curves of data derived from healthy controls and patients with sepsis from datasets GSE57065, GSE95233, and GSE28750 were analyzed using the R package ( Figure 5). ROC curves were generated, and the area under the curves was used to compare the different genes. This analysis demonstrated that the nine selected genes had a diagnostic role in sepsis. Thus, we chose these genes as candidates for further analysis and validation.

Validation of Selected Genes at the Transcriptional Level.
The expression of nine key genes was compared between patients with sepsis (n = 5) and healthy controls (n = 5) using quantitative real-time PCR. The results showed that the expression of seven of these genes was consistent with the trend observed in the microarray analysis, whereas two genes, LRG1 and TP53, showed no significant difference in expression ( Figure 6).

Discussion
A microarray study is an ideal way to comprehensively investigate sepsis. In this study, three gene datasets were integrated to search for potential biomarkers and explore molecular mechanisms of sepsis. Although sepsis is an inflammatory disease, it has recently been established that both proinflammatory and anti-inflammatory responses occur early during sepsis [10]. In our study, genes associated with both innate and adaptive immunity had altered expression patterns in patients with sepsis since the beginning of diagnosis. KEGG pathway enrichment analysis showed that different antigenic constituents from bacteria, viruses, and other insults can cause sepsis. Upon receptor contact with their cognate ligands, proinflammatory intermediates are recruited and intracellular signaling pathways such as NF-κB transduction are activated. The activation of NF-κB induces the expression of early activation genes such as IL1B, IFNG, and IL-6 to combat the insult. However, excessive cytokine release leads to an increase in the release of circulating immune cells [5].
We found that most genes in cluster B encoded proteins that participate in innate immune responses. All of these genes were upregulated, indicating their roles in innate immune responses. One of these genes is associated with OLFM4+ neutrophils, a subset of human neutrophils. Elevated levels of OLFM4+ neutrophils in patients with sepsis are associated with worse outcomes [11]. CHI3L1 is produced by several cells including macrophages and neutrophils. A recent study pointed out that the downregulation of CHI3L1 alleviates skeletal muscle stem cell injury, suggesting its therapeutic potential for sepsis [12]. In our study, we found that the genes in cluster B interact with one another to mediate inflammatory responses. Targeting these genes and corresponding pathways involved in innate immune responses might be one strategy to reduce inflammation and associated pathology during sepsis, although much work is still required, considering the previous unsatisfactory trials involving immune activation genes. Previously, it was thought that the host immune response to sepsis is characterized by an initial hyperinflammatory response, followed by an immunosuppressive phase as the disease progresses. However, recent studies have shown that both proinflammatory and anti-inflammatory responses occur early and simultaneously in sepsis [13]. In our study, genes in cluster A were found to be mainly involved in Tcell biology, antigen processing and presentation, and NK cell function. Interestingly, all genes in cluster A were downregulated, suggesting immunosuppression during sepsis. In cluster A, genes encoding antigen presentation-related molecules, including HLA-DRA, HLA-DRB1, HLA-DPA1, HLA- 5 Mediators of Inflammation DPB1, and CCR7, exhibited decreased expression. Studies have also shown that during sepsis, the number of dendritic cells (DCs), the major group associated with antigen presentation, is decreased in patients with sepsis. In addition, the surviving DCs also exhibit lower expression of HLA-DR. Moreover, endotoxin-tolerant macrophages express relatively low levels of HLA-DR on their surface, resulting in a lack of antigen presentation [14]. Similarly, NK cells are also underrepresented in patients with sepsis, and the remaining NK cells display defective cytotoxic functions [15,16]. In our study, we found that genes involved in NK cellmediated cytotoxicity (such as KLRB1, KLRD1, SH2D1A, and PRF1) showed decreased expression. In cluster A, LCK, ZAP70, CD2, CD247, CD27, CD28, CD3E, CD3G, CD4, CD8A, CD8B, ITK, LCK, TRAT1, TBX21, FYN, and IL-7R, which are related to T-cell biology, were represented, suggesting an important role for T-cell immunity during sepsis. Recently, the immunosuppressive phase has become a focus during sepsis treatment. It is inspiring that IL-7, a growth factor that stimulates the proliferation and maturation of many cell types, could continuously boost the absolute T lymphocyte counts of sepsis patients in a phase II clinical trial [17]. Our results also indicated that immunostimulatory therapy targeting immunosuppression-related genes and pathways could be a promising way to treat sepsis.
It should be noted that there were some DEGs that were not shared among all sepsis networks but have been proven to play indispensable roles in sepsis in recent years. For example, RETN encodes resistin, which is strikingly elevated in patients with sepsis and is associated with sepsis severity and outcomes [18]. Silswal et al. showed that resistin activated monocytes and macrophages as well as induced the release of proinflammatory cytokines [19]. Further, recent experimental data have highlighted resistin as a potential therapeutic target in sepsis [20]. These previous findings suggest a paradoxical role for resistin in sepsis. TCN1 encodes a  Mediators of Inflammation member of the vitamin B 12 -binding protein family, transcobalamin I, which is elevated during infectious conditions. As a member of the cobalamin transport protein, transcobalamin elevation may contribute to the resolution of inflammation [21]. FOLR3 and GGH are associated with the folate pathway. These reports suggest that vitamin B 12 and folate metabolism may also constitute an important part of sepsis. Nutritional therapy, including vitamin B 12 and folate, may affect the pathogenesis of sepsis, which requires further research [22]. In this study, we also identified some genes for which the functions in sepsis have not been completely characterized, suggesting their potential as biomarkers for this disease. It should be noted that the expression levels of most of these genes demonstrated by real-time PCR corresponded to the patterns observed by microarray analyses, while two genes (LRG1, TP53) showed no significant difference. The inconsistency could be attributed to the different detection methods, sample size, patient heterogeneity, and course of the disease. Considering the small sample size of our study, it is not powerful enough to change the conclusions about the nine critical genes selected by bioinformatics analysis.
One of these genes, LRG1, encodes a highly conserved member of the leucine-rich repeat family of proteins, which has been reported to play a role in the inflammatory response. LRG1 is expressed by neutrophils and macrophages [23,24]. Some studies found that circulating LRG1 mRNA and plasma LRG1 protein levels might together be helpful for diagnosing simple and complicated acute appendicitis in patients with acute abdominal pain [25]. However, the role of LRG1 in sepsis remains unclear to the best of our knowledge. Our microarray analyses showed that LRG1 mRNA levels are higher in patients with sepsis. Further study is needed to validate these results and investigate the roles of this marker in sepsis.
Another gene, ELANE, encodes neutrophil elastase (NE), a serine protease secreted by neutrophils into the extracellular milieu during the inflammatory response [26]. NE also participates in the formation of neutrophil extracellular traps [27]. A previous study found that NE is positively correlated with the severity of sepsis and organ dysfunction, suggesting its potential as a biomarker for sepsis [28]. This finding is consistent with our results.
LCK and FYN belong to the Src family of protein tyrosine kinases. LCK phosphorylates downstream signaling proteins, resulting in changes in the expression of genes that are essential for T-cell maturation and activation. FYN is involved in signal transduction pathways during the development and activation of T lymphocytes and NK cells under physiological conditions [29]. FYN and LCK are also essential for platelet production and activation [30]. However, the roles of LCK and FYN in sepsis have not been entirely elucidated. In our study, the expression of LCK and FYN was decreased in patients with sepsis, indicating possible suppression of Tcells and NK cells, as well as a role for these proteins in platelet functions during sepsis, which warrants further exploration. Our ROC curve analysis also showed that both genes have diagnostic value for sepsis.
ZAP70, a member of the Syk protein kinase family, is enriched in the TCR signaling pathway. This protein functions    in the initial step of TCR-mediated signal transduction in combination with the Src family kinases LCK and FYN. Functional deletion of ZAP70 can lead to selective T-cell defects characterized by the selective absence of CD8-positive T-cells. Gomez-Rodriguez et al. [31] found that the downregulation of ZAP70 accelerates neonatal sepsis disease progression. The role of ZAP70 in adult sepsis warrants further investigation.

Mediators of Inflammation
The gene product of CD247 (also known as CD3ζ) is Tcell receptor zeta, which is part of the T-cell receptor-CD3 complex. CD247 plays essential roles in coupling antigen recognition to several intracellular signal transduction pathways. CD3ζ chain expression is consistently reduced in Tcells from both the spleen and lymph nodes in sepsis [32]. Moreover, the downregulation of CD247 was reported to be accompanied by decreased expression of other T-cellassociated signal transduction molecules such as ZAP70, as well as T-cell apoptosis, in line with our data.
TP53 is a well-characterized gene for which the gene product induces cell cycle arrest, apoptosis, senescence, DNA repair, and metabolic changes [32]. TP53 may contribute to apoptosis in a tissue-dependent manner. A recent study revealed that p53 expression in T lymphocytes during sepsis could be responsible for enhancing both apoptosis and immune dysfunction in T-cells [33]. In our microarray analyses, TP53 expression was found to be downregulated in patients with sepsis, which is in accordance with the expression of genes involved in the T-cell signaling pathway, suggesting their possible interaction during sepsis.
TBX21, also known as T-bet, is the master regulator of effector T-cell activation. Many studies have shown that TBX21 controls the expression of IFNG, a hallmark Th1 cytokine, suggesting a role for this protein in initiating Th1 lineage development [34]. Recently, studies have also discovered T-bet expression in B-cells, CD8+ T-cells, and T-reg    Mediators of Inflammation cells, suggesting variable functions under different circumstances [35][36][37]. The decrease in TBX21 expression might influence the expression of related immune cells, a notion that warrants further study. Finally, ITK, which belongs to the Tec tyrosine kinase family, is involved in multiple aspects of T-cell development and functions such as T-cell activation and T-helper cell differentiation [31]. ITK is also known to be involved in the development of Th17 cells by regulating various transcription factors. For example, in ALI mice, ITK regulates the balance between inflammatory Th17 cells and antiinflammatory T-reg cells [38]. However, the role of ITK in T-cell functions in patients with sepsis has not been fully evaluated. Our study showed that ITK was downregulated in patients, although further research is needed to explore the underlying mechanism.

Conclusions
Using a bioinformatics analysis of three gene datasets (GSE95233, GSE57065, and GSE28750), we identified the immune characteristics of sepsis. We found that DEGs in patients were enriched for pathways mainly involved in the innate immune response, T-cell biology, antigen presentation, and NK cell function. Focusing on the key genes and corresponding pathways involved in sepsis could provide new insights for sepsis treatment. Nine genes including LRG1, ELANE, TP53, LCK, TBX21, ZAP70, CD247, ITK, and FYN were also identified as potential new biomarkers for sepsis. The expression levels of most of these genes demonstrated by real-time PCR corresponded to the patterns observed by microarray analyses. Further investigations are needed to validate these preliminary findings.

Data Availability
The data used to support the findings of this study are available from the corresponding authors upon request.

Conflicts of Interest
The authors declare that there is no conflict of interest.  Figure 6: Expression of nine selected key genes was compared between healthy controls and sepsis patients by quantitative realtime PCR. Differences between two groups were analyzed using the t-test. * P < 0:05, * * P < 0:01.

11
Mediators of Inflammation