Prognostic signature of ovarian cancer based on 14 tumor microenvironment genes

Ovarian cancer is one of the lethal gynecological in women. Tumor microenvironment (TME) is emerging as a pivotal biomarker for patients’ therapeutic sensitivity and prognosis. In this study, we proposed to explore the prognostic role of TME-related genes in ovarian cancer. The data of whole genome expression proles and detailed clinicopathological information of three cohorts of ovarian cancer patients from the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). Univariate Cox regression analysis was used to screen TME-related genes with signicantly prognostic value based on TCGA cohort. LASSO Cox regression analysis was adapted to the construction of prognostic model. Ovarian cancer cohorts from GEO were used as validation set for verifying the reliability of the prognostic model. Relative inltrating proportion of 22 immune cells were estimated through CIBERSORT software.


Abstract Background
Ovarian cancer is one of the lethal gynecological in women. Tumor microenvironment (TME) is emerging as a pivotal biomarker for patients' therapeutic sensitivity and prognosis. In this study, we proposed to explore the prognostic role of TME-related genes in ovarian cancer.

Methods
The data of whole genome expression pro les and detailed clinicopathological information of three cohorts of ovarian cancer patients from the Cancer Genome Atlas (TCGA) and Gene Expression Omnibus (GEO). Univariate Cox regression analysis was used to screen TME-related genes with signi cantly prognostic value based on TCGA cohort. LASSO Cox regression analysis was adapted to the construction of prognostic model. Ovarian cancer cohorts from GEO were used as validation set for verifying the reliability of the prognostic model. Relative in ltrating proportion of 22 immune cells were estimated through CIBERSORT software.

Results
This study identi ed a total of 14 TME-related genes that nally incorporated into the prognostic model.
The risk score that calculated through the prognostic model was proved as an independent prognostic signature in ovarian cancer. Nomogram that contains TNM stage and risk score could reliably predict the long-term overall survival probability. Additionally, risk score was signi cantly associated with the relative in ltrating proportion of several immune cells in ovarian cancer and mRNA levels of some immune checkpoint genes.

Conclusions
This study constructed a prognostic model for ovarian cancer, which was closely associated with the prognosis and immune status. This should provide novel clue for prognosis study in ovarian cancer.

Background
Ovarian cancer is the most lethal gynecological malignancy [1], following cervical cancer and uterine body cancer, the mortality of the disease accounts for up to 5% in female cancer-related deaths in the worldwide [2]. Malignant ovarian lesions include primary lesions and secondary lesions. In the primary lesions, epithelial ovarian carcinoma is the most common among malignant ovarian tumors (70% of all ovarian malignancies); also including stromal tumors of the ovary, germ-cell tumors, sex-cord stromal tumors, and other more rare types [3]. Unfortunately, early ovarian cancer causes minimal, nonspeci c, or no symptoms due to its anatomy features [4]. Therefore, the most of cases are diagnosed in an advanced stage. [5,6]. Epithelial ovarian carcinoma has been shown to have considerable complexity and heterogeneity in biology, drug response, and survival time, representing a major obstacle for its precision medicine practice.
Recent studies have shown that the tumor microenvironment (TME), composed of a variety of immune cells and stromal cells, is crucial in the occurrence and development of cancer and the response of cells to chemotherapy [7]. At the same time, the rise of immunotherapy, including immune checkpoint inhibitors, shows that the assessment of TME heterogeneity and the remodeling of the immune microenvironment will have broad application in and profound impact on the future cancer treatment [8]. Recently, some studies have shown prognostic markers in tumor microenvironment for tumor prognosis.
The study describes the comprehensive features of gastric cancer TME-related genes characteristics and provide new strategies for cancer treatment [9]. Another present study found that many elements of TME other than tumor epithelial cells in uence the progression of non-small cell lung cancer [10]. Consequently, new treatment strategies and paradigms are of great need for these patients. Unfortunately, there are currently no approved immune therapies for ovarian cancer. Therefore, mining TME-related genes in the immune microenvironment signi cantly changed the treatment prospects of ovarian cancer. TME-related genes may denote a solution to overwhelmed drug resistance and improve the results of the ovarian cancer patients [11].
In this study, we analyzed the expression of 760 tumor microenvironment-related genes in ovarian cancer patients. Meanwhile, we constructed a risk score model by hub TME-related genes to predict the prognosis of ovarian cancer and veri ed evaluation e ciency of the model in the multiple level, which provides new ideas for improving the prognosis of ovarian cancer.

Study population and TME-related genes
In the present study, 386 ovarian cancer patients' mRNA expression pro le data and corresponding clinical information were retrieved from The Cancer Genome Atlas (TCGA, https://tcgadata.nci.nih.gov/tcga/). Data set with incomplete survival information were excluded. 375 of 386 patients have complete survival information and were used in the following analysis. The detailed clinical information of these 375 patients is shown in Table 1. In addition, the data sets GSE26193 and GSE63885 were retrieved from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/). GSE26193 contains information of 107 ovaries cancer patients with complete survival information, and GSE63885 contains information of 101 ovarian cancer patients, 78 of which have complete survival information. MRNA levels of ovarian cancer samples in the two datasets of the GEO database were quanti ed by the Affymetrix Human Genome U133 Plus 2.0 Array platform. The 760 TME-related genes were obtained based on xCell (https://xcell.ucsf.edu/) and listed in Table S1.

Construction Of Prognostic Signature
Based on the expression levels of 760 TME-related genes, univariate Cox regression analysis was performed, and genes with signi cant correlation with the prognosis of ovarian cancer were identi ed with P < 0.05 considered as statistically signi cant. Then the LASSO Cox regression analysis was performed using R language glmnet package to further optimize the TME-related genes that associated with the prognosis of ovarian cancer [12]. Risk score for each sample was calculated with the optimized genes base the following formula: In this formula, Coef i is the risk coe cient of each factor calculated by the LASSO-Cox model, X i is the expression value of each factor, and in this study refers to the expression level of mRNA of TME-related genes. Then, R package survival, survminer and two-sided log-rank test were applied to determine the optimal cutoff value of risk score for stratifying patients. According to the cutoff value, patients were attributed into Low-Risk group and High-Risk group.

Survival Analysis
Kaplan-Meier analysis was performed to estimate the overall survival rate of different groups using the R language survival package and survminer package. Log-rank test was performed to determine the signi cance of the difference in survival rate between the different groups. Multivariate Cox regression model was used to analyze whether Risk Score can predict the survival of ovarian cancer patients independently of other factors.

Calculation Of Immune Cell In ltration
We used the software CIBERSORT [13]to calculate the relative in ltration of 22 different kinds of immune cells in each cancer sample. CIBERSORT software will characterize the composition of immune in ltrating cells base on a deconvolution algorithm and 547 pre-set barcode genes based on the gene expression matrix. The sum of the proportion of all estimated immune cell types in each sample is equal to 1.

Establishment Of Nomogram Prognosis Predictive Model
Nomogram is widely used to predict the prognosis of cancer. In order to predict the 1-year, 3-year and 5year survival probabilities of patients, the nomogram was established based on all the independent prognostic factors determined by multivariate Cox regression by using the R language rms package. A calibration curve of the nomogram was drawing to compare the predicted overall survival probability of nomogram and the actual overall survival probability.

Statistical Analysis
Kaplan-Meier method was used to estimate the overall survival rate of different groups. Log-rank was used to test the signi cance of the difference in survival rate between different groups. Wilcoxon signedrank test method was used to compare the differences in immune cell in ltration in different groups, with p < 0.05 as the threshold of statistical signi cance. R software (v3.5.2) was used in performing statistical analysis in the current study.

Results
Construction and veri cation of prognostic models Univariate Cox regression analysis was performed using the TCGA data set samples with 760 TMErelated gene expression values as continuous variables, and the Hazard ratio (HR) of each gene was calculated. 48 genes were identi ed as prognostically signi cant genes with P value < 0.05 as the threshold. The protective genes with HR value less than 1 were conducive to prognosis, and those with HR value greater than 1 were hazardous genes with unfavorable prognosis (Fig. 1A). LASSO Cox regression analysis was performed on the 48 identi ed genes. According to the lambda value corresponding to the number of different genes in the LASSO Cox analysis, we determined to use 14 genes in the following analysis to achieve the smallest lambda value (Fig. 1B). . Base on this formula, the risk score of each patient in the TCGA data set and meta-GEO veri cation set (the combination of two sets of GEO data) was calculated. And according to the calculated optimal cut-off point (0.00515), these patients are divided into a high-risk group and a low-risk group. Survival analysis found that in the TCGA and meta-GEO datasets, high-risk ovarian cancer samples have poorer survival comparing to the low-risk samples ( Fig. 1C and Fig. 1D). In addition, the expression levels of the 14 genes in the model between the high and low risk groups of the TCGA data set were signi cantly different (Fig. 1E). In conclusion, the risk score calculated by the 14 genes, ELN, FBLN1, ANGPT2, FBL, COMP, PPL, PLD2, PDLIM4, EMP1, MYC, ITGAM, FRAT2, WDR45 and ADSL can serve as a model to predict the death risk of patients with ovarian cancer.
Risk Score is an independent prognostic marker for ovarian cancer Age, TNM Stage and Risk Score were adopted for multivariate Cox regression analysis to determine whether the risk score is an independent prognostic indicator ( Fig. 2A). It was found that Risk Score and Stage were still signi cantly correlated with overall survival. Samples with a high-risk score were at greater risk of death (HR = 6.33, 95% CI: 3.50-11.45, P < 0.001).
In order to further explore the prognostic value of Risk Score in ovarian cancer samples with different clinicopathological factors (including age, TNM Stage), we regrouped ovarian cancer patients according to these factors and conducted Kaplan-Meier survival analysis. It was found that, the overall survival rates of samples in the high-risk group were signi cantly lower than those in the low-risk group in early stage (Stage I + Stage II) and advanced stage (Stage III + Stage IV) samples ( Fig. 2B and 2C); also, in samples ≤ 59 years old and samples > 59 years old ( Fig. 2D and Fig. 2E). These results indicate that the Risk Score can be used as an independent indicator to predict the prognosis of ovarian cancer patients.

Nomogram model could reliably predict the long-term survival of ovarian cancer patients
Both Stage and Risk Score, these two independent prognostic factors, was used to construct the nomogram model (Fig. 3A). Three lines were draw up to determine the points obtained from each factor in the nomogram for each patient. The sum of these points is located on the "Total Points" axis, and a line from the Total Points axis was drawn to determine the probability of survival of ovarian cancer patients for 1, 3, and 5 years. The corrected curve in the calibration chart is close to the ideal curve (a 45degree line that passes through the origin of the coordinate axis and has a slope of 1), indicating that the predicted overall survival probability agrees well with the actual result ( Fig. 3B and 3D).

Immune Landscape In Low-risk And High-risk Ovarian Cancer Patients
We used the CIBERSORT method combined with the LM22 feature matrix to estimate the difference in immune in ltration of 22 kinds of immune cells between high-risk and low-risk ovarian cancer patients. Results of immune cell in ltration in 375 ovarian cancer patients from TCGA were summarized (Fig. 4A). The change in the proportion of tumor in ltrating immune cells in different patients may represent the inherent characteristics of individual differences. There are signi cant differences in the relative proportion of in ltration of three types of immune cells such as M1 macrophages between the high-risk and low-risk groups (Fig. 4B), and these three signi cantly different immune cells have a correlation with Risk Score (Fig. 4C). Through PCA analysis, it was found that based on the three types of immune cells with signi cant differences, samples within low-risk and high-risk groups could be well separated (Fig. 4D).
The expression of immune checkpoints has become a biomarker for ovarian cancer patients to choose personalized immunotherapy [14]. The correlation between the patient's risk score and the key immune checkpoints (CTLA4, PD1, IDO1, TDO2, LAG3, TIGIT) were analyzed and found that the risk score was signi cantly related to most of them (Fig. 5A). Meanwhile, the expressions of CTLA4 and IDO1 in these 6 immune checkpoints were signi cantly different between ovarian cancer patients within high-risk and low-risk groups (Fig. 5B). The result implicated that the poor prognosis of patients with high-risk ovarian cancer may be partially contributed by the immunosuppressive microenvironment aberration.

Discussion
Ovarian cancer has the highest mortality rate among all gynecological malignancies. Nearly 80% of the patients are diagnosed after the disease has become symptomatic and has progressed to a late stage [15]. The focus of cancer treatment, especially cancer immunotherapeutic strategies, has shifted from killing the cancer cell by conventional cytotoxic chemotherapy towards re-building the immune system to eliminate cancer cells and preventing the recurrence and relapse [16]. Here is a strong call for the methods to identify and develop clinically valuable gene signatures for OC prognosis, especially when based on comprehensive and unbiased whole-genome data. In this study, we rst identi ed 760 TMErelated genes were screened and eventually in the TCGA data set samples. By using the LASSO Cox analysis calculated, we nally determined to select 14 genes (ELN, FBLN1, ANGPT2, FBL, COMP, PPL,  PLD2, PDLIM4, EMP1, MYC, ITGAM, FRAT2, WDR45 and ADSL) prognostic signature and scoring system showing strong discriminative power to separate patients with good or poor survival (Fig. 1). Among these, many are already reported contributing to ovarian cancers' pathophysiological features as a single gene. EMP1 promotes the proliferation and invasion of ovarian cancer cells by activating the MAPK pathway [17]. C-myc, a common over-activated oncogene by ampli cation, has been found in more than 70% advanced ovarian cancer patients [18,19]. Integrin adhesion signaling has been reported playing a key role in macrophage activation, and Itgam knock out mice, tumor growth, and immunosuppressive cytokine mRNA levels are enhanced [20,21].
To discover a novel panel of prognostic of TEM-related genes, is the rst phase in developing a reliable predictive model to predict the cancer patient outcomes in a clinical practice. After that with the prognostic model we generated in our study, risk factors can cooperate the weight of all 14 key genes was constructed in order to further explore the prognostic value of risk score. In order to further explore the prognostic value of risk score, we regrouped ovarian cancer patients with different clinicopathological factors (age, TNM Stage) and conducted Kaplan-Meier survival analysis. The results showed that the ovarian cancer of patients with a high-risk score have poorer survival comparing with low risk. This risk score can be used as an independent model of prognosis (Fig. 2). In addition, we constructed a nomogram that integrated clinicopathological stage features with risk score to predict the 1, 3, and 5-year survival probability of ovarian cancer patients. The nomogram ndings showed that predicted overall survival probability agrees well with the actual result (Fig. 3). The key TEM-related genes showed the best accuracy in predicting the survival of ovarian cancer patients at any time point and would therefore be helpful for the precise treatment of ovarian cancer patients.
We applied settled algorithm CIBERSORT to estimate the fractions of immune cells. In the present studies, M1 macrophages, M2 macrophages and follicular helper T cells were associated with risk score. In contrast, naive B cells, CD8 T cells, CD4 memory activated T cells and eosinophils didn't show the correlation with risk score, which were the protective factors of patients. Immune cells are very complex and heterogeneous in vivo, different subsets of T cells play different roles in the immune response [22]. And Naive B cells is one of the mechanisms of autoimmune tolerance [23]. In order to express of immune checkpoints to help with personalized immunotherapy. We also proved that these risk scores are related to all key immune checkpoint genes named CTLA4, PD1, IDO1, TDO2, LAG3, TIGIT (Fig. 5). Lastly, CTLA4 and IDO1 expression was found to be highly associated with the risk score. The patients with low risk scores had increased expression levels of CTLA4 and IDO1 indicating a potentially high response rate to CTLA4 and IDO1. Though the early study showed CTLA4 blockade has shown minimal activity in ovarian cancer models [24], recent studies reported CTLA4 blockade can boost the expansion of tumor-reactive CD8 + tumor-in ltrating lymphocytes in ovarian cancer [25]. In the present date, tailor targeted therapy against IDO1 may enhance the effectiveness of treatment. IDO1 inhibitors have been designed, screened, and tested in preclinical models of disease, but still requires veri cation in clinical trials and across cancer diagnoses.

Conclusions
In the present work, we are concerned with the genetic characteristics of the microenvironment. Our systematic analysis and assessment demonstrated that 14 TME related genes might play a pivotal role in the prognosis of ovarian cancer patients. Risk score may help to outline the prognosis of patients with ovarian cancer. Targeting on TME that proves to be successful in the preclinical phase, an enormous challenge still lies ahead for translating these strategies to clinical practice.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
All the co-authors agree to the publication of the article.

Availability of data and material
Our data and related clinical information were retrieved from The Cancer Genome Atlas (TCGA, https://tcga-data.nci.nih.gov/tcga/) and the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/).   The X axis represents the predicted survival rate indicated by the nomogram, and the Y axis represents the actual survival rate.