Identification and validation of AKR1C1/2/3 as hepatocellular carcinoma risk modules via an integrated strategy

The human aldo-keto reductase 1 (AKR1) C family comprises four enzymes, AKR1C1– AKR1C4. Lots of studies have investigated the function of AKR1Cs in tumors, however little is known in hepatocellular carcinoma (HCC). explore and of in Meanwhile, data of 134 from used The results revealed that AKR1Cs expression was negatively correlated with the infiltration level of CD4+ T cells. Overexpression of AKR1C1/2/3 was significantly associated with tumor stage and pathological grade. Moreover, higher mRNA expression of AKR1C1/2/3 was related with shorter overall survival (OS), progression-free survival (PFS) and relapse-free survival (RFS). Multivariate Cox regression analysis showed that AKR1C1/2/3 could be significant risk factors for HCC patients. Additionally, genetic alterations of AKR1Cs can significantly affect patient OS and PFS, and expression of AKR1Cs was linked to functional networks involving oxidation-reduction process, cellular hormone metabolic process and organic hydroxy compound metabolic process, as well as retinol metabolism, steroid hormone biosynthesis, metabolic pathway and fatty acid degradation pathways. In conclusion, we successfully elaborated the relationship between AKR1Cs expression and immune infiltrations, and identified AKR1C1/2/3 could be novel prognostic biomarkers for HCC patients.

Background Hepatocellular carcinoma (HCC) is the 6th common tumor and the most frequent primary liver cancer. Due to its poor prognosis, HCC has always been a topic of great concern [1].
Although research is endless, exact mechanism of HCC occurrence and metastasis remains unclear, exploring and discovering new prognostic biomarkers are needed and can provide theoretical basis for conquering this disease in the future.
AKR1C family, consisting of AKR1C1-AKR1C4, is a critical subgroup of AKR family. AKR1Cs mainly focus on catalyzing NADPH dependent reductions and have been involved in apoptosis, intermediary metabolism, and malignant transformation [2]. Mounting evidence has revealed roles of AKR1C family members in a variety of solid tumors [3][4][5][6], especially in cancer occurrence, progression and anticancer treatment [7]. No enough evidence about comprehensive roles of the AKR1C family can be found in HCC. Hence, systematic inquiry of AKR1Cs is needed and has great benefits to reveal the mechanisms in HCC development.
In this study, we systematically explored the expression and function of AKR1C family members in HCC using bioinformatics data mining. We identified the expression levels of AKR1C1/2/3 were differentially overexpressed in cancerous tissues comparing to normal tissues. Moreover, AKR1C1/2/3 can jointly predict the prognosis in HCC patients, and low expressions were favorable. Besides, genetic alteration of AKR1Cs and the neighboring genes were analyzed in HCC to reveal that these genes could affect the occurrence and development of tumors through common carcinogenic pathways.
Methods CCLE Different tumor cell line expression profiling assays were downloaded from CCLE database (https://portals.broadinstitute.org/ccle/home), usage of this database can facilitate the identification of genetic, lineage and predictors of drug sensitivity [8].
ONCOMINE ONCOMINE database (www.oncomine.org) is the most commonly used database for carcinogenic gene mining [9]. The mRNA levels of distinct AKR1C members between different cancer tissues and corresponding adjacent normal samples were determined through this platform. The filter conditions were set as follows: the fold change was defined as 1.5 and p ˂ 0.01 were considered significant.

UALCAN
UALCAN is an online analysis tool for public data, whose resource mainly comes from level 3 RNA-seq and clinical data of 31 cancer types from The Cancer Genome Atlas (TCGA) database [10]. In present study, mRNA expression of AKR1C members and their association with clinicopathologic parameters were analyzed using this tool, UALCAN can be publicly available at http://ualcan.path.uab.edu.

Timer
The Timer database (https://cistrome.shinyapps.io/timer/) is a tool which can be used for analysis between gene expression and immune infiltration across different cancer types [11]. Herein, expression of four AKR1C members was analyzed in HCC patients, and the immune infiltrates consisted of B cell, CD8 + T cell, CD4 + T cell, macrophage, neutrophil and dendritic cell. GEPIA GEPIA (http://gepia.cancer-pku.cn/) is an online integration web tool developed by Peking University, it concludes RNA-seq data of 9,736 tumor samples and 8,587 normal samples from TCGA and GTEX [12]. In this study, we analyzed the associations of AKR1C members and gene markers of T cell exhaustion using this server.

STRING
The protein-protein interaction (PPI) network was constructed using STRING online database (https://string-db.org/) [13]. The association of AKR1C members was calculated and score values for evaluating the extent of relevance could be obtained.

Kaplan-meier Plotter
Kaplan-Meier plotter (http://kmplot.com/analysis/) is a widely used online survival analysis tool. In this study, prognostic values of AKR1C members in HCC samples were evaluated by analyzing OS, RFS, PFS and disease-specific survival (DSS).

cBioPortal and g:Profiler
The cBio Cancer Genomics Portal (http://cbioportal.org) is mainly used for exploration of multidimensional cancer genomics data sets [14]. In this study, genetic mutations of AKR1Cs and impacts on OS and DFS in HCC patients were analyzed. Neighboring genes were subsequently used to perform gene ontology (GO) and KEGG analyses by g:Profiler (http://biit.cs.ut.ee/gprofiler/) [15]. GO annotation has three parts: cellular component (CC), biological process (BP) and molecular function (MF).

Linkedomics
The LinkedOmics database (http://www.linkedomics.org/ login.php) [16] was used to analyze the correlation between differentially expressed genes and AKR1C members in the TCGA HCC cohort (n = 371), and to assess the correlation of genes by Pearson's coefficient. Similarly, the Web-based Gene SeT AnaLysis Toolkit (WebGestalt) [17] was then used to perform GO and KEGG analyses of these related genes.

Statistical Analyses
Univariate and multivariate analyses were performed using Cox proportional hazards model. All analyses were performed using SPSS version 22.0 (IBM, United States). A 2tailed p value < 0.05 was considered statistically significant.

Expression profiles of AKR1Cs in HCC
To investigate values of AKR1Cs in HCC patients, we first analyzed expression of AKR1Cs by CCLE, ONCOMINE and UALCAN databases. CCLE analysis demonstrated that mRNA expression of four members in HCC cell lines all ranked forefront levels among the common cancer cell lines, moreover AKR1C1 and AKR1C4 ranked the top level (Fig. 1a).
Analysis by ONCOMINE database showed that AKR1C3 mRNA level was significantly higher in cancerous tissue than in normal samples, however AKR1C4 mRNA expression was lower in HCC tissues (Additional file 1: Figure S1a). Meta-analysis of multiple datasets indicated significantly higher levels of AKR1C1/2/3 in HCC than in normal tissues (Additional file 1: Figure S1b). In Roessler Liver 2 dataset [18], AKR1C3 over-expression was found in HCC tissues compared with normal tissues with a fold change of 3.438 (p = 3.86E-71) (Additional fle 2: Figure S2a), Wurmbach Liver dataset [19] also showed AKR1C3 was overexpressed in HCC (Additional file 2: Figure S2b). AKR1C2 was over-expressed with a fold change of 1.543 (p = 4.53E-4) in Wurmbach Liver dataset (Additional file 2: Figure S2c).
To further explore mRNA expression levels of AKR1Cs in HCC, we used UALCAN database, whose data resource was totally different from ONCOMINE, to conduct this analysis. As shown in Fig. 1b, mRNA expressions of AKR1C1/2/3 were proved to be significantly higher in tumor tissues than in normal samples (all p < 0.05).

Correlation of AKR1Cs mRNA level with clinical indicators in HCC patients
Combined with above results, we concluded that AKR1C1/2/3 was significantly highexpressed in HCC tissues. Then we investigated the relationship between respective expression level and clinical indicators in HCC patients. Sub-group analysis of multiple clinic pathological features of HCC samples in UALCAN database consistently showed higher transcription level of AKR1C1/2/3 in HCC patients than in healthy human based on gender, age and tumor stage (Fig. 1c, d, e). Further analyzing the results of mRNA expression with tumor grade, we found that transcription levels of AKR1C1/2/3 were apparently correlated with tumor grade in HCC patients, patients with higher tumor grade tend to have higher expression of AKR1C1/2/3 (Fig. 1f).

Relationship of expression of AKR1Cs and immune infiltration and T cell exhaustion in HCC
The correlation between AKR1Cs expression and immune invasion in HCC was evaluated using Timer database in this study. The results showed that the infiltration levels of CD4 + T cells were negatively correlated with the expression of all AKR1C members (Fig. 2, all p < 0.05), and levels of infiltrated macrophages were also negatively correlated with AKR1C1/2/4 expression (Fig. 2).
Next, we used the GEPIA database to identify the relationship between AKR1Cs expression and T cell exhaustion markers including PD-1, LAG3, GZMB and CTLA4 in tumor and normal tissues. Our results showed that there was a negative correlation of AKR1C2/4 and PD-1 in tumor tissues (p < 0.05, Additional file 3: Table S1), but no significant correlations between remaining members and other markers.
Integration analysis of multiple members showed that high combinatory mRNA expression of AKR1C1/2/3 (HR = 1.88, 95% CI: 1.32-2.67, and p = 4E-04) or all members (HR = 1.86, 95% CI: 1.3-2.65, and p = 0.00053) was significantly associated with poor OS in HCC patients (Fig. 3a). Similar results were observed when analyzing PFS, RFS and DSS ( Fig. 3b, c, d). Taken together, these results indicated that high mRNA expression of AKR1C1/2/3 was significantly correlated with poor prognosis in HCC patients and AKR1C1/2/3 could be viewed as significant risk indicators.  Genomic alterations and biological interaction network of AKR1Cs in HCC Next, we used cBioPortal, which was based on sequencing data from the TCGA database to explore genetic alterations of AKR1Cs and their associations with prognostic indicators in HCC patients. As shown in Fig. 4a, AKR1Cs were altered in 70 of 370 (19%) patients and AKR1C2 ranked the highest level (the mutation rate was 11%). Among these alterations, 10 2 cases had deep deletion (0.54%) (Additional file 4: Table S2). To further investigate the roles of genomic alterations of AKR1Cs in HCC, we conducted survival analysis and the results showed that patients with AKR1Cs alterations had poor OS and PFS (Fig. 4b, c, all p < 0.05), indicating that genomic alterations of AKR1Cs had significant impacts on patient prognosis in HCC.  We then planned to study the biological interaction network of AKR1Cs in HCC using the Network tool in cBioPortal, and Fig. 4d showed the constructed network of neighboring genes associating with AKR1Cs mutations. The most frequent alterations were RDH10 (21.1%) and ADH5 (6.8%) among these neighboring genes (Additional file 5: Table S3).
Functions of AKR1Cs and the frequently altered neighboring genes were enriched by g:Profiler website (Additional file 6: Table S4). Results of enriched GO terms indicated that these genes encode proteins localized mainly in endoplasmic reticulum membrane, nuclear outer membrane and endoplasmic reticulum part (Fig. 4e). These proteins are primarily involved in oxidation-reduction process, cellular hormone metabolic process and organic hydroxy compound metabolic process (Fig. 4f), while they were also associated with oxidoreductase activity and steroid dehydrogenase activity (Fig. 4g). In KEGG analysis, it was suggested that retinol metabolism, steroid hormone biosynthesis, metabolic pathway and fatty acid degradation pathway (Fig. 4h) all showed enrichments, which indicated the possible pathways of AKR1Cs mutation in HCC.

Construction of PPI network and enrichment analysis of AKR1Cs functional networks in HCC
PPI network was constructed using STRING database, and the results indicated that there was a close positive correlation between any two of AKR1C1/2/3 members. However, AKR1C4 is much less relevant with remaining members (Fig. 4i-l and Additional file 7: Table S5).
The LinkedOmics database, containing mRNA sequencing data of 371 HCC patients in the TCGA was used to analyze co-expression genes correlated with AKR1Cs in HCC. As shown  (Fig. 5a).
The same method is used to analyze AKR1C2/3/4, and the results were shown in Fig. 5b, c, d, Additional file 8: Table S6 and Additional file 9: Table S7. All these results reflected that AKR1C1/2/3 had a strong positive correlation with each other, indicating the 3 members may play a synergistic role in function regulation. We then used gene set enrichment analysis (GSEA) to perform GO term and KEGG analyses and found that significant genes differentially expressed in correlation with AKR1C1 were located mainly in the mitochondria, ribosome and respiratory chain, while they were mainly involved in coenzyme metabolic process, cellular amino acid metabolic process and fatty acid metabolic process. At the same time, they play important roles in the structural constituent of ribosome, oxidoreductase activity and electron transfer activity (Fig. 6a).
KEGG pathway analysis showed they were mainly enriched in ribosome, oxidative phosphorylation and carbon metabolism pathways (Fig. 6a). Similarly, we did similar analysis for AKR1C2/3/4, and results were shown in Fig. 6b, c, d and Additional file 10: Table S8.

Discussion
In HCC, roles of AKR1Cs alteration still remain largely unclear although a small number of studies indicated some members play an important role in the progression of liver cancer [20,21]. Considered the deficiency of related researches, this study focuses on the expression levels, immune infiltrations, mutation status and prognostic values of different AKR1C members in HCC patients. Our results indicated that AKR1C1/2/3 members were distinctively high-expressed in HCC comparing to normal controls, and correlation analysis showed that there was a good positive correlation among the three members, which implied that there was a potential synergistic effect among the 3 members in HCC. And this was consistent with previous results of studies [6,22]. Besides, AKR1C1/2/3 jointly acting as significant risk factors on prognosis was also validated in HCC patients.
High expression of AKR1C1 was observed in many solid tumors and increasing studies have used AKR1C1 as a therapeutic target for treatment of cancers [3,23]. Our study firstly showed that AKR1C1 was also high-expressed in HCC tissues and high mRNA expression of AKR1C1 predicted poor OS and PFS. All the results indicated that AKR1C1 actively participated in the occurrence of tumors and can be regarded as a new target in future treatment of HCC. Meanwhile, our study also showed high mRNA expression of AKR1C2 predicted poor survival in HCC patients, which was in line with previous studies [20,21]. However in breast cancer (BC), AKR1C2 expression in tumor cells suggested a positive correlation with DFS and OS, this opposite conclusion may be attributed to the characteristics of breast cancer itself [24]. Additionally, significant up-regulation of AKR1C3 was observed in acute lymphoblastic leukemia [25] and prostate cancer, promoting the development of prostate cancer and reduce the effect of chemotherapy [26]. These features were in line with the effect of AKR1C3 in current study, namely AKR1C3 positively correlated with tumor grade, and can be viewed as a new biological indicator for prognosis of HCC patients. Finally, AKR1C4 seems to play a conflicting role comparing with other members, moreover AKR1C4 and AKR1C3 seem to function mutually exclusive. This is consistent with our results that the coefficient of person-correlation is only 0.469, which is much lower than that with any other members (Additional file 8: Table   S6 and Additional file 9: Table S7), indicating the least relevance between AKR1C4 and AKR1C3. Perhaps because of its uniqueness, the mRNA expression of AKR1C4 in HCC is lower than normal controls, and patient prognosis with high expression of AKR1C4 is exclusively favorable in HCC.
Tumor immune cell infiltration refers to the migration of immune cells from blood to surrounding tissue. The infiltration of immune cells in tumors is closely related to patient clinical outcomes. Tumor infiltrating lymphocytes are independent predictors for patient survival in many cancers [27,28]. As a key point of the anticancer effects, however, it remains much challenging to confirm the relationship of immune cell infiltrates with tumor cells and body immune system. A large number of studies elucidating inner mechanisms are needed for improving patient prognosis in cancer treatments. In the present study, although we have initially obtained the relationship between AKR1Cs expression and immune cells & some immune markers, it remains a lot to be explored which AKR1C member can specifically affect liver cancer growth through which immune cell.
Our research has many shortcomings although we systematically revealed for the first time the roles of AKR1C family members in HCC. First, most of the conclusions are mined from online public databases, a larger number of clinical samples and experimental studies are still needed to confirm the roles of these members in HCC. Second, the internal mechanism of AKR1Cs' role in HCC and how different members of the family related to each other still need to be explored in the future.

Conclusions
Collectively, these bioinformatic and statistical findings provided multilevel evidence for the importance of AKR1C family members in hepatocarcinogenesis and its potential as a marker in HCC. Future work will focus on utilizing clinical parameters along with the biomarker to improve its performance in HCC.