The prognostic value of CYP2C subfamily genes in hepatocellular carcinoma

Abstract Cytochrome P2C (CYP2C) subfamily members (CYP2C8,CYP2C9,CYP2C18, and CYP2C19) are known to participate in clinical drug metabolism. However, the association between CYP2C subfamily members and hepatocellular carcinoma (HCC) remains unclear. This study investigated the prognostic value of CYP2C subfamily gene expression levels with HCC prognosis. Data of 360 HCC patients in The Cancer Genome Atlas database and 231 in the Gene Expression Omnibus database were analyzed. Kaplan–Meier analysis and a Cox regression model were used to ascertain overall survival and recurrence‐free survival, and to calculate median survival time using hazard ratios (HR) and 95% confidence intervals (CI). In TCGA database, low expression of CYP2C8,CYP2C9, and CYP2C19 in tumor tissue was associated with a short median survival time (all crude P = 0.001, adjusted P = 0.004, P = 0.047, and P = 0.020, respectively). In TCGA database, joint effects analysis of the combinations of CYP2C8 and CYP2C9,CYP2C8 and CYP2C19, and CYP2C9 and CYP2C19 revealed that high expression of two genes (group 4; group IV, group d) was associated with a reduced risk of death as compared to low expression (group 1, group I, and group a) (adjusted P = 0.005, P = 0.013, and P = 0.016, respectively). In TCGA database, joint effects analysis of CYP2C8,CYP2C9, and CYP2C19 showed that the risk of death from HCC was lower for groups C and D than for group A (adjusted P = 0.012 and P = 0.008, respectively). CYP2C8,CYP2C9, and CYP2C19 gene expression levels are potential prognostic markers of HCC following hepatectomy.


Introduction
Liver cancer is the fifth most commonly diagnosed cancer and the second most frequent cause of cancer-related deaths in men and the seventh most frequently diagnosed cancer and the sixth leading cause of cancer-related deaths in women worldwide [1]. There were about 4,292,000 newly diagnosed cases and 2,814,000 deaths from cancer in China in 2015 [2]. Hepatocellular carcinoma (HCC), the major histological type, accounts for most (70-85%) cases of primary liver cancer worldwide [3]. Etiologically, infection of hepatitis C or B virus (HBV), aflatoxin

ORIGINAL RESEARCH
The prognostic value of CYP2C subfamily genes in hepatocellular carcinoma exposure, obesity, diabetes, nonalcoholic steatohepatitis, alcohol ingestion, hemochromatosis, and other metabolic diseases are the primary risk factors for HCC [4]. Despite advances in several treatment strategies, such as liver resection, liver transplantation, percutaneous ethanol injection, transcatheter arterial chemoembolization, transarterial radiation, microwave ablation, and systemic therapy, the prognosis of HCC remains unsatisfactory because of latestage diagnosis [5], which has resulted in a reported 5-year survival rate of only 7% [6]. Thus, the identification of molecular biomarkers for the early diagnosis of HCC is crucial to provide more effective therapies and improve patient prognosis.
Cytochrome P2 (CYP2) family members of the CYP superfamily include many subfamilies, such as CYP2A, CYP2B, CYP2C, CYP2D, CYP2E, and CYP2F. The human CYP2C subfamily consists of four members (CYP2C8, CYP2C9, CYP2C18, and CYP2C19) that are localized in a single gene locus on chromosome 10 [7,8]. Members of the CYP2C subfamily are known to be involved in the metabolism of roughly 20% of clinically used drugs, such as the anticancer drug paclitaxel [9], the antidiabetic agent tolbutamide [8], proton pump inhibitors [10], as well as various endogenous and exogenous substances [11]. In addition, CYP2C8 is reportedly related with an increased risk of essential hypertension and coronary artery disease in Bulgarians [12] and has also been associated with anemia [13], breast cancer [14], and vascular inflammatory diseases [15]. Moreover, CYP2C9 is reportedly associated with the risk of colorectal cancer [16], while CYP2C18 was found to have no contribution to cancer risk [11] and CYP2C19 has been associated with peptic ulcer disease [17], colorectal adenoma recurrence [18], breast cancer [19], and cardiovascular diseases [20]. However, little is known about the associations of the expression levels of these four genes with the risk of HCC. Thus, the aim of this study was to identify relationships between CYP2C expression levels and HCC prognosis.

Patient data
First, the Metabolic gEne RApid Visualizer database (http:// merav.wi.mit.edu/) was accessed on September 10, 2017 to determine whether any of the four members of the CYP2C subfamily are differentially expressed between normal liver tissues and primary liver tumors. Then, the GTExPortal (https://gtexportal.org/home/) was accessed on September 10, 2017 to obtain gene expression levels of CYP2C subfamily in different tissues [21]. Moreover, the Search Tool for the Retrieval of Interacting Genes/ Proteins (STRING) database was accessed on September 10, 2017 to construct protein-protein interaction networks between CYP2C subfamily members and other proteins.

Functional enrichment analysis of the CYP2C subfamily
The Database for Annotation, Visualization, and Integrated Discovery (DAVID) v.6.7 (https://david-d.ncifcrf.gov/) was accessed on September 15, 2017 [25,26] for enrichment analysis, gene ontology (GO) functional analysis, and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis. GO analysis is composed of terms of biological processes (BP), cellular components (CC), and molecular functions (MF); in the latter, KEGG pathways were drawn between CYP2C and other subfamilies.

Survival analysis
From the TCGA database, 360 HCC patients were divided into two groups of 180 patients each at a 50% cutoff value. The median survival time (MST) was applied to estimate patient prognosis and TNM stage in a Cox regression model adjusted for patient age and sex. In order to assure a rational comparison between the above two databases, the 50% cutoff was used for the GEO database. In the GEO database, overall survival (OS) and recurrencefree survival (RFS) were applied to evaluate patient prognosis. In addition, the Cox regression model was adjusted for age, sex, alanine aminotransferase level, nodal status, HBV status, primary tumor size, alpha-fetoprotein (AFP) level, cirrhosis status, and Barcelona Clinic Liver Cancer (BCLC) stage.

Statistical analysis
The Pearson correlation coefficient was used to assess correlations among the CYP2C8, CYP2C9, CYP2C18, and CYP2C19 genes. Correlation plots were depicted by R v.3.2.0 (https://www.r-project.org/). Interactions among these four genes and others as well as the four proteins encoded by these with others were drawn with the Cytoscape v.3.5.1 open source software platform for visualizing complex networks (http://www.cytoscape.org/). MST and probability (P) values were calculated by Kaplan-Meier survival analysis and the log-rank test. Univariate and multivariate survival analysis were performed using the Cox hazards regression model. Scatter diagrams and survival curves were constructed using GraphPad Prism v.7 software (GraphPad Software, Inc., La Jolla, CA). All statistical analyses were performed using SPSS v.16 software (SPSS, Inc., Chicago, IL, USA). A P < 0.05 was considered statistically significant.

Basic patient data
Detailed characteristics of the 360 patients in the TCGA database are shown in Table 1. TNM stage was significantly associated with MST (P < 0.001), but not sex, age, BMI, or race (all P > 0.05).

Analysis of CYP2C subfamily gene expression levels in tumor and nontumor tissues
Expression levels of CYP2C8, CYP2C9, CYP2C18, and CYP2C19 in different organs are exhibited in the supplementary material. Box diagrams of the gene expression levels of CYP2C8, CYP2C9, CYP2C18, and CYP2C19 were downloaded from an online website ( Fig. 1A-D, respectively). The expression levels of these genes were high in normal liver tissues, but low in primary liver tumors. Scatter diagrams of these four genes from the GEO database showed that all generated statistically significant results between tumor and nontumor tissues (all P < 0.0001, Fig. 1E).

Analysis of the GO and KEGG pathways of the CYP2C subfamily
The biological functions (BP, CC, and MF) of CYP2C8, CYP2C9, CYP2C18, and CYP2C19 were evaluated using GO analysis, which showed that each were involved in drug metabolism and oxidation-reduction reactions. Detailed outcomes are shown in Figure 1F. In the KEGG pathway analysis, DAVID determined associations between CYP2C subfamily members and other genes. Benzo[a] pyrene can be metabolized by CYP2C subfamily members and finally transformed into DNA adducts, including (+)-trans-benzo[a]pyrene-7, 8-dihydrodiol-9, and 10-oxide (BPDE)-N 2 -dG, which are known to induce cancers of the skin, lung, and stomach (Fig. 2).
Role of CYP2C in Hepatocellular Carcinoma X. Wang et al.

Correlation analysis of the expression levels among CYP2C subfamily members
The Pearson correlation coefficients of the four CYP2C members were calculated. In the TCGA database, each of these four genes was positively and significantly correlated with the other three members (all P < 0.05) (Fig. 3A). In the GEO database, all four genes were positively and statistically significantly correlated with the other three genes as well (all P < 0.05) (Fig. 3B).
As shown by the survival curves of CYP2C8, CYP2C9, CYP2C18, and CYP2C19, based on data retrieved from the TCGA database, which are presented in Figure 4A-D, CYP2C8, CYP2C9, and CYP2C19 were significantly associated with survival (P = 0.001, <0.001, and <0.001, respectively). However, survival curves of these genes, based on data retrieved from the GEO database, as presented in Figure 4A-H, showed that none were significantly associated with OS or RFS (all P > 0.05). In addition, scatter diagrams of the expression levels of these genes, based on data retrieved from both databases, are presented in Figure 4E and F.

Joint effects analysis of CYP2C subfamily members
Joint effects analysis of the CYP2C8 and CYP2C9 combination showed that MST was poorest in group 1 (931 days; adjusted P = 0.031) and best in group 4 (2456 days; adjusted P = 0.005). Meanwhile, analysis of the CYP2C8 and CYP2C19 combination showed that MST was poorest in group I (899 days; adjusted P = 0.005) and best in group IV (2456 days; adjusted P = 0.013), and that of the CYP2C9 and CYP2C19 combination showed the poorest MST in group a (1005 days; adjusted b = 0.097) and the best in group d (2456 days; adjusted P = 0.016). Detailed joint effects analysis results are shown in Table 5 and associated survival curves are shown in Figure 5A-C.
Finally, joint effects analysis of the CYP2C8, CYP2C9, and CYP2C19 combinations showed that MST was poorest in group A (827 days; adjusted P = 0.017) and best in group C (3125 days; adjusted P = 0.012). Surprisingly, MST could not be determined for group D, which contained the best factors for patients, possibly due to the influence of other potential elements (Table 6). Survival curves of the above analysis are presented in Figure 6D.

Discussion
In this study, the associations between CYP2C subfamily genes with HCC were investigated in both TCGA and GEO databases. The results showed that low gene expression levels of CYP2C8, CYP2C9, and CYP2C19 in TCGA database were associated with poor prognosis of HCC. Moreover, the groups, in TCGA database analysis, with the most poor prognostic factors had the poorest prognosis in the combination analysis of the above three genes. Thus, gene expression levels of CYP2C8, CYP2C9, and CYP2C19-in TCGA database-both alone and in combination, may serve as potential biomarkers of HCC.
CYP2C subfamily members participate in the metabolism of many endogenous and exogenous substances. It is estimated that approximately 30% of all drugs are metabolized by CYP2C8, CYP2C9, CYP2C18, and CYP2C19 [27]. Moreover, CYP2C9, CYP2C19, and CYP2C8 contribute to 17%, 10%, and 6% of drug biotransformations, respectively [28]. Specifically, CYP2C8 is reported to metabolize analgesics [29] as well as antidiabetics and cholesterol-lowering drugs [30], while CYP2C9 was found to metabolize analgesics [31] and neurological drugs [32], and CYP2C19 has been linked to the metabolism of antidepressants and antipsychotics [33], as well as drugs for treatment of respiratory diseases and allergies [34]. Among them, CYP2C18 has been less well studied. Furthermore, members of the CYP2C subfamily have been implicated in drug metabolism and have also been explored in many diseases, including several cancers. Specifically, genetic variants of CYP2C8 have been associated with an increased risk of myocardial infarction [35], paclitaxel-induced neuropathy [36], and bisphosphonate-related osteonecrosis of the jaw in multiple myeloma [37] and esophageal squamous cell carcinoma [38]. A CYP2C9 gene polymorphism has been associated with increased susceptibility to colorectal cancer and adenoma [39], increased progression of nonalcoholic fatty liver disease [40], and excessive anticoagulation and bleeding risk in patients taking warfarin [41]. Also, mutant alleles of CYP2C18 have been linked to CYP2C19 in a Japanese population [42]. Genetic polymorphisms of CYP2C19 were found to be associated with a greater risk of HCC in Japanese cirrhotic patients with HCV infection [43], as well as a significant risk of triple-negative breast cancer [44] and lung cancer in combination analysis with smoking in a Chinese population [45].
CYP2C subfamily members are highly expressed in normal liver tissue and mainly metabolize endogenous and exogenous substances as well as clinical drugs. A previous study reported that CYP2C subfamily members in human hepatocytes were affected by different inflammatory cytokines, including bacterial lipopolysaccharide, interleukin 6, tumor necrosis factor-α, interferon γ,   transforming growth factor β, and interleukin 1β. Meanwhile, with regard to the four members, CYP2C8 was downregulated by each of the above elements, CYP2C9 and CYP2C19, which had almost identical response patterns, gave rise to cytokine-specific outcomes. However, CYP2C18 was not affected by any treatment [46]. Moreover, CYP2C subfamily members are involved in the metabolic pathways of arachidonic acid, linoleic acid, retinol, as well as drug metabolism of cytochrome P450, serotonergic synapses, and chemical carcinogenesis.
In chemical carcinogenesis metabolism, benzo[a]pyrene can be metabolized by CYP2C8, CYP2C9, CYP2C18, and CYP2C19, and then finally transformed into the DNA adduct (+)-trans-BPDE-N 2 -dG, which has been shown to promote cancers of the skin, lung, and stomach. In addition, the CYP2C8, CYP2C9, CYP2C18, and CYP2C19 genes are linked to CYP1A2 in physical interactions, coexpression, shared protein domains, co-localization, other various pathways, and even predicted relationships. At the protein-protein interaction level, CYP2C8, CYP2C9, CYP2C18, and CYP2C19 were related to CYP1A1 and CYP1A2 in coexpression, protein homology, text mining, predicted gene neighborhood interactions, predicted gene fusions interactions, predicted gene co-expression interactions, and other known interactions, as noted in curated databases and as determined experimentally.  X. Wang et al.

Role of CYP2C in Hepatocellular Carcinoma
These results further confirmed that CYP2C subfamily members exhibit many interactions with CYP1A1 and CYP1A2. CYP1A1 is known to participate in the metabolism of Sudan I to 8-(phenylazo)guanine in DNA, 1, 2-naphthoquinone, 3′,4′-diOH-Sudan I, and 4′,6′ -diOH-Sudan I, as well as DNA, RNA, and protein adducts. Among them, 8-(phenylazo) guanine in DNA and DNA, RNA, and protein adducts can result in cancers of the liver and bladder. Meanwhile, CYP1A2 can metabolize IQ and MeIQx and finally into DNA adducts (dG-C8-MeIQx, dG-N-MeIQx). The above DNA adducts can lead to tumorigenesis in cancers of the liver, lung, colon, and breast. In view of these results, CYP2C8, CYP2C9, CYP2C18, and CYP2C19 may be associated with the occurrence of HCC. Therefore, CYP2C8, CYP2C9, CYP2C18, and CYP2C19 may serve as potential diagnostic and prognostic serum biomarkers for HCC diagnosis.
It is well-known that serum AFP is the most widely used biomarker for early diagnosis and monitoring of HCC recurrence [47]. However, the prognostic value of AFP remains controversial. Several studies refuted the prognostic value of AFP in single, small HCC, and even for the prediction of HCC recurrence [48,49]. Several literatures reported its sensitivity of less than 70% at a cutoff value of 20 ng/mL [50,51].
Many novel serum biomarkers of HCC have been identified in recent years, including osteopontin [52], UQCRH [53], CXCL1 [54], integrator complex subunit 6 [55], PIVKA-II [56], TIP 30 [57], cavin-2 [58], and annexin A2 [59], among others. Although a variety of potential serum biomarkers were put forward by different research centers, clinical applications have been limited because of the highly heterogeneous nature of HCC. In the present population, CYP2C subfamily gene expression levels were associated with HCC prognosis. Thus, we postulate that the CYP2C subfamily members may serve as potential serum biomarkers for the early diagnosis of HCC.  However, there were some limitations in this study. First, larger population studies are required to increase the credibility of these conclusions. Second, other potential influencing factors regarding tumor evolution and prognosis, such as drinking status, smoking status, cirrhosis status, Child-Pugh score, tumor number, primary tumor size, pathological differentiation diagnosis, tumor capsule status, and vascular invasion should be included for analysis     to better evaluate the relationships between CYP2C subfamily members and HCC prognosis. Third, more commonly used indicators, such as disease-free survival, should be considered to estimate HCC prognosis. Fourth, further well-designed studies concentrating on functional validation are warranted with a greater number of research centers and more racially diverse countries. Fifth, other significant drug-metabolizing CYPs, including CYP1A2, CYP2A6, CYP2B6, CYP2D6, CYP2E1, and CYP3A4/5, will be explored for HCC in our future studies. To summarize, the results of this study indicate that CYP2C8, CYP2C9, and CYP2C19 present potential serum biomarkers for the early diagnosis of HCC and combination analysis showed significant interactions that were better prognostic indicators of HCC. However, because of the incomplete clinical data and small sample size in this study, further research is necessary to validate these findings.