A genetic variant in pseudogene E2F3P1 contributes to prognosis of hepatocellular carcinoma

ABSTRACT Certain pseudogenes may regulate their protein-coding cousins by competing for miRNAs and play an active biological role in cancer. However, few studies have focused on the association of genetic variations in pseudogenes with cancer prognosis. We selected six potentially functional single nucleotide polymorphisms (SNPs) in cancer-related pseudogenes, and performed a case-only study to assess the association between those SNPs and the prognosis of hepatocellular carcinoma (HCC) in 331 HBV-positive HCC patients without surgical treatment. Log-rank test and Cox proportional hazard models were used for survival analysis. We found that the A allele of rs9909601 in E2F3P1 was significantly associated with a better prognosis compared with the G allele [adjusted hazard ratio (HR)  =  0.69, 95% confidence interval (CI)  =  0.56–0.86, P  =  0.001]. Additionally, this protective effect was more predominant for patients without chemotherapy and transcatheter hepatic arterial chemoembolization (TACE) treatment. Interestingly, we also detected a statistically significant multiplicative interaction between genotypes of rs9909601 and chemotherapy or TACE status on HCC survival (P for multiplicative interaction < 0.001). These findings indicate that rs9909601 in the pseudogene E2F3P1 may be a genetic marker for HCC prognosis in Chinese.


INTRODUCTION
Liver cancer is the fifth most common cancer worldwide and the second most frequent cause of cancer mortality with over 748,300 new cases every year, half of which are in China [1,2] . Hepatocellular carcinoma (HCC) is the most common type of liver cancer [3] . Although surgical resection and liver transplantation are regarded as the best treatment for a curative prognosis of early-stage HCC [4] , about 85% patients are not suitable for surgery due to locally advanced tumor or distant metastasis [5] . Discovery and application of biomarkers that incorporate with traditional cancer staging improve patient care. Thus, substantial efforts have been made to identify biomarkers as prognostic factors for improving therapeutic effect and prognosis prediction.
Pseudogenes are structurally similar to genes that encode functional proteins, but unable to encode fully functional proteins in most cases. Thus, pseudogenes have long been considered as nonfunctional sequences of genomic DNA. However, emerging evidence suggests that pseudogenes may harbor the potential to regulate the expression of their ancestral protein-coding genes by serving as a source of small interfering RNAs (siRNAs), antisense transcripts, microRNA (miRNA) binding sites, or competing mRNAs [6][7][8] . Furthermore, pseudogenes regulate tumor suppressors and oncogenes by acting as microRNA decoys [9][10][11][12][13][14] . To date, several studies have reported the association between pseudogene expression and multiple cancer risk. Ishiguro et al. reported that two pseudogenes (NANOG1 and NANOGP8) were differentially expressed in colon cancer cells, and their expression might contribute to the proliferation of colon cancer cells [15] . Similar results were found for the association of POU5F1P1 expression with prostatic carcinoma [16] , PTENP1 expression with melanoma [17] , and NANOGP8 expression with gastric cancer [18] . However, little is known about pseudogenes and cancer prognosis.
Functional polymorphisms in pseudogenes, such as single nucleotide polymorphisms (SNPs) influencing miRNAs binding, may affect the expression or function of the proteins [19,20] . Thus, we speculate that potentially functional polymorphisms in pseudogenes may affect the expression of pseudogenes or its original protein-coding genes by influencing miRNA binding affinity, and thus play a role in the development and progression of human cancer. In this study, we examined the associations between six genetic variants of pseudogenes and prognosis of 331 patients with intermediate or advanced HCC in Chinese.

Subjects
The protocol was approved by the local institutional review board at the authors9 affiliated institution. Written informed consent was obtained from every subject. The enrollments of subjects were described previously [21,22] . For constructing a relatively homogenous population, our current study was restricted to HCC patients without surgery in intermediate stage (B) or advanced stage (C) according to the Barcelona Clinic Liver Cancer (BCLC) staging system [23,24] . We recruited 414 intermediate or advanced HCC patients from Nantong Tumor Hospital and the First Affiliated Hospital of Nanjing Medical University, Nanjing, Jiangsu, China. All patients were followed up prospectively every three months from the time of enrollment by personal or family contacts until death or last time of follow-up. As a result, a total of 331 intermediate or advanced HCC patients who had completed follow-ups and clinical information were enrolled in our study with a response rate of 80.0%. The maximum follow-up time (MFT) for the 331 patients involved in the present study was 60.7 months (last follow-up in January 2013) and the median survival time (MST) was 14.5 months.

Selection and genotyping of SNPs
We included eight key cancer-related pseudogenes from a review conducted by Poliseno et al. [8] (Supplemental Table 1 available online). By blasting the identical sequence between pseudogenes and their parental genes, we found 31 common SNPs located in pseudogenes, with at least 50 bps flanking regions identical to their parental genes. Then, we searched miRBase (http://www.mirbase.org/) and Patrocles (http://www. patrocles.org/) to find whether the allelic change of 31 SNPs may influence miRNAs binding. Eventually, we selected six potentially functional SNPs that might affect miRNA binding (rs11682718 in DNMT3AP1, rs1838149 and rs9909601 in E2F3P1, rs2004079 in NANOGP8, rs9889937 in FOXO3B, and rs6913881 in KRASP1).
Genomic DNA was extracted from a leukocyte pellet by traditional proteinase K digestion, phenolchloroform extraction and ethanol precipitation. All SNPs were genotyped using the TaqMan allelic discri-mination assay on a 7900 system (Applied Biosystems, Carlsbad, CA, USA). The primers and probes for the six SNPs are shown in Supplemental Table 2 (available online). Two blank (water) controls in each 384-well plate were performed for quality control, and more than 5% samples were randomly selected and repeated, yielding a 100% concordance. The success rates of genotyping for the six SNPs were all above 95%.

Statistical analysis
Mean survival time was presented when the MST could not be calculated. Kaplan-Meier method and log-rank test were performed to compare the survival time in different subgroups categorized by patient characteristics, clinical features and genotypes. Univariate and multivariable Cox proportional hazard regression analyses were performed to estimate the crude or adjusted hazard ratio (HR) and their 95% confidence intervals (CI), with adjustment of age, gender, smoking status, drinking status, BCLC stage, and chemotherapy or transcatheter hepatic arterial chemoembolization (TACE) status. Cox stepwise regression model was also conducted to determine predictive factors of HCC prognosis, with a significance level of 0.050 for entering and 0.051 for removing the respective explanatory variables. The Chi-square-based Q test was applied to test the heterogeneity of associations between subgroups. Analyses were carried out using Statistical Analysis System software (version 9.1.3; SAS Institute, Cary, NC, USA). All tests were two-sided and the criterion of statistical significance was set at P , 0.05.

Demographic and baseline characteristics of the study subjects
The demographic characteristics and clinical information of the 331 HCC patients in stage B or C included in the study were described previously [22] . Totally, 258 of them died from HCC, and two died from other causes during up to 60.7 months of follow-up. For disease-specific survival analysis, the latter were considered as censored data in the analyses. Drinking and chemotherapy or TACE status was significantly associated with survival time (log-rank P 5 0.006 and , 0.001 for drinking status and chemotherapy or TACE status, respectively). Notably, compared to those who received neither chemotherapy nor TACE therapy (MST 5 3.4 months), patients with chemotherapy or TACE therapy (MST 5 16.8 months) had a 61% significantly decreased risk of death (HR 5 0.39; 95% CI 5 0.29-0.51).

HCC survival
Kaplan-Meier method and log-rank test were performed to examine the associations of the six SNPs with HCC survival in different genetic models (additive model, dominant model and recessive model). As shown in Table 1, the difference between the survival time of HCC patients and variant genotypes of rs9909601 located in E2F3P1 was statistically significant (log-rank test: P 5 0.007 in the dominant model and P 5 0.026 in the additive model, respectively). Patients carrying rs9909601 AG/AA genotypes survived significantly longer time (MST 5 15.8 months) than those carrying rs9909601GG genotypes (MST 5 12.6 months; Fig. 1A). Furthermore, multivariable Cox regression analysis showed that rs9909601 remained as a significant prognostic marker for HCC ( Stepwise Cox regression analysis on HCC survival Then, we performed stepwise Cox proportional hazard analysis to estimate the effects of demographic characteristics, clinical features, and E2F3P1 rs9909601 on HCC survival. As shown in Table 3, four variables (age, drinking status, chemotherapy or TACE status, receiving chemotherapy or TACE therapy; 1, those with common genotype (GG) and receiving chemotherapy or TACE therapy; 2, those with variant genotypes (AG/AA) and without chemotherapy and TACE therapy; 3, those with GG genotype and without chemotherapy and TACE therapy. and E2F3P1 rs9909601) were selected into the final regression model. Furthermore, when gender, smoking status and BCLC stage were included in the final model, the E2F3P1 rs9909601 still remained as an independent protective factor for HCC survival (HR 5 0.56, 95% CI 5 0.43-0.72, P , 0.001).

Stratification and interaction analysis
The associations between E2F3P1 rs9909601 and HCC survival were further investigated by stratification of age, gender, smoking status, drinking status, BCLC stage, and chemotherapy or TACE status. As shown in Table 4, we found that the protective effect of rs9909601 variant genotypes was more prominent in patients without chemotherapy and TACE (adjusted HR 5 0.45, 95% CI 5 0.28-0.73) than those with che-motherapy or TACE therapy (adjusted HR 5 0.85, 95% CI 5 0.62-1.15, P 5 0.029 for heterogeneity test). Therefore, a gene-chemotherapy or TACE status interaction analysis was carried out, and a statistically significant multiplicative interaction was observed (P for multiplicative interaction , 0.001, Fig. 1B). Compared to subjects with AG/AA genotypes and with chemotherapy or TACE therapy, patients with GG genotype but without chemotherapy or TACE therapy had a significantly increased mortality risk (adjusted HR 5 14.98, 95% CI 5 9.20-24.37, P , 0.001) ( Table 5).

DISCUSSION
In the present study, we investigated the effects of six common SNPs in cancer-related pseudogenes on the survival of advanced HCC patients and demon-  strated that E2F3P1 rs9909601 may be an independent biomarker to predict the survival of advanced HCC patients. To the best of our knowledge, this is the first report to evaluate the role of genetic variations of pseudogene in HCC survival. E2F was reported to regulate the expression of multiple genes that are important in cell proliferation as a transcription factor [25] . Specifically, it plays a critical role in the control of cell cycle [26,27] . Recent findings suggested that the expression of E2F3 was regulated by several miRNAs [28] and played a major role in modifying cellular proliferation rate which directly or indirectly affected clinical outcome of many types of tumors, including bladder cancer [29] , prostate cancer [30] , ovarian cancer [31] and breast cancer [32] . E2F3P1 located in chromosome 17 is a pseudogene with sequence similar to E2F3 in chromosome 6. Although they are in different chromosomes, E2F3P1 may regulate E2F3 expression by competing for miRNAs and their expressions are positively correlated [33] . Lees et al. have reported that genetic variations in E2F3P1 may influence the miRNAs binding and thus may interrupt subsequent cellular activity, including proliferation and apoptosis [34] . Thus, it is biologically plausible that genetic variations in pseudogenes contribute to cancer risk or prognosis, given that miRNAs may provide linkage between pseudogenes and their parent genes.
By using two web-based prediction tools (miRBase and Patrocles), we found that the wild G allele of E2F3P1 rs9909601 was more inclined to bind miR-24, miR-149, and miR-892b than the variant A allele. Both of the miR-24 and miR-149 have been investigated substantially in cancers. For instance, Han et al. identified that the overexpression of miR-24 was associated with the non-recurrence of hepatocellular carcinoma following liver transplantation which contributed to a better prognosis of HCC [35] . Besides, several studies have indicated that the down-regulation of miR-149 has been found in a variety of carcinomas and finally led to a worse patient survival, such as head and neck squamous cell carcinoma [36] , colorectal cancer [37] and astrocytoma [38] . Thus, E2F3P1 rs9909601 identified in our study may affect the impact of miRNAs regulation on gene expression by influencing the binding affinity of several special miRNAs, and hence play a role in the progression of HCC. Moreover, our analyses indicated a significant interaction between variant genotypes of rs9909601 and therapy status, which provided evidence that the effect of genetic variants on HCC prognosis could be modified by clinical factors.
In conclusion, rs9909601 at E2F3P1 may be a useful biomarker for the prognosis of HCC survival. However, other studies with larger sample size and functional analysis are warranted to verify our finding.