Introduction

Carcinogenic process can be driven by the accumulation of epigenetic alterations as well as genetic alterations1. Epigenetics comprise heritable modifications to the chromatin that affect gene expression without altering DNA coding sequence, including DNA methylation, microRNA regulation, and histone modifications2. These epigenetic changes and their roles in human cancer have been actively studied for the past decades2. The knowledge of various epigenetic events has led to deeper understanding of the pathogenesis of cancer, and provided a clue to the discovery of prognostic biomarkers and novel therapeutic targets1. Histones are the central component of the nucleosomes, the fundamental building blocks of chromatin. The histone tails are subject to extensive posttranslational modifications such as methylation and acetylation among others, which can contribute to chromatin compaction, nucleosome dynamics, and transcriptional processes3. Dysregulation of these processes may lead to aberrant gene expression, which is frequently observed in human cancers4. Many studies have reported the role of genetic variations in the regulation of epigenome5,6,7,8, and their association with the risk and clinical outcomes of human cancers9,10,11,12.

Lung cancer is one of the most commonly diagnosed malignancies and is the leading cause of cancer-related death worldwide, with an average 5-year survival rate of 19%13. Lung adenocarcinoma comprises approximately 50% of all lung cancer14. Advanced understanding in molecular pathogenesis have enabled the identification of driver mutations and the development of personalized treatment options with clinically meaningful outcomes in metastatic lung adenocarcinoma15,16. In early-stage lung adenocarcinoma, many patients experience recurrence and even death after complete resection although surgery is the best treatment modality for cure. Even patients with the same stage are regarded to be at a variable risk of recurrence. However, reliable prognostic biomarkers are still not available. Therefore, deeper understanding in the molecular pathogenesis is essential to promote the development of prognostic biomarkers for building effective post-surgical strategies such as adjuvant therapy and follow-up to improve prognosis in early-stage NSCLC.

In this study, we hypothesized that potentially functional single nucleotide polymorphisms (SNPs) in histone modification regions may modulate the pathogenesis of lung cancer by regulating the expression of target genes, and consequently clinical outcomes of lung cancer. To test this hypothesis, we aimed to investigate the association between functional variants within or adjacent to genes with high expression in histone modification regions and the prognosis after surgery in lung adenocarcinoma.

Results

Patient characteristics and clinical outcomes

The baseline clinical and pathologic characteristics of patients in the discovery and validation cohorts and their association with OS and DFS are shown in Table 1. Median duration of follow-up was 30.4 (range, 3.6–67.1) months for the discovery cohort, and 27.4 (range, 1.0–66.7) months for the validation cohort. Pathologic stage was significantly associated with OS and DFS in both cohorts (log-rank P [PL–R] for OS = 0.01 and 3 × 10–6, and PL–R for DFS = 4 × 10–5 and 3 × 10–10 in the discovery and the validation cohorts, respectively). In the discovery cohort, age was significantly associated with OS (PL–R = 0.02), and adjuvant therapy was associated with DFS (PL–R = 0.04). In the validation cohort, sex and smoking status were significantly associated with OS and DFS (PL–R for OS = 0.04 and 0.01; and PL–R for DFS = 0.03 and 0.01, respectively).

Table 1 Univariate analysis for overall survival and disease-free survival by clinicopathologic features in the discovery and validation cohorts.

Identification of potentially functional SNPs using ChIP-seq and RNA-seq

The regulatory variants in histone modification regions may affect the expression of genes. To identify potentially functional variants in histone modification regions, we performed ChIP-seq with antibodies against two posttranslational modifications of histone H3 (H3K4me3 and H3K27ac) and RNA-seq using H2087 cell lines. Based on the ChIP-seq data for two histone marks, 31,582 SNPs located within H3K4me3 peaks and 34,591 within H3K27ac peaks were retrieved. And then, based on the NCBI SNP database (https://snpinfo.niehs.nih.gov/), 1654 SNPs within H3K4me3 peaks and 1229 within H3K27ac peaks, which were potentially functional and had minor allele frequency ≥ 0.1, were collected. Using RNA-seq we chose genes with high expression level (FPKM ≥ 100), and then 383 SNPs within or closest to the genes were extracted, of which 320 were identified as tag SNPs. Finally, the 279 SNPs were selected because 41 located within the overlap between the H3K4me3 peaks and H3K27ac peaks were duplicated.

Associations between the SNPs and survival outcomes

Among the 279 SNPs genotyped, 184 were analyzed for the association study after excluding 10 with genotyping failure, 85 with deviation from the HWE (P < 0.05) or call rate < 90%. In the discovery set, 41 SNPs were significantly associated with clinical outcomes (Supplementary Table S1). Among 41 SNPs, two (CAPN1 rs17583C>T and LINC00959 rs4751162A>G) were found to be significantly associated with survival outcomes in a validation set (Supplementary Table S2). In a combined analysis that adjusted for age, gender, smoking status, pathologic stage, and adjuvant therapy, CAPN1 rs17583C>T was significantly associated with better OS and DFS (adjusted HR [aHR] = 0.49, 95% CI 0.32–0.75, P = 0.001, under a dominant model; aHR = 0.43, 95% CI 0.23–0.79, P = 0.007, under a recessive model, respectively), and LINC00959 rs4751162A>G exhibited worse DFS (aHR = 1.53, 95% CI 1.11–2.09, P = 0.008, under a dominant model) (Fig. 1 and Table 2).

Figure 1
figure 1

Overall survival (A, B) and disease-free survival (C, D) according to CAPN1 rs17583C>T genotypes, and disease-free survival (E, F) according to LINC00959 rs4751162A>G genotypes. The data were analyzed using SAS. P values in the multivariate Cox proportional hazard model.

Table 2 Overall and disease-free survival according to genotypes of two polymorphisms in the combined cohorts.

Effect of rs17583 C>T and rs4751162 A>G on promoter activity and mRNA expression

The rs17583C>T is located in the promoter region of CAPN1, and the rs4751162A>G is located in the intron of LINC00959. The rs4751162 resides in an enhancer which is 26 kb apart from GLRX3 and expected to regulate its expression. To verify the functional relevance of the two genetic variants, we investigated whether rs17583C>T and rs4751162A>G regulate the promoter activity of the CAPN1 and GLRX3 gene, respectively. The promoter assays showed that the CAPN1 rs17583 T allele had significantly lower promoter activity than the C allele (Fig. 2A, P = 0.008). The LINC00959 rs4751162 G allele had significantly higher GLRX3 promoter activity than the A allele (Fig. 2B, P = 0.05). Quantitative RT-PCR showed that the relative CAPN1 expression was significantly higher in tumor tissue than in normal lung tissue (Fig. 3A, P = 0.003). In tumor tissues, CAPN1 expression was significantly lower in CT + TT than in CC genotype (Fig. 3B, P = 0.01). The expression of GLRX3 was also significantly higher in tumor tissue than in normal tissue (Fig. 3C, P = 0.0003). GLRX3 expression was not significantly different among genotypes (Fig. 3D). There was no significant association between mRNA expression level and the survival outcomes (data not shown).

Figure 2
figure 2

The transcription activity according to alleles of CAPN1 rs17583C>T and LINC00959 rs4751162A>G measured by luciferase reporter assay. The H1703 cells were transfected with pGL3-CAPN1_C and pGL3-CAPN1_T for rs17583 (A), and pGL3-GLRX3pro, pGL3-GLRX3pro_A, or pGL3-GLRX3pro_G for rs4751162 (B), respectively. The data were analyzed using Excel. Each bar represents mean ± SEM of luciferase activity normalized to Renilla luciferase activity. Experiments were performed in triplicate. P value, Student's t-test.

Figure 3
figure 3

The mRNA expression levels of CAPN1 and GLRX3 in tumor and corresponding non-malignant lung tissues (A, n = 73 and C, n = 55), and CAPN1 and GLRX3 mRNA expression according to CAPN1 rs17583C>T (32CC, 29CT, and 8TT) and LINC00959 rs4751162A>G (32AA, 15AG, and 1GG) genotypes (B, D). The data were analyzed using SPSS. The P-value was calculated using Student’s t-test.

ChIP-qPCR assays to confirm histone modification and transcription factor binding in two SNP-containing regions

To confirm whether the two genetic variants are located in functional promoter or enhancer with active histones, we performed ChIP-qPCR assays using antibodies against H3K4me3, H3K27ac, H3K9ac, and H3Kme1. The rs17583-containing region was associated with strong enrichment of H3K4me3 and H3K9ac which mark active promoters (Fig. 4A), and the rs4751162-containing region showed strong enrichment of H3K27ac and H3Kme1 which mark active enhancers (Fig. 4B). Next, motif analyses were performed to predict transcription factors which bind to the SNP regions. As a result of aligning with known motifs using TOMTOM tool in the MEME Suite web server17, the rs17583 and rs4751162 are located in the binding motifs of YY1 and TFAP4, respectively (Supplementary Fig. S1). We performed ChIP assays with antibodies to YY1, TFAP4 or with IgG as a control. The rs17583 and rs4751162 regions showed significant enrichment of YY1 and TFAP4, respectively (Fig. 4C,D). The YY1 was predicted to bind to rs17583 C allele more efficiently than to T allele, and TFAP4 was predicted to bind to rs4751162 G allele more efficiently than to A allele, suggesting the alleles with increased gene expression had preferred transcription factor binding (Supplementary Fig. S1).

Figure 4
figure 4

ChIP-qPCR analysis of H3K4me3 and H3K9ac (A) and YY1 (C) binding at CAPN1 rs17583, and H3K27ac and H3K4me1 (B) and TFAP4 (D) binding at LINC00959 rs4751162. The data were analyzed using Excel. Each bar represents mean ± SEM from three independent experiments carried out in triplicate. P values by Student’s t-test.

Discussion

In this study, we investigated the association between genetic variation in histone modification regions and the prognosis in lung adenocarcinoma. We found that two genetic variants, CAPN1 rs17583C>T and LINC00959 rs4751162A>G, in the histone modification regions were associated with the survival outcomes of patients with lung adenocarcinoma who underwent curative surgery. Functional analyses showed significant difference in the promoter activity and expression level of CAPN1 and GLRX3 genes according to genotypes, supporting the association between the variants and the survival outcomes. The motif analyses and ChIP-qPCR confirmed that the variants are located in the active promoter/enhancer regions where transcription factor binding occurs, elucidating epigenetic regulatory mechanisms of gene expression by the variants. These results suggested a potential role of the two genetic variants in the pathogenesis of lung adenocarcinoma, and their usefulness as prognostic biomarkers.

Calpains (CAPNs) are a family of calcium-dependent cysteine proteinases involved in a variety of cellular processes including remodeling of cytoskeletal/membrane attachments, signal transduction, and apoptosis18. CAPN1 and CAPN2 are the two isoforms of the ubiquitous calpain, which are heterodimers of a large catalytic subunit encoded by CAPN1 and CAPN2, respectively, and a regulatory subunit (CAPN4) encoded by CAPNS119. CAPNs could cause limited cleavage or functional modulation of various substrates that act as metastatic mediators, and several studies have suggested that it plays a significant role in tumor migration and invasion20. Only a few studies have addressed the role of CAPNs in lung cancer. It has been reported that CAPN2 might promote lung cancer progression by activating EGFR/pAKT signaling pathway, and also contribute to the resistance to paclitaxel or EGFR-TKI21,22. Another study suggested that overexpression of CAPN4 was an independent prognostic factor in patients with NSCLC, and could enhance the invasive potential of lung cancer cells by upregulating the expression of matrix metalloproteinase 223. In the present study, the CAPN1 rs17583C>T was associated with significantly better clinical outcomes. The CAPN1 expression was significantly higher in lung adenocarcinoma than normal lung. Functional analyses showed that rs17583C-to-T change was associated with decreased CAPN1 mRNA expression in clinical samples and significantly decreased CAPN1 promoter activity in in vitro promoter assays. The ChIP-qPCR confirmed that CAPN1 rs17583C>T is located in an active promoter with YY1 binding. These results suggested a tumor promoting function of CAPN1 in lung adenocarcinoma, and provided evidence of the functional relationship between the genetic variant and the better clinical outcomes.

GLRX3 is an antioxidant enzyme, one of the intracellular redox-regulating molecules that contribute to maintaining cellular redox homeostasis, and is known to play an important role in cellular signal transduction in response to stress signals by reactive oxygen species24. Although the complicated roles of GLRX3 in cancer remain poorly understood, overexpression of GLRX3 was ascertained in several types of malignancy, such as nasopharyngeal carcinoma (NPC), oral squamous cell carcinoma (OSCC), colon cancer, and lung cancer25,26,27. Studies on NPC and OSCC have reported in common that knockdown of GLRX3 inhibited cell proliferation and decreased the migration and invasion capacity of cancer cells by suppressing the epithelial-mesenchymal transition, indicating the essential roles of GLRX3 in cancer progression25,26. In agreement with these results, our unpublished data indicated that knockdown of GLRX3 inhibited proliferation, migration, and invasion capacity of lung cancer cells. In this study, LINC00959 rs4751162A>G was associated with worse DFS. GLRX3 expression was significantly higher in tumor tissues than in normal lung, and rs4751162 G allele correlated with a significantly higher promoter activity of GLRX3 than the A allele. Motif analyses predicted that the rs4751162 is located in the binding motif of TFAP4, and the ChIP-qPCR confirmed the SNP was located in an active enhancer with TFAP4 binding. These results suggested a potentially oncogenic role of GLRX3. However, further investigation is needed to understand the role of CAPN1 and GLRX3 in the pathogenesis of lung adenocarcinoma and the biological mechanisms of the association between the genetic variants and clinical outcomes.

In this study, only two out of 41 SNPs were replicated in the validation cohorts. Difference in clinical characteristics of the patients between the discovery and validation cohorts, which could be a possible cause of replication failure, is a limitation of the retrospective multicenter studies. However, all the clinical variables were adjusted for in the multivariate analyses. More importantly, because the discovery study with a relatively small sample size was an exploratory study for which type II errors should be considered, we did not perform multiple testing corrections for the associations, which may have resulted in false positive associations leading to the replication failure of the associations in the validation study. In addition, the modest sample size of the validation cohort may not have optimal statistical power for replicating the associations. In this study, the observed P values did not reach a more stringent level of statistical significance to avoid false positive associations arising from multiple comparisons. Future studies with larger number of patients are required to validate our results. Nevertheless, the design of two-stage independent cohorts for the discovery and validation sets is a major strength of this study, which could reduce false-positive findings from the genetic association study28,29. The two SNPs were significantly associated with clinical outcomes in both independent cohorts, and the association had similar effect size with the same direction. In addition, the association showed higher level of significance in the combined analysis including larger population, supporting the credibility of the association. Functional analyses further supported the plausibility of our findings.

In summary, the current study demonstrated that the two genetic variants, CAPN1 rs17583C>T and LINC00959 rs4751162A>G, was associated with survival outcomes of patients with lung adenocarcinoma after surgical resection. Our results suggest that CAPN1 and GLRX3 may play important roles in the pathogenesis of lung adenocarcinoma, and that the variants may be useful in predicting the prognosis of lung adenocarcinoma after surgery, thereby helping to refine therapeutic decisions for better clinical outcomes in NSCLC. Future studies are warranted to validate our results in a larger population with diverse ethnicity.

Methods

Study population

A total of 404 patients with available genomic DNA samples, who were diagnosed with pathologic stages I, II, or IIIA (micro-invasive N2) NSCLC after curative surgical resection, were enrolled in this study. Discovery cohort comprised 166 patients whose diagnosis was made at Kyungpook National University Hospital (KNUH) between September 2001 and August 2009, and validation cohort consisted of 238 patients diagnosed with NSCLC at Seoul National University Bundang Hospital (SNUBH) between June 2005 and May 2012. All patients in this study were of Korean ethnicity. Genomic DNA samples extracted from peripheral blood lymphocytes of the patients were provided by the National Biobank of Korea, KNUH, which is supported by the Ministry of Health, Welfare and Family Affairs. Written informed consent was obtained from all patients before surgery. This study was approved by the Institutional Review Boards of the Kyungpook National University Chilgok Hospital and Seoul National University Bundang Hospital (Approval No. KNUCH 2017-07-012). All experiments were performed in accordance with relevant guidelines and regulations.

Cell culture and antibodies

H2087 cells were obtained from the American Type Culture Collection (ATCC) and H1703 cells were purchased from Korean Cell Line Bank (KCLB), Seoul, Korea. Antibodies used in this study include anti-Histone H3 antibodies (ab8580, ab4441, ab4729 and ab8895), and anti-GLRX3 antibody (ab226396) from Abcam (Cambridge, UK), anti-YY1 antibody (46395) from Cell Signaling Technology (Danvers, MA, USA), and anti-TFAP4 antibody (sc-166216X) from Santa Cruz Biotechnology (Dallas, TX, USA). Cells were cultured at 37 °C in a humidified atmosphere with 5% CO2 in Corning RPMI medium (Corning Inc., Corning, NY, USA) supplemented with 10% Corning Fetal Bovine Serum (Corning Inc., Corning, NY, USA), and 100 U/ml penicillin and 100 mg/ml streptomycin.

Chromatin immunoprecipitation (ChIP)-sequencing

ChIP assays were performed using the Pierce Magnetic ChIP kit (Thermo Fisher Scientific, Waltham, MA, USA), according to the manufacturer’s protocol. H2087 cells were crosslinked with 1% formaldehyde for 10 min, and the crosslinking was inactivated by 0.125 M glycine for 5 min at room temperature. Cells were washed with cold 1xPBS twice. The cells were lysed, sonicated to shear DNA. To immunoprecipitate protein/chromatin complexes, the diluted supernatants were incubated with 10 μg of H3K4me3 or H3K27ac antibody overnight, and then incubated for 2 h after adding 50 μl of agarose/protein A or G beads. Ten percent of the diluted supernatants were saved as “input” for normalization. Several washing steps were followed by protein digestion using proteinase K. Reverse crosslinking was carried out at 65 °C. DNA was subsequently purified. ChIPSeq library preparation was performed with TruSeq ChIP Library Preparation Kit (Illumina, San Diego, CA, USA). Sequencing was performed on an Illumina HiSeq4000. Sequence reads for each sample were aligned to the human genome using Bowtie30. The reference genome sequence of Homo sapiens (hg19) and annotation data were downloaded from the UCSC table browser (http://genome.uscs.edu). Peaks were called in the aligned sequence data using a model-based analysis of ChIP-seq (MACS2 version 2.1.0) (https://bioweb.pasteur.fr/packages/pack@macs@2.1.0)31. ChIPseeker (version 1.6.6) (http://www.bioconductor.org/packages/release/bioc/html/ChIPseeker.html)32, a bioconductor package within the statistical programming environment R to facilitate batch annotation of enriched peaks identified from ChIP-seq data, was used to identify nearby genes and transcripts from the peaks obtained from MACS2.

RNA-sequencing

Total RNAs from H2087 cells were isolated using TRIzol (Invitrogen, Carlsbad, CA, USA). Sequencing was performed on an Illumina HiSeq4000 and aligned the processed reads to the Homo sapiens (hg19) using HISAT v2.0.533. Transcript assembly and abundance estimation was performed using StringTie v1.3.3b34,35. It provides the relative abundance estimates as FPKM values (Fragments Per Kilobase of exon per Million fragments mapped) of transcript and gene expressed in each sample. FPKM values have already been normalized with respect to library size.

SNP selection and genotyping

We conducted an integrated analysis of ChIP-seq and RNA-seq for the SNP selection. As a result of the ChIP-seq using H2087 cells, SNPs within H3K4me3 and H3K27ac peak regions were selected. Next, using the FuncPred utility for functional SNP prediction in the SNPinfo web server (https://snpinfo.niehs.nih.gov/), potentially functional variants with minor allele frequency ≥ 0.1 based on the HapMap JPT data were collected after excluding those in linkage disequilibrium (r2 ≥ 0.8). And then, using RNA-seq we chose genes with high expression level (FPKM ≥ 100), and then SNPs within or closest to the genes were extracted. Genotyping was performed using iPLEX Assay and MassARRAY System (Agena Bioscience, San Diego, CA, USA). Approximately 5% of the samples were randomly selected to be genotyped again by a different investigator, by a restriction fragment length polymorphism assay, and the results were 100% concordant.

Promoter-luciferase constructs and luciferase assay

We evaluated the effect of the rs17583C>T or rs4751162A>G on the activity of the promoter of CAPN1 or GLRX3 genes by luciferase reporter assay. The rs17583C>T of CAPN1 gene is located in the region of H3K4me3 peak, which marks active promoters, in CAPN1 gene promoter. The 378 bp fragment including rs17583C>T was synthesized by polymerase chain reaction from human genomic DNA and cloned into XhoI/HindIII site of the pGL3-basic vector (Promega, Madison, WI, USA). The correct sequences of all clones were verified by DNA sequencing. The rs4751162A>G is located in the region of H3K27ac peak, which is an activation mark of enhancers, in the intron region of LINC00959 gene. The SNP is expected to regulate expression of GLRX3 gene because it resides 26 kb downstream of GLRX3 gene, although they both are on the chromosome 10. The promoter region of GLRX3 (− 980 to + 38 bp, the transcriptional start site is designated as + 1) was synthesized by polymerase chain reaction from human genomic DNA and cloned into XhoI/NcoI site of the pGL3-promoter vector (Promega, Madison, WI, USA) to generate pGL3-GLRX3pro. Two fragments including rs4751162A or rs4751162G allele of rs4751162A>G were amplified from genomic DNA sample and the 283 bp products were cloned into BamHI/SalI site of the pGL3-GLRX3pro, respectively, to generate pGL3-GLRX3pro_A and pGL3-GLRX3pro_G. The cloning PCR primers were listed Supplementary Table S3. All constructs were verified by direct sequencing before use. The H1703 cells were transfected with 200 ng of each plasmid DNA (pGL3-CAPN1_C and pGL3-CAPN1_T for rs17583, and pGL3-GLRX3pro, pGL3-GLRX3pro_A, or pGL3-GLRX3pro_G for rs4751162) and 2 ng of pRL-SV40 Vector (Promega, Madison, WI, USA) using Effectene transfection reagent (Qiagen, Hilden, Germany) according to manufacturer’s protocol. The cells were collected 48 h after transfection. Luciferase activity was measured using the Dual-Luciferase Reporter Assay System (Promega, Madison, WI, USA). Firefly luciferase activity measurements were normalized with respect to pRL-SV40 Renilla luciferase activity to correct for variations in transfection efficiency. All experiments were performed in triplicate.

RNA preparation and quantitative reverse transcription-PCR (qRT-PCR)

CAPN1 and GLRX3 mRNA expression was examined by qRT-PCR. Total RNAs from tumors and paired non-malignant lung tissues (n = 73) were isolated using TRIzol (Invitrogen, Carlsbad, CA, USA). Real-time PCR was performed using a LightCycler 480 (Roche Applied GLRX3 expression Science, Mannheim, Germany) with QuantiFast SYBR Green PCR Master Mix (Qiagen, Hilden, Germany). The real-time PCR primers for CAPN1, GLRX3 and β-actin genes were listed in Supplementary Table S3. Each sample was run in duplicate. Relative target gene mRNA expression was normalized to that of β-actin expression and then evaluated using the 2−ΔΔCt method36.

ChIP-quantitative PCR (qPCR) assay

Chromatin from H2087 cells was immunoprecipitated with the Pierce Magnetic ChIP kit (Thermo Fisher Scientific, Waltham, MA, USA), using 10 μg anti-H3K4me3, anti-H3K9ac, anti-H3K27ac, anti-H3Kme1, anti-YY1, and anti-TFAP4 antibodies and 2 μg normal rabbit IgG antibodies per immunoprecipitation reaction. Immunoprecipitated chromatin was subjected to real-time qPCR using SYBR Green PCR Master Mix (Qiagen, Hilden, Germany). The ChIP-qPCR primers were listed Supplementary Table S3. The qPCR was performed as follows: 95 ℃ 10 min, 45 cycles of 95 ℃ 15 s, 60 ℃ 1 min. ChIP-qPCR enrichment analysis were performed by Comparative Ct method. Each samples were normalized to the input and the fold difference between sample and IgG was calculated using 2(− ΔΔCt)37.

Enriched motif analysis

We identified the transcription factor binding motif enriched in the regions containing rs17583C>T or rs4751162 A>G. Motifs were analyzed using TOMTOM, a motif database scanning algorithm, of the MEME Suite web server17 for comparison against Human and Mouse38 and the SwissRegulon databases of known transcription factor motifs39.

Statistical analyses

Hardy–Weinberg equilibrium was evaluated by a goodness-of-fit χ2 test with 1 degree of freedom. Overall survival (OS) was measured from the date of surgery to the date of death or the last follow-up. Disease-free survival (DFS) was calculated from the date of surgery until first evidence of disease recurrence or last date of follow up for patients who were free of disease. Estimated survival rate was calculated using the Kaplan–Meier method. Log-rank test was used to compare the difference in OS and DFS across different genotypes. Multivariate Cox proportional hazards models were used to estimate the hazard ratio (HR) and 95% confidence intervals (CI) after adjusting for age (< 64 years vs ≥ 64 years), gender (male vs female), smoking status (never vs ever), pathological stage (I vs II–IIIA), and adjuvant therapy (yes vs no). Statistical analyses were carried out using Statistical Analysis System for Windows, version 9.4 (SAS Institute, Cary, NC, USA), Statistical Package for the Social Sciences (SPSS) 25.0 (IBM Corp., Armonk, NY, USA), and Microsoft Excel (Microsoft Corp., Redmond, WA, USA).