Analysis of the role of rs2031920 and rs3813867 polymorphisms within the cytochrome P450 2E1 gene in the risk of squamous cell carcinoma

To explore the genetic effect of rs2031920 and rs3813867 polymorphisms within the cytochrome P450 2E1 (CYP2E1) gene on the risk of squamous cell carcinoma (SCC), a meta-analysis was performed. The eligible case–control studies were obtained by database searching and screening, and the specific statistical analysis was performed with STATA 12.0 software. After the process of database searching and screening, a total of 32 case–control studies with 7435 cases and 10,466 controls were ultimately included in our meta-analysis. With regard to the rs2031920 C/T polymorphism, in comparison to controls, a reduced risk in cases of esophageal squamous cell carcinoma (ESCC) was detected for the models of allele T vs. allele C [P = 0.025, odds ratio (OR) = 0.67], carrier T vs. carrier C (P = 0.014, OR = 0.70), TT vs. CC (P = 0.029, OR = 0.65), CT vs. CC (P = 0.040, OR = 0.56), CT + TT vs. CC (P = 0.035, OR = 0.58). Similarly, a decreased SCC risk was observed for the rs3813867 G/C polymorphism in the allele, carrier, homozygote, dominant, and recessive models of overall SCC meta-analysis and “ESCC” subgroup analysis (all P < 0.05, OR < 1) and in all genetic models of “Asian” and “population-based control (PB)” subgroup analysis (all P < 0.05, OR < 1). Additionally, for the rs2031920/rs3813867 haplotype, a decreased SCC risk was also detected in the overall SCC meta-analysis under the allele, carrier, homozygote and dominant model (all P < 0.05, OR < 1) and the subgroup analysis of “PB” under the allele, carrier, and dominant models (all P < 0.05, OR < 1). Our meta-analysis supports the “T” allele carrier of the CYP2E1 rs2031920 C/T polymorphism and “C” allele carrier of the rs3813867 G/C polymorphism as protective factors for ESCC patients, especially in Asian populations.


Background
The cytochrome P450 2E1 (CYP2E1) gene in Homo sapiens is located on chromosome 10 and is responsible for encoding a membrane-bound CYP2E1 protein, an important member of the human cytochrome P450 system [1]. The cytochrome P450 system works as a series of phase I enzymes participating in a group of biological events, such as drug metabolism, oxidative reactions, or the detoxification of endogenous and exogenous substances [2,3]. Polymorphic variants, existing in the functional genes of the cytochrome P450 system, are associated with the pathogenesis of several clinical cancers [2,3]. For example, rs2031920 C/T with an RsaI restriction enzyme site and rs3813867 C/T with a PstI restriction enzyme site are two common single nucleotide polymorphisms (SNP) within the 5′-flanking regions of the CYP2E1 gene [4][5][6]. Three genotypes of c1/c1, c1/c2, c2/c2 were generated; rs2031920 and rs3813867 were in close linkage disequilibrium [4][5][6]. Furthermore, CYP2E1 polymorphisms were reported to be linked to several cancers, such as nasopharyngeal carcinoma [7], urinary cancers [6] and head and neck carcinoma [5], particularly in Asian populations.
Squamous cell carcinoma (SCC) is the most common histological type of several clinical cancers, such as head and neck cancer, esophageal cancer, skin cancer, lung cancer, and cervical cancer [8,9]. The exact pathogenesis of SCC remains unclear. Living habits (e.g., smoking, drinking), viral infection [e.g., human papillomavirus (HPV)], immune system, and polymorphic variants with many genes may be related to the risk of different SCC diseases [10][11][12]. Previously, we conducted an updated meta-analysis to explore the impact of MDM2 (MDM2 Proto-Oncogene) polymorphisms on SCC susceptibility and found that the GG genotype of MDM2 rs2279744 polymorphism may be associated with an increased risk of esophageal SCC in Asian populations [8].
We observed a different conclusion regarding the role of rs2031920 and rs3813867 polymorphisms within the CYP2E1 gene in the risk of SCC. Thus, we are very interested in investigating the role of the rs2031920 and rs3813867 polymorphisms within the CYP2E1 gene in the susceptibility to SCC, considering the lack of publications of specific meta-analyses. We included a total of 32 case-control studies in our meta-analysis, which followed the preferred reporting items for systematic reviews and meta-analyses (PRISMA) [13]. The retrieved studies were then reviewed and screened with the following exclusion criteria: (1) data based on animal experiments; (2) case reports, cohort studies or meeting abstracts; (3) without SNP data; (4) meta-analyses or reviews; (5) no SCC or CYP2E1 data; (6) duplicate studies; (7) no pathological typing data; (8) no genotype data. The data of genotype frequencies in cases and controls must have been provided in the selected studies.

Characteristics and quality assessment
Based on the eligible articles, the authors extracted and summarized the usable information, including the first author's name, year, country, race, SNP, genotype frequency, SCC type, control source, genotyping assay, and HWE (Hardy-Weinberg equilibrium), in a table. The Newcastle-Ottawa Scale (NOS) system was also used to assess the methodological quality of individual studies. Only the studies with NOS score > 5 were ultimately included in our meta-analysis.

Heterogeneity and association test
STATA software (Stata Corporation, College Station, TX, USA) was used for our heterogeneity and association tests. In the case of heterogeneity, the P value of Cochran's Q statistic < 0.05 or I 2 value > 50% were considered to represent high heterogeneity among studies, which led to the use of a random effects model (DerSimonian and Laird method). Otherwise, the fixed effects model (Mantel-Haenszel statistics) was used. In the association test, odds ratio (OR), 95% confidence interval (CI) and P value were computed to assess the association strength in the allele, carrier, homozygote, heterozygote, dominant, and recessive models. In addition, based on the factors of race, SCC type, control source and HWE, a series of subgroup analyses were performed as well.

Publication bias and sensitivity analysis
Begg's test and Egger's test were used to assess the potential publication bias. A P value larger than 0.05 indicated the absence of potential publication bias. In addition, sensitivity analysis was used to evaluate the data stability and possible sources of heterogeneity.

Process for identifying eligible studies
After our initial database retrieval, a total of 393 records [PubMed (n = 89), Web of Science (n = 161), Cochrane (n = 1), Scopus (n = 116) and CNKI (n = 26)] were obtained, as presented in Fig. 1. Then, 113 duplicate records were excluded. Based on the exclusion criteria, 223 records were removed. Moreover, the lack of confirmed pathological typing data or genotype frequency distribution resulted in the exclusion of another 25 articles. Finally, our meta-analysis involved a total of 32 articles  containing 7435 cases and 10,466 controls. The characteristics of each study are presented in Table 1.
No study had poor quality; the NOS score of all studies was greater than five ( Table 1).

The rs2031920 polymorphism
A meta-analysis of rs2031920 and SCC risk was conducted on the allele model (allele T vs. allele C), carrier model (carrier T vs. carrier C), homozygote model (TT vs. CC), heterozygote model (CT vs. CC), dominant model (CT + TT vs. CC), and recessive model (TT vs. CC + CT). As shown in Table 2, 18 case-control studies were enrolled for the allele, carrier, heterozygote models, 15 case-control studies were enrolled for the homozygote model, 21 case-control studies were enrolled for the dominant model, and 16 case-control studies were enrolled for the recessive model. Pooling results     suggested that there was no statistically significant difference for the overall SCC risk between the case and control groups under any model ( Table 2, P value of association test > 0.05). Moreover, we conducted a statistical analysis of the subgroup of race (Asian/Caucasian), SCC type (HNSCC/ ESCC/LSCC), control source (PB/HB), and HWE (Y/N). As shown in Table 2, in comparison with controls, a reduced ESCC risk was observed in the models of allele  Figure 2a shows forest plot data in subgroup analysis by SCC type under the allele model. The "T" allele carrier of the rs2031920 polymorphism within the CYP2E1 gene seems to be linked to ESCC risk.
OR odds ratio, CI confidence interval, HNSCC head and neck squamous cell carcinoma, ESCC esophageal squamous cell carcinoma, LSCC lung squamous cell carcinoma, PB population-based control, HB hospital-based control, Y P value of hardy-weinberg equilibrium > 0.05, N P value of hardy-weinberg equilibrium > 0.05

The rs3813867 polymorphism
We also conducted the overall and subgroup metaanalysis of rs3813867 and SCC risk under the allele (10 case-control studies), carrier (10 case-control studies), homozygote (6 case-control studies), heterozygote (10 case-control studies), dominant (11 case-control studies), and recessive (6 case-control studies) models. The positive results regarding the association between CYP2E1 rs3813867 and SCC risk were detected in the overall SCC meta-analysis and subgroup analysis of "ESCC" and "Y" (P value of Hardy-Weinberg equilibrium > 0.05) under all genetic models (Table 3, all P < 0.05, OR < 1), only apart from the heterozygote model (P = 0.150). A decreased SCC risk was also detected in the subgroup analysis of "Asian" and "PB" under all genetic models (Table 3, all P < 0.05, OR < 1). Figure 3a shows the forest plot data of subgroup analysis by SCC type under the allele model. The "C" allele carrier of CYP2E1 rs3813867 polymorphism may be associated with the risk of SCC, especially the ESCC cases in Asian populations.

The rs2031920/rs3813867 haplotype
The results of overall and subgroup meta-analysis of the rs2031920/rs3813867 haplotype and SCC risk under the allele (five case-control studies), carrier (five studies), homozygote (three studies), heterozygote (five studies), dominant (seven studies), and recessive (three studies) models are shown in Table 4. We observed a decreased SCC risk in the overall SCC meta-analysis under the allele, carrier, homozygote, and dominant models ( Table 4, all P < 0.05, OR < 1), and the subgroup analysis of "PB" under the allele, carrier, and dominant models (all P < 0.05, OR < 1). These results suggested a potential link between the c1/c2 or c2/c2 of rs2031920/rs3813867 haplotype and SCC risk, which still requires more casecontrol studies.

Heterogeneity evaluation
When assessing the heterogeneity level, the fixed model was used for the TT vs. CC model of rs2031920 due to the lack of high heterogeneity ( Table 5, I 2 = 38.3%, P value of heterogeneity = 0.066), however, the random model was utilized for others. The fixed model was used for the allele, carrier, homozygote and recessive models of rs3813867 (Table 5, all I 2 < 50.0%, P value of heterogeneity > 0.05); and the allele, carrier, homozygote, dominant, and recessive models of the rs2031920/rs3813867 haplotype (Table 5, all I 2 < 50.0%, P value of heterogeneity > 0.05).

Publication bias and sensitivity analysis
Begg's and Egger's tests did not provide confirmed evidence of obvious publication bias in the above comparisons (Table 5, all P value of Begg's test and Egger's test> 0.05) apart from the CT + TT vs. CC model of rs2031920 (P value of Egger's test = 0.037). Figures 2b and  3b show the Egger's publication bias plot of rs2031920 and rs3813867 under the allele model, respectively. Additionally, a relatively stable conclusion was obtained by sensitivity analysis results (Fig. 2c for allele model of rs2031920; Fig. 3c for allele model of rs3813867; data for others not shown).

Discussion
CYP2E1 rs2031920 was related to the risk of ESCC in a high-incidence region (Kashmir, India) [15]. Nevertheless, negative results were also reported in one study from South Africa [29] and in a Huai'an population from China [34]. Meta-analysis can address this conflicting issue. We did not observe published meta-analyses specific for the genetic relationship between CYP2E1 rs2031920, rs3813867 SNP and ESCC risk. In this study, we provide evidence that the "T" allele carrier of the rs2031920 polymorphism and the "C" allele carrier of the CYP2E1 rs3813867 polymorphism may be associated with a decreased risk of ESCC, especially in Asian populations because most of the included case-control studies were from China or India. Tang et al. [46] selected 21 case-control studies for a meta-analysis in 2010 and investigated the potential effect of CYP2E1 rs2031920 and rs3813867 in the risk of head and neck cancer; they found that the homozygote genotype of CYP2E1 rs2031920/rs3813867 may be linked to the risk of head and neck cancer, especially in Asian populations. Zhuo et al. [5] performed another meta-analysis containing 43 case-control studies in 2016 and reported a positive association between CYP2E1 rs2031920/rs3813867 and head and neck cancer risk under the homozygote model. However, the subgroup analysis based of HNSCC was not performed in the two meta-analyses. In our meta-analysis, we failed to observe the statistical relationship between CYP2E1 rs2031920   [18] selected 17 case-control studies with 2639 cases and 3450 controls for a meta-analysis of the association between CYP2E1 rs3813867 and the risk of lung cancer in the Chinese population in 2014, and showed a potential link between the "C" allele carriers of CYP2E1 rs3813867 and a decreased risk of lung cancer. In our meta-analysis, very limited data were included after our strict selection; thus, no statistical evidence regarding the role of CYP2E1 rs3813867 in LSCC risk was provided. However, we enrolled five case-control studies [26-28, 33, 39] in our subgroup analysis of "LSCC" for CYP2E1 rs2031920 and found a negative genetic relationship, which was partly in line with the previous data from LSCC subgroup analysis [47].
The close linkage disequilibrium between rs2031920 and rs3813867 for the CYP2E1 gene was reported [4][5][6]. For example, the same genotype frequency distribution was observed in case and control groups of south Indians [14]. However, we observed different genotype frequency distributions between case and control in some other reports [29,45]. For example, in the Taihang regions of China, the genotype frequency of rs2031920 differs from that of rs3813867 in both the case and control groups [45]. In addition, most case-control studies only measured the single SNP. Thus, we performed a meta-analysis of rs2031920 and rs3813867, respectively; then, we analyzed the role of the rs2031920/rs3813867 haplotype based on the available data. We also conducted an overall and subgroup meta-analysis with four factors (race, SCC type, control source and HWE) under the allele, carrier, heterozygote and dominant models.
To enroll as many eligible case-control studies as possible, a search of five independent online databases (Pub-Med, Web of Science, Cochrane, Scopus and CNKI) was performed using the overall SCC terms and specific terms, such as ESCC, HNSCC, LSCC and SSCC. Based on our strict criteria, we removed the articles that contained the unconfirmed pathological typing information or failed to provide a genotype frequency distribution in both case and control studies. We observed the absence of large publication bias and the stability of data through Begg's/Egger's tests and sensitivity analyses.
Despite this, the shortcomings of the small sample size may still have affected our statistical power. Only one case-control study [38] was included in the "cervical SCC" subgroup analysis of rs2031920 under the allele, carrier, homozygote, heterozygote, and recessive models. Only one case-control study [18] was enrolled in the "lung SCC" subgroup analysis of rs3813867 under all genetic models. Only two studies [25,36] were enrolled in the "ESCC" subgroup analysis of the rs2031920/ rs3813867 haplotype.
In this study, we focused on the genetic role of two polymorphisms within the CYP2E1 gene in our metaanalysis, and we still cannot rule out the potential genetic effect of other CYP2E1 polymorphisms (e.g., rs6413432 T/A) and the variant combination between CYP2E1 and other related genes (e.g., MDM2).
For rs3813867, we did not observe obvious heterogeneity in the allele, carrier, homozygote and recessive models, only apart from the heterozygote model. Reduced heterogeneity levels were also observed in the ESCC subgroup analysis compared to the overall analysis. For example, in the allele model, a relatively high heterogeneity level in overall meta-analysis (P value of heterogeneity = 0.054, I 2 = 46.1%) changed to a relatively lower heterogeneity level in the ESCC subgroup (P value of heterogeneity = 0.517, I 2 = 0.0%). A slight reduction was also observed for the heterozygote model (P value of heterogeneity from 0.026 to 0.101, I 2 value from 52.4 to 51.9%), even though significant between-study heterogeneity existed in the ESCC subgroup. We thus performed another meta-analysis, which only enrolled the available case-control studies of ESCC, and similar results were obtained (data not shown).
In addition, we observed remarkable heterogeneity for the allele, carrier, heterozygote, dominant and recessive modes of rs2031920. Even though a stable result was detected in the sensitivity analysis, and no decreased heterogeneity level was observed in the subgroup of ESCC compared with overall meta-analysis. This suggested that mixed factors contributed to the source of heterogeneity of specific ESCC subgroups. We tried to analyze the clinical characterizations, such as gender, age or concomitant pathologies, within the enrolled case-control studies. However, in the ESCC, only six eligible case-control studies were included in the ESCC subgroup, and the adjustment data was very limited for categorization. A larger sample size is required to conduct a more in-depth analysis.

Conclusions
In conclusion, our meta-analysis data demonstrated that the CYP2E1 rs2031920 and rs3813867 polymorphisms may be associated with the risk of ESCC. However, this conclusion should be confirmed with more extractable case-control studies.