Significant Role of Estrogen and Progesterone Receptor Sequence Variants in Gallbladder Cancer Predisposition: A Multi-Analytical Strategy

Background Carcinoma of gallbladder (GBC) is an aggressive malignancy. The higher incidence of gallbladder cancer in women has been partly attributed to hormonal factors. Therefore the present study was designed to explore the role of genetic variants in estrogen (ESR1, ESR2) and progesterone (PGR) receptors in conferring risk of gallbladder cancer. Materials and Methods The present case-control study recruited total of 860 subjects, including 410 GBC patients, 230 gallstone patients and 220 controls. We examined the associations of 6 selected polymorphisms in three genes: ESR1 (rs2234693, rs9340799, rs1801132), ESR2 (rs1271572, rs1256049) and PGR (rs1042838) with GBC risk. Genotyping for all the polymorphisms was done using PCR-RFLP. Multifactor dimensionality reduction and classification and regression tree approaches were combined with logistic regression to discover high-order gene-gene interactions in hormonal pathway. Results On comparing the genotype frequency distribution in gallstone and GBC patients with that of healthy subjects, the homozygous variant genotypes of ESR1-397TT (rs2234693) polymorphism showed significant risk for developing gallstone [odds ratio: OR = 2.9] and GBC [OR = 1.8] respectively. Detailed haplotypes analysis suggested that ESR1 T rs2234693G rs9340799C rs1801132 have significant association in conferring risk for both gallstones [OR = 2.2] and GBC [OR = 3.0]. However, the variant-containing genotypes (DI+II) of PGR (rs1042838) showed low risk in both GBC [OR = 0.4] and gallstone patients [OR = 0.4].On performing the MDR analysis, ESR1 IVS1-397C>T, ESR1 IVS1-351A>G, and ESR2-789 A>C yielded the highest testing accuracy of 0.634. These results were further supported by the CART analysis which revealed that individuals with the combined genotypes of ESR1-397 CT or TT, ESR1-351 AG or GG and ESR2 -789 AA had the highest risk for GBC [OR  = 3.9]. Conclusion Using multi-analytical approaches, our study showed important role of ESR1 IVS1-397C>T, ESR1 IVS1-351A>G, and ESR2-789 A>C variants in GBC susceptibility and the risk appears to be mediated through gallstone dependent pathway.


Introduction
Carcinoma of the gallbladder is a highly fatal disease with late diagnosis, limited treatment options and deprived prognosis [1]. It is the most common malignant lesion of the biliary tract and the sixth most common among malignant neoplasms of the digestive tract [1,2,3]. A study from Utah cancer registry (UCR) and Swedish family-cancer database reported familial clustering of GBC [4]. Compelling evidence also exists for the role of family history of gallstones in gallbladder cancer etiology [5,6]. Moreover, the incidence graph of gallbladder carcinoma fluctuates with sex and ethnicity and the highest frequencies are reported in females belonging to Native Americans, South America, and North India [7].
A multifaceted interplay between hormones, metabolic alterations, infections, and even anatomical anomalies have been elucidated in the etiology of gallbladder carcinoma [1]. Moreover, a plethora of epidemiological studies have shown strong association of GBC with cholesterol gallstone disease [8] and with many of its risk factors like obesity, high carbohydrate intake, and female sex [9]. In post menopausal women, hormone replacement therapy significantly increases the risk of gallbladder diseases [10,11] suggesting a noteworthy role of sex hormones in the etiology of GBC [12,13,14,15,16].Also women are two to six times more affected than men. Altogether, these evidences have raised the possibility that sex steroids (estrogen and progesterone) could play a key pathophysiological role in the development of gallbladder carcinoma. The estrogen and progesterone hormones act on target tissues by binding to their respective receptors. Estrogen receptors (ESR1, ESR2) and progesterone receptor (PGR) genes are located on chromosome 6q25.1, 14q21-22 and 11q22 respectively and their expression have been detected in human gallbladder normal mucosa [15] and GBC [16] Allelic variants of estrogen receptor genes have been shown to be associated with susceptibility or progression with various disorders such as myocardial infarction [17], cholesterol gallstone and biliary tract diseases [18]. Moreover, functional assays suggest that ESR1 IVS1-397C.T polymorphism affects a binding site for the myb family of transcription factors [19,20] and the polymorphism has also been studied in various breast cancer association studies [17,21,22]. On the contrary, hormone progesterone plays an important role in regulating the level of estrogen and providing protection against several cancers [23,24].
Given the potential hormonal role in gallbladder diseases, and also the previously explored role of ESR/PGR polymorphisms in female related cancers, we hypothesized that genetic variants in ESR1, ESR2 and PGR genes may have significant impact on the risks of gallbladder cancer. Therefore, in the present study, we investigated a panel of 6 well-studied polymorphisms in ESR1, ESR2 and PGR genes in a case-control design involving 410 GBC patients, 230 gallstone patients and 220 cancer/gallstone-free controls from North India. In addition to logistic regression (LR), two non-parametric approaches, multifactor dimensionality reduction (MDR) and classification and regression tree (CART) were applied to explore high-order gene-gene interactions in modulating risk of gallbladder cancer. To the best of our knowledge, this is the first report that has investigated the role of genetic variants in hormonal receptor genes using multi-analytic approach to define individual risk profiles for gallbladder cancer and gallstone disease.

Population Characteristics
The demographic characteristics of GBC and gallstone patients with respect to their age and gender matched controls are presented in Table 1. The mean age in all the three groups were comparable and demonstrated no statistically significant differences. More than 90% of the GBC patients were in advanced stages of cancer (stages III and IV) and gallstones were found to be present in 50.5% of GBC patients. About 31% of the GBC patients were associated with tobacco usage in some form (smoking, chewing, or both). It was observed during data collection that majority of the female patients were housewives and male patients were not engaged in any hazardous occupations. All cancer and gallstone patients were incident cases, and none of the controls had family history of cancer.

ESR1 and PGR Polymorphisms and Modulation of Risk in the Presence of Gallstones
Since gallstones are present in more than 50% of GBC patients, the cancer cases were segregated into two groups on the basis of presence or absence of accompanying gallstones and compared independently with controls (Table 3). GBC patients with accompanying gallstones were found to have higher risk of developing the disease with ESR1 IVS1-397 CT+TT (rs2234693) genotypes (p = 0.002; [OR], 1.6 Table 3). In contrast, ESR1 Ex4-122C.G (rs1801132) was not found to be significantly associated in both the subgroups. In case of PGR ins/del (rs1042838), a protective effect was observed in GBC patients irrespective of their gallstone status (Table 3). However, on comparing the GBC patients having gallstones with gallstone patients (no cancer), the results showed no association with all studied polymorphisms of ESR1, ESR2 and PGR, both at genotypic and allelic levels. (Data not shown.).

Linkage Disequilibrium and Haplotypes Analysis of ESR1 in Case and Control Groups
On LD analysis, ESR1 rs2234693 and rs9340799 were found to be in linkage disequilibrium (D' = 0.575). Haplotypes were constructed for the three polymorphisms in ESR1 gene including IVS1-397C.T (rs2234693); IVS1-351A.G (rs9340799) and Ex4-122C.G (rs1801132). The haplotypes comprising the homozygous wild alleles were taken as reference and the difference in the frequencies of haplotypes between patients and controls were tested using chi-square test.
Haplotypes analysis of the studied three polymorphisms of ESR1 revealed that distribution of T rs2234693 G rs9340799 C rs1801132 haplotype was significantly higher in both GBC (27.5% v/s 13.7%) and gallstone patients (25.1% v/s 13.7) in comparison to controls and was conferring high risk for GBC (p = ,0.0001; [OR], 3.0 Table 4) and gallstone disease (p = 0.0012; [OR], 2.2 Table 4). Global haplotypes analysis indicated a statistically significant difference between GBC cases and controls based on the distribution pattern of the ESR1 haplotypes (p = ,0.001). However, none of the ESR2 haplotypes were found to be associated with GBC and gallstone risk.

Gene-gene Interaction
As there may be significant interactions between PGR, ESR1 and ESR2, overall gene-gene interaction analysis was performed. The results shown in Table 5 revealed significant interaction in specific variants of the three genes with overall interaction p value ,0.0001.

Association of High-order Interactions with GBC Risk by MDR Analysis
Our earlier results have shown the involvement of high order gene-gene interactions in DNA repair and inflammatory pathways in GBC susceptibility [26]. Therefore, we looked for such interactions in the genetic variants of hormonal receptor genes, using MDR and CART analysis. Table 6 shows the best interaction model by MDR analysis. The best one-factor model for predicting GBC risk was ESR1 IVS1-397C.T SNP (testing accuracy = 0.519, CVC = 10/10, permutation p = 0.025). The best two-factor model of ESR1 IVS1 351A.G and ESR2 -789 A.C had an improved testing accuracy of 0.564 (permutation p = ,0.001), however, the CVC were decreased (6/10). The best interaction model was the three-factor model including ESR1 IVS1-397C.T ESR1 IVS1 351A.G and ESR2 -789 A.C SNPs, which yielded the highest testing accuracy of 0.634 and the maximal CVC of 10/10 (permutation p = ,0.001). The four-factor model consisting of ESR1 IVS1-397C.T, IVS1 351A.G, Ex4-122C.G and ESR2 -789 A.C also improved testing accuracy compared with the one-factor model (CVC = 10/10 permutation p = ,0.001). For the three SNPs identified in the best interaction model, ESR1 IVS1-397C.T, IVS1 351A.G and ESR2-789 A.C were combined and dichotomized according to the MDR software. Individuals carrying the combined risk stratum had a 4.0 fold increased risk for GBC (p = ,0.001). Furthermore, a combined effect of ESR1 IVS1-397C.T, IVS1 -351A.G and ESR2 -789 A.C was evaluated by logistic regression analysis (Table 7) with the ESR1 IVS1-397TT, IVS1-351GG and ESR2 -789 AA as risk genotypes. Subjects were categorized into four groups based on the number of risk genotypes they carried and those without any risk genotype were designated as the reference group. We found that the pvalues for individuals carrying one and two risk genotypes was 0.46 and 0.015 respectively, However, the p value and ORs for the three risk genotypes could not be ascertained because of the absence of the variant combination in the controls (Table 7). These results suggest a significant gene dosage effect of ESR1 IVS1-397C.T, IVS1 -351A.G and ESR2 -789 A.C.

Association of High-order Interactions with GBC Risk by CART Analysis
The final resulting tree was generated by the CART analysis (Table 8). Consistent with the MDR best one-factor model, the initial split of the root node on the decision tree was ESR1 IVS1-397C.T, suggesting that this SNP is the strongest risk factor for GBC among the polymorphisms examined. Further inspection of the tree structure revealed distinct interaction patterns between individuals carrying the ESR1 IVS1-397 CT or TT and those with the ESR2 -789 AC or CC genotypes. Individuals carrying ESR1 IVS1-397CC, ESR1-122CC and ESR2 -789 AC or AA genotypes had the lowest case rate of 33.3%, and taken as reference. Using the terminal node comprising the ESR1 Ex4-122 CC genotype carriers as the reference, individuals carrying both the ESR1 -397 CT or TT, ESR1-351 AG or GG and ESR2 -789 AC genotypes exhibited a significantly higher risk for GBC (adjusted OR 3.6; 95% CI, 1.7-9.1), whereas individuals with the combined genotypes of ESR1 -397 CT or TT, ESR1 -351AG or GG, ESR2 -789 AA had the highest risk for GBC (adjusted OR 3.9; 95% CI, 2.0-9.8) (Table 8). Thus, combining the single locus analysis, CART and MDR we found that single genetic variants in either ESR1 or ESR2 may not be responsible in conferring high risk for disease but rather a higher order genegene interactions are likely to be involved in genetic susceptibility to GBC.

In-silico Analysis of Genetic Variants on Gene Activity
As the SNPs are located in non-coding sequences, it was plausible that the SNPs may have influence on transcription of the gene. In-silico analysis using FAST-SNP and F-SNP showed   (Table 9).

Discussion
Genetic differences in sex hormone genes may have an effect on their respective activities, thereby causing inter-individual divergence in the propensity to GBC. Also, the strong female incidence has raised the likelihood that estrogens may play a key pathophysiological role in the progression of gallbladder cancer [27]. In addition, expression and functional studies have shown direct interactions between ESR and PGR receptor domains [28,29]. There is evidence that inherent risk of GBC from cholelithiasis in patients without a family history of gallstones had a 21-fold risk, while those with both gallstones and a positive family history had a 57-fold higher risk [6].
In the present study, we applied a multi-analytic strategy combining LR, MDR and CART approaches to systematically examine the associations between GBC risk and a panel of genetic polymorphisms involved in hormonal pathway.
In the single-locus analysis, ESR1 IVS1-397C.T (rs2234693) polymorphism showed significant association with GBC risk. Our results from LR, MDR and CART analyses also consistently suggested that ESR1 IVS1-397C.T polymorphism is the most important single susceptibility factor for GBC development. Moreover, gene -gene interaction analysis showed significant interactions between these hormonal variants. In addition genethe multi-analytic strategies also revealed higher-order gene-gene interactions among ESR1 IVS1-397C.T, IVS1 -351A.G and ESR2 -789 A.C polymorphisms in GBC risk.
On performing detailed analysis of the haplotypes, we found that the gallbladder carcinoma and gallstones subjects who carry ESR1 haplotypes IVS1-397T, IVS1-351G, Ex4-122C conferred increased risk for both GBC and gallstones indicating that ESR1 haplotype as a risk factor. Thus, carriers of ESR1 haplotypes had a 3.04 times increased risk of gallbladder carcinoma compared with non-carriers, whereas the same haplotypes conferred 2.2 times increased risk for gallstone diseases. In contrast, Alu insertion  (35) and reported an association with ESR1 (rs1801132) and ESR2 (rs1255953) variants in GBC risk, but we did not observe any significant association with these exonic polymorphisms. Instead, we found association with the intronic (ESR1) and promoter (ESR2) polymorphisms. The reason behind this discrepancy could be the population variation. The allelic frequencies of ESR1 and ESR2 polymorphisms were not comparable between the two studies. Also, the total number of GBC cases enrolled by Park et al was relatively smaller as compared to the present study which raises the chances of Type II b error in their study. It is also possible that the SNPs of ESR1 and ESR2 may not be conferring direct effects on GBC susceptibility and their effects may be mediated through their linkage to some key functional polymorphisms.
The association between ESR1 polymorphisms and risk of GBC/gallstones are biologically credible. The animal studies have shown that ESRs are present in the hepato-pancreatic-biliary tree [30,31,32] including bile duct epithelial cells and gallbladder, suggesting that estrogens may play a role in gallbladder diseases. In addition, immunohistochemical and quantitative RT PCR studies have also revealed that the expression level of ESR1 gene is approximately 50 fold higher compared to ESR2 [33] In animal models, 17beta estradiol promoted gallstone formation involves upregulation of hepatic expression of ERalpha but not ERbeta, and the lithogenic actions of estrogen can be blocked completely by the antiestrogenic agents ICI 182,780 [34]. These studies show that ESR-1 is key player and findings may offer a new approach to treat gallstones and gallbladder cancer by inhibiting hepatic ER activity with a liver-specific, ERalpha-selective antagonists. Some studies have highlighted the significant role of ESR-2 rs1271572 in the risk of ovarian cancer [35,36]. Moreover a study by MARIE-GENICA Consortium suggested that higher risk were observed in subjects having combined genotypes of both ESR1 and ESR2 genes which modified risk associated with estrogen monotherapy used in breast cancer [37]. Similarly, in our study, applying gene-gene interactions and multianalytical approaches revealed that GBC risk was higher when the variant genotypes of ESR1 and wild ESR2 -789 A.C were present in combinations.
It may also be mentioned that genetic variants in several other genes of estrogen biosynthesis, transport and metabolism have shown significant association with both gallstone disease and biliary tract cancer possibly by modulating hormone metabolism [18]. The IVS1-397C.T and IVS1-351A.G polymorphisms have been an important area of research in diseases such as osteoporosis [22,38,39] cardiovascular disease [40] and cancer [41]. A number of hypotheses for the functional significance of these polymorphisms have been reported in the literature. Given  their location, 397 and 351 base pairs upstream from the start of exon 2, possible functional mechanisms include altered ESR1 expression by differential binding of transcription factors and influencing alternative splicing. Our in-silico studies further support the influence of genetic variants of estrogen receptor on gene transcription and splicing mechanisms. Thus, estrogen could enhance cholesterol cholelithogenesis by augmenting functions of estrogen receptors in the liver and gallbladder [42]. Considering the importance of hormonal receptors in gallbladder function, the hypothesized variants may result in cholesterol gallstones followed by chronic inflammation and further intense metaplasia, ultimately progressing into GBC which in turn may be categorized as the gallstone dependent pathway of estrogen receptors. For progesterone receptor polymorphism, very little is known about the functional consequences by which Alu insertion in progesterone receptor protect individuals from gallbladder carcinoma and gallstones. PROGINS allele codes for PGR that consists of 306 bp Alu insertion in the G intron. PGR ins/del is in perfect linkage disequilibrium (D' = 1.0) with V660L polymorphism (rs1042838) [43] (i.e. The insert-carrying allele (PGR I) exhibits higher mRNA stability and is transcribed to a more stable and transcriptionally active protein [44]. The PROGINS insertion allele has been reported as inversely correlated with risk of breast cancer [45,46,47] and endometriosis [48] in some populations [49,50,51]. It is believed that increased PGR may inhibit the mitogenic activity of insulin-like growth factors (IGFs), possibly through the regulation of Insulin-like growth factor-binding protein 1 (IBP-1) and thus influence cancer risk [52,53,54].

Study Limitations
Although, sample size in the present study is sufficient to yield 80% power but it is limited in subgroup analysis. Therefore, study may require confirmation in larger cohorts. Because this is an association study, we cannot rule out the presence of possible linkage disequilibrium with other neighboring genes that might explain the significant association with gallbladder cancer phenotypes or adverse prognosis.

Ethics Statement
The study protocol was approved by the institutional ethical committee of Sanjay Gandhi Post Graduate Institute of Medical Sciences (SGPGIMS), and the authors followed the norms of World's Association Declaration of Helsinki. All the participants were provided with written informed consent for the study.

Study Population
The present case control study recruited a total of 860 subjects, including 410 GBC patients which included 230 previous GBC cases [26], 230 gallstone patients (GS) and 220 healthy subjects. All unrelated subjects were of North Indian ethnicity. Patients were consecutively diagnosed between June 2006 and September 2011 at the Dept. of Gastro-surgery, Sanjay Gandhi Post Graduate Institute of Medical Sciences and Dept. of Surgical Oncology, CSMMU Lucknow, India. GBC was defined as tumor arising at the innermost (mucosal) layer and spreading through the visceral peritoneum (tissue that covers the gallbladder) and/or to the liver and/or to one nearby organ (such as the stomach, small intestine, colon, pancreas, or bile ducts outside the liver), and to nearby lymph nodes. Cancer diagnosis for all cases was confirmed by Fine Needle Aspirated Cell cytology (FNAC) and histopathology, yielding a response rate of 94%. Staging of cancer was documented according to the AJCC/UICC staging [55]. In gallstone disease, symptomatic gallstones were detected by transabdominal ultrasonography. The healthy controls were recruited from unrelated individuals free of any malignancy from general population. Individuals with silent gallstones detected by ultrasonography were excluded from the controls. The controls were frequency matched to patients for age (65 years) and sex. At recruitment, informed consent was obtained from each subject and  the information on demographic characteristics, such as sex, age and smoking habit, was collected by questionnaire.

DNA Samples and Genotyping
On the basis of previous functional and epidemiological studies [17], we selected a total of 6 literature-defined functional polymorphisms in three important genes involved belonging to estrogen and progesterone receptors. Candidate single nucleotide polymorphisms (SNPs) were chosen based on the following: (a) the allele frequency of over five percent in published literature or databases [56] (b) validated allelic substitutions, and/or (c) functional changes linked with allelic substitution reported in the literature. These included three single nucleotide polymorphisms (SNPs) in estrogen receptor 1 (ESR1: IVS1-397C.T rs2234693, IVS1-351A.G rs9340799), and Ex4-122C.G (rs1801132) gene, two SNPs in the estrogen receptor 2 (ESR2:-789 A.Crs1271572, 1082 G.A rs1256049) gene and one SNP in progesterone receptor (PGR: ins/del rs1042838). Genomic DNA was extracted from 5 ml peripheral blood leukocytes according to standard salting out method [57]. The blood sample and the clinical details were collected from each participant at recruitment. The polymorphisms were genotyped using the PCR or PCR-restriction fragment length polymorphism method as described earlier by Lai et al [58] and Rowe et al. [59]. The digested PCR fragments were separated on polyacrylamide gel, stained with ethidium bromide and observed with ultraviolet imaging system (Bio-Rad Model). Genotyping was performed without knowledge of the case or control status. A 10% masked, random sample of cases and controls were tested twice by different laboratory personnel and the reproducibility was 100%.

Statistical Analysis
Descriptive statistics were presented as mean and standard deviation [SD] for continuous measures while absolute value and percentages were used for categorical measures. The chi-square goodness of fit test was used for any deviation from Hardy Weinberg Equilibrium in controls. Differences in genotype and allele frequencies between study groups were estimated by chisquare test. Unconditional multivariate LR was used to estimate odds ratios [ORs] and their 95% confidence intervals [CIs] adjusting for age and sex. The ORs were adjusted for confounding factors such as age and gender. A two-tailed p-value of less than 0.05 was considered a statistical significant result. All statistical analyses were performed using SPSS software version 16.0 (SPSS, Chicago, IL, USA). Haplotype analysis was performed using SNPstatwww.snpstats.in [60].
Multifactor dimensionality reduction (MDR) method is nonparametric, genetic model-free method for overcoming some of the limitations of logistic regression (i.e. sample size limitations) for the detection and characterization of gene-gene interactions [61]. In MDR, multilocus genotypes are pooled into high risk and low risk groups, effectively reducing the genotype predictors from n dimensions to one dimension (i.e. constructive induction). The new one-dimensional multilocus genotype variable is evaluated for its ability to classify and predict disease status through crossvalidation and permutation testing. The MDR software (version 2.0 beta8) was applied to identify high-order gene-gene interactions associated with GBC risk. In our study, the best candidate interaction model was selected across all multilocus models that maximized testing accuracy and the cross-validation consistency (CVC). Furthermore, validation of models as effective predictors of disease status was derived empirically from 1000 permutations, which accounted for multiple comparison testing as long as the entire model fitting procedure was repeated for each randomized dataset to provide an opportunity to identify false positives. The MDR permutation results were considered to be statistically significant at the 0.05 level. All the variables identified in the best model were combined and dichotomized according to the MDR software and their ORs and 95% CIs in relation to GBC risk were calculated. Finally, combined effect of the variables in the best model by the number of risk genotypes was evaluated using logistic regression analysis.
Classification and regression tree (CART) analysis was performed using the SPSS ver. 16 software to build a decision tree via recursive partitioning [26]. For the analysis, decision tree was created by splitting a node into two child nodes repeatedly, beginning with the root node that contains the total sample. Before growing a tree, we choose measure for goodness of split using Gini criteria, by which splits were found that maximize the homogeneity of child nodes with respect to the value of the target variable. After the tree was grown to its full depth, a pruning procedure was performed to avoid over fitting the model. Finally the risk of various genotypes was evaluated by using the logistic regression analysis. The ORs and 95% CIs were adjusted for age and sex, with treating the least percentage of cases as the reference.

In Silico Analysis
The putative functional effects were determined in both coding and non-coding regions of ESR and ESR-2 gene by online web servers FASTSNP (http://fastsnp.ibms.sinica.edu.tw) and F-SNP http://compbio.cs.queensu.ca/F-SNP/ [62,63] and in case of coding regions the effect on the protein structure was considered. The following features were used to identify the effect of SNPs in non-coding regions: Transcription factor binding sites (TFBS), Intron/exon border consensus sequences (splice sites), Exonic splicing enhancers (ESEs), and Triplex-forming oligonucleotide (TFO) target sequences.