CD44 Gene Polymorphisms in Breast Cancer Risk and Prognosis: A Study in North Indian Population

Background Cell surface biomarker CD44 plays an important role in breast cancer cell growth, differentiation, invasion, angiogenesis and tumour metastasis. Therefore, we aimed to investigate the role of CD44 gene polymorphisms in breast cancer risk and prognosis in North Indian population. Materials & Methods A total of 258 breast cancer patients and 241 healthy controls were included in the case-control study for risk prediction. According to RECIST, 114 patients who received neo-adjuvant chemotherapy were recruited for the evaluation of breast cancer prognosis. We examined the association of tagging SNP (rs353639) of Hapmap Gujrati Indians in Houston (GIH population) in CD44 gene along with a significant reported SNP (rs13347) in Chinese population by genotyping using Taqman allelic discrimination assays. Statistical analysis was done using SPSS software, version 17. In-silico analysis for prediction of functional effects was done using F-SNP and FAST-SNP. Results No significant association of both the genetic variants of the CD44 gene polymorphisms was found with breast cancer risk. On performing univariate analysis with clinicopathological characteristics and treatment response, we found significant association of genotype (CT+TT) of rs13347 polymorphism with earlier age of onset (P = 0.029, OR = 0.037). However, significance was lost in multivariate analysis. For rs353639 polymorphism, significant association was seen with clinical tumour size, both at the genotypic (AC+CC) (P = 0.039, OR = 3.02) as well as the allelic (C) (P = 0.042, OR = 2.87) levels. On performing multivariate analysis, increased significance of variant genotype (P = 0.017, OR = 4.29) and allele (P = 0.025, OR = 3.34) of rs353639 was found with clinical tumour size. In-silico analysis using F-SNP, showed altered transcriptional regulation for rs353639 polymorphism. Conclusions These findings suggest that CD44 rs353639 genetic variants may have significant effect in breast cancer prognosis. However, both the polymorphisms- rs13347 and rs353639 had no effect on breast cancer susceptibility.


Introduction
Breast cancer is the commonest cancer worldwide and second to cervical cancer in women mortality [1]. A large number of environmental and genetic factors are known to play an important role in breast cancer development and prognosis. In recent years, several breast cancer susceptibility genes have been identified with BRCA1 and BRCA2 are major genes related to 15% of hereditary breast cancer cases [2,3]. Thus, further studies are needed to identify other genes having an impact on breast cancer risk and prognosis which may likely to play a major role in risk prediction.
Breast cancers contain few distinct cells called breast cancerinitiating cells (BCICs), which are characterized by the expression of CIC biomarkers [4]. CD44 is one such biomarker. CD44 gene is located on chromosome 11p13 [5]. The encoded protein is a cell surface glycoprotein, involved in a number of biological processes including lymphocyte migration, extravasation, homing, activation and apoptosis [6,7,8,9,10,11,12,13]. Many studies have also stated its role in tumor metastasis [14,15]. It is also a receptor for hyaluronic acid. Recent studies have shown CD44 and its interaction with hyaluronan regulate breast cancer cell proliferation, migration and invasion. In addition genetic variants of CD44 were found to be associated with breast cancer patient survival, risk prediction and prognosis [16,17,18].
SNP rs13347 was previously reported to be significantly associated with breast cancer risk and prognosis in Chinese population [16]. However, in view of limited studies [16,18,19], we aimed to determine the association of previously significant reported SNP (rs13347), together with taggerSNP (rs353639) in the CD44 gene of Hapmap-GIH population with breast cancer risk and prognosis in North Indian population.

Ethics Statement
The study including the consent process was approved by the ethics committee of Sanjay Gandhi Post Graduate Institute of Medical Sciences (SGPGIMS), Lucknow, India and the authors followed the norms of World's Association Declaration of Helsinki. Written informed consent was taken from each subject.

Study Population
The present study consisted of 258 histopathologically confirmed breast cancer patients from north Indian population. Patients were enrolled from the outpatient department (OPD) of Endocrine & Breast Surgery, and Radiotherapy, SGPGIMS, Lucknow, who have completed their treatment as planned between the period from April, 2010 to Oct, 2012. The patients were subjected to detailed demographical, clinical and pathological investigations. Staging of cancer was documented according to the AJCC-TNM classification system [20]. During the same time, age and ethnicity matched 241 healthy controls were recruited from volunteers who came to the hospital for their routine checkups, unrelated to patients and to each other. Selection criteria for controls included no evidence of any personal history of cancer or other malignant conditions.
Out of 258 patients, neo-adjuvant chemotherapy (NACT) was administered to 114 locally advanced or large operable breast cancer (LOBC or LABC) patients. Therefore, case-control study for evaluating breast cancer risk was done in 258 patients while 114 patients were recruited in case only study for evaluation of response to NACT. Pathological response to NACT was recorded according to Response Evaluation Criteria in Solid Tumors (RECIST criteria) [21] and patients were categorized as patho-logic complete responders ($30% tumour regression) and non pathologic responders (,30% tumour regression).

TagSNPs Selection
TagSNPs were selected from the Haploview software 4.2 (Mark Daly's lab of Broad Institute, Cambridge, MA, Britain) [22], based on the GIH population data of HapMap (HapMap Data Rel 27 PhaseII +III, Feb 09, on NCBI B36 assembly, dbSNP b126). TagSNPs that captured all the known common SNPs (with minor allele frequencies of .0.1) in the CD44 gene, with a pairwise correlation r 2 .0.8 were selected.
TaggerSNP rs 353639 was found to represent the known SNPs in the haplotype blocks 3 and 4 in the CD44 gene of GIH population ( Figure S1). It was also found to represent the haplotype block 5 in Caucasian (CEU in Hapmap) population ( Figure S1). Previously significantly reported SNP (rs13347) in Chinese population also represents the haplotype block 9 ( Figure 1).

Genotyping
Genomic DNA was extracted from the venous blood using the standard salting out method [23]. The quality and quantity of DNA was checked by using Nanodrop spectrophotometer (Thermo Fisher Scientific/Nanodrop Products, Wilmington, Delaware, USA). Genotyping of both the SNPs: rs13347 and rs353639 was carried out using Taqman allelic discrimination assay. Primers and probes were supplied as pre-designed assays by Applied Biosystems (Foster City, CA, USA). Genotyping was performed on an ABI 7500 Real Time PCR system using 96-well plates. All plates included negative controls (wells containing no

Statistical Analysis
Effective sample sizes for case-control study was calculated by Quanto 1.1 ver. software and the power was set at 80% [24]. Descriptive statistics of patients and controls were presented as the mean and standard deviations (SDs) for continuous measures, while frequencies and percentages were used for categorical measures. The null hypothesis that the Hardy-Weinberg equilibrium holds was tested using a chi-squared test for deviation from Hardy-Weinberg equilibrium. The relationship between genetic variants and clinicopathological features along with breast cancer treatment response was examined using univariate analysis through Fisher's exact test. Similarly, multivariate analysis was also performed by using binary logistic regression. Clinicopathological features included in the analysis were age, tumor size, HER2 status, hormone receptor status, histology, grade and nodal status. All statistical analysis was done using SPSS statistical analysis software, version 17.0 (SPSS, Chicago, IL, USA). Association was expressed as odds ratios (OR) with 95% confidence intervals (CI). The association was considered to be significant when the P-value was ,0.05.

Population Characteristics
The detailed demographic, clinical and pathological characteristics of study subjects were illustrated in Table 1.

CD44 gene Polymorphisms with Breast Cancer Risk
The observed genotype frequencies of the two polymorphisms studied in healthy controls were in accordance with Hardy-Weinberg equilibrium. Table 2 shows the risk of breast cancer in context with each of the SNPs studied in CD44 gene. No significant differences were observed in the frequency distribution of rs13347 and rs353639 polymorphisms between breast cancer patients and healthy controls, both at the genotypic and allelic levels. Further analyzing our study subjects on stratification based on menstrual status, we did not find significant correlations of the genotypes as well as alleles of both the polymorphisms with breast cancer risk.

CD44 gene polymorphisms with Breast Cancer Prognosis
Univariate analysis by Fisher's exact test and Multivariate by logistic regression was employed to correlate the genotypes of the two CD44 polymorphisms with the clinicopathological features and breast cancer pathologic response to NACT.
In univariate analysis, we found significant correlations of genotype (CT+TT) of rs13347 polymorphism with earlier age of onset (P = 0.029, OR = 0.037) (Table 3). However, we could not find any association at the allelic level ( Table 3). The data also revealed significant association of rs353639 polymorphism with clinical tumour size, both at the genotypic (AC+CC) (P = 0.039, OR = 3.02) as well as the allelic (P = 0.042, OR = 2.87) levels (Table 4).
Next, we performed a multivariate analysis using a logistic regression to evaluate the correlations of both the CD44 gene polymorphisms with the clinicopathological features and breast cancer treatment response to NACT. In rs13347 polymorphism, we observed that significant association of genotypes with earlier age of onset was lost on applying multivariate analysis (Table 5). No significant association of the alleles with any of the clinicopathological features as well as treatment response was seen (Table 5). On analyzing the data in rs353639 polymorphism with logistic regression, we found increased significance of both the genotype (P = 0.017, OR = 4.29) as well as allele (P = 0.025, OR = 3.34) with clinical tumour size when compared with the results of univariate analysis (Table 6). However, no significant association of both the polymorphisms was seen with treatment response to NACT. In Silico Analysis of CD44 gene Polymorphisms SNPs-rs13347 and rs353639 selected for the present study are located in 39 UTR and intron region (non-coding sequences) of the CD44 gene. Therefore, it was possible that this SNP may effect the transcription of the gene. In-silico analysis using F-SNP showed change in transcriptional regulation for both the selected SNPs (Table 7).

Discussion
CD44 is a transmembrane glycoprotein involved in many functions such as cell proliferation, angiogenesis, invasion and metastasis [27]. CD44 gene is composed of 20 exons [5] in two groups. One group consists of exons 1-5 and 16-20, which are expressed together whereas the other group has exons from 6-15. These 10 exons are alternatively spliced and can be included within the group one exons-5 and 16. Multi-functional nature of the CD44 depends on the binding of its ligand-hyaluronic acid [28]. There are two binding domains for hyaluronan-exon 2 and exon 5 [29]. Interaction of CD44 with hyaluronan is involved in the regulation of breast cancer through cell-cell adhesion and inhibited invasion [30]. However, altered binding of hyaluronan to CD44 can activate cell growth, survival, invasion and metastasis in breast cancer [31,32] as well as in other cancers [33,34]. On the basis of above studies, we can state that CD44 has a significant role in cancer development and prognosis. Therefore, the present study was carried out to evaluate the role of CD44 gene polymorphisms in north Indian breast cancer patients.
Two polymorphisms within the CD44 gene were studied to investigate the association of genetic variants with breast cancer risk prediction and prognosis. TaggerSNP approach was used to select the SNP (rs353639) that represents all known SNPs in the CD44 gene of GIH population. The previously significant reported SNP (rs13347) in Chinese breast cancer patients was also selected to replicate the results in our population. CD44 gene polymorphisms have not been widely studied with only few reports in  breast cancer worldwide [16,17,18,19]. However till date, there is no report on influence of rs353639 polymorphism except a Genome-wide association report for subclinical atherosclerosis in the NHLBI's Framingham Heart Study [35].
In this study, we sought to determine genetic variants of CD44 that may confer individual's risk to breast cancer in 258 patients and 131 healthy controls. However, we did not observe any significant differences in the frequency distribution of the genetic variants CD44 rs13347 between breast cancer patients and healthy controls. After stratification of our subjects on the basis of menstrual status also, no significant association was found. Our results are not in agreement with the single reported study on rs13347 polymorphism with susceptibility to breast cancer [16]. This study by Jiang et al., evaluated the variations in rs13347 polymorphism in 1,853 breast cancer patients and 1,992 healthy control subjects of Chinese population and variant genotype (CT+TT) conferred 1.72 times increased susceptibility to breast cancer. They also performed reporter assay to validate their findings and found variant genotype (CT+TT) carriers to have more CD44 expression than wild type (CC) carriers. Reasons for variations in results can be due to difference in ethnicity and it is possible that another linked SNP of CD44 may be conferring risk in our population.
Therefore, one taggerSNP (rs353639) in the CD44 gene of GIH population was selected to evaluate the effect on breast cancer risk. However, we still found no significant association of rs353639 polymorphism with susceptibility to breast cancer. We also did not find any association on sub-group analysis based on menstrual status. However, no study on the role of rs353639 polymorphism in cancer risk has been reported. Therefore, reasons for discrepancy in the effect of CD44 polymorphisms on breast cancer susceptibility may be because these polymorphisms have indirect role on breast cancer susceptibility. And their effects may possibly be mediated through linkage to some other key functional polymorphisms, especially in exon 2 and 5 regions of hyaluronan binding of CD44.
A study by Zhou et al., reported a unique SNP CD44 Ex2+14 A.G in the intron 1 region and found that patients with variant genotype had breast cancer at earlier ages, larger tumor burden, more regional lymph node metastasis and higher cancer recurrence [19]. Another study published by same author in 2012, identified 4 SNPs in exon 2 through sequencing and found significant association of the CD44 polymorphisms in exon 2 coding sequence with higher probability and higher cumulative risk for breast cancer [36]. A recent study also showed significant increase in the CD44 expression in breast cancer when compared to normal breast epithelium [37]. Thus, based on conflicting findings, there is a need to replicate the above findings in larger sample size of different ethnicity to come to a definitive conclusion.
We also investigated the effect of genetic variants of CD44 gene polymorphisms with clinicopathological features and pathologic response to NACT in a cohort of 114 patients. For rs13347 polymorphism, significant association of genotype (CT+TT) with earlier age of onset was found in univariate analysis but was lost on  applying multivariate logistic regression. It therefore highlighted that genetic variants do not alone play a role in breast cancer development and prognosis. It is also necessary to evaluate the role of confounding factors like age, clinical stage, pathological lymph node, grade, hormone and Her 2 neu receptor along with these known variants. Similarly, no significant difference of genotype and allele of this polymorphism was seen with pathologic response to NACT. On the contrary, Yao et al, 2009 observed higher expression of CD44 before chemotherapy in drug resistant cell lines than drug sensitive ones [38]. A study by Zhou et al., reported variant genotype of CD44 rs4756195 polymorphism to be associated with response to anthracyclines based chemotherapy in patients with breast cancer [18]. For GIH tagger SNP rs353639, we found variant genotype (AC+CC) and allele (C) to be significantly associated with increased clinical tumour size. These results were consistent even after multivariate analysis. Our findings are in agreement with the concept that genetic variations in CD44 gene may possibly effect the altered binding of its ligand-hyaluronan which leads to increased breast cancer cell growth and differentiation. The importance of this result was strengthened by performing bioinformatic analysis for prediction of functional effects for CD44 rs353639 polymorphism. Possible functional mechanisms include altered CD44 expression by differential binding of transcription factors.
Similar to rs13347 polymorphism, no association of genetic variants of CD44 rs353639 polymorphism was seen with pathologic response to NACT. Therefore, influence of CD44 gene polymorphisms in breast cancer treatment response to NACT is still not established.

Conclusions
Our study indicated that both the polymorphisms in CD44 gene might not have any effect on breast cancer risk prediction in north Indian population. But these polymorphisms have definitely some implications in breast cancer prognosis. To our best knowledge, present study is the first to report a taggerSNP based selection of CD44 gene polymorphisms with breast cancer risk and prognosis in North Indian subjects. However, study may require confirmation in larger population based cohorts. Furthermore, our findings need to be validated in breast cancer patients of different ethnicities with a gene expression functional assay.