Simple tandem repeat (TTTA)n polymorphism in CYP19 (aromatase) gene and breast cancer risk in Nigerian women

Background Breast cancer is the most common cancer and the leading cause of cancer related deaths in women worldwide. The incidence of the disease is increasing globally and this increase is occurring at a faster rate in population groups that hirtherto enjoyed low incidence. This study was designed to evaluate the role of a simple tandem repeat polymorphism (STRP) in the aromatase (CYP19) gene in breast cancer susceptibility in Nigerian women, a population of indigenous sub-Saharan African ancestry. Methods A case-control study recruiting 250 women with breast cancer and 250 women without the disease from four University Teaching Hospitals in Southern Nigeria was carried out between September 2002 and April 2004. Participants were recruited from the surgical outpatient clinics and surgical wards of the Nigerian institutions. A polymerase chain reaction (PCR)-based assay was employed for genotyping and product sizes were detected with an ABI 3730 DNA Analyzer. Results Conditional logistic regression analysis revealed that harboring the putative high risk genotypes conferred a 29% increased risk of breast cancer when all women in the study were considered (Odds ratio [OR] = 1.29, 95% confidence interval [CI] 0.83–2.00), although this association was not statistically significant. Subgroup analysis based on menopausal status showed similar results among premenopausal women (OR = 1.35, 95% CI 0.76–2.41 and postmenopausal women (OR = 1.27, 95% CI 0.64–2.49). The data also demonstrated marked differences in the distribution of (TTTA)n repeats in Nigerian women compared with other populations. Conclusion This study has shown that harboring 10 or more repeats of the microsatellite (TTTA)n repeats of the CYY19 gene is associated with a modest increased risk of breast cancer in Nigerian women.


Background
An aging world population with rising global burden of breast cancer [1], emerging evidence suggesting that postmenopausal obesity as well as weight gain over the adult years are positively associated with postmenopausal breast cancer [2] and the finding that breast cancer patients may have an inherently higher aromatase expression in breast adipose tissue when compared to healthy women [3] have increased research interest on mechanisms underlying aromatase action in breast carcinogenesis.
The CYP19 gene encodes aromatase, which is responsible for the rate-limiting step in the metabolism of C19 steroids to estrogens and is expressed in most breast carcinomas [4]. Five different polymorphisms in the CYP19 gene have been described in relationship to breast cancer risk, including the highly polymorphic tetranucleotide repeats in intron 5 [5][6][7][8][9], a C-T substitution in the 3' noncoding region of exon 10 [10], a cytosine to thymine substitution in codon 264 of exon 7 (resulting in amino acid Arginine to Cysteine substitution) [11], a silent G-A polymorphism at codon 80 in exon 3 [6], and a rare 3 base pair change within the promoter region of exon 1.4 [9]. Although the tetranucleotide repeat polymorphism is the most widely studied of all these variants in the CYP19 gene, there are a lot of inconsistencies in reports in various populations with respect to breast cancer susceptibility [5][6][7][8][9]. There is evidence suggesting that part of the population differences in breast cancer risk may be related to the wide racial variation in the distribution of various polymorphisms in different populations. This study was designed to evaluate the role of the tetranucleotide repeat polymorphism in breast cancer risk in an indigenous African population in Southern Nigeria.

Study subjects
A total of 500 study participants comprising 250 women with breast cancer and 250 age-matched control women without the disease were recruited from four University Written informed consent was obtained from participants willing to participate in the study prior to commencement of recruitment. At the end of the questionnaire session, anthropometric measurements including height, weight, waist and hip circumferences were taken.

Biological samples
Biological samples including 40 milliliters of blood in two 15-milliliter plain vacutainer tubes and one 10 milliliter K 3 -EDTA tube were collected at the end of the questionnaire session. Samples were centrifuged within 10 h of collection; buffy coat was separated from plasma and red cells in the K 3 -EDTA vacutainer tubes and carefully aliquoted into 3 ml tubes while clots separated from serum in the plain vacutainer tubes were turned into two 20 ml plain tubes and stored at -20°C in each of the Nigerian study sites until transferred in ice packs to the Nigerian coordinating center at the University of Benin Teaching Hospital where samples were stored at -20°C until transferred to the University of Pittsburgh in dry ice packs. Samples were then stored at -80°C at the University of Pittsburgh until DNA extraction.

DNA extraction
DNA was extracted from buffy coats using the QIAamp Mini Kit protocol (Qiagen, Chatsworth, CA) or blood clots using the QIAamp Midi Kit protocol (Qiagen, Chatsworth, CA) for participants in whom buffy coat was unavailable. The DNA was stored at 4°C until amplified by polymerase chain reaction (PCR) and used for genotyping on the ABI 3730 DNA Analyzer (Applied Biosystems, Foster City, CA).

PCR and genotyping
PCR amplification of the polymorphic fragment was generated using primers, 5'-CAA CTC GAC CCT TCT TTA TG-3' (forward) and 5'-GTT TGA CTC CGT GTG TTT GA-3' (reverse). The forward primer was 5'-labeled with a fluorescent dye (FAM, 5-carboxy-flouorescein) for automated fragment analysis. A 50 µl reaction mixture containing 2 µl of genomic DNA, 8 µl of deoxynuclotide triphosphates, 0.5 µl each of forward and reverse primers, 5 µl of 10× buffer, 1.5 µl of MgCl and 2 µl of Taq polymerase was placed in Thermalcycler (MJ Dyad PCR machine). After denaturing for 5 min at 95°C, the DNA was amplified for 35 cycles at 95°C for 30 s, 55°C for 30 s, 72°C for 45 s, followed by a 5 min extension at 72°C. A positive control containing genomic DNA and a negative control containing everything except DNA were included in the PCR experiment. Five µl of each PCR product, including the controls, were run on a 1% agarose gel to ensure that the expected fragment size was generated.
The product sizes were then detected with an ABI 3730 DNA Analyzer (Applied Biosystems, Foster City, CA) at the Genomics and Proteomics Core Laboratory at the University of Pittsburgh. Allele calling was carried out using Genotyper software (Applied Biosystems, Foster City, CA).

Statistical analysis
Data analysis was carried out using Statistical Analysis Software (SAS), Version 8.0. Association between demographic, reproductive and anthropometric variables and breast cancer risk was first assessed using conditional logistic regression. Each matched case was paired with the corresponding control to enable differences between the cases and controls to be calculated. First each variable was assessed alone in a univariate model. The strength of significant variables was further assessed by building multivariate models.
Conditional logistic regression was also used to assess the relationship between the distribution of the genotypes of the CYP19 tetranucleotide (TTTA) n repeats and breast cancer risk. Based on information available in the literature, the CYP19 genotypes were categorized into two groups for the purpose of conditional logistic regression. There is evidence suggesting that an increased risk of breast cancer is associated with 10 or more (TTTA) n repeats. Genotypes containing at least one allele with 10 or more repeats were categorized as high risk genotypes while genotypes containing alleles with less than 10 repeats were categorized as low risk genotypes. Because of the small number of individuals carrying these genotypes, we had to collapse all genotypes in the high risk category as one group in the logistic regression model. A multivariate conditional logistic regression model containing the significant predictors of breast cancer risk in the descriptive analysis and the CYP19 genotype data was then developed to assess gene environment interactions.

Association of demographic/reproductive characteristics with breast cancer risk
Conditional logistic regression analysis was used to examine the association between demographic/reproductive variables of study participants and breast cancer risk. In the final multiple conditional logistic regression model, only family history of breast cancer (OR = 8.07, 95% CI 1.003-64.95), waist/hip ratio (WHR) (OR = 1.98, 95% CI 1.27-3.10), age at first full term pregnancy (OR = 1.31, 95% CI 1.01-1.71), and higher level of education (OR = 1.33, 95% CI 1.05-1.74) were found to be significant predictors of breast cancer risk.

Distribution of CYP19 (TTTA) n repeat alleles
All women Seven different alleles (designated A1 to A7, as shown in Table 1) of the tetranucleotide (TTTA) n repeat polymorphism of the CYP19 gene were identified in 228 cases and 228 control subjects in whom polymerase chain reaction (PCR) amplification and genotyping were successful. The most common alleles were A1 (356 base pairs [bp], 7 repeats) and A2 (360 bp, 8 repeats) alleles. The A1 allele was slightly more common in the cases (allele frequency 0.3385) than in the control subjects (allele frequency 0.3026) while allele A2 was less common in the cases (allele frequency 0.4978) compared to the control subjects (allele frequency 0.5504) as shown in Table 1. The distribution of these alleles was not significantly modified by menopausal status.

Distribution of CYP19 (TTTA) n repeat polymorphism genotypes All women
The distribution of the various genotypes of the CYP19 (TTTA) n repeat polymorphism are shown in Table 2. Homozygosity for the A1 (7 repeats) allele was more common in the cases (27 [ Table 3.

Postmenopausal women
Of 108 postmenopausal breast cancer cases and 108 agematched control subjects in the study, PCR amplification and genotyping was successful in 102 cases and 99 controls. As shown in Table 4

Association of CYP19 (TTTA) n genotypes and breast cancer risk
Conditional logistic regression was used to evaluate the association between the distribution of the CYP19 (TTTA) n genotypes and breast cancer risk. For the purpose of this analysis, the genotypes were categorized into two broad groups; genotypes with at least one allele containing 10 or more (TTTA) n repeats were classified as high-risk genotypes while those containing less than ten alleles were considered as low-risk genotypes. This classification was adopted based on evidence in the literature suggest-

Discussion
This study has demonstrated marked variation in the distribution of the tetranucleotide (TTTA) n repeats of the CYP19 (aromatase) gene among Nigerian women compared with other population groups. The (TTTA) n repeats ranged from seven to 14 repeats with seven and eight repeats constituting the major alleles with allele frequencies of 0.3385 and 0.3026 for the seven repeat allele in the cases and controls respectively and 0.4978 and 0.5504 for the eight repeat allele in cases and controls respectively. Among some Caucasian population groups, the seven repeat allele appears to be the most common, with allele frequency of 0.484 in the Nordic population [5] and 0.50 in non-Latina Whites in Southern California [9]. In the British population, the six repeat allele appears to be the most common (allele frequency of 0.379) [8] with the seven repeat allele frequency of 0.021. A much higher frequency of 0.76 was reported for the seven repeat allele among African American women in southern California [9]. Populations of Japanese descent appear to harbor the highest frequency of the seven repeat allele (allele frequency of 0.85) among Japanese Americans in Southern California and Hawaii [9].
Our data suggest that harboring at least one 10 or more repeats was associated with a 29% increased risk of breast cancer (OR = 1.29, 95% CI 0.83-2.00) and this risk profile was unaltered in premenopausal (OR = 1.35, 95% CI 0.76-2.41) and postmenopausal women (OR = 1.27, 95% CI 0.64-2.49). It is important to note that four of the six study participants that harbored the homozygous variants of the putative high risk alleles were cases. Although not statistically significant, probably due to the small number of cases with these homozygous genotypes, there was a 4-fold increased risk of breast cancer in women who were homozygous for these alleles (OR = 4.00, 95% CI 0.45-35.79). Controlling for other significant reproductive and lifestyle factors identified in our study subjects did not alter the risk profile. Our analyses was based on the speculation that harboring longer repeat lengths of the CYP19 tetranucleotide repeat polymorphism will increase the risk of breast cancer. This consideration was based on evidence from previous epidemiological studies demonstrating increased risk of breast cancer with longer repeat lengths of the tetranucleotide repeats.
Our findings of increased breast cancer risk in association with harboring longer repeats of the (TTTA)n tetranucleotide repeats have been noted in other populations. Kristensen et al [5], studied the association between the tetranucleotide simple tandem repeat polymorphism and breast cancer risk in a case-control study (sporadic cases: n = 182; familial cases: n = 185; and controls; n = 252) among Norwegian and Swedish individuals and observed a positive association between carriage of the 12 repeat allele [allele [12], (TTTA) 12 ] of CYP19 and risk of breast cancer compared with repeat lengths less than 12 (OR = 2.42, 95% CI, 1.03-5.80). These authors also observed a higher frequency of the 12 allele among estrogen-and progesterone receptor positive cases compared with receptor negative cases [5]. Using data from the Nurses' Health Study, Haiman et al [7] examined the relationship between the CYP19 tetranucleotide repeats and breast cancer risk among US Caucasian women; the authors found that harboring the 10 repeat allele [not identified in the Nordic population by Kristensen et al [5]] conferred a 3-fold increased risk of breast cancer (OR = 3.08, 95% CI 1.35-7.01). A non-significant 49% increased risk was also reported by these investigators for individuals harboring the 12 repeat allele (OR = 1.49, 95% CI 0.86-2.56). In a study among Caucasian women in the US, Siegelmann-Danieli and Buetow [6] observed a lower frequency of the 12 allele and higher frequency of the 7 allele among cases (n = 348) than controls (n = 145).
Other studies have not provided support for the association between CYP19 repeat alleles and breast cancer risk.
Differences in study size and ethnicity may account for the discrepancy between study results. The studies of Siegelmann-Danieli and Buetow [6] and Probst-Hensch et al [9], had less than 200 Caucasian cases and/or controls, which may result in unreliable allele frequency estimates for rare alleles. The positive association reported by Kristensen et al [5] was observed among Caucasian women of Scandinavian ancestry. High risk CYP19 alleles may be linked to functional genetic variations among specific Caucasian ethnic groups. Consequently, disproportionate frequencies of different Caucasian ancestral groups between cases and controls may influence associations between CYP19 alleles and disease risk.
Because of the weak association detected in most association studies involving single nucleotide polymophisms and breast cancer risk, haplotype-based studies have been proposed as a more powerful comprehensive approach to identify causal genetic variation underlying breast cancer and other complex diseases. Employing these techniques using data from the Multiethnic Cohort (MEC) Study recruiting African-American, Hawaiian, Japanese, Latina and white women, Haiman et al [12] utilized twenty-five haplotype-tagging SNPs (htSNPs) to predict the common haplotypes with high probability within the CYP19 gene and genotyped these htSNPs in a breast cancer case-control study nested in the MEC Study (cases, n = 1355; controls, n = 2580). These investigators observed significant haplotype-effects in block 2 [P = 0.01; haplotypes 2b (OR = 1.23; 95% CI, 1.07-1.40), 2d (OR = 1.28; 95% CI, 1.01-1.62)]. They also found a common long-range haplotype comprising block-specific haplotypes 2b and 3c to be associated with increased risk of breast cancer (haplotype 2b-3c: OR = 1.31; 95% CI, 1.11-1.54). Although this methodology does not require the casual variant to be identified and tested directly, it has the potential to highlight physical regions that harbor putative disease associated variants for further re-sequencing to uncover ethnicspecific disease associated variants.

Conclusion
Despite the inconsistencies in the epidemiological literature and inconclusiveness of current experimental observations on the role of aromatase in breast carcinogenesis, a combination of experimental [13][14][15][16][17], laboratory [4,[18][19][20][21][22] and clinical data [3,[23][24][25][26][27] strongly suggests a role for aromatase in the etio-pathological network of breast carcinogenesis. Our finding of a modest increased risk of breast cancer among women harboring 10 or more (TTTA) n repeat alleles of the CYP19 (aromatase) gene is in keeping with this proposition. As mentioned previously, some of the current epidemiological studies lack adequate sample size to detect the low risk profile impacted by lowpenetrance alleles such as the STRP being considered in this study and this may have affected our results. We hope to further investigate the association between this polymorphism and other variants in the CYP19 gene by recruiting more study participants. There is emerging consensus that future molecular epidemiological studies should focus more on polygenic models employing haplotytpe-based techniques to properly assess the contributions of various low-penetrance genes acting in specific metabolic pathways such as estrogen biosynthesis and metabolism pathway to disease risk. Such studies will require large sample sizes and the collaborative efforts of multi-center investigators. In addition, the combination of high through-put genotyping techniques with emerging technologies including microarray and proteomic technology will enhance our ability to further understand the genetic factors involved in breast carcinogenesis.

Abbreviations
Publish with Bio Med Central and every scientist can read your work free of charge