Mitochondrial DNA Haplogroups and Breast Cancer Risk Factors in the Avon Longitudinal Study of Parents and Children (ALSPAC)

The relationship between mitochondrial DNA (mtDNA) and breast cancer has been frequently examined, particularly in European populations. However, studies reporting associations between mtDNA haplogroups and breast cancer risk have had a few shortcomings including small sample sizes, failure to account for population stratification and performing inadequate statistical tests. In this study we investigated the association of mtDNA haplogroups of European origin with several breast cancer risk factors in mothers and children of the Avon Longitudinal Study of Parents and Children (ALSPAC), a birth cohort that enrolled over 14,000 pregnant women in the Southwest region of the UK. Risk factor data were obtained from questionnaires, clinic visits and blood measurements. Information on over 40 independent breast cancer risk factor-related variables was available for up to 7781 mothers and children with mtDNA haplogroup data in ALSPAC. Linear and logistic regression models adjusted for age, sex and population stratification principal components were evaluated. After correction for multiple testing we found no evidence of association of European mtDNA haplogroups with any of the breast cancer risk factors analysed. Mitochondrial DNA haplogroups are unlikely to underlie susceptibility to breast cancer that occurs via the risk factors examined in this study of a population of European ancestry.


Introduction
The relationship between mitochondrial DNA (mtDNA) and breast cancer has been frequently explored. Carcinogenesis has been associated with oxidative stress, with mitochondria acting as a major source of production of reactive oxygen species (ROS) [1]. Additionally, the lack of protective histones and a limited capacity for DNA repair [2,3] has meant that the mitochondrial genome is particularly susceptible to damage by ROS, which in turn could affect the mitochondrial role in energy metabolism, apoptosis and aging [4]. Most publications have focused on somatic mutations in mtDNA.

Breast Cancer Risk Factors
The selection of breast cancer risk factors was based on worldwide research data published by the World Cancer Research Fund [32] and Cancer Research UK [33].
Lifestyle factors showing robust evidence for increasing the risk of breast cancer have been identified in premenopausal women, postmenopausal women, or both. Risk factors for postmenopausal breast cancer are: overweight or obesity and a greater weight gain in adulthood. Having a greater birth weight is a risk factor for premenopausal breast cancer. Alcohol intake and a greater linear growth (marked by adult attained height) have been associated with both pre-and postmenopausal breast cancer. Conversely, factors that reduce the risk of breast cancer include increased physical activity, and breastfeeding, whereas being overweight or obese acts as a protective factor in premenopausal women.
Other established risk factors are older age, White ethnicity, family history of breast cancer, a prior diagnosis of cancer, early menarche, late natural menopause, nulliparity, first pregnancy after the age of 30, hormone replacement therapy, use of oral contraceptives, high bone mineral density, diabetes mellitus and exposure to radiation such as X-rays.
Possible breast cancer risk factors include smoking and night shift work. In contrast, a healthy diet and regularly taking aspirin or other non-steroidal anti-inflammatory drugs are considered protective factors.
We also tested women having had reproductive surgery (hysterectomy and/or oophorectomy), and using non-oral hormonal contraceptives, which may affect breast cancer risk by altering hormone concentrations, and serum levels of sex hormone binding globulin (SHBG), testosterone, androstenedione, dehydroepiandrosterone-sulfate (DHEAS), insulin like growth factor I (IGF-I), insulin-like growth factor II (IGF-II) and insulin-like growth factor binding protein 3 (IGFBP-3).
More detailed information on the association of the risk factors considered with breast cancer can be found in the literature [34][35][36][37][38].
Data on most risk and protective factors were available in ALSPAC from mothers, children or both.

Variables Analyzed
Details on the continuous and categorical variables examined in mothers and children are given in Tables S1 and S2 showing the complete set of ALSPAC participants of European ancestry. Data were obtained from questionnaires answered by the subject or by the subject's mother during childhood, as well as from clinic visits at different time points. We limited the analysis to variables measured approximately every two years in the children.
In some cases, usually when numbers were low, we created new variables that reflected whether the individual had ever experienced the activity or shown the trait of interest by combining the different instances where it was assessed. We did this for smoking, having diabetes, having had cancer, having undergone a hysterectomy and/or an oophorectomy, doing night shift work, taking oral contraceptives, using non-oral hormonal contraceptives, receiving hormone replacement therapy, taking aspirin, being a biological parent, undertaking physical activity and getting X-rays. The variable age of menarche in the children was determined by integrating all the information provided by the mother and the child [39].

Mitochondrial DNA Genotyping
Genotyping methods for mitochondrial and nuclear DNA polymorphisms in ALSPAC have been previously described [40]. Haplogroup assignment was performed using HaploGrep [41]. Mitochondrial DNA haplogroups of mothers were inferred from those of their children in view of the maternal inheritance of mtDNA.
Samples with a quality score of more than 80% were included in the analysis (238 samples were excluded, Table S3). Additionally, we ran the analysis using a quality score cut-off point of >90%.
We grouped the clades as follows: European = H (H + V + subclade R0), J, K, T, U, other European (I + W + X + subclades N1, R1, R3); South Asian = M + subclade R5; East/Southeast Asian = A + C + D; and African = L. The 'other European' group consisted of haplogroups that were present in less than 3% of the sample.
We restricted our analysis to all individuals who were of European genomic ancestry (as detected by a multidimensional scaling analysis seeded with haplotype map (HapMap)2 individuals) or self-identified as White (if data on genomic ancestry was missing) and carried a European mtDNA haplogroup. There was information on 7781 mtDNA haplogroups in our working dataset.

Statistical Analysis
We used linear regression to investigate the association of continuous variables with mtDNA haplogroups and chi-squared tests to examine differences in categorical variables. Adjustment for confounders was carried out using linear and logistic regression models with continuous and categorical variables, respectively. When categorical variables exhibited more than two ordered categories ordinal logistic regression was run, also adjusted for confounders. Confounders introduced in the models were age, sex, gestational age and the top 10 principal components accounting for population stratification, where appropriate.
We checked that the residuals of the linear regression of each continuous variable on mtDNA haplogroups were normally distributed. Only residuals for DHEAS levels in children showed a markedly non-normal distribution, and therefore the variable was natural log-transformed [42].
Similarly to Howe et al. [40], we used pairwise correlation to determine the number of independent variables to account for when applying the Bonferroni correction for multiple testing [43]. Polychoric correlation was used with binary and ordinal variables. Variables showing a correlation coefficient of 0.8 and above were considered non-independent (data not shown). There were 43 independent variables (out of 59) in the mothers and 48 independent variables (out of 86) in the children, therefore the multiple testing adjusted p-value cut-off was 0.001 in both cases.
We tested whether there was residual population stratification due to mtDNA clustering within our working dataset of European/White individuals by plotting the top two principal components by mitochondrial lineage as reported by Erzurumluoglu et al. for Y chromosome haplogroups in ALSPAC [44].
All analyses were performed with the statistical package Stata 14 (StataCorp, College Station, TX, USA).
Statistical power was calculated with mitPower using binary variables, as this approach has not been developed for continuous or ordinal variables yet (http://bioinformatics.cesga.es/ mitpower/) [45].

Results
Mitochondrial DNA haplogroups found in mothers and children of ALSPAC that were used in this study are shown in Table 1. For a more detailed haplogroup report see Table S3. The most frequent haplogroup was HV, representing almost 50% of the sample (49.3%), in agreement with the probable haplogroup composition of the UK estimated in a recent mtDNA study [46].
It is interesting to note, although not completely unexpected [47], that despite selecting individuals of European genomic ancestry and White ethnicity around 1% of them carried non-European mtDNA haplogroups (i.e., A, C, D, L, M, N) (Table S3), and were therefore excluded from the analysis. This can also be observed in individuals with an mtDNA quality score over 90% (Table S3). We uncovered no evidence of residual population stratification in the group of participants who carried a European mtDNA haplogroup (Figures S1 and S2). No differences were identified in the distribution of mtDNA haplogroups by sex of the child (p = 0.15).
In the unadjusted analyses sample sizes ranged from 143 to 7629 in the mothers, and from 142 to 7373 in the children, whereas sample sizes for the adjusted analyses were between 100 and 4863, and between 137 and 6838, respectively.
Given the sample sizes available for the binary variables, the haplogroup frequencies and the number of haplogroups included in the analysis, we had 80% power to detect odds ratios (OR) of 1.2 (ever smoked) to 1.7 (being a biological parent) at an α level of 0.05 if considering HV as the risk haplogroup. Under the same conditions, at the significance level corrected for multiple testing (p ≤ 0.001), those ORs become~1.3 to 2.1. This calculation excludes the variable 'having diabetes' in children as there were only 26 subjects with the disease and mtDNA data in ALSPAC.

Association between Mitochondrial DNA Haplogroups and Breast Cancer Risk Factors in ALSPAC Mothers
In the unadjusted analysis 10 nominal associations of mtDNA haplogroups with breast cancer risk factors were found (p ≤ 0.05). Most of these associations involved body composition variables such as body mass index (BMI), weight, height and bone mineral density (Table S4). Seven associations were apparent after correction for age (or gestational age in the case of IGF and sex hormone measurements in pregnancy) and the top 10 principal components (Table 2). However, no strong evidence of association was present after applying a Bonferroni correction for multiple testing.

Association between Mitochondrial DNA Haplogroups and Breast Cancer Risk Factors in ALSPAC Children
Likewise, among the children no associations between mtDNA haplogroups and breast cancer risk factors were uncovered after multiple testing correction. All three associations showing a p ≤ 0.05 in unadjusted models were related to BMI and height (Table S5) and two of these were also detected after controlling for age, sex and the top 10 principal components (Table 3).
Similar results overall were obtained when using the more stringent quality score threshold of >90% (Table S6).

Discussion
In this study we have not found evidence that major mtDNA haplogroups underlie differences in breast cancer risk factor distribution. This finding in some way supports previous research showing that mtDNA lineages are not associated with breast cancer risk [23,48], although it is still possible that mtDNA variation directly affects cancer development without going through any of the risk factors investigated here. However, we did not observe any associations with cancer-specific traits available in the cohort, such as any cancer diagnosis in the mother or a breast cancer diagnosis in her biological mother. In addition, no association was evident with having had a mammogram either. Conversely, if mtDNA variation plays a role in breast cancer onset via any of the tested exposures it might represent a small increase in risk. Nevertheless, large scale case-control studies are needed to properly examine the association of mtDNA haplogroups with breast cancer.
We tried to reduce sources of bias such as small sample sizes and population stratification by using the ALSPAC cohort, where we had over 7700 individuals with mtDNA haplogroup data who were of European genomic ancestry or self-identified as being of White ethnicity. In addition, we ran regression models adjusted for principal components that reflect population structure in the South West region of the UK. Because ALSPAC has collected such a comprehensive set of phenotypes we were also able to examine most of the established and possible breast cancer risk factors.
Among the limitations of our study, the fact that a handful of traits examined had a low number of observations, in particular the hormone measures, decreased our confidence in these results. The minimum difference detectable with 80% power was an OR of 1.2 for the binary variables, as estimated using mitPower. In addition, some of the derived variables grouped all instances of a phenotype together, which may have prevented us from noticing an effect that depended on the frequency of such a phenotype.
Pre-Bonferroni correction, we detected associations of mtDNA haplogroups with body composition variables (mainly BMI, weight, height), which were seen in mothers as well as children and at various instances across the lifetime. A few earlier studies have shown an association of mtDNA haplogroups with obesity and obesity-related traits [17,49,50], while others reported no evidence of association [51,52]. In our analysis these associations have not survived multiple testing correction, so it is possible that they have arisen by chance or are the result of persisting population stratification (as different subsamples of mothers and children responded to questionnaires and were involved in the clinics), given that height and BMI are considerably structured across Europe [53]. On the other hand, we did not find any discernible stratification beyond what was accounted for by the use of genome-wide principal components, in agreement with a recent study that examined Y chromosome haplogroups in ALSPAC and showed that clustering by male lineage did not affect the association between autosomal single nucleotide polymorphisms (SNPs) and BMI [44].
Whilst we did not have a replication cohort to confirm our findings, the analysis was run in two groups of individuals, albeit related, each with a different set of phenotypes. Further investigation is needed to strengthen the results presented here; however, these could prove useful to generate hypotheses for future, more powerful studies.

Conclusions
Well-established and possible breast cancer risk factors were not found to be associated with mtDNA haplogroups in ALSPAC, a cohort of predominantly European ancestry. This study can serve as the basis for a further detailed analysis of the influence of mtDNA variation on nutritional, anthropometric and lifestyle exposures that underlie the susceptibility to breast cancer and other cancer types.
Supplementary Materials: The following are available online at http://www.mdpi.com/2073-4425/9/8/395/s1, Table S1: Breast cancer risk factors in ALSPAC mothers of European ancestry, Table S2: Breast cancer risk factors in ALSPAC children of European ancestry, Table S3: Distribution of major mitochondrial DNA haplogroups and subgroups in ALSPAC mothers and children of European descent based on HaploGrep quality scores, Table S4: Breast cancer risk factors and major mtDNA haplogroups in ALSPAC mothers of European ancestry, unadjusted analysis, Table S5: Breast cancer risk factors and major mtDNA haplogroups in ALSPAC children of European ancestry, unadjusted analysis, Table S6: p-values obtained in the adjusted analysis using a quality score threshold of >90% compared to >80%, Figure S1: Top two principal components (PCs) by mtDNA haplogroup in ALSPAC mothers, Figure S2: Top two principal components (PCs) by mtDNA haplogroup in ALSPAC children.
Author Contributions: C.B. and S.R. conceived the study. A.M.E. generated the mtDNA haplogroup data. V.R. and C.B. analyzed the data. C.B. wrote the paper. All authors interpreted the data, read and approved the final manuscript.