Relationship of established risk factors with breast cancer subtypes

Abstract Background Breast cancer is a heterogeneous disease, divided into subtypes based on the expression of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). Subtypes have different biology and prognosis, with accumulating evidence of different risk factors. The purpose of this study was to compare breast cancer risk factors across tumor subtypes in a large, diverse mammography population. Methods Women aged 40–84 without a history of breast cancer with a screening mammogram at three United States health systems from 2006 to 2015 were included. Risk factor questionnaires were completed at mammogram visit, supplemented by electronic health records. Invasive tumor subtype was defined by immunohistochemistry as ER/PR+HER2−, ER/PR+HER2+, ER, and PR−HER2+, or triple‐negative breast cancer (TNBC). Cox proportional hazards models were run for each subtype. Associations of race, reproductive history, prior breast problems, family history, breast density, and body mass index (BMI) were assessed. The association of tumor subtypes with screen detection and interval cancer was assessed using logistic regression among invasive cases. Results The study population included 198,278 women with a median of 6.5 years of follow‐up (IQR 4.2–9.0 years). There were 4002 invasive cancers, including 3077 (77%) ER/PR+HER2−, 300 (8%) TNBC, 342 (9%) ER/PR+HER2+, and 126 (3%) ER/PR−HER2+ subtype. In multivariate models, Black women had 2.7 times higher risk of TNBC than white women (HR = 2.67, 95% CI 1.99–3.58). Breast density was associated with increased risk of all subtypes. BMI was more strongly associated with ER/PR+HER2− and HER2+ subtypes among postmenopausal women than premenopausal women. Breast density was more strongly associated with ER/PR+HER2− and TNBC among premenopausal than postmenopausal women. TNBC was more likely to be interval cancer than other subtypes. Conclusions These results have implications for risk assessment and understanding of the etiology of breast cancer subtypes. More research is needed to determine what factors explain the higher risk of TNBC for Black women.


| INTRODUCTION
Breast cancer subtypes are typically classified based on immunohistochemistry according to the expression of estrogen receptor (ER), progesterone receptor (PR), and human epidermal growth factor receptor 2 (HER2). Tumors that are ER and/or PR positive and HER2 negative (ER/PR+HER2−) are the most common, accounting for approximately 73% of breast cancers. 1 Treatments targeting ER, PR, and HER2 pathways have improved breast cancer outcomes. However, drastic survival differences still exist by tumor subtype, with 5-year survival near 95% ER/PR+HER2− tumors but just over 75% for triple-negative breast cancers (TNBC), for which limited targeted therapies exist. 2,3 Better understanding of the etiologies of each subtype could identify pathways that could be targeted by treatment and preventive interventions.
The existing literature suggests that breast cancer subtypes have unique etiologies. 4 The canonical breast cancer risk factors used in risk prediction models, such as family history, breast biopsy, and hyperplasia, and reproductive risk factors largely reflect the risk of ER/PR+HER2− breast cancer. Prior studies have produced some inconsistent results with respect to the associations of body mass index (BMI) with the risk of breast cancer subtypes. 4 Furthermore, studies have differed in their assessment of interactions of BMI and breast density and the interactions of these factors with menopause status, which may explain some of the inconsistency across studies.
The purpose of the study was to compare breast cancer risk factors across breast cancer subtypes, with a specific focus on assessing associations of BMI and breast density and the interactions of these factors with menopause status.

| Study population
The study population included women aged 40 Women were excluded from the analysis if they had prior breast cancer, breast implants, or a known BRCA1/2 mutation. We additionally excluded women with less than 6 months of follow-up, including women diagnosed with breast cancer within 6 months of mammography to maintain temporality between risk factor ascertainment and cancer diagnosis. Finally, women who died but did not have a known date of death or date of the last contact were excluded from analyses. Details of the study populations and exclusions are provided in Figure 1. This study was approved by the University of Pennsylvania Institutional Review Board.

| Risk factors
We used a tiered strategy for the assessment of risk factors with the primary source being a risk factor questionnaire completed by the patient at the time of their mammogram. All risk factors were self-reported, with the exception of age and breast density. Missing information was then supplemented by electronic health records (EHR) when appropriate. If self-report BMI information was missing, EHR weight and height and/or BMI were used if the body measurement occurred within 1 year prior or within 6 months after screening mammogram (N = 3160, 1.6%). Imputation was used to estimate height and then calculate BMI for an additional 18,163 patients (9.2%) that did not have selfreported or EHR height data but did have weight data, using the median value for that site. Implausible BMI valuesthose <12 or >82-were considered missing. 5 Missing information on prior atypical hyperplasia was also extracted from EHR (N = 136). Missing information on BI-RADS (Breast Imaging Reporting and Data System) breast density was extracted from radiology reports using natural language processing as described previously (N = 8855). 6 Since BI-RADS density category titles changed during the course of the study time, the BI-RADS 4th edition 7 density categories (1, 2, 3 or 4) have been translated to the BI-RADS 5th edition corresponding categories (A, B, C, and D). 8 Missing prior biopsy information was obtained from linkage to pathology records (N = 92). Menopause status was defined based on age and self-reported menstruation status; patients were automatically considered postmenopausal if they had stopped menstruating or were over 55 years of age, and premenopausal otherwise. 9

| Cancer outcomes
Breast cancer diagnoses through December 31, 2018 were identified from health system cancer registries as well as the Massachusetts, Pennsylvania, New Jersey, and Delaware state cancer registries. Invasive cancers were characterized based on the expression of ER, PR, and HER2 from immunohistochemistry as reported to the cancer registries. In addition, HER2 expression was manually abstracted from EHR for cases diagnosed prior to 2010. Tumor subtypes were defined as ER and/or PR+HER2−, ER and/or PR+HER2+, ER and PR−HER2+, or ER and PR and HER2− (TNBC). Additionally, we categorized whether invasive breast cancer cases were screen-detected or not screen-detected for cases diagnosed through 2016.
We defined cases as screen-detected if there was a positive mammogram (BI-RADS 0, 3, 4, 5) within 1 year prior to cancer diagnosis date. We further defined cases as interval cancers if there was a negative mammogram (BI-RADS 1, 2) within 1 year of cancer diagnosis, consistent with established definitions. 10 Cancers that did not have a mammogram within 1 year prior to diagnosis were not coded as screen-detected or interval (N = 902, 37% of invasive cancers).

| Statistical analysis
Cox proportional hazards modeling was used to estimate the hazard ratios (HRs) for breast cancer subtypes for each risk factor. We ran separate models with each tumor subtype as the outcome, using a time origin of 6 months post mammogram date, with censoring upon the diagnosis of DCIS (ductal carcinoma in-situ), other tumor subtypes, death, date last contact for patients with the missing date of death or December 31, 2018 for patients not known to have died. We tested the interactions of BMI with menopause status, breast density with menopause status, and BMI with breast density for each tumor subtype, based on interactions reported in prior studies. 11,12 Additionally, we also tested the interaction of breast density with race/ethnicity. When testing interactions, breast density was grouped into two categories: non-dense for those with a density of BI-RADS A or B, and dense for those with a density of BI-RADS C or D. Additionally, we examined associations of the number of births with tumor subtypes among the subgroup of parous women. In addition, we performed site stratified Cox models, but since results were similar, unstratified models are presented. Missing data were considered to be an additional category in modeling, but estimates are not reported here. Sensitivity analyses were performed | 6459 MCCARTHY eT Al.
after multiple imputations using chained equations (MICE) to evaluate the effect of missing data on our results. 13 Finally, we performed logistic regression among cancer cases to look at the odds of the cancer being screen-detected or not, and the odds of the cancer being interval cancer or not according to breast cancer subtypes, adjusted for age, race, atypical hyperplasia, family history, breast density, BMI, and menopause status, factors that have been previously associated with interval cancer risk. 5,[14][15][16] An alpha level of 0.05 was considered statistically significant. Table 1 displays the characteristics of the study population. Together, the study population included 198,278 women with a median of 6.5 years of follow-up (IQR 4.2-9.0 years). Participants had a mean age of 54.3 years at the time of screening. About 11% of participants had ever had a breast biopsy, and 0.9% had previously had atypical hyperplasia.  Table S1. The associations of known breast cancer risk factors with breast cancer subtypes were assessed using multivariable models (  Table S2. In addition, models estimated using multiple imputations yielded similar results and are displayed in Table S5.

| RESULTS
As expected based on the previous literature, 11 there was a significant interaction of menopause status with BMI (p < 0.001). Overweight and obesity were more strongly associated with ER/PR+HER2− breast cancer among postmenopausal women than premenopausal women (Table 3; postmenopausal HR for BMI over 30 kg/m 2 = 1.69, 95% CI 1.50-1.91). Interactions were not statistically significant for the ER/PR+HER2+ or ER/PR−HER2+ breast cancer. Associations of BMI with TNBC were of similar magnitude as seen in other subtypes but were only statistically significant for postmenopausal overweight women (HR = 1.49, 95% CI 1.02-2.01). We also observed a significant interaction between menopause status and breast density for both ER/PR+HER2− (p < 0.001) and TNBC (p = 0.019), with a stronger association among premenopausal than postmenopausal women (TNBC premenopausal HR for dense breasts = 2.84, 95% CI 1.61-5.04). There was no significant interaction between menopause status and dense breasts for combined HER2+ subtypes (Table S3). There were no statistically significant interactions between BMI and breast density or between race/ethnicity and breast density for any breast cancer subtypes (data not shown).
Among parous women, a greater number of births was associated with reduced risk of ER/PR+HER2− breast cancer (Table 4; HR = 0.95, 95% CI 0.92-0.99). There was no significant association of the number of births with TNBC, ER/ PR−HER2+ or all HER2+ cancers (Table S4). There was no association of the number of births as a continuous variable with ER/PR+HER2+ cancers; however, women with two births had a significantly higher risk than women with one birth (HR = 1.39, 95% CI 1.02-1.90), but patients with three or more births had no significant difference in risk than patients with 1 birth (HR = 0.92, 95% CI 0.64-1.32). Table 5 displays the associations of cancer subtypes with screen detection and interval cancers. TNBCs were 33% less likely to be screen-detected (OR = 0.67 95% CI 0.50-0.88) and more than two times more likely to be interval cancers than ER/PR+HER2− cancers (OR = 2.26 95% CI 1.60-3.20).

| CONCLUSIONS
Our results highlight both similarities and differences in risk factors across breast cancer subtypes. Higher breast density was associated with increased risk of all four tumor subtypes, with a stronger association among premenopausal women for ER/PR+HER2− and TNBC. In contrast, the relationship with other risk factors varied across subtypes with distinct sets of risk factors for TNBC (age, race, BMI, and density) and ER/PR+HER2+ (prior biopsy, atypical hyperplasia, BMI, density), ER/PR−HER2+ (family history and density) and ER/PR+HER2− (age, race, prior biopsy, atypical hyperplasia, age at first birth, age at menarche, family history, BMI, and density). Additionally, we found that TNBCs were less likely to be screen-detected and more likely than other subtypes to be diagnosed as interval cancers. These results have implications both for risk assessment and understanding of the etiology of breast cancer subtypes. Our results are consistent with a recent large pooled analysis of six cohorts or case-control studies that found that breast density was associated with increased risk of all intrinsic molecular subtypes. 17 This analysis also observed a significant interaction between percent mammographic density and age for Luminal A cancers, with breast density having a stronger association in younger women. This study observed a similar trend among TNBC that did not reach statistical significance. Furthermore, they found no significant association of breast density with BMI. 17 Other, smaller studies have yielded inconsistent associations of breast density with breast cancer subtypes. [18][19][20][21][22][23][24][25][26] Our finding of the interaction of menopause status and BI-RADS breast density is clinically relevant, as breast density has increasingly been used to identify women who may benefit from supplemental screening, given that mammography is less sensitive among women with dense breasts. There is controversy about the risk-tobenefit ratio of supplemental screening for all women with dense breasts, given that nearly half of the screening eligible population has heterogeneously or extremely dense breasts. However, if young women with dense breasts are at particularly high risk for TNBC, which has poor prognosis, supplemental screening may be warranted. Our results are based on a small number of cases among young women, so future studies are needed to validate the large HR that we observed with respect to TNBC in premenopausal women.
While it is well known that Black women have higher risk of TNBC, it is striking that Black women had nearly threefold increased risk even with comprehensive adjustment for breast cancer risk factors in a screened population, a magnitude that has been observed in previous studies which adjusted for fewer risk factors. [27][28][29] The HR for race was nearly identical prior to multivariable adjustment, suggesting that differences in known risk factors do not explain this disparity. We observed no statistically significant association between age at first birth and risk of TNBC, in contrast to the protective effect for ER/PR+HER2−. This is consistent with three prior studies which also found no significant association of age at first birth with TNBC, 20,30,31 but contrasts with one prior study which found that older age at first birth was associated with fewer cases of TNBC. 32 We did not see a significantly increased risk of TNBC among women with greater parity, as has been reported in prior studies. 4,[33][34][35][36] We, unfortunately, lacked data on breastfeeding history in our study, which has shown to be particularly protective against TNBC among women with high parity. 35 data and previous studies, [37][38][39] Black women had lower risk of ER/PR+HER2− breast cancer compared to White women, as expected based on subtype-specific incidence rates, 37 though it is noteworthy that this was true even after adjustment for breast cancer risk factors. We observed that older age was associated with an increased risk of TNBC. This may seem to be inconsistent with the prior literature reporting younger age to be associated with increased risk of TNBC. 27,[39][40][41][42][43] For example, a large registry-based study of patients in New Jersey showed that among cancer cases, the OR for TNBC was 1.77 for women aged 20-39, but only 1.10 for women aged 40-49 compared with women aged 50-64. 39 However, these studies were case only analyses, whereas our study compares women diagnosed with TNBC to women not diagnosed with cancer. While patients with TNBC may be younger than patients diagnosed with other tumor subtypes, TNBC incidence increases with age. Based on SEER estimates, the TNBC incidence rate is 4.0 per 100,000 for women aged 20-39 years compared with 38.9 per 100,000 for women aged 65 and older. 37 Therefore, our results are not inconsistent with prior data.
We found that prior biopsy and atypical hyperplasia were strongly associated with ER/PR+cancers irrespective of HER2 status but were not associated with TNBC, recognizing that the HR for the association with for ER/PR−HER2+ was relatively large but not statistically significant. Prior biopsy and atypical hyperplasia likely reflect changes in the breast that suggest higher subsequent risk of hormone receptorpositive and HER2 positive tumors, but these changes do not appear to correlate with TNBC. This finding further points to unique etiologic mechanisms for TNBC.
As expected based on prior studies, 11 the effect of BMI on ER/PR+HER2− breast cancer differed between premenopausal and postmenopausal women with a greater effect in postmenopausal women. A similar relationship was seen for HER2+ cancers, although the interaction term was not statistically significant (0.07) for ER/PR−HER2+. Most prior studies have found no association between BMI and HER2+ cancers, 32,44-47 although one study reported higher risk of HER2+ cancers in overweight women. 42 Although metaanalyses have found a higher risk of ER− and TNBC among premenopausal obese women, 12,32,45,48-50 BMI was not significantly associated with TNBC in either premenopausal or postmenopausal women in this analysis. A prior analysis of black women found a positive association between obesity and TNBC in premenopausal women but a negative association in postmenopausal women, raising the possibility that the relationship between obesity and menopausal status and TNBC may also vary by race. 51 Additional studies will be needed to further investigate racial differences in the association of obesity and menopausal status with TNBC.
Our finding that interval cancers are more likely to be triple negative is consistent with existing literature. A populationbased study in Ireland found that triple-negative tumors were over three times more likely to be interval cancers than screen-detected. 52 Similarly, a Canadian population-based study showed that interval cancers were nearly three times more likely to be ER negative than screen-detected cancers, though this study lacked data on HER2 status. 53 One limitation that should be noted is that we lacked information on mammography screening at outside facilities, and therefore our estimates of screen detection and interval cancers may be underestimated. Patients without a mammogram within 1 year prior to their cancer diagnosis were not coded as screen-detected or interval, which represented 37% of invasive tumors.
The strengths of our study include the prospective design among a large population of women undergoing mammography at three large centers and included the assessment of established breast cancer risk factors along with BMI and breast density, allowing us to assess interactions among risk factors. Additionally, the study includes a significant number of Black women, who are at high risk of dying of cancer but have been underrepresented in research studies to date. The limitations of our study include missing data on some risk factors-an inherent problem in studies using data collected for clinical purposes. However, given the prospective design, we do not expect that missing data would be differential by breast cancer diagnosis. We lacked data on the use of hormone replacement therapy (HRT), which is strongly associated with both breast density and breast cancer risk. 54,55 However, given that the current use of HRT is most strongly associated with risk of ER/ PR+HER2− breast cancer and that the prevalence of current HRT use is small, 56 we do not expect that adjustment for HRT use would greatly affect our results. 20,57,58 Finally, despite the large study sample, the numbers of TNBC and HER2+ cases were limited.
Our results add to the literature describing differences in risk factors across breast cancer subtypes. We found that breast density may be a particularly strong risk factor for TNBC among premenopausal women, and that the other risk factors evaluated in this study do not explain racial differences in TNBC between Black and white women. These results highlight the urgency of exploring novel risk factors, such as genetics, epigenetics, biomarkers, and environmental exposures to understand the risk for less common but aggressive triple-negative and ER/PR−HER2+ breast cancer subtypes, as existing risk factors appear largely irrelevant to risk of these tumors.