In modern times, how important are breast cancer stage, grade and receptor subtype for survival: a population-based cohort study

In breast cancer, immunohistochemistry (IHC) subtypes, together with grade and stage, are well-known independent predictors of breast cancer death. Given the immense changes in breast cancer treatment and survival over time, we used recent population-based data to test the combined influence of IHC subtypes, grade and stage on breast cancer death. We identified 24,137 women with invasive breast cancer aged 20 to 74 between 2005 and 2015 in the database of the Cancer Registry of Norway. Kaplan-Meier curves, mortality rates and adjusted hazard ratios for breast cancer death were estimated by IHC subtypes, grade, tumour size and nodal status during 13 years of follow-up. Within all IHC subtypes, grade, tumour size and nodal status were independent predictors of breast cancer death. When combining all prognostic factors, the risk of death was 20- to 40-fold higher in the worst groups compared to the group with the smallest size, low grade and ER+PR+HER2− status. Among node-negative ER+HER2− tumours, larger size conferred a significantly increased breast cancer mortality. ER+PR−HER2− tumours of high grade and advanced stage showed particularly high breast cancer mortality similar to TNBC. When examining early versus late mortality, grade, size and nodal status explained most of the late (> 5 years) mortality among ER+ subtypes. There is a wide range of risks of dying from breast cancer, also across small breast tumours of low/intermediate grade, and among node-negative tumours. Thus, even with modern breast cancer treatment, stage, grade and molecular subtype (reflected by IHC subtypes) matter for prognosis.


Introduction
The prognosis of invasive breast cancer is strongly determined by tumour size (T), nodal spread (N) and distant metastases (M) at the time of diagnosis [1][2][3][4]. In addition, routine immunohistochemistry (IHC) tumour markers, i.e. estrogen receptor (ER), progesterone receptor (PR) and human epidermal growth factor receptor 2 (HER2), as well as grade and Ki67, are independent predictors of breast cancer death and have therefore together with TNM been guiding treatment decisions in the past decades [5][6][7].
In the early 2000s, the five intrinsic molecular subtypes of breast cancer (luminal A, luminal B, basal-like, Erb-B2/HER2-enriched, and normal breast-like) were first described, separated by gene expression analysis and with different biological properties and outcomes [8,9]. Today, breast carcinomas can be classified into four of these types by a commercial 50-gene molecular signature [10]. This molecular classification is however recent and not yet widely used in clinical practice. Therefore, clinicians and researchers have used subtypes defined by IHC markers as proxies for the molecular subtypes [11].
Although the prognostic value of established IHC markers, TNM and grade is undisputed, they are commonly not assessed in exhaustive combinations, even in large studies [12][13][14][15][16]. The large registry-based epidemiological studies that have divided IHC subtypes into more detail by grade and TNM have still not fully examined the potential in the detailed stratification that is possible [17][18][19].
With the introduction of new treatments, it is important to continuously re-evaluate the role of classical markers for prognosis in large population-based materials to confirm and, if necessary, update the tumour/patient stratification. Such updated results will provide crucial guidance to the molecular scientists in their efforts to identify new markers for patient subgroups with insufficient prognostic characterization.
Using nationwide cancer registry data, we investigated the combined contribution of routine clinicopathologic markers (ER/PR/HER2 status, grade, tumour size and nodal status) to breast cancer-specific death up to 13 years after diagnosis. We also investigated the combined influence of these factors on early (< 5 years) and late (> 5 years) breast cancer death, in order to identify subgroups with particularly high mortality in different risk windows.

Study population
We identified a cohort of women diagnosed with invasive breast cancer from the Norwegian Breast Cancer Registry at the Cancer Registry of Norway (CRN) [20,21]. The CRN has recorded new cancer cases in Norway since 1953. The registry database is considered 98.8% complete, and for breast cancer, 99.3% of cases were morphologically verified [22]. The current study included women aged 20-74 years with a primary invasive breast cancer (ICD 10 = C50) diagnosed between January 2005 and December 2015, and with no prior history of invasive carcinoma recorded in the CRN, including n = 24,386 women. Using ICD-O-3 morphology codes [23], we excluded tumours which were not morphologically verified (n = 35), not confirmed as primary (n = 36), or non-epithelial tumours or Paget's disease (n = 154). By routine linkage to Norwegian population registries, the CRN also has information on vital status, date and cause of death and date of emigration. Women with unclear residency status at the time of cancer diagnosis were excluded (n = 15). After these exclusions, the cohort comprised n = 24,146 women.

Histological grade and Ki67
Histological grade information was available from the ICD-O-3 code and categorized as low (I), intermediate (II) and high (III) according to the Elston-Ellis modification of the Scarff-Bloom-Richardson grading system [24]. Women with anaplastic carcinoma (n = 9) were excluded, leaving n = 24,137 women for the analysis (Additional file 1). Ki67 has been recorded routinely since 2011 (as percentage of Ki67-positive tumour cells within hot-spots) and was categorized as low (< 15.0%), intermediate (15.0-30.0%) or high (> 30.0%) according to cutoffs in the Norwegian treatment guidelines [25,26].

Treatment
Information on the type of surgery (mastectomy, breast conserving surgery or no surgery) was available for the full study period. Information on adjuvant treatment (chemotherapy (CT), radiotherapy (RT), endocrine therapy (ET)) was only available for 57% of the women over the study period. For RT, the recorded treatment corresponded to "given" treatment. For CT and ET, the recorded treatment corresponded to "planned" treatment.

Statistical methods
Time to breast cancer death was defined from date of breast cancer diagnosis until date of breast cancer death or censoring by death from other causes, emigration or end of follow-up in December 2017, whichever came first. The maximum follow-up was 13 years. Breast cancer-specific survival was estimated with the Kaplan-Meier method, compared with logrank tests, and agestandardized according to the internal age distribution in the sample. Breast cancer mortality rates were analysed with flexible parametric survival models (FPM) [29,30] estimating hazard ratios (HRs) with 95% confidence intervals (CI) for combinations of IHC subtype, grade and pTN status. We estimated separate HRs during 0-5 and 5-13 years of follow-up. Models were adjusted for age and year of diagnosis and type of surgery. We did not adjust for Ki67 since it was only available from 2011. In a sensitivity analysis, we adjusted for adjuvant therapy. The baseline hazard of the FPM was estimated with a spline using 5 degrees of freedom. Timevarying effects (non-proportional hazards) were estimated with 3 degrees of freedom, and yielded smooth hazard rates shown graphically. Likelihood ratio tests assessed the interaction between IHC subtype and variables. Only women with complete information on all covariates in the adjusted models were included in the regression analyses. Frequencies of IHC subtypes by clinicopathologic characteristics were compared by Pearson chi-square tests. All tests were 2-sided and the significance level was 5%. Analyses were performed in Stata version 15.1 [31]. Among ER+HER2− subtypes, more than 2/3 of tumours were small (≤ 20 mm) and had no nodal spread (Table 1). Lymph node involvement was most common among ER+HER2+ and HER2-positive subtypes. ER+ HER2+ and HER2-positive subtypes had the highest proportions of small tumours with nodal spread (pT1-2pN+ ) (range 37 to 42%), but the differences in pTN status across IHC subtypes were largely explained by grade (Additional file 7). TNM stage was most advanced among ER+HER2+ and HER2-positive subtypes overall (Table 1) and within all levels of grade (Additional file 8). PR− status conferred consistently more advanced pTN status and TNM stage compared to PR+ (Additional file 6).

BC death by IHC subtype, grade and pTN status
To assess the independent contribution of each factor to breast cancer death, we stratified by all three variables IHC subtype, grade and pTN status in models adjusted for age, year and surgery type ( Table 2). Among ER+ subtypes, an increasing grade was associated with increased mortality in all subtypes and levels of pTN status. Larger tumour size and positive nodal status were consistently associated with increased mortality in all ER+ subtypes and levels of grade, and larger size was associated with increased mortality also among nodenegative tumours (p value for three-way interaction 0.4333). Among small tumours (≤ 20 mm) with no nodal spread, ER+PR−HER2− subtype of grade III was associated with a particularly high mortality (HR = 8.5, 4.0-18.2) and of similar magnitude to TNBC grade III tumours (HR = 9.2, 5.1-16.5). Women with larger tumours and any nodal spread (pT3-4pN0/+) had the highest mortality although numbers were low for this group. Among ER− subtypes, high-grade tumours were associated with higher mortality than intermediate-grade tumours for pT1pN0 tumours, while for other pTN status the mortality rates were similarly elevated for intermediate-and high-grade tumours.

BC death by time-since-diagnosis
In a second step, we assessed breast cancer death by subtype and grade at early (0-5 years) and late (5-13 years) follow-up. Among ER+HER2− subtypes, survival was markedly poorer for grade III compared to grade II tumours (Fig. 1a, b), and the mortality rate remained elevated up to 13 years after diagnosis (Fig. 1g, h). Adjusted hazard ratios confirmed the association (Fig. 1m, n). Compared to ER+PR+HER2− tumours of grade I, tumours of grade II or III were associated with increased mortality both early and late, with strongest associations for grade III (early: HR = 3.9 (95% CI 2.9-5.3), late: 2.9 (2.0-4.1), Fig. 1m, Additional File 9). Stronger associations with grade were observed for ER+PR−HER2− subtype (Fig. 1b, h, n), and weaker associations for ER+PR+ HER2+ and ER+PR−HER2+ subtypes (Fig. 1c, d, i, j, o,  p). HER2-positive subtype of grade II or III was associated with increased early but not with late mortality (Fig.  1e, k, q). The highest early mortality was observed for the TNBC subtype of grade II or III which had eightfold and tenfold mortality rate, respectively, compared to the reference group (Fig. 1f, l, r). In all these comparisons, it is important to recall that the reference group (women with ER+PR+HER2− grade I tumours (Fig. 1g)) had an increasing mortality over follow-up (HR = 1.9, 1.4-2.7 comparing mortality at 10 vs. 1 year after diagnosis) and that ER+ high-grade tumours accounted for the largest numbers of deaths.
Breast cancer survival and early and late mortality rates by IHC subtype and pTN status are presented in Fig. 2. Breast cancer survival was significantly worse with increasing tumour size and with nodal spread within all IHC subtypes, with the possible exception of ER+PR −HER2+ (Fig. 2a-f). Among ER+ subtypes, larger size and nodal spread were associated with increased mortality throughout 13 years of follow-up (Fig. 2g-j). In the adjusted analysis, in particular for ER+PR+HER2− and ER+PR−HER2− subtypes, both larger size (pT2pN0) and nodal spread (pT1-2pN+) were associated with increased early and late mortality (Fig. 2m, n). Among HER2positive and TNBC subtypes, size and nodal spread were mainly associated with early mortality (TNBC pT1-2pN+ early: HR = 12.9 (8.8-18.9), late: HR = 1.6 (0.8- 3.0)) ( Fig. 2r). Again, it is important to recall that the mortality in the comparison group (ER+PR+HER2− pT1pN0) increased over follow-up (Fig. 2g).

Sensitivity analyses
Adjustment for adjuvant treatment in a subset of patients with available treatment information indicated that the associations were essentially unchanged after adjustment for adjuvant treatment, although due to sparser data the models were simplified (Additional file 10). For comparison to other studies, we estimated hazard ratios of IHC subtype without stratification by grade, but with adjustment for grade (Additional file 11). To account for age differences in the prevalence of IHC subtypes, we also present agestandardized Kaplan-Meier curves (Additional files 12 and 13) indicating that age confounding was small in Figs. 1 and 2.

Discussion
Our study is the first registry-based study of breast cancer death that combines IHC subtype with grade, tumour size and nodal status into high-resolution detailed patient strata. A main finding was that IHC subtype, grade and pTN status were independent prognostic factors for breast cancer death. The largest populationbased study ever to assess breast cancer death by IHC subtype utilized SEER registry data [16]. However, the authors did not separate groups by grade or N status, nor did they separate the ER+PR+HER2− group from the smaller ER+PR−HER2− group. A thorough study from the California Cancer Registry [17] stratified survival by IHC markers and AJCC stage, yet was restricted to 5 years of follow-up and only separated the ER+HER2 − group by grade. Of particular importance was the finding that when combining all parameters (and thus defining 48 subgroups of patients), a huge diversity in prognosis was found between the patient groups with a 20-to 40-fold higher rate of breast cancer death in the groups of worst prognosis compared to the group with best prognosis. Further, the consistent finding of a poorer prognosis with increasing tumour size within all levels of IHC subtype and grade, also among low-grade ER+HER2− tumours, suggests that late diagnosis will compromise survival regardless of IHC subtype and grade. Still, the finding that among small tumours (≤ 20 mm) with no nodal spread, ER+PR+HER2− subtype of grade III was associated with the same high mortality as TNBC grade III tumours strengthens the importance of subgroup definition also in small tumours.
Breast cancer patients can suffer relapse and death due to their disease even decades after diagnosis. When analysing the subgroups with regard to timing of death, we found that high grade and nodal spread represented the strongest predictors of late breast cancer death (> 5 years of diagnosis) in ER+HER2− subtypes, and less so in ER+ HER2+ subtypes. Contrary, among women with ER− subtypes, the breast cancer mortality was substantially higher close to diagnosis (< 5 years) and of similar magnitude for intermediate-and high-grade tumours. Among ER+ subtypes, there was an indication that PR− status may be associated with worse early mortality (< 5 years), and PR− status consistently conferred a higher grade and Ki67 expression. Adverse effects of PR− on mortality among ER+ subtypes have also been reported in smaller samples [32], but the difference in early vs. late mortality for this group has not been described before.
It has been suggested that low-and high-grade ER+ tumours constitute two independent pathobiological entities with their own characteristics and that intermediate grade is a poorly classified mix of those two underlying types [33][34][35][36]. Molecular subtyping complemented with DNA Fig. 2 Breast cancer-specific survival proportions, hazard rates and adjusted hazard ratios by IHC subtype and pTN. Legend: Survival proportions (panels a-f), hazard rates (panels g-l) and adjusted hazard ratios (panels m-r) including n = 16,809 women with known information on ER, PR, HER2, grade, TNM stage and surgery type, and restricted to T1-2, N any, M0. Numbers at risk at start: ER+PR+HER2− n = 6229 (pT1pN0), n = 1362 (pT2pN0), n = 3125 (pT1-2pN+); ER+PR−HER2− n = 1309 (pT1pN0), n = 361 (pT2pN0), n = 694 (pT1-2pN+); ER+PR+HER2+ n = 437 (pT1pN0), n = 171 (pT2pN0), n = 380 (pT1-2pN+); ER+PR−HER2+ n = 225 (pT1pN0), n = 92 (pT2pN0), n = 230 (pT1-2pN+); HER2pos n = 250 (pT1pN0), n = 131 (pT2pN0), n = 310 (pT1-2pN+); TNBC n = 638 (pT1pN0), n = 413 (pT2pN0), n = 452 (pT1-2pN+). Logrank tests of survival differences by grade: ER+PR+HER2− p < 0.001, ER+PR−HER2− p < 0.001, ER+PR+HER2+ p = 0.0001, ER+PR−HER2+ p = 0.6100, HER2pos p = 0.0099, TNBC p < 0.001. Hazard rate curves only plotted until the last event in each group. Hazard ratios adjusted for age and year at diagnosis, subtype × pTN interaction, grade, surgery type and follow-up. N = 16,809. Estimates of HRs are given in Additional file 9 copy number analysis (CNA) suggests that up to eleven subtypes can be identified [11,37]. Some of these molecular subgroups (within ER+HER2− patients) show an increased risk for late relapse [38]. We found that ER+ HER2− tumours of intermediate grade have a survival in-between those of low and high grade, while for ER+ HER2+ tumours there was no difference in survival between intermediate and high grade. Breast cancer is recognized as several biologically and clinically distinct subtypes, and in particular ER+ breast cancer is considered to be a spectrum of diseases by international guidelines [5,7]. According to the ASCO guidelines, the ER+ HER2− subgroup needs stratification into low, intermediate and high risk of relapse to guide adjuvant treatment decisions [39,40]. Many countries, including Norway, have used the proliferation marker Ki67 for this purpose. However, in our study, the differences in Ki67 level across IHC subtypes were mostly explained by grade. Importantly, among the clinically challenging ER+HER2− intermediate-grade tumours, Ki67 expression did not discriminate between the tumours. It is known internationally that Ki67 measurements display a large variation both with regard to semiquantitative estimation and cutoff level across laboratories [41,42]. As some previous studies have questioned the prognostic value of PR status [43], it is of particular interest that our combined analysis shows that subsets of patients with ER+PR−HER2− status have a significantly worse prognosis compared to the ER+PR+HER2− patients. Similar findings of the possible importance of PR were observed in the Californian study [17] and in a Swedish study [44].
Our findings highlight the importance of molecular testing in specific subgroups and the need to integrate molecular subtype with pTN status to predict late risk of recurrence and death. One recommendation for the ER+ HER2− group is to implement multimarker molecularbased risk scores [45][46][47]. These new multigene signatures are expensive and will only be beneficial to implement for subsets of patients. Thus, there will also in the future be a need for evaluating clinically available tumour markers in large patient datasets and monitoring their impact on patient survival. Such analyses can be used as a benchmark for future molecular studies by representing all patients in the population with sufficient numbers in subgroups. This may also help molecular scientists identify which subgroups of patients should be sampled for genetic studies.
It is not surprising that adjustment for treatment did not change the findings of our study, since IHC subtype, grade and pTN status determine the treatment choice. Endocrine treatment, trastuzumab and chemotherapy were routinely used in Norway according to treatment guidelines during the study period [26]. This is one of the largest population-based studies to date evaluating the combined effect of IHC subtype, histological grade, Ki67, tumour size and nodal spread on breast cancer death. The nationwide cancer registry data ensured essentially complete case ascertainment and follow-up for death and migration via routine population registers including all patients presenting at the clinics [22]. The information on subtype was prospectively collected and coded at the CRN throughout the study period [20]. Norway has national treatment guidelines for breast cancer applied at all cancer hospitals [25]. Using thorough adjustments and appropriate modelling of time enabled precise estimation of effects over follow-up. Grade-mix in the reference group of ER+ HER2− may be a problem in previous population-based analyses comparing IHC subtypes without adjustment for grade, which would lead to under-estimation of associations. Our supplemental results further highlight the importance of stratifying IHC subtype (ER/PR/HER2) by grade rather than adjusting for grade.
Despite a large nationwide cohort of recently diagnosed patients, the assessment of some combinations of tumour characteristics was not possible due to small numbers. In particular, we could not include Ki67 in the survival analysis. It should be noted that estimates for early (0-5 years) and late (5-13 years) follow-up represent average effects in those time windows and that the window 5-13 years, in particular, include patients with differential follow-up since only a fraction of patients were followed for a full 13 years. No adjustment for socioeconomic status was possible in our dataset; however, the tax-funded healthcare system in Norway is characterized by equal access to diagnostics and treatment across the population, in addition to a national screening programme for women aged 50-69; thus, socioeconomic differences are unlikely to substantially influence the observed associations. We did not have information to adjust for screening; however, age-standardized results indicated no or small influence of age (as proxy for screening) on overall findings though the majority of cases were post-menopausal.

Conclusion
These results show that tumour size and nodal status, as well as IHC subtype and grade, are important independent predictors of breast cancer death also in patients under modern treatment regimes and therefore must be assessed jointly for their impact on prognosis. In addition, these population-based findings highlight that also patients with ER+ node-negative tumours of low or intermediate grade are in need of new multigene molecular signatures for better prognostic stratification. These findings show the importance of high-quality registry data for evaluating the clinical impact of new multigene signatures, which will be particularly important in the next decades as many countries now include multigene molecular analysis for treatment decisions.