Age-Dependent Heterogeneity of Lymph Node Metastases and Survival Identified by Analysis of a National Breast Cancer Registry

Background: For several cancers, including those of the breast, young age at diagnosis is associated with an adverse prognosis. Although this effect is often attributed to heritable mutations such as BRCA1/2, the relationship between pathologic features, young age of onset, and prognosis for breast cancer remains unclear. In the present study, we highlight links between age of onset and lymph node metastasis (NM) in US women with breast cancer. Methods: Case listings from Surveillance, Epidemiology, and End Result (SEER) 18 registry data for women with breast cancer, which include information on race, were used. NM and its associated outcomes were evaluated for a subset of women with receptor subtype information and then compared against a larger, pre-subtype validation set of data from the same registry. Age of diagnosis was a 5-category variable; under 40 years, 40–49 years, 50–59 years, 60–69 years and 70+ years. Univariate and adjusted multivariate survival models were applied to both sets of data. Results: As determined with adjusted logistic regression models, women under 40 years old at diagnosis had 1.55 times the odds of NM as women 60–69 years of age. The odds of NM for (HR = hormone receptor) HR+/HER2+, HR-/HER2+, and triple-negative breast cancer subtypes were significantly lower than those for HR+/HER2-. In subtype-stratified adjusted models, age of diagnosis had a consistent trend of decreasing odds of NM by age category, most noticeable for HR+ subtypes of luminal A and B. Univariate 5-year survival by age was worst for women under 40 years, with NM attributable for 49% of the hazard of death from cancer in adjusted multivariate models. Conclusions: Lymph node metastasis is age-dependent, yet not all molecular subtypes are clearly affected by this relationship. For <40-yr-old women, NM is a major cause for shorter survival. When stratified by subtype, the strongest associations were in HR+ groups, suggesting a possible hormonal connection between young age of breast cancer onset and NM.


Background
In 2020, an estimated 276,480 new cases of breast cancer were diagnosed in the United States (https://seer.cancer.gov/statfacts/html/breast.html). For women with breast cancer, the infiltration of tumor cells into surrounding lymph nodes is associated with a poor prognosis. Lymph node metastasis (NM), a means for the regional and distant spread of tumor cells, has a considerable influence upon treatment options and patient survival. Approximately 60% of all newly diagnosed cases of breast cancer are localized (non-metastatic). However, one third of the patients with localized cancers will eventually develop metastatic disease [1]. Of all new cases of breast cancer, another third have regional NM at the time of diagnosis [1]. Lymph nodes usually represent the first site of metastasis of breast cancer, and they initiate the process of metastasis of the disease.
Breast cancer is distinctive among highly prevalent cancers in that women with a young age of onset often have a more aggressive form of the disease. Although younger women are eligible for more intensive therapy, they nevertheless have, relative to older patients, worse survival and a higher recurrence rate [2,3]. Although links between molecular/receptor subtypes and disease progression (metastasis in particular) are commonplace [4][5][6], there has been little research into the durability of these relationships across age groups. Estrogen receptor-positive status has an unclear prognostic influence across ages [7][8][9].
However, ERBB2/HER2 receptor status seems to be more frequent and is associated with a lower survival of younger women [10]. A few negatively associated prognostic variables for early onset breast cancer are identified [11][12][13], yet the connection between patient characteristics and metastasis has not been fully assessed. In the present study, we used nation-wide data from Surveillance Epidemiology and End Result (SEER) cancer registries to gain a better understanding of the relationship between lymph node metastasis, receptor subtype, and age-dependent patient characteristics of breast cancer.

Study population data
The present analysis involved data from the SEER 18 SEER*Stat8.3.8 database case listings. The data in SEER 18 represents 27.8% of the total US population and 18 cancer registries [14]. After excluding patients with non-ductal or lobular tumor histology, male gender, missing information on lymph node metastasis, and follow up of less than 6 months, our final sample consisted of 717,331 women diagnosed with breast cancer from 1975-2017.
Since SEER did not start recording information regarding the HER2 receptor subtype until 2010, we used a subset of the overall data that offered complete receptor subtype analysis. The receptor subtype set of data is made up of 223,986 cases of breast cancer with first and only primary tumor for patients diagnosed for the ~ 7 year period of 2010-2017.

Study design
The present study used both case-control and follow-up designs. Cases were defined as breast cancer patients with NM at the time of diagnosis. Controls were patients with no NM at diagnosis. Data for all eligible patients were used, and no matching of controls to cases was done. A response variable of lymph node metastasis was defined as a binary outcome using AJCC 7 th edition tumor, nodes, and metastases (TNM) staging [15].
This large data set with NM information is consistent with AJCC 6 th edition TNM staging [16]. All nodes (N) values of N0 (including N0 (i-) and N0 (mol-)) were considered NM negative, and any other N values (N1-N4) were considered positive. Included were patient demographic measures of age and ethnicity as well as tumor differentiation and staging information. Breast cancer subtype was based on receptor status.
The "HR" abbreviation for hormone receptor represents both estrogen (ER) and progesterone (PR) receptor status. HER2 indicates human epidermal growth factor 2 receptor status. Borderline information on HER2 status was excluded from analysis of SEER*Stat data queries [17]. Age of onset was considered as both a continuous and a categorical variable. Categories of age were based on preceding literature [8,18]: under 40 years, 40-49 years, 50-59 years, 60-69 years, and 70+ years. Additional survival analysis was accomplished with time to death from cancer as an outcome (Tables 1a and 1b).

Statistical analyses
In depicting the univariate relationship between NM, receptor subtype, and individual covariates, we applied chi-square tests for categorical, and t-tests for continuous p-values. Logistic regression models were constructed for each variable to obtain univariate odds of NM and 95% confidence intervals. To examine whether the effect of age upon odds of NM was modified by receptor subtype, we performed 4 separate, adjusted logistic regression analyses stratified by subtype. Logistic regression modeling was also used to adjust for potential confounding variables, both for full data set and in receptor-based subtype subset analyses. Kaplan-Meier log-rank tests and Cox proportional hazard models were used to estimate the effect of age upon survival and NM.
We measured 5-year survival for the similar age categories used in the analyses of Alteri et al. [19]. For 5 separate age category-stratified survival models, we calculated the proportion of hazard of death that was attributable to NM (attributable fraction analysis) by comparing a counterfactual survival function (excluding NM from baseline) to a factual survival function, including NM [20,21]. In Cox models, variables adjusted for were NM, race, size of tumor, and receptor subtype. All statistical analyses utilized R version 3.6.2 (2019-12-12) [22].

Overall study population
Associations with NM were consistent across overall (1975-2017) and subset (2010-2017) analyses and showed that, at diagnosis, 32-36% of women had nodal metastasis of breast cancer. In follow up, women with NM made up a larger proportion of breast cancer deaths, were more African American, had tumors of larger size at diagnosis, and had more distant metastases at diagnosis (Table 1a and 1b).
Additional information unique to the test data set also showed higher grade/more poorly differentiated tumors for NM cases. Women with nodal metastases were also younger than controls (non-NM), with an average age at diagnosis of 57 years (61 years for controls). Lastly, a larger proportion of women had HR+/HER2+, triple-negative, and HR-/HER2+ subtype cancers than controls. See Tables 1a and 1b for a description of patient variables across nodal metastasis outcomes. In a subset analysis (data not shown) of women under 40 years in 1975-2017 data, no effect modification by age was found for the association between race, grade, and NM.

Odds of lymph nodal metastasis
The adjusted odds of NM for common variables in both datasets were available for race, age, tumor size, and tumor grade. The relationship between age and the adjusted odds of NM had similar estimates for both sets of data. Among all age groups, women under 40 had the highest adjusted odds of NM (1.55 in subset and 1.74 in overall data).
Both sets of data confirmed tumor size as being strongly associated with NM. Relative to Whites, African American women had higher adjusted odds of NM (1.13 in subset and 1.23 in overall data). Higher tumor grades were also positively associated with NM, with the effect increasing parallel to the loss of differentiation. See Tables 2a and 2b for details of univariate and adjusted odds of NM.

Receptor subtype
In receptor-stratified data, adjusted estimates of NM by subtype showed that TNBC cancers had lower odds of nodal metastasis relative to the HER2-/HR+ receptor subtype (OR 0.74, 95% CI 0.70-0.78) (Table 2b). There was, however, no significant increase in the odds of having NM for either HER2+ subtypes of HER2+/ER+ or HER2+/ER- (Table 2a). There was a young-to-old gradient in the odds of NM, which is most apparent in for HR+ subtypes of tumors, HER2-/ER+ and lHER2+/ER+ (Figure 1). In HR-/HER2+subtypes, age at diagnosis had a similar trend in odds estimates per age group, but showed no significant difference between ages <40-59. In addition, triple-negative subtypes showed a slightly higher association with NM for those younger than 60 but with no age gradient ( Figure 1).

Survival analysis
For both subset and overall data, Kaplan-Meier analyses consistently showed women under 40 as sharing the worst survival outcomes with women aged 70 years and above. Although women 70 years and older had low survival, women under 40 years old began follow-up with survival similarly high with other age categories (40-69 years), but there was a decline starting at ~15 months which surpassed 70-year-olds at ~22 months. Women under 40 years of age had the lowest probability of 5-year survival at 0.87, with age groups from 40-69 having similar 5-year survivals at approximately 0.91-0.93, with the value for the oldest group of women slightly decreasing to 0.89. As determined with adjusted Cox analyses, in both sets of data, women younger than 40 also had a higher fraction of hazard of death from cancer (~50% vs. ~37% at start) due to NM than women 70 years and above ( Figure 2).

Discussion
In adult females younger than 40, there is considerable debate about whether breast cancer should be considered as a distinct disease. Women under 40 have consistently poor survival outcomes across various study populations [12,13,[23][24][25][26] yet the reasons for this are not fully explained. The present study showed that the odds of NM at diagnosis have a gradient relationship to patient age of onset. This age-decreasing slope is most prominent for hormone receptor-positive subtypes. Findings from survival analysis confirmed that women under 40 have a hazard of death from breast cancer equal to or greater than the next-poorest group, women 70 years or older. Furthermore, the hazard of death attributable to NM was higher in women under 40 years, suggesting that NM has a stronger causal role in breast cancer mortality for younger women as compared to late age of diagnosis survival factors.
Previous studies have suggested that a higher incidence of the HR-/HER2+ subtype for young women with breast cancer partially explains their more aggressive disease [23][24][25]. Indeed, although our analysis showed a higher proportion of women under 40 years old having the HR-/HER2+ subtype, there was no significant difference in the odds of NM and age among the HR-/HER2+, and HR+ subtypes. However, a study, using a subset of the SEER data shows that, regardless of patients age, the HR+/HER2-breast cancer subtype has a higher rate of lymph node involvement at diagnosis than the triple-negative subtype [11]. From this study, NM at diagnosis does not appear to be related to HR-/ HER2+ aggressiveness in young age of diagnosis. Furthermore, this observation may be of importance in considering whether prior research examining the link between the HR-/ HER2+ subtype and nodal metastasis was adequately controlled for age at diagnosis.
The level of generalizability of the SEER 2010-2017 receptor-only subset of data may be questioned as a liability for analysis. We evaluated how representative subset data was using the complete, 42-year data. For all the variables common between datasets, both measures of association and measures of effect were consistent across data. Furthermore, we examined whether there was a time trend in proportion of NM positive patients at diagnosis in relation to age group. Interestingly, we found that, for women <40 years, the proportion of NM at diagnosis vs. non-NM remained stable across time at approximately 50%. In women aged 70+, there was a marked split of NM proportion from 50% starting in 1980, decreasing to ~25% by 2008 (Figure 3).
The concordance of IHC-based receptor subtype with intrinsic molecular subtype has been shown to vary by subtype. Luminal B (HR+/HER2+) is the worst performing subtype in measures of both concordance and accuracy across several studies [27,28], suggesting that the relationship between young age of onset, NM, and interplay between estrogen/ progesterone hormone receptors and HER2 should be studied further with a thorough consideration of molecular subtype classification. In contrast, previous research has shown the concordance and accuracy between both luminal A (HR+/HER2-) and triple-negative (HR-/HER2-) to be the highest among subtypes [29], suggesting that the results found in our study for these subtypes are likely not due to misclassification.
Lastly, there is also a question of consistency in measurement of variables over time. Perhaps the phenomenon of aggressive disease for younger women with breast cancer is influenced by changes in screening and diagnostic practices. From the standpoint of the association between age of diagnosis and NM, we included year of diagnosis as a predictor variable for both sets used for analysis. There was no confounding effect of the year of diagnosis upon the association between NM and age of diagnosis.
Involvement of lymph nodes is a key component in decisions for postoperative therapy, particularly radiation therapy, because clinicians evaluate need for lymph node radiation to minimize toxicity of treatment. Therefore, our results showing that the higher incidence of NM in young (<40 years) HR+ breast cancer groups (HR+/HER2-and HR+/HER2+) is clinically relevant. This relationship between NM and age of diagnosis is clear in women with luminal tumors, and may explain observations from previous studies linking the luminal subtype, young age of diagnosis, and poor prognosis [8]. Thus, our findings may aid in identifying aggressive disease in young women with luminal disease.

Conclusions
Our results suggest that a predisposition towards more severe breast cancer for women with younger age of diagnosis is driven in part by NM. This relationship between NM and age of diagnosis for women with luminal tumors is strong and may be related to a poor prognosis. These findings also have implications in identifying high-risk, young HR+ groups (HR+/HER2-and HR+/HER2+) of breast cancer patients for aggressive therapy.

Funding
The funding sources for the research are reported in the Acknowledgement section below. The NCI training grant to Dr. Michael Behring is provided to support his salary, and the NCI/NIH grant to Dr. Manne has covered a portion of his effort on this study, particularly to analyze the data based on the race/ethnicity of patients. However, the funding bodies did have any role in the design of the study or in collection, analysis, and interpretation of data or in writing the manuscript.