Risk factors and survival outcomes in patients with breast cancer and lung metastasis: a population‐based study

Abstract The risk factors for morbidity and mortality in breast cancer lung metastases (BCLM) patients still remain poorly identified. The aim of this study was to assess the incidence and survival of BCLM and associated risk factors. Patients with BCLM were identified from the Surveillance, Epidemiology, and End Results (SEER) database. Multivariate logistic regression analysis was used to determine the risk factors for BCLM. Predictors of factors associated with death were analyzed in Cox regression and Fine and Gray's test. Of the 11568 patients with stage IV breast cancer, 4213 (36.4%) had BCLM and 1214 (10.5%) had metastases confined to lungs. The median survival time for patients with BCLM was 21 months, and 15.5% of the patients were alive more than 3 years. The tumor subtype distribution was 45.3% HR−/HER2−, 12.2% HR+/HER2+, 7.8% HR−/HER2+, and 15.0% triple‐negative subtype. Compared with patients without BCLM, those with BCLM were more likely to be aged, female, black, higher tumor grade, HR−/HER2+, HR+/HER2+, and triple‐negative subtypes at diagnosis. Survival analysis showed that the aged, black race, HR−/HER2+, triple‐negative subtype, higher grade were the independent risk factor for BCLM patients’ survival, while HR+/HER2+ subtype, insured status, and married status suggested better prognosis. In conclusion, the incidence and prognosis of BCLM varied by tumor subtypes, age, and race. Elderly patients with HER2‐positive or triple‐negative tumors were more likely to have BCLM.


Introduction
Better understanding of the incidence and survival of BCLM and related risk factors can help identify patients with high-risk factors, and reduce the occurrence of BCLM and improve the prognosis by early intervention. The aim of this study was to assess the incidence and survival of BCLM and its associated risk factors.

Patients
We extracted data from 18 registries released in 2016 (the latest follow-up information available) within the Surveillance, Epidemiology, and End Results (SEER) database which contains the limited medical information for about 30% of the total American population [12]. Using SEER*Stat (version 8.3.4 National Cancer Institute, Bethesda, MD, USA), we attained a cohort of 247,364 patients diagnosed as primary and histologically validated malignant breast cancer and aged 18 years or above at diagnosis from January 1st, 2010 to December 31st, 2013. Those with carcinoma in situ were excluded in this cohort. Furthermore, we generated a final incidence cohort of 240,808 patients with definite lung metastases status at diagnosis (Yes or No). Within the case listing, 4213 patients had lung metastases when first diagnosed as having breast cancer. Subsequently, we excluded patients who were diagnosed by autopsy or death certificate and whose survival record presented with 0 month, leaving a survival cohort of 3772 patients with active follow-up for survival analysis. Before initiating this study, we signed and submitted a data agreement form to the SEER research team, thus having access to SEER database according to official permission. The study was approved by the Institutional Review Board of Sun Yat-sen University Cancer Center, Guangzhou, Guangdong Province, People's Republic of China. The review board waived informed consent from patients because of unknown identity.

Stratification
Incidence proportion was defined as the number of patients with lung metastases divided by the number of patients with breast cancer. We calculated the absolute quantity and incidence proportion of patients with lung metastases confirmed at breast cancer diagnosis among the entire cohort and metastatic diseases subgroup after breast cancer molecular subtype stratification which includes hormone receptor (HR)-positive human epidermal growth factor receptor 2 (HER2)-positive, HR − /HER2 + , HR + /HER2 −, and triple-negative (HR-negative HER2-negative), respectively. Patients were also stratified by age, race, sex, marital status, number of metastatic sites outside of the lungs, pathological grade, etc. Race/ethnicity was comprised of white, black, Asian or Pacific Islander, and American Indian/ Alaska Native in accord with the database record.

Statistical analysis
To assess the correlation between variables and lung metastases status, we used multivariate logistic regression model to calculate the odds ratios (ORs) within the subgroups, adjusted for all variables which may harbor different prognosis. The extent of metastatic diseases was characterized by the presence or absence of brain, liver, and bone metastases available in the SEER database.
The survival was defined as the time from the initial breast cancer diagnosis to death. We used Kaplan-Meier method to compute the survival estimates and generate survival curves within subsets of subtypes and overall cohort. A Cox proportional hazards regression analysis was conducted to assess the association of the same variables described herein with the hazard ratio (HR) of death in patients with lung metastases. Fine and Gray's semiparametric competing risk model was used to exam the subdistribution hazards.
We calculated 95% confidence intervals (95% CI) for all estimates (ORs and HRs) across strata. A P value of 0.05 or less was determined as statistically significant. All P values were two-tailed. Statistical analysis was performed using SPSS statistical software (SPSS IBM STATISTICS 21, IBM Corporation, Armonk, USA), apart from the Kaplan-Meier method by SAS 9.4 software (SAS Institute, Cary, NC) and breast cancer-specific mortality using a Fine and Gray's semiparametric model by cmprsk package of R software (version 3.4.1 R Foundation).

Results
The absolute quantity and incidence proportion of our cohort according to molecular subtypes appear in Table 1. The incidence proportion of HR + /HER2 − , HR + /HER2 + , HR − /HER2 + , triple-negative, and unknown subtypes among 240,808 patients diagnosed with malignant breast cancer between 2010 and 2013 was 67.7%, 9.3%, 4.1%, 10.7%, and 8.3%, respectively. Among the 11,568 patients diagnosed with metastatic diseases at distant sites analyzed for incidence, 51.8%, 13.2%, 6.9%, 11.1%, and 17.0% had HR + /HER2 − , HR + /HER2 + , HR − /HER2 + , triplenegative, and unknown subtypes, respectively. Four thousand two hundred and thirteen lung metastatic patients were identified, accounting for 1.8% and 36.4% of the entire study cohort and subgroup with distant metastases, respectively. Of these, 1214 were patients with metastases confined to lungs (i.e., metastatic disease only in the lungs). The patients with HR + /HER2 − subtype harbored the highest incidence proportion (3.4% of the entire study population, 42.4% of the metastatic subgroup) and ; P < 0.001) were more likely to be diagnosed as lung metastases at initial diagnosis. Interestingly, married (vs. unmarried; OR, 0.85; 95% CI: 0.78-0.92; P < 0.001) and insured (vs. uninsured; OR, 0.64; 95% CI: 0.54-0.77; P < 0.001) status seemed to be associated with lower odds of lung metastases at diagnosis. The results among the entire study population reflected a similar trend. Significant results appear in Table 2.

Survival
In the survival cohort of 3772 patients diagnosed as lung metastases, the median survival stratified by subtypes is provided in Table 1. The median survival among the cohort was 21 months, of which the median survival of patients with metastases confined to lungs was 25 months. Patients with HR + /HER2 + had the longest median survival (31 months) and triple-negative the shortest (11 months). Figure shows the overall survival ( Fig. 1A), survival stratified by subtype (Fig. 1B).
The hazard ratios for all-cause mortality according to all variables in multivariate Cox regression model appear in Table 3 25; P = 0.001) were significantly associated with a increased all-cause mortality. Married status (vs. unmarried; hazard ratio, 0.79; 95% CI: 0.72-0.86; P < 0.001) and HR + /HER2 + subtype (vs. HR + /HER2 − subtype; OR, 0.82; 95% CI: 0.70-0.94; P = 0.001) were significantly associated with decreased all-cause mortality. But, insured status was not associated with mortality in this model. Breast cancer-specific mortality of patients with lung metastases at initial diagnosis also appears in Table 3. Median survival of subtypes after the stratification of the extent of metastatic sites is provided in Table 4. Survival was better among those with less metastatic diseases at distant sites. In general, patients with lung metastases at diagnosis experienced significantly shorter survival than patients presented with no baseline lung involvement (Table 4).

Discussion
As far as we know, this work represents the first comprehensive analysis of the incidence and prognosis of patients with lung metastases at the initial diagnosis of breast cancer. In this study, we identified 4213 cases of lung metastases from patients with newly diagnosed breast cancer, accounting for 1.8% of all patients with breast cancer, 36.4% of metastatic diseases subgroup. Among them, HER2-enriched and triple-negative tumors had a higher percentage of lung metastases. In addition, the median survival of patients with different subtypes of lung metastases was also very heterogeneous, ranging from 11.0 months of triple-negative subtypes to 31.0 months of HR + /HER2 + subtypes.
To date, relatively few studies have attempted to find the association between breast cancer subtypes and lung metastases. Kennecke et al. [1] reported that the cumulative rates of lung metastases in HR + /HER2 − , HR + /HER2 + , HR − /HER2 +, and triple-negative subtypes were 9.1%, 17.7%, 24.1%, 15.7%, respectively, after a long-term followup of 3726 patients with early breast cancer (diagnosed from 1986 to 1992). Soni et al. [13] observed that the frequency of lung metastases in each breast cancer subtype was 17%, 14%, 25%, and 31%, respectively, in a cohort of 531 consecutive patients with advanced breast cancer. Sihto et al. [14] reported that 234 cases of distant metastases occurred in 2032 cases of breast cancer patients after follow-up of 2.7 years. Their results indicated that the incidence of lung metastasis as first distant metastases was 8.5%, 16.3%, 22.9%, 20.8%, respectively, in luminal A, luminal B, HER2 + -enriched, and basal-like subtypes [14]. One of the advantages of these studies was the provision of information on the cumulative incidence of lung metastases during the natural course of the disease. In contrast to these studies, our study focused on patients presenting with lung metastases at the initial diagnosis of breast cancer. Therefore, the effect of tumor subtypes on lung metastases would not be affected by previous local and systemic treatments in our cohort.
In the present study, the median survival of patients with lung metastases was 21 months,while those who with metastases confined to lungs had a median survival of 25 months. In a retrospective analysis of M. D. Anderson Cancer Center, the median OS was 22.5 months for breast cancer patients with metastases confined to lungs treated with systemic chemotherapy [15]. However, patients with metastases confined to lungs undergoing pulmonary metastasectomy had a median OS of 35-75.6 months and a 5-year overall survival rate of 38% to 54% [5,[16][17][18][19]. We did not have information on lung surgery and systemic therapy for lung metastases in this cohort, so we were unable to analyze the differences in survival due to treatment. According to the tumor subtype, our study also showed important differences in OS. Patients with HR + /HER2 + subtype had the longest OS, and their risk of death was significantly lower compared to HR + /HER2 − patients. In contrast, the patients with triple-negative subtype had the worst prognosis. Our findings were similar to previous reports on the effect of tumor subtypes on OS of patients with breast cancer [13,[20][21][22]. Several studies showed that triple-negative breast cancer was associated with poor prognosis [23,24]. Our findings confirmed and extended the previous reports on the effects of tumor subtypes on the prognosis of patients with breast cancer. The prognosis of patients with lung metastases observed in all tumor subtypes is quite different, confirming that breast cancer is a heterogeneous disease, even in patients with specific lung metastases.
In addition to the association between the prevalence of lung metastases and the tumor subtypes, several important correlations between lung metastases and demographics of breast cancer patients were noteworthy. Although young women were more likely to develop more aggressive breast cancer subtypes and more advanced diseases [25][26][27], it revealed a higher incidence of lung metastases in older patients. In addition, the percentage of lung metastases in male patients was significantly higher than that of females, although the absolute number of male breast cancer patients had not yet reached 1% of women throughout the cohort. Furthermore, black race (vs. white, OR, 1.38; 95% CI, 1.25-1.51; P < 0.001) had significantly greater likelihood of lung metastases at the time of diagnosis, but this association was not found in distant metastatic disease subset. Perhaps the most interesting was that this study showed that the incidence and prognosis of lung metastases were associated with marital status and insurance status at the initial diagnosis of breast cancer, regardless of known clinical prognostic variables such as tumor subtypes, age at diagnosis, tumor grade. This confirmed and enriched previous studies which reported that the risk of cancer metastases and cancer-related death in unmarried or uninsured patients was significantly higher than in married or insured patients [28,29]. This result emphasized the potentially significant impact of social support on breast cancer detection and survival.
Our research has some limitations. Firstly, all data were collected by the SEER program, which relied on routine collection of cancer registry data, and the incidence of lung metastases might be underestimated. Secondly, as the SEER database did not capture subsequent lung metastases during disease progression, our study was unable to incorporate subsequent lung metastases. Thirdly, we were unable to analyze the effect of lung-directed treatment (such as lung surgery and endocrine therapy) on the prognosis of patients because these data were not available in the public SEER data set. Fourthly, while we have adjusted the effects of confounding factors such as  age, insurance, and marital status, we were unable to adjust sociodemographic status to the level of the patient.
Fifthly, there is no information on the number of lung lesions or the bulk of disease (i.e., lung full of tumor vs. a tiny lung met), which are important prognostic factors for patients with lung metastases. Sixthly, there may be other organs/tissue involved that are not captured by the SEER registry (i.e., adrenal glands). Finally, the distribution of the cause of death cannot be specific to lung metastases or other metastases. Our research also had several advantages. The study was based on population-based tumor registration in recent years, providing a generalization for the results. The sample size of this study was large enough to provide sufficient strength to explore the incidence and prognosis of lung metastases. Due to the extensive information collected by the SEER program, we were able not only to explore OS, but also to explore cancer-specific survival. Finally, our study differed from other reports of lung metastases due to recurrence or progression of early breast cancer and was not affected by previous local and systemic treatment (which might have a potential impact on the occurrence and treatment of lung metastases), thus providing an important clinical information on the prognosis and the risk stratification of simultaneous lung metastases.
Our study provides important information on the incidence and prognosis of lung metastases in breast cancer, which is critical for designing studies to test interventions that may improve survival. In addition, the frequency of lung metastases identified in this study can be used to estimate the burden of disease, and the risk factors identified here can be used for risk-based screening to maximize early detection of lung metastases and achieve optimal cost-effectiveness.