A prognostic nomogram based on risk assessment for invasive micropapillary carcinoma of the breast after surgery

Abstract Purpose Invasive micropapillary carcinoma (IMPC) is one of the rare subtypes of breast cancer. This study aimed to explore a predictive nomogram model for IMPC prognosis. Methods A total of 1855 IMPC patients diagnosed after surgery between 2004 and 2014 were identified from the Surveillance, Epidemiology and End Results (SEER) database to build and validate nomogram. A nomogram was created based on univariate and multivariate Cox proportional hazards regression analysis. Receiver operating characteristic (ROC) curves were used to demonstrate the accuracy of the prognostic model. Decision curve analysis (DCA) was performed to evaluate the safety of the model in the range of clinical applications, while calibration curves were used to validate the prediction consistency. Results Cox regression analysis indicated that age ≥62 at diagnosis, negative ER status, and tumor stage were considered adverse independent factors for overall survival (OS), while patients who were married, white or of other races, received chemotherapy or radiotherapy, had a better postoperative prognosis. The nomogram accurately predicted OS with high internal and external validation consistency index (C index) (0.756 and 0.742, respectively). The areas under the ROC curve (AUCs) of the training group were 0.787, 0.774 and 0.764 for 3, 5 and 10 years, respectively, while those of the validation group were 0.756, 0.766 and 0.762, respectively. The results of both DCA and calibration curves demonstrated the good performance of the model. Conclusions A nomogram for IMPC of the breast patients after surgery was developed to estimate 3, 5 and 10 years—OS based on independent risk factors. This model has good accuracy and consistency in predicting prognosis and has clinical application value.


| INTRODUCTION
Breast cancer (BC) is the most common malignant tumor worldwide, which seriously endangers women's lives and health. 1 Invasive micropapillary carcinoma (IMPC) is a rare type of invasive carcinoma of the breast, accounting for only 0.9%-2% of cases. 2,3 IMPC is defined in the WHO classification of breast tumors as an invasive carcinoma with small clusters of tumor cells arranged in a mesenchymal lumen resembling a vasculature. Consisting of clusters of mulberry-like or glandular ductal or alveolar-like carcinoma cells, IMPC has a polarity flip phenomenon. 2,4 IMPC often coexists with invasive ductal carcinoma (IDC) and can usually be differentiated by EMA staining. 2 The SEER (surveillance, epidemiology and end results) database is established by the National Cancer Institute to provide reliable and valuable information on cancer statistics. 5 The nomogram prediction model provides a simple and visual representation of prognostic-related risk factors that can guide clinical research. IMPC has biological characteristics of high lymph node metastasis, recurrence and distant metastasis. 6 In the past, this carcinoma was considered a breast cancer with poor prognosis, but the survival rate of IMPC has increased significantly in recent years. 2,7-9 Wu et al. analyzed 881 IMPC patients from SEER and determined that IMPC had good breast cancerspecific survival (BCSS) and OS. 10 Recently, Meng et al. constructed a nomogram of 388 cases of IMPC and determined that age, lymph node metastasis, hormone receptor status, adjuvant radiotherapy and other factors may affect locoregional recurrence (LRR) after mastectomy. 7 Surgery improves the prognosis and quality of life of BC patients, but there is no prognostic prediction for patients with IMPC of the breast after surgery. In this study, we aimed to develop a nomogram to identify factors associated with improved survival in patients with IMPC.

| Selection of patients
IMPC after surgery was identified by SEER*Stat (version 8.3.9) from 18 population-based cancer registries. Patients were eligible for enrollment according to the following inclusion criteria: (1) histology ICD-O-3 (8507), (2) surgery performed, (3) patients with primary site, (4) known ER, PR status and adjusted AJCC 6th stage. The exclusion criteria were as follows: (1) detailed information lacking age, race, grade or marital status and (2) unknown T, N, M classification and breast subtype.

| Variable declaration
Patient characteristics included basic information, histological type, grade, breast subtype, primary site, tumor size, positive regional nodes and treatments. The patient's age was cut off at 62 years, while the tumor size was reclassified as ≤20, 20-50 and >50 mm. Surgery information was categorized as breast-conserving surgery (BCS) or mastectomy. Primary sites of tumors were divided into central portion of breast or nipple, lower-inner/lower-outer/ upper-inner/upper-outer quadrant of breast and others. The subtypes of tumors were classified as HR+/HER2− (luminal A), HR+/HER2+ (luminal B), HR−/HER2+ (HER2 enriched) and HR−/HER2− (triple negative).

| Statistical analysis
The baseline characteristics of IMPC patients after surgery were first described statistically. OS was defined as the date of diagnosis to the date of death from any cause or the date of the last follow-up visit. The data were divided into training and validation sets in a 7:3 ratio using the "caret" package in R version 4.1.1. Survival analysis was performed using SPSS version 25.0. Kaplan-Meier survival curves were constructed for each variable with a log-rank test. Variables with p < 0.05 in univariate analysis were included in Cox proportional risk regression models to identify risk factors associated with IMPC prognosis.
Nomograms were built based on multifactor analysis using the "rms" package. The performance of the nomogram was measured by the C index to judge the accuracy of the prediction results. The total score of patients in the validation set was calculated based on the corresponding column line graphs and included as a new factor in the Cox regression model. In addition, the area under the ROC curve (AUC) was also calculated to assess the performance of the prognostic model, while the "stdca" function was used in decision curve analysis (DCA) to determine the suitability of the model. Moreover, calibration curves were plotted to compare the difference between predicted survival and actual survival determined using Kaplan-Meier analysis.

K E Y W O R D S
breast cancer, invasive micropapillary carcinoma, nomogram, overall survival, surgery 3 | RESULTS

| Clinicopathological characteristics of the patients
A total of 1855 patients diagnosed with IMPC after surgery were included in this study. Patients were randomly divided via a 7:3 ratio into two sets: a training set (n = 1300) for nomogram building, and a validation set (n = 555) for model validation. Next, the clinical and pathological characteristics of the patients in the training set were described in detail. The median age of primary diagnosis for the entire population was 62 (22-96) years old. The majority of patients were white (78.2%), and 98.5% were female. The breast cancer subtype was HR+/HER2− (luminal A) in 52.7% of patients, HR+/HER2+ (luminal B) in 11.2%, HR−/HER2+ (HER2 enriched) in 2.8% and HR−/HER2− (triple negative) in 2.7%. Since the Seer database has only recorded HER2 status since 2010, the HER2 status was not known for patients (30.6%) before that date. A total of 41.4% of screened patients after surgery were stage I, 37.3% were stage II, 20% were stage III, and 1.3% were stage IV. A total of 52.4% of the patients received breastconserving surgery (BCS), and the rest underwent mastectomy. The majority of patients developed breast cancer that was located on the upper-outer region (29.5%), and most tumors were ≤20 mm (58.4%). The proportions of patients who underwent chemotherapy and radiotherapy were 48.9% and 55.1%, respectively. The demographic and clinical characteristics of the study participants based on dataset classification are shown in Table 1.

| Prognostic factors
As shown in Table 2, cox regression analysis was performed on the training set. Factors that were statistically significant in the univariate analysis were subjected to multiple covariance diagnosis, and strong covariance was found between T, N, M stage and clinical stage of the tumor; therefore, we did not include T, M stage and lymph node status in the cox multifactor regression model. Ultimately, age ≥ 62 (p < 0.001), negative ER (p = 0.004), stage III (p = 0.044), and stage IV (p < 0.001) were related to a significantly increased risk of IMPC patients after surgery. In contrast, marital status (p < 0.001), white or other race (p = 0.002), chemotherapy (p < 0.001) and radiotherapy (p = 0.002) were associated with a significant reduction in risk. Kaplan-Meier analysis with the log rank test was performed for the above factors using the "survival" package of R software, and the same statistical results were obtained ( Figure 1). The study also found no significant difference in survival time among patients treated with two modalities of surgery. These results identified factors that may predict the occurrence of IMPC after surgery.

| Development and validation of nomogram
To demonstrate the interrelationship between the variables, we constructed a nomogram predicting OS by integrating independent predictors. The results of nomogram showed that stage was the main factor affecting prognosis, followed by race, age, ER status, marital status, chemotherapy, and radiotherapy ( Figure 2). The prognostic models for the two groups were examined by plotting receiver operating characteristic (ROC) curves (

| Risk assessment
According to the OS of postoperative risk of breast cancer, we divided the patients into three groups by "coxph" function, including low-risk group and high-risk group. By plotting Kaplan-Meier survival curves for each group, we found that the results of both the training and validation sets showed statistically significant differences in OS for patients with different risk levels (p < 0.001) ( Figure 6A,B). These results demonstrated the strong predictive value of this risk grouping system for the postoperative prognosis of IMPC patients, further demonstrating the application of this prognostic model.

| DISCUSSION
BC is the most common form of cancer and the leading cause of cancer deaths in women worldwide. 11 The combination of various treatment modalities, such as chemotherapy, hormone therapy, targeted therapy and immunotherapy, can effectively control disease progression and improve patients' quality of life. 12 Many patients with IMPC have associated clinically disease-positive lymph nodes, often related to the strong lymphovascular tropism of tumor. 13 The potential biology of microcapillary histological patterns is detrimental in the lymphatic directionality of tumors. 13 Verras et al. suggested that although there are no specific guidelines, breast surgeons should be aware that IMPC may require more extensive marginal excision. 13 Although IMPC has a high propensity for lymph node metastasis, various studies have shown that its overall prognosis is similar to that of IDC. 9,14-16 Comparison of IMPC and IDC using propensity score matching (PSM) to remove confounding factors revealed no significant differences in OS and disease-free survival (DFS) between the two groups. 16 The BCSS and OS of IMPC were even superior to those of IDC in AJCC stage II-III, and histology grade II-III. 17 Chen et al. proposed that the diseasespecific survival (DSS) and overall prognosis of IMPC are similar to those of IDC and that patients with ER-negative or ER-positive lymph nodes ≥4 have the worst prognosis. 14 In addition to positive ER and fewer lymph nodes, Lewis et al. further demonstrated that age <65 years and receipt of radiotherapy were also protective factors. 4 Ye et al. analyzed 1407 IMPC patients from the SEER database and found that larger tumors, younger age, black race, and lack of hormone receptor expression were significantly associated with regional lymph node involvement. 18 Surgery can effectively alleviate the progression of IMPC. However, there is no prognostic prediction for patients with IMPC after surgery. Nomograms are conducive to the promotion of personalized medicine and have been proposed as a means to improve disease prediction. [19][20][21] In this study, we analyzed 1855 IMPC breast cancer patients from the SEER database after surgery and identified age, race, marital status, stage, ER status, radiotherapy and chemotherapy as factors affecting prognosis. The above characteristics were further used to build a nomogram for predicting the 3-, 5-, and 10-year OS. Unfavorable prognostic factors for IMPC patients after surgery included age ≥62, black race, stage III-IV and negative ER status. Receiving radiotherapy or chemotherapy improved patient prognosis, while there was no difference in OS between BCS and mastectomy. The AUC values of the training and validation sets, as well as the C-index, demonstrated the strong accuracy of the model in predicting 3-, 5-, and 10-year OS. The calibration curves proved the consistency of the model's prediction results  This study is a retrospective study and inevitably has some limitations. Information on the HER2 status of tumors prior to 2010 was not available for 31.2% of the total population, thus potentially ignoring the prognostic impact of HER2 status. Although M stage was statistically significant in the univariate analysis, we did not include M stage in the multivariate analysis because of the bias of the results due to the small sample size for the occurrence of distant metastases. Overall, our model has good clinical applicability in predicting the postoperative prognosis of IMPC patients.

FUNDING INFORMATION
This work was supported by the National Natural Science Foundation of China (81860465 and 8216110588).