Abstract

Objective. To assess the diagnostic performance of clinically common single markers and combinations to distinguish nonmetastatic breast cancer and benign breast tumor. A predictive model with a better diagnostic ability for nonmetastatic breast cancer was established by using the diagnostic process. Methods. A total of 222 patients with nonmetastatic breast cancer and 265 patients with benign breast disease were enrolled in this study. CEA, Ca 15-3, Ca 125, Ca 72-4, CYFRA 21-1, FERR, AFP, and NSE were measured by an electrochemiluminescent immunoenzymometric assay on the Elecsys system. There are four key steps for our diagnostic workflow, that is, feature selection, algorithm selection, parameter optimization, and outer test data was used to validate the optimal algorithm and markers. Results. CEA, Ca 15-3, CYFRA 21-1, AFP, and FERR were selected using the t-test in our inner development set. The optimal algorithm among logical regression, decision tree, support vector machine, random forest, and gradient boost machine was selected by 10-fold cross-validation, and we found that random forest and logistic regression are the better classification. The outer test data was used to validate the best markers and classification. The random forest with CEA, Ca 15-3, CYFRA 21-1, AFP, and FERR showed the optimal combination for distinguishing breast cancer and benign breast disease. The AUC value was 0.888, the cut-off point was 0.484, and sensitivity and specificity were 78.9% and 90.1%. Conclusions. No single marker of these eight markers was good at identifying nonmetastatic breast cancer from benign tumors. But a diagnostic analysis workflow was established to develop a predictive model with better diagnostic capability for nonmetastatic breast cancer. This workflow is also applicable to the optimization of other disease markers and diagnostic models. The predictive model showed good diagnostic performance, and it could be gradually incorporated as a support method for the diagnosis of nonmetastatic breast cancer.

1. Introduction

Breast cancer is by far the most frequently diagnosed cancer among women with an incidence of 11.6% and overall cancer mortality of 6.6% worldwide [1]. There were an estimated 2.0 million new cases (24.2% of all cancers in women) and 0.6 million cancer deaths (15.0% of all cancer deaths in women) in 2018 [1]. Early diagnosis plays an important role in optimizing treatments and reducing the mortality of breast cancer patients [2]. Screening of early breast cancer forms part of the state programme of routine annual or biannual ultrasonography or mammograms for women within a certain age range [3], in China, between 40 and 70 years old. At present, mammography is the most common screening method for the detection of breast cancer. However, the results are not particularly satisfactory because of the high false-positive and false-negative rates [4]. Ultrasonography is also used for the early diagnosis of breast cancer in China. Unfortunately, approx. 20% of breast cancer patients cannot be diagnosed [5]. Therefore, a complementary instrument is required to get better results for the early diagnosis of breast cancer.

The common tumor markers in clinical use are dominant for various tumors, such as carcinoembryonic antigen (CEA) for colorectal cancer, alpha-fetoprotein (AFP) for hepatocellular carcinoma, Ca 12-5 for ovarian cancer, and so forth [6]. But they cannot be used as effective indicators for the diagnosis of breast cancer. There is no clinical guide, or even consensus among experts, with regard to the use of biomarkers for the early diagnosis of breast cancer. CEA and Ca 15-3 are recommended only for therapeutic monitoring of breast cancer and early detection of recurrent disease but not for breast cancer detection because of their low sensitivity [79]. However, many cancers may not be detected by their dominant markers but by the elevation of tumor markers not recommended for monitoring their tumor activity [10]. And screening with multiple tumor markers also allows cancers to be detected in the absence of their dominant markers [10].

Multivariate statistics combined with machine learning as a means of clinical data analysis have been reported in many pieces of literature, especially in the field of breast imaging [11]. However, for the study of molecular markers distinguishing benign and malignant breast diseases, there is only the selection of potential universal markers that have been made and no report on the optimization of indicators and classifiers. In this study, we compared the levels of marker panel (CEA, Ca 15–3, Ca 125, Ca 72-4, cytokeratin fragment 19 (CYFRA 21-1), ferritin (FERR), AFP, and neuron-specific enolase (NSE)) in breast cancer with that in benign controls, respectively, and tried to find an effective marker combination and a better diagnostic capability for nonmetastatic breast cancer.

2. Materials and Methods

2.1. Patients

The development set included 111 breast cancer and 132 benign samples, which were obtained from the First Hospital of Tsinghua University. The study was performed according to the standards of the Institutional Ethical Committee and the Helsinki Declaration and was approved by the clinical ethics committee of the Tsinghua University. The patients with breast cancer were selected according to the following criteria: (1) all patients with breast cancer were diagnosed by pathology; (2) all patients were female; (3) all patients had no distant metastases; (4) no patients received prior neoadjuvant oncological treatment; and (5) no patients were previously diagnosed with any other tumor. The patients with benign breast diseases were selected according to the following criteria: (1) all patients with benign breast diseases were diagnosed by pathology; (2) all patients were female; (3) no patients were previously diagnosed with any other tumor.

According to these criteria, we collected a validation set, including 111 breast cancer and 133 benign samples from the First Hospital of Tsinghua University, as the outer test group. All of these cancer and benign samples were approximately age-matched and were included according to the above criteria. All patients underwent surgical resection of the tumors. The clinicopathological characteristics and tumor stage were assessed based on the histopathological results.

2.2. Marker Analysis

Tumor marker measurements were performed strictly according to the manufacturer's instructions and quality control was ensured. Before surgery and after overnight fasting, 10 ml of venous blood was collected into a vessel tube containing heparin as an anticoagulant from each subject and was subsequently centrifuged (1500 × g for 15 min) to collect clear serum. The sera were then transferred into sterile vials and immediately stored at −80°C until further analysis. Subsequently, CEA, Ca 15-3, Ca 125, Ca 72-4, CYFRA 21-1, FERR, AFP, and NSE were assessed by an electrochemiluminescent immunoenzymometric assay (Roche Diagnostics, Germany) on the Elecsys system.

2.3. Marker and Model Optimization

We aimed to build a binary classifier that can distinguish between the nonmetastatic breast cancer and benign breast tumors accurately. The workflow is described in Figure 1.

There are four key steps for our diagnostic workflow, that is, feature selection, algorithm selection, parameter optimization, and an outer validation for the optimal algorithm and markers. All the analysis was performed using rpart, random forest, e1071, gbm packages of R software (http://www.r-project.org).

In the main modeling and mining methods, the “glm” function in the “stat” package is used for logistic regression, the family parameter is set as “logit”, and the default parameters are used for the rest. The decision tree uses the “rpart” function in the “rpart” package, sets the method parameter as “class”, and uses the default parameters for the rest. Random forest uses the “randomForest” function in the randomForest package and sets the “mtry” parameter as “2”, the “ntree” parameter as “500”, the proximity parameter as “T”, and the importance parameter as “T”. The SVM model uses the “svm” function in the “e1071” package, and the default parameters in the function are used for the parameters.

Student's t-test was carried out using internal training data to obtain the different indicators as potential markers between breast cancer and benign breast diseases.

The 10-fold cross-validation was carried to ensure the repeatability of the results by setting random seeds. By evaluating the performance of the training model, the optimal algorithm is selected in logistic regression, decision tree, support vector machine, random forest, and gradient boost machine. The sensitivity, specificity, accuracy, and area under the curve (AUC value) were determined. All the values are calculated as the mean value based on the inner training data was randomly divided into 10 subsets with equal sizes, and a single subset is retained as the validation data for evaluating the model, and the remaining 9 subsets are used for training.

Finally, the outer test data is used as external validation to verify the optimization algorithm and tags.

2.4. Statistical Analysis

Results are expressed as the mean ± SD for continuous variables and as the number (percent) for categorical variables. All statistical analyses were conducted using R software version 2.9.1. All statistical analyses were carried out using R software version 2.9.1. The differences of tumor markers between breast cancer and benign breast diseases were compared. When the data obeyed normal distribution, t-test was used; otherwise, Wilcoxon rank-sum test was used.

3. Results

3.1. Patient Characteristics

A total of 222 patients with nonmetastatic breast cancer and 265 patients with benign breast disease were enrolled in our study. All subjects were female from Han Chinese. The basic clinical and biological characteristics of the nonmetastatic breast cancer patients in the development set and validation set enrolled in this study are summarized in Table 1. The mean age of patients with benign breast disease was 42.6 ± 12.6 years in the development set and 42.7 ± 12.4 years in the validation set. There is no difference in clinicopathological characteristics between the two groups.

3.2. Blood Biomarkers Analysis

The levels of serum CEA, Ca 15-3, Ca 125, Ca 72-4, CYFRA 21-1, FERR, AFP, and NSE in all patients were analyzed. Univariate statistical analysis using the R project was performed to validate the statistical significance () of the tumor biomarker differences between breast cancer patients and benign breast disease patients. Five tumor biomarkers were selected with (Table 2). These five differentiating tumor biomarkers, including CEA, Ca 15-3, CYFRA 21-1, AFP, and FERR showed increased levels in breast cancer patients compared with benign breast disease patients (Figure 2).

3.3. Relationships between Serum Biomarkers and Clinical Characteristics of Breast Cancer Patients

The levels of these eight tumor biomarkers in 222 breast cancer patients of the development and validation set with different clinicopathological characteristics were analyzed to investigate the relationship between these eight tumor biomarkers and the clinical characteristics of the patients. We performed a matrix correlation analysis of tumor biomarkers and clinicopathological characteristics of patients with breast cancer, which can be seen from the graph (Figure 3(a)). The changes in the levels of these eight biomarkers were not correlated with histology and molecular subtypes. However, the significant difference between CEA and Ca 15-3 levels was higher in Tis-T1 than in T2-3 () (Figure 3(b)). And for the clinical staging of breast cancer, Ca 15–3 levels were also higher in stage III than that in stage I and stages 0-II, respectively () (Figure 3(c)). For the molecular marker, CYFRA 21-1 had higher levels in patients with the expression of Ki-67 at ≥14% () (Figure 3(d)). Furthermore, NSE was also downregulated in grade III patients compared with grade II patients (); and FERR was downregulated in grade II patients compared with grade I patients () (Figure 3(e)). Considering the staging of lymph nodes, the result showed that Ca 15-3 was upregulated in N2 patients compared with N0 and N1 patients, respectively () (Figure 3(f)). But, NSE was downregulated in N3 patients compared with N0, N1, and N2 patients, respectively ().

3.4. Differential Diagnostic Value of Biomarkers

The capacity of these five tumor markers to differentiate breast cancer patients from patients with benign breast disease was assessed with ROC analysis. CEA (AUC 0.716) and CYFRA 21-1 (AUC 0.761) showed good diagnostic performance. Sensitivity and specificity are 64.0% and 66.9% for CEA, and 64.0% and 80.5% for CYFRA 21-1 (Figure 4; Table 3).

3.5. Establishment and Validation of a Predictive Model

Multivariate statistical analysis was used for further research. We chose logistic regression, decision tree, random forest, support vector machine, and gradient boost machine as alternative algorithms. Through the 10-fold cross-validation, the metrics of each model were calculated, respectively, including accuracy, sensitivity, specificity, and AUC. According to the statistical analyses of the results of 10 verifications, we found that logistic regression had a similar classification effect with random forest, which was specifically shown as high AUC value and accuracy (Table 4).

Later, we used outer validation data to perform the out-of-project test. Through ROC comparison, we found that random forest showed the best diagnostic performance with AUC of 0.888, sensitivity of 78.9%, and specificity of 90.1% (Figure 5(a)), compared to the AUC of 0.777 in the logical regression model (Figure 5(b)). Also, variables importance in the model of the random forest was analyzed to evaluate the importance of variables from two perspectives: Mean Decrease Accuracy and Mean Decrease Gini. The results showed that CYFRA 21-1, CEA, and Ca 15-3 were the three most important variables in the model (Figure 6).

4. Discussion

In the present analysis, we investigated a panel of different markers to define which marker or which combination can be used in detecting nonmetastatic breast cancer from breast lumps and to develop a workflow with better diagnostic capability for nonmetastatic breast cancer.

A total of eight clinically used markers, including CEA, Ca 15-3, Ca 125, Ca 72-4, CYFRA 21-1, FERR, AFP, and NSE, were detected in all patients. Among them, five markers such as CEA, Ca 15-3, CYFRA 21-1, FERR, and AFP were found to have important differences between breast cancer and benign tumors.

Fold change value was calculated by the average value of breast cancer divided by the average value of benign breast disease. Fold change with a value larger than 1 indicates a higher level of the biomarker in plasma of breast cancer, while a fold change value lower than 1 indicates a lower level, compared to benign breast disease.

In the present analysis, Ca 15-3, FERR, and AFP showed increased levels in breast cancer patients compared with the benign breast disease controls. Ca 15-3, a variant of mammary epithelial surface glycoprotein and an antigen related to breast cancer, are used for therapeutic monitoring of breast cancer and early detection of recurrent disease [68]. Choi et al. [12] found that the levels of Ca 15-3 were higher in breast cancer patients than in benign breast disease by an antibody-lectin Sandwich assay that appeared to efficiently discriminate nonmetastatic breast cancer from benign breast disease. In our study, the Ca 15-3 level was also upregulated in breast cancer patients. However, Ca 15-3 showed a poor diagnostic ability, which might be caused by different detection methods. Also, the serum level of Ca 15-3 was associated with host tumor burden such as larger tumor size, more lymph node metastases, and advanced stage. Therefore, preoperative high serum levels of Ca 15-3 may indicate a poor outcome.

Ferritin is currently used to monitor the presence of malignant disease; it is regarded as a predictor of positive lymph nodes involved in patients with breast cancer [13, 14]. Orlandi et al. [14] found that breast cancer patients had significantly higher ferritin levels compared with the benign breast disease controls, which is consistent with our research. Several studies indicate that plasmatic ferritin is produced and secreted by macrophages, hepatocytes, and cancer cells [15], which may be the reason for the high levels of ferritin in breast cancer.

AFP, used as a liver cancer biomarker for over 30 years, may also be elevated to varying degrees in patients with gastric cancer, pancreatic cancer, or lung cancer [1619]. He et al. [20] found that the median value of AFP in 17 kinds of diseases was higher than that in healthy controls, including breast cancer. Little literature has studied the difference in AFP levels between benign and malignant breast cancer. Our results showed that AFP was elevated in breast cancer patients compared with the benign breast disease controls. The above summary indicated that both the source and the regulation of serum AFP levels were much more complicated than previously thought.

CEA and CYFRA 21-1 also showed increased levels in breast cancer patients compared with the benign breast disease controls. Concerning the differentiation between the two groups, CEA (AUC 0.716) and CYFRA 21-1 (AUC 0.761) showed good diagnostic performance. Also, CEA and CYFRA 21-1 were directly associated with larger tumor size and high Ki-67 index, respectively, in our study. Since tumor size and Ki-67 level were positively correlated with host tumor burden and malignant degree of breast cancer, respectively [21, 22], preoperative elevated levels of serum CEA and CYFRA 21-1 could be related to a poor outcome. CEA, a widely used tumor marker for examination and prediction in many cancers [23] and CYFRA 21-1, an excellent tumor marker in lung cancer [24, 25], were found upregulated in breast cancer patients in several studies [2630]. They are consistent with our findings. However, when we performed cross-validation within the test group, we found that the AUC values of CEA and CYFRA 21-1 were not stable. In conclusion, this means that no single marker of these five markers is well-diagnosed for breast cancer.

One relevant finding of the present work has been the design of a final predictive model. Some reports used the combination of molecular markers to identify breast cancer [26, 30, 31]. Bayo et al. developed a predictive model using NSE, Ca 15-3, NGAL, EGFR, and 8-OHdG for early breast cancer diagnosis with AUC of 0.918 [26]. Liu et al. used a panel of PD-1, IL-10, IL-2Rα, and Ca 15-3 for early-stage breast cancer diagnosis; this panel also had the AUC of 0.811 [31]. They had used newly discovered molecular markers in combination with classic tumor markers to improve the diagnosis rate of breast cancer. However, the diagnostic value had not been verified, and the new molecular markers have a longer turnaround time and many uncertainties from discovery to clinical application. In our study, the markers we selected were all tumor markers commonly used in clinical practice, and the model we established still had a stable and good diagnostic effect after validation. Since these markers have been widely used in various medical institutions, this diagnostic tool would be easily promoted and applied.

Some literature [8, 32] reported that simultaneous use of CEA and Ca 15-3 allowed the early diagnosis of metastasis in up to 60–80% of patients with breast cancer. Moreover, CEA and Ca 15-3 have been shown to detect 40–60% of breast cancer recurrences before clinical or radiological evidence of disease. Our study only contains data on early breast cancer, and some patients with advanced breast cancer should be added in the future. In addition, CEA and Ca 15-3 have been able to predict the recurrence of breast cancer [33]. I believe that the combined application of these five markers can better predict the recurrence of breast cancer, which will also be the direction of our future work.

At the beginning of the modeling, it was found that the AUC mean of the logistic regression model was similar to the random model in the internal 10-fold cross-validation. The result indicated that the linear generalized regression method might have a similar ability of internal stability control as the nonparametric probabilistic method. However, in the external verification, the random forest algorithm performs better and stronger external generalization ability, which is more in line with clinical application. The relationship between outcome variables and multiple indicators often cannot be parameterized by simple linear means. The same is true for the markers of breast cancer in our study, while the projection of variables to the high-dimensional space in the modeling is not completely linear. Therefore, the nonparametric method can be better fitted for the relationship between outcome variables and multiple indicators, and the importance of variables in the random forest model was also more reliable (Figure 5). In our future studies, a larger population to obtain more general data results is required to prove that the workflow is widely applicable and robust in the different cohorts.

Our research had the following limitations. First, our study population only consisted of Chinese women patients with breast cancer. In future studies, the scope could be broadened to include other ethnicities. Second, although the sample size was relatively large, the study population was selected from one hospital, and more validation would be carried out in other research institutions.

5. Conclusions

In summary, CEA, Ca 15-3, CYFRA 21-1, FERR, and AFP were found to be elevated in nonmetastatic breast cancer patients compared with the benign breast disease controls in our study. However, no single marker of these five markers is good at identifying breast cancer from benign tumors. A diagnostic analysis workflow was established to develop a better diagnostic capability for nonmetastatic breast cancer. This workflow is also applicable to the optimization of other disease markers and diagnostic models. The predictive model showed good diagnostic performance with AUC of 0.888, sensitivity of 78.9%, and specificity of 90.1%, and it could be gradually incorporated as a support method for the diagnosis of nonmetastatic breast cancer.

Data Availability

The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Conflicts of Interest

The authors declare that there are no conflicts of interest regarding the publication of this paper.

Authors’ Contributions

Nan Jiang, Tian Tian, and Xianyang Chen contributed equally to the work.

Acknowledgments

The authors thank Dr. Rui Jia and Dr. Teng Xue for their help in statistical analysis.