Establishment of a prognostic risk prediction modelfor non-small cell lung cancer patients with brainmetastases: a retrospective study

Background Patients with non-small cell lung cancer (NSCLC) who develop brain metastases (BM) have a poor prognosis. This study aimed to construct a clinical prediction model to determine the overall survival (OS) of NSCLC patients with BM. Methods A total of 300 NSCLC patients with BM at the Yunnan Cancer Centre were retrospectively analysed. The prediction model was constructed using the least absolute shrinkage and selection operator-Cox regression. The bootstrap sampling method was employed for internal validation. The performance of our prediction model was compared using recursive partitioning analysis (RPA), graded prognostic assessment (GPA), the update of the graded prognostic assessment for lung cancer using molecular markers (Lung-molGPA), the basic score for BM (BSBM), and tumour-lymph node-metastasis (TNM) staging. Results The prediction models comprising 15 predictors were constructed. The area under the curve (AUC) values for the 1-year, 3-year, and 5-year time-dependent receiver operating characteristic (curves) were 0.746 (0.678–0.814), 0.819 (0.761–0.877), and 0.865 (0.774–0.957), respectively. The bootstrap-corrected AUC values and Brier scores for the prediction model were 0.811 (0.638–0.950) and 0.123 (0.066-0.188), respectively. The time-dependent C-index indicated that our model exhibited significantly greater discrimination compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging. Similarly, the decision curve analysis demonstrated that our model displayed the widest range of thresholds and yielded the highest net benefit. Furthermore, the net reclassification improvement and integrated discrimination improvement analyses confirmed the enhanced predictive power of our prediction model. Finally, the risk subgroups identified by our prognostic model exhibited superior differentiation of patients’ OS. Conclusion The clinical prediction model constructed by us shows promise in predicting OS for NSCLC patients with BM. Its predictability is superior compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging.


INTRODUCTION
or challenging to quantify evaluation metrics. Furthermore, only a few researchers have developed prognostic prediction models for BM in patients with NSCLC. For instance, Li et al. (2022) introduced a novel prognostic model based on clinical features and inflammation markers to enhance the prognostic information accuracy for NSCLC patients with BM compared with the adjusted prognostic analysis, RPA, and GPA. Additionally, Zhang et al. (2020) examined the feasibility of employing computed tomography imaging radiomics to predict the survival of NSCLC patients with BM undergoing whole-brain radiotherapy.
However, these studies have certain limitations, including the non-rigorous selection method used for predictor selection and the limited clinical applicability of the developed models. Therefore, it becomes crucial to identify clinically meaningful and cost-effective prognostic factors that are readily available at the time of BM onset, as this would provide more valuable insights. This study aimed to establish a novel prognostic model based on clinicopathological characteristics, serological indicators, and treatment information using least absolute shrinkage and selection operator (LASSO)-Cox regression analysis to bridge this knowledge gap, thereby achieving a more precise reflection of the prognostic information on NSCLC patients diagnosed with BM. Our model could assist clinicians in formulating reasonable treatment plans.

MATERIALS & METHODS
This clinical prediction model was constructed according to the transparent reporting of a multivariable prediction model for individual prognosis or diagnosis (TRIPOD) checklist (Collins et al., 2015).
This research adheres to the Declaration of Helsinki. The Ethics Committee of the Third Affiliated Hospital of Kunming Medical University approved this study (review number: KYLX2022221). Given the retrospective design of the study and the challenges associated with assessing certain patients, the Ethics Committee granted a waiver of informed consent for a subset of the patients.

Study population and follow-up
This retrospective study included 300 patients with NSCLC who were diagnosed with BM between January 2006 and May 2020 at Yunnan Cancer Hospital, Third Affiliated Hospital of Kunming Medical University. The inclusion criteria for this study were as follows: (1) NSCLC confirmed via pathological examination; (2) magnetic resonance imagingconfirmed BM; (3) availability of patient demographic characteristics, clinicopathological features, serological indicators, and treatment information; (4) absence of other concurrent cancers. The survival duration of the patients was determined by reviewing the medical records and conducting telephonic inquiries. The OS was defined as the interval from the initial diagnosis until death from any cause or the date of the last follow-up visit (Pilz, Manegold & Schmid-Bindert, 2012).

Data collection
Baseline clinical data were obtained from medical records at the time of initial diagnosis of BM. The collected data encompassed various aspects, including general conditions (age, sex, body mass index (BMI), smoking history, and KPS), tumour markers (carcinoembryonic antigen (CEA), neuron-specific enolase (NSE), cytokeratins (cytoplasmic protein fragment of cytokeratin 19 (CYFRA21)), and squamous cell carcinoma antigen (SCCA)), serological indicators (albumin (ALB), lactate dehydrogenase, and alkaline phosphatase (ALP)), serum inflammatory indicators (neutrophils, platelets, lymphocytes, monocytes, platelet/lymphocyte ratio (PLR), neutrophil/lymphocyte ratio (NLR), systemic immuneinflammation index (SII) = platelet × neutrophil/lymphocyte, advance lung cancer inflammation index (ALI) = BMI × ALB/NLR), prognostic nutritional index (PNI) = ALB + 5× lymphocyte), advance distant metastases (number of BM, lung metastasis, intrathoracic metastasis (malignant pleural effusion, pericardial effusion, or pleural metastasis), liver metastasis, bone metastasis, adrenal metastasis, and metastases to other sites), signs and symptoms of BM (intracranial hypertension, focal signs and symptoms, epilepsy, and decreased cognitive function), type of pathology, pathological stage (tumour stage, lymph node stage (N_stage), or metastasis stage (M_stage)/TNM stage), epidermal growth factor receptor (EGFR) gene mutation status, treatment status (surgery for primary lung cancer foci, radiotherapy of primary lung cancer, radiotherapy for BM lesions (whole-brain radiation therapy or stereotactic radiation therapy), surgical treatment of metastatic brain lesions, chemotherapy, or EGFR-tyrosine kinase inhibitor (TKI) treatment), and classification information of RPA, GPA, Lung-molGPA, and BSBM models. The above predictors were complete and comprised objective data. All predictors were assessed independently of each other, without any knowledge of the clinical outcome. All continuity predictors maintained their continuity and were not categorised. The categorised predictors were all predetermined before model construction. The continuous and categorical variables are presented in Table S1. The sample size of this study met the criterion of having events per variable (EPV) of >10 (Peduzzi et al., 1996;Austin, Allignol & Fine, 2017;Moons et al., 2019).

Model construction and evaluation
The final predictors were selected via a 10-fold cross-validation of LASSO-Cox regression, whereby the λ-value associated with the minimum standard error was chosen. Subsequently, the final predictors were incorporated into a multivariate Cox regression analysis, and the risk score for each patient was calculated using the ''predict ()'' function. Finally, a prognostic model was constructed.
The discriminatory ability of the model was assessed by evaluating the area under the receiver operating characteristic (ROC) curve (AUC). Furthermore, the calibration curve was plotted and the Brier score was calculated to measure the calibration of the model. Internal validation was conducted using the bootstrap method (resampling 1,000 times). The discriminative ability and clinical utility of the novel prognostic models were compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging using time-dependent C-index and decision curve analysis (DCA). A larger AUC value indicated better predictability of the model (Carrington et al., 2023). DCA demonstrated the relationship between benefits and risks across models by examining various cut points (thresholds) in different models (Van Calster et al., 2018). Furthermore, the integrated discrimination improvement (IDI) and net reclassification improvement (NRI) were employed to assess the reclassification performance and discrimination of our novel prediction models compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging. A nomogram was developed based on the selected predictors to facilitate individual survival prediction in NSCLC patients with BM. Subsequently, patients were classified into low-risk, intermediate-risk, and high-risk groups based on the new prediction model RiskScore. The differences in OS among these three subgroups were analysed using the Kaplan-Meier method. All statistical analyses were performed using R software (version 4.2.1; R Core Team, 2022), and statistical significance was set at P ≤ 0.05.

Patient characteristics
A total of 300 NSCLC patients with BM were included in this study, all of whom had complete baseline clinical and laboratory data. The clinicopathological characteristics and laboratory results of these patients are summarised in Table S1. The mean age of the patients was 55.4 years (range 31-83 years). Among the patients, 185 were males and 115 were females. The median follow-up duration of the patients was 13.9 months, with a minimum and maximum follow-up duration of 0.1 months and 173.83 months, respectively. The last follow-up was performed on 16 June 2021. The OS rates for these patients at 1, 3, and 5 years were 75%, 49%, and 40.3%, respectively.

Construction of the prognostic models
First, LASSO-Cox regression analysis was used to identify the optimal predictors and constructed the model. Cross-validation was used to select the λ-value associated with lambda.min (λ = 0.054), which yielded the highest model fit (Fig. 1). This λ-value corresponds to the most significant prognostic factor for OS. The final model comprised 15 predictors, namely age, KPS, NSE, PLR, lymphocyte count, ALP, smoking history, intrathoracic metastasis, metastases to other sites, N_stage, M_stage, surgery for primary lung cancer foci, chemotherapy EGFR mutation, and TKI treatment. The EPV for each variable was 12.4. The regression coefficients of these predictors were used to construct the prognostic model. The risk score of the prognostic model was calculated as follows: The continuous variables in the formula were based on the original numerical levels, while the categorical variables were represented by codes, as presented in Table S1.
We conducted Cox proportional risk regression analysis using the 15 predictors identified through LASSO regression (Fig. 2). The time-dependent ROC curves suggested AUC values   Evaluation of the performance between our novel prognostic model, RPA, GPA, Lung-molGPA, BSBM, and TNM staging The time-dependent C-index was used to evaluate the accuracy of our model's predictions. Our model demonstrated superior discriminatory ability compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging, and these results were consistent with our bootstrap validation (Fig. 4). Furthermore, DCA was employed to evaluate which focus the clinical applicability of our predictive model. Notably, the applicable threshold range varied among the six DCA curves. The widest threshold range was observed for our model, approximately 0.4−0.8, among the six curves. Moreover, within most of the threshold ranges, our model yielded the highest net benefits (Fig. 5). These findings indicate that our model is the optimal choice.
Lastly, the IDI and NRI metrics were incorporated to assess the performance of our model. The IDI analysis allowed us to measure the extent to which our new model improved its predictive power compared with the existing model, and the NRI analysis was

Constructing a nomogram for predicting OS
A nomogram was constructed to visualise our model. This provided a convenient, personalised tool to predict the 1-, 3-, and 5-year OS in NSCLC patients with BM. Each predictor is associated with a score, and the scores of all predictors were summed together to obtain a total score, from which the OS at 1, 3, and 5 years could be obtained (Fig. 6).

Risk stratification based on our model
The risk score of each patient was calculated, following which they were classified into three groups, namely the low-risk ( n = 75), intermediate-risk (n = 150), and high-risk (n = 75) groups based on the tertiles of their risk score to assess whether our model could accurately assess patient risk. The OS was significantly lower (P < 0.001) in the high-risk group (risk The Lung-molGPA and BSBM models, along with our model, help distinguish the OS of patients. However, when using risk groupings derived from the GPA and RPA models, the differentiation of OS among patients is not fully effective. Specifically, according to the GPA model, there was no statistically significant difference in the OS between patients in the ''GPA 1.5−2.5'' group and the ''GPA 3'' group (P = 0.275). Similarly, based on the RPA model, there was no statistically significant difference between the OS of patients in the ''class II'' group and the ''class III'' group (P = 0.122).
While the dichotomous risk grouping based on TNM staging can differentiate patients' OS, it lacks the level of detail provided by our model, which comprises three subgroups. The results demonstrate the superior performance of our prognostic model in accurately distinguishing the prognosis of NSCLC patients with BM (Figs. 7, 8).  lung cancer foci, chemotherapy, EGFR mutation, and TKI treatment. The prediction model exhibited good discriminative ability, calibration, and clinical utility. In addition, it outperformed the conventional BM models such as GPA, RPA, Lung-molGPA, BSBM, and TNM staging. Furthermore, the patients were categorised based on their risk scores,   (Rodrigus, de Brouwer & Raaymakers, 2001;Sanchez de Cos et al., 2009;Fuchs et al., 2021;Junger et al., 2021;Yu et al., 2021;Jacot et al., 2001;Cho et al., 2021). In our model, five individual prognostic factors, namely NSE, PLR, ALP, intrathoracic metastasis, and targeted therapy, were identified. Notably, these factors were not considered in previously published clinical prediction models for NSCLC with BM. For instance, Jacot et al. (2001) demonstrated that high serum NSE levels were associated with a worse prognosis in NSCLC patients with BM, suggesting a correlation between elevated NSE and the extent of tumour-induced damage to normal brain tissue. On the other hand, PLR serves as an index of inflammation, and a study by Cho et al. (2021) reported that an increase of 10 in PLR was associated with a 1.3% increase in the risk of death in NSCLC patients with BM. These findings might be attributed to the association between inflammation and cancer progression, wherein elevated platelet levels result in the production of inflammatory cytokines and chemokines, thereby facilitating tumour progression (Lim et al., 2019). Additionally, lymphocytes play a crucial role in antitumour immunity, and a decrease in lymphocyte count indicates an impaired cell-mediated immune response and compromised antitumour immunity (Jiang et al., 2019). Jacot et al. (2001) observed that NSCLC patients with BM and elevated ALP levels had shorter survival. Moreover, the use of TKIs targeting driver mutations in NSCLC, such as EGFR-TKIs and anaplastic lymphoma receptor tyrosine kinase gene (ALK)-TKIs, has significantly improved the prognosis of NSCLC patients with BM who possess corresponding gene mutations (Rotow & Bivona, 2017;Planchard et al., 2018). Based on the above evidence, the predictors incorporated into our model are valid and plausible. However, it is worth noting that there is a controversial point regarding intrathoracic metastasis. While the study by Hirashima et al. (2014) found intrathoracic metastasis to be a significant favourable prognostic factor for NSCLC patients with distant metastases, our study arrived at the opposite conclusion, suggesting that intrathoracic metastasis is an unfavourable prognostic factor in NSCLC patients with BM. Further research is warranted to validate these findings and address the existing conflicts.

DISCUSSION
Few studies have developed predictive models for assessing the prognostic risk of NSCLC patients with BM, and these studies have certain limitations. For instance, Wang et al. (2021) constructed clinical prediction models using univariate analysis to screen predictors (Zhang et al., 2020;Wang et al., 2021;Huang et al., 2020). However, according to the prediction model risk of bias assessment tool guidelines, bias can arise when univariate analysis leads to the exclusion of certain variables from the, as some predictors only demonstrate significance when adjusted for other factors simultaneously during analysis (Moons et al., 2019). In contrast, our model is based on LASSO regression, which effectively filters predictors. This method is a robust reduction algorithm that actively selects relevant and interpretable predictors from a large pool of variables, considering possible multicollinearity. Moreover, it helps avoid model overfitting (Balachandran et al., 2015;McEligot et al., 2020). Another study by Zhang et al. (2020) constructed a predictive model based on a predictor known as the computed tomography imaging histology score (Rad-score). However, the clinical applicability of their model is limited since the Rad-score predictor is not readily available during hospitalisation as it is not a routine examination item (Zhang et al., 2020). In contrast, our model incorporates readily available predictors, enhancing its clinical applicability. Similarly, Li et al. (2022) also established a nomogram combining patient clinicopathological factors and serological inflammatory markers (PLR, NLR, SII, PNI, and ALI) to predict survival in NSCLC patients with BM. Our study not only includes these indices but also incorporates a range of lung cancer-related tumour markers, such as CEA, NSE, CYFRA21, and SCCA, which have a potential impact on the prognosis of NSCLC patients with BM.
In summary, our LASSO-Cox regression analysis revealed that our prediction model provides a good fit for predicting OS in NSCLC patients with BM. The calibration plots demonstrated good calibration, while the time-dependent C-index analysis confirmed the model's strong prognostic accuracy compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging. Additionally, DCA revealed that our model yields the highest overall net benefit. Moreover, the IDI and NRI results showed significant improvements in predictive power and reclassification ratio when compared with RPA, GPA, Lung-molGPA, BSBM, and TNM staging. Notably, our model effectively stratified NSCLC patients with BM into low-, intermediate-, and high-risk subgroups, with the high-risk group exhibiting the worst survival outcomes. In conclusion, our clinical prediction model offers numerous advantages, including cost-effectiveness, broad applicability, simplicity of use, accessibility, and high accuracy. It holds great potential for predicting prognosis and aiding treatment decisions in NSCLC patients with BM.
Our study has certain limitations. First, the retrospective nature of our study introduces the possibility of selection bias, information bias, and confounding bias. Second, the data were obtained from a single hospital, and the sample size was relatively small. Therefore, future studies incorporating larger, multicentre cohorts are warranted to validate our findings. Furthermore, the inclusion of 15 predictive factors in our model might limit its clinical applicability due to the complexity associated with a large number of predictors. Finally, while our model incorporates easily accessible predictors, it is important to recognise that the specificity of the prognostic model could be improved by including NSCLC-related immunohistochemical markers or other relevant genetic mutations, such as programmed death-ligand 1, cytotoxic T-lymphocyte-associated antigen 4, ALK rearrangement, and ROS1 rearrangement (Ahmadzada et al., 2018).

CONCLUSIONS
The clinical prediction model we constructed holds the potential for predicting OS in NSCLC patients with BM, outperforming established models such as RPA, GPA, Lung-molGPA, BSBM, and TNM staging.