Survival prediction model for right-censored data based on improved composite quantile regression neural network

: With the development of the field of survival analysis, statistical inference of right-censored data is of great importance for the study of medical diagnosis. In this study, a right-censored data survival prediction model based on an improved composite quantile regression neural network framework, called rcICQRNN, is proposed. It incorporates composite quantile regression with the loss function of a multi-hidden layer feedforward neural network, combined with an inverse probability weighting method for survival prediction. Meanwhile, the hyperparameters involved in the neural network are adjusted using the WOA algorithm, integer encoding and One-Hot encoding are implemented to encode the classification features, and the BWOA variable selection method for high-dimensional data is proposed. The rcICQRNN algorithm was tested on a simulated dataset and two real breast cancer datasets, and the performance of the model was evaluated by three evaluation metrics. The results show that the rcICQRNN-5 model is more suitable for analyzing simulated datasets. The One-Hot encoding of the WOA-rcICQRNN-30 model is more applicable to the NKI70 data. The model results are optimal for 𝑘 (cid:3404) 15 after feature selection for the METABRIC dataset. Finally, we implemented the method for cross-dataset validation. On the whole, the Cindex results using One-Hot encoding data are more stable, making the proposed rcICQRNN prediction model flexible enough to assist in medical decision making. It has practical applications in areas such as biomedicine, insurance actuarial and financial economics.


Introduction
Prognostic modeling in medical survival analysis is used to analyze and understand disease processes, to predict the behavior of new patients in the context of available data, and to explore the interaction between disease factors. Statistical methods and machine learning-based methods are two common types of approaches to construct prognostic models [1]. Both statistical and machine learning methods can predict survival time and estimate the probability of survival. However, the former focuses more on characterizing the statistical properties of time-to-event distributions and parameter estimates by estimating survival curves, while the latter mainly predicts the occurrence of specific time-to-events by combining the power of traditional survival analysis methods with various machine learning techniques. Kaplan-Meier estimation is an asymptotic maximum of the survival function under incomplete data likelihood estimation, which provides a solid theoretical foundation for dealing with right-censored data [2]. The problem of sparse estimation of censored median regression models is an important issue in the analysis of high-dimensional survival data. A sparse and robust median estimate can be obtained by minimizing the weighted minimum absolute bias loss of the inverse censored probability under an adaptive lasso penalty [3]. Giussani and Bonetti explored multivariate survival techniques for binary right-censored data analysis and investigated a new parametric bivariate vulnerability model [4]. For consideration of the correlation between two survival times, the Marshall-Olkin bivariate exponential distribution (MOBVE) was used to model two joint distributions. Yu investigated the problem of great likelihood estimation and data fitting diagnostics associated with uniform distributions and applied it to a cancer research dataset [5].
When investigating regression models, different loss functions are usually considered. The most commonly utilized one is the squared loss, i.e., the least squares estimation. However, it is sensitive to heavy-tailed distributions and outliers and is not robust. The quantile regression model [6] can not only handle heteroskedasticity and outliers, but also construct confidence intervals for parameters using empirical likelihood inference methods without estimating the asymptotic variance. Zou and Yuan proposed the first composite quantile regression method by considering the loss of multiple quantile points simultaneously, which greatly expanded the theoreticality and stability of the estimation results [7]. In recent years, many scholars have done a lot of work based on the composite quantile regression method, such as Shim et al. proposed a new nonparametric regression method, the composite support vector quantile regression (CSVQR) [8]. Bang et al. proposed the weighted composite quantile regression (WCQR) [9], and in that same year, Bang proposed composite kernel quantile regression (CKQR) [10]. Xu et al. proposed a new composite quantile regression neural network model (CQRNN) [11]. In 2021, a weighted composite quantile regression model for linear models [12] was used for the problem of right-censored data with random missing censored indicators, and an adaptive penalty procedure was applied to discuss the variable selection problem in the model.
Neural networks have an excellent ability to handle censored data in survival analysis and can identify the importance of extremely complex interactions in the data. Therefore, it can be used to improve traditional survival analysis techniques [13]. Katzman et al. introduced DeepSurv, a state-ofthe-art Cox proportional risk deep neural network survival method for modeling the interaction between patient covariates and treatment effects in order to provide personalized treatment recommendations [14]. Wang et al. proposed a novel multi-task based neural network SurvNet [15]. Anika and Olivier constructed a multimodal neural network model to develop an unsupervised encoder for personalized treatment of cancer patients [16]. Currently, there are an increasing number of models for cancer survival prediction [17][18][19][20], and developing a model that is both accurate and interpretable is to remain a challenge.
For datasets with censored data, the selection of estimation methods is essential due to the incompleteness of the data, which happens to make excellent use of the quantile nature. In this paper, we use a composite quantile regression approach to deal with censored data. Moreover, many deep learning algorithms have been adapted to tackle such censored data and to deal with other challenging problems that arise in realistic data. In this study, a survival prediction model based on an improved composite quantile regression neural network framework is proposed. It extends the deep neural network framework to composite quantile regression, combined with an inverse probability weighting method for survival prediction. The whale optimization algorithm (WOA) is used for hyperparameter tuning, and the binary whale optimization algorithm (BWOA) is applied for variable selection of highdimensional covariate data. Through extensive tests on simulated datasets and three real datasets of NKI70, METABRIC, and TCC, the results show that the proposed rcICQRNN method has high prediction accuracy for right-censored data.
The purpose of this study is to help healthcare professionals make sound decisions by establishing a more accurate method of predicting cancer patients. Cancer patients can benefit from accurate prognosis prediction, and physicians can benefit by making timely judgments and suggesting appropriate treatment options. For healthcare professionals, cancer prognosis analysis is an important part of their profession, especially when dealing with patients with short survival times, which need to be supported by more robust models. Deep learning can greatly facilitate cancer treatment and has the potential to bring it closer to accurate prognostic outcomes. This paper employs a state-of-the-art neural network architecture optimized for cancer survival analysis and validated on open-source and publicly accessible multiple data, and also shows that the present method can enable predictions from other domain datasets. The application of composite quantile regression captures the overall picture of the survival distribution, enhances the robustness, fit, predictive power and nonlinear processing of neural networks, and fills the gap between composite quantile regression and neural network methods in the field of survival analysis. Compared with state-of-the-art models, this paper is able to enhance the predictive power of survival analysis and facilitate the development of deep learning techniques in personalized clinical decision making.

NKI70 dataset
In this paper, the proposed rcICQRNN method and DeepQuantreg [21] are compared with two real breast cancer datasets. The first is the NKI70 dataset, which is from the R package penalized. This dataset is described in detail on the website https://rdrr.io/cran/penalized/man/nki70.html. The NKI70 dataset contains 144 patients with lymph node positive breast cancer. In an earlier study, metastasisfree survival, 5 clinical risk factors, and gene expression measurements for 70 genes were found to be predictive of metastasis-free survival in 144 patients. Among these patients, 96 were censored samples, with a censoring ratio of approximately 67%. Four variables in the dataset were of type categorical: tumor diameter (two levels), number of affected lymph nodes (two levels), estrogen receptor status (two levels), and tumor grade (three ordered levels). We implemented integer encoding and One-Hot encoding to encode categorical features, and the number of covariates after One-Hot encoding was 80. We subsequently compared the performance of integer encoding and One-Hot encoding in rcICQRNN for survival prediction of breast cancer data.

METABRIC dataset
The most important part of the clinical decision-making process for cancer patients is the accurate estimation of prognosis and survival. Breast cancer patients with the similar disease stage and the similar clinical features can have different treatment responses and overall survival due to the fact that cancer is associated with genetic abnormalities. The Molecular Taxonomy of Breast Cancer International Consortium (METABRIC) database is a Canadian-UK project containing targeted sequencing data from 1980 major breast cancer samples. Clinical and genomic data can be downloaded from cBioPortal. For this study, the METABRIC dataset was obtained from the Kaggle website at https://www.kaggle.com/raghadalharbi/breast-cancer-gene-expression-profiles-metabric/download. The METABRIC dataset included 31 clinical attributes, m-RNA level z scores for 331 genes, and 175 gene mutations in 1904 breast cancer patients. After processing missing values, the data included 1902 patients with 518 features. Again, integer encoding and One-Hot encoding were performed for 18 categorical features. Due to the large number of variables in the dataset, direct survival prediction gives poor results. Therefore, a variable selection method, called BWOA, is proposed in this paper. The number of variables after integer encoding was 433 and the number of variables after One-Hot encoding was 241, both with a censoring ratio of 66%.

Telco Customer Churn dataset (TCC)
To validate the performance of this study on different domain datasets, this paper analyzes the characteristics that influence telecom customer churn and the prediction of customer churn probability. The data was obtained from 7034 telecommunication customers. It consists of 50% female customers, 50% male customers, 18 characteristics, and a censoring ratio of 73%. This data set may be found at https://www.kaggle.com/datasets/blastchar/telco-customer-churn. The data set of 'gender', 'Partner' and other 15 non-numeric features in the data set were converted to numeric form.
The selected features corresponding to each dataset are given in Table 1.

rcICQRNN method
Taylor and Cannon considered a quantile regression neural network (QRNN) method by combining the excellent properties of neural networks and quantile regression (QR) [22,23]. In this section, the censored composite quantile regression neural network (rcICQRNN) model is proposed by adding a deep neural network and an inverse probability weighting structure to the composite quantile regression method. For random variables , the quantile of is defined as: : is a function of , it can completely describe the distribution of random variable . Given the covariate , ⋯ , The conditional quantile of can be defined as: The log linear quantile regression model is defined as follows: where , ⋯ is the regression coefficient. Since the original test function is not defined at the origin, it cannot be differentiable everywhere. Therefore, the Huber function [24] is used to smooth the test function in this paper.
where ⋅ is the indicator function. In traditional quantile regression, the model will be trained separately for each quantile, and the training objective is to minimize the average loss function for the quantile: The composite quantile regression is extended to different levels of while considering the multiple quantile regression model. In general, take an equidistant , 1,2, ⋯ , , and define the following objective function : Suppose a deep feedforward network with two hidden layers is considered. The overall structure is: an input layer, consisting of input variables , 1, ⋯ , . In the middle are two hidden layers, consisting of and neural units, respectively. The final layer is the output layer. is the weight and bias of the hidden layer, and , is the weight and bias of the output layer. Then, the output of the first hidden layer is: The output of the second hidden layer is: where and are the activation functions of the hidden layer. Common activation functions include sigmoid, relu, etc. [21]. Finally, the output layer of the network is a single node with linear activation function. The estimation of the conditional quantile of the object is Eq (9).
Therefore, the objective of CQRNN is to minimize the empirical loss function as Eq (10).
where includes all coefficients, i.e., weights and bias, thus trained and estimated. Now consider the data with right-censored loss and analyze mainly the time of death of the target subject, or other known points of interest that occurred over a period of time. The Cox proportional risk model is a widely used statistical method for modeling risk rates. For overcoming the complexity of risk ratio interpretation, a composite quantile regression neural network is introduced to the censored data, and the interpretation is more intuitive. First, is defined as the failure time. is the maximum follow-up study time.
is the dimension prediction variable that may affect the failure time.
Combined with the characteristics of right-censored data and assuming the conditional independence between and of a given , define a weight function: , (12) where is the Kaplan-Meier estimation based on censored distribution of observed data , 0 . This weight function can be shown to be equivalent to a jump in the Kaplan-Meier estimator of the logarithmic distribution function of the failure time [21,25]. Similar to the ordinary censored quantile regression, in this study, the loss function of rcICQRNN is shown in Eq (13).
In summary, the inverse censored probability weighting method is combined with the composite quantile regression neural network method considering the characteristics of the right censored data, and the neural network methods used include single hidden layer feedforward neural networks and deep feedforward networks. For the hyperparameters in the neural network, the Whale optimization algorithm in the next section is used for parameter optimization. Also, to prevent overfitting, the dropout layer is added to the neural network in this paper.

Whale optimization algorithm (WOA)
The whale optimization algorithm is a novel population intelligence optimization method proposed by Mirjalili and Lewis at Griffith University, Australia, in 2016 [26]. Its thought is derived from the unique special predatory behavior of humpback whales in the ocean, and the algorithm achieves the optimization search purpose through the process of whale encirclement and bubble attack on the prey. The algorithm mainly consists of three search mechanisms, using contraction and spiral mechanism to achieve the local search of the algorithm, and using stochastic learning strategy to achieve the global search of the algorithm. The most important feature is to simulate the hunting behavior of humpback whales with random individuals or optimal individuals, and to simulate the bubble net attack mechanism of humpback whales with spirals.
In recent years, machine learning methods have developed rapidly and have greater advantages in some aspects compared with traditional statistical methods, and the integration of the two is increasingly being studied, which is a trend in the development of statistical research at this stage. Therefore, after studying and deeply researching statistical methods and machine learning methods, this paper integrates the characteristics of different methods, combines the composite quantile regression and neural network methods, and uses WOA optimization parameters to propose a new method with higher comprehensive utility.

Binary WOA (BWOA)
In this paper, a feature selection method based on an optimization algorithm is proposed, inspired by the feature selection strategy in [27]. The process of selecting the smallest subset of survival predictions from the most informative features. A binary version of the WOA is utilized to perform this task by constructing the input feature matrix most relevant to the model.
The original WOA algorithm is a continuous algorithm, which cannot solve the binary problem directly. For solving this problem, we propose a binary version of WOA. in BWOA, set the upper bound ( ) to 1, the lower bound ( ) to 0, and the threshold to 0.5. The whale individual is initialized by * to 1 if is greater than 0.5, and to 0 if is less than or equal to 0.5. In the discrete binary space, the position update implies a change between "0" and "1" are mutually changed. This study sets the following conversion function to update the whale position in binary space.
Note that the updated individual position is obtained, and it is judged whether it satisfies the upper and lower bound requirements. If it is less than the lower bound, it is made equal to the lower bound, and if it is greater than the upper bound, it is made the upper bound.

Algorithm implementation
The rcICQRNN algorithm requires defining weights and performance indicator functions, and then splitting the input data for representation. In the main function, train, test, layer, node, n_epoch, bsize, activation, optimizer, dropout, are used as parameters. In this study, the results were compared for hidden layers of 1, 2 and 3 and of 5, 10, 15, 19, 30 and 50. In the base model, node is set to 50, n_epoch is set to 200, bsize is set to 64, activation is set to sigmoid, optimizer is set to Adam, and dropout is set to 0.2. The newly defined function is used as the loss function of the rcICQRNN algorithm to construct the prediction model, and the output is the result of the evaluation index MMSE, C-index, and QL.
The operating system used in this paper is Windows 10. The running software is Python version 3.8.8 and the running tool is Jupyter Notebook. Scikit-learn version is Sklearn 0.24.1. The processor is Intel(R) Core(TM) i7-10700 CPU @ 2.90 GHz 2.90 GHz and the RAM is 16.0 GB.
Due to the specificity of right-censored data, this paper proposes a survival analysis method that extends neural networks to composite quantile regression by effectively integrating the loss functions of neural networks and composite quantile regression. To optimize the hyperparameters of the survival prediction model, the WOA algorithm was used to fit the model to the training set with the best parameters and to evaluate it on the test set. The prediction results of the model proposed in this paper and the quantile regression neural network method under different quantile are compared, and the performance of the model is evaluated comprehensively by quantitative analysis of evaluation indexes to find the best prediction model. The general framework diagram of this paper is shown in Figure 1.

Performance metrics
When evaluating the prediction model, the main measure is the difference between the predicted value and the observed value . In this paper, three important metrics were used to evaluate survival prediction models using the consistency index (C-index) [28], modified mean squared error (MMSE) [21], and quantile loss (QL) [21].
In the field of survival analysis, one of the most commonly used metrics to assess model prediction accuracy is the C-index, which is an extension of the area under the subject operating characteristic curve (ROC) to censored time-to-event data. The C-index represents the fraction of patient pairs that show agreement between predictions and outcomes, i.e., patients are considered to have greater predicted survival times if longer-lived patients is consistent with the outcome, as shown in Eq (15).
where and are the patient pairs in the sample and ⋅ denotes the indicator function. If , then 1, otherwise 0.
and are the predicted risks of the patient pairs and . Where patient pairs with greater predicted risk have shorter survival rates, indicating the agreement between the model's risk prediction and the true survival outcome. The C-index measures the overall performance of the survival prediction model within an interval [0,1]. The larger the Cindex, the higher the prediction accuracy.
The MMSE is defined as the mean of the sum of squares of the residuals between the observed event times and the times predicted by the quantile estimates in the censored condition. Since no true event times are observed for the censored sample, the MMSE is calculated only at the observed event times. The MMSE can be defined as Eq (16).
QL is the defined quantile loss and is another effective way to assess the difference between the predicted and true quantile. QL is defined as Eq (17).
where is a constant, slightly smaller than the true follow-up time.

Results and discussions
In this study, the above method was applied to multiple data sets to predict survival time. To avoid overfitting phenomenon, the dataset was divided into two parts: 75% for data model training and 25% for evaluating prediction accuracy of rcICQRNN-k, Integer-rcICQRNN-k, One-Hot-rcICQRNN-k, GS-rcICQRNN-k, WOA-rcICQRNN-k, QRNN, COX proportional hazard model.

Predictive performance on simulated datasets
In this study, numerical simulation experiments are used to evaluate the performance of the proposed estimation method on a simulated data set. The proposed inverse probability weighted composite quantile regression neural network model method is then applied to the real data. An observational data with 10 covariates was generated with 4000 training samples and 1000 test samples. Each covariate was derived from a uniform distribution and patients' survival times were generated by an exponential Cox model [29], where the log-risk function was set as a Gaussian function [14].  Table 2 shows the prediction results of 21 survival prediction models run 1 time using different hidden layers. The best results for Cindex, MMSE and QL are shown in bold. rcICQRNN-5 denotes the composite quantile regression neural network with 5, and the rest also denotes the model at different values, with which the real data set in the next section is consistent. Scatter plots of actual versus predicted survival times are in the supplemental material. Among all models with a single hidden layer, the QRNN model has a slightly better Cindex and QL of 0.6489 and 0.2871, respectively. rcICQRNN-30 is only 0.0004 lower than it in terms of Cindex. While the MMSE index is optimal for the composite quantile regression neural network model when 30 Among the models combining composite quantile regression with feedforward neural networks with two hidden layers, both rcICQRNN-5 and rcICQRNN-50 show optimal properties on Cindex, 0.0003 higher than the next best model rcICQRNN-10, and significantly better than the single hidden layer model by 0.0051. The MMSE and QL of the rcICQRNN-10 model are 2.0158 and 0.2774, respectively, which are better than the rest of the models. In all models with three hidden layers, the performance is worse than that of the model with two hidden layers. In general, the rcICQRNN-5 model with two hidden layers is more suitable for analyzing the simulated data set.

Predictive performance on real datasets
Breast cancer is the deadliest type of cancer in women worldwide, and there is evidence that 50-60% of breast cancer cases are detected at a late stage, thus the survival rate of breast cancer patients is low, and early diagnosis and care will improve the survival rate of breast cancer patients [30]. In the present study, two real breast cancer datasets were selected and the proposed model was applied to the prediction of survival time. We ran the different methods for each dataset 50 times separately and took the average results. In this section, the grid search method (GS) used is the one in the article by Jia et al. [21], where the individual parameters are adjusted in the training set using 5-fold cross validation. The GS optimized parameters result in discrete values in the range, and the WOA optimized parameters result in continuous values in the range.
In Figure 2, we plot the Kaplan-Meier survival curves for the overall data. The characteristics of patient survival over time are depicted. The average age at diagnosis or disease examination in the NKI70 dataset was 44 years, and the average survival time of patients was 7.35 months. The average age at diagnosis or disease examination in the METABRIC dataset was 61 years, and the average survival time of patients was 126.55 months. 2757 clients in the TCC dataset had a duration of use of more than 40 months. Figure 3 shows the histograms of the performance results of different models for the NKI70 dataset. Parameter results for the GS and WOA optimization models after integer encoding and One-Hot encoding are shown in the Supplementary Material. The three plots on the left side of Figure 3 show the results for the integer encoding and the right side shows the results for the One-Hot encoding. The results show that the model performance of One-Hot encoding is better than the model performance of integer encoding.The performance after WOA optimized parameters is significantly better than the model performance of GS optimized parameters. The rcICQRNN model proposed in this paper significantly outperforms the rcICQRNN with two hidden layers than the rcICQRNN with one hidden layer and three hidden layers due to other evaluation algorithms. In integer encoding, the performance of WOA optimized parameters is significantly better than that of GS optimized parameters when layer = 1 the average Cindex of the WOA-rcICQRNN-30 model is 0.8194, which is 0.    The span of the Cindex results of the single hidden layer model is larger in the One-Hot encoding dataset. Standard deviation is larger. Although the difference in the Cindex results between the different models was small, the superiority of the WOA-rcICQRNN-30 model could still be seen, further illustrating our method as the best breast cancer survival prediction model.
The survival prediction results of the METABRIC dataset before feature selection are shown in Table 3, and the values in parentheses are the standard deviations of the results of 50 runs. The results show that the Cindex of different models is around 0.6 and the MMSE is around 0.8, and the accuracy of the prediction model is low and the error is large. Therefore, considering the dimensionality of the dataset, the BWOA algorithm was used for variable selection of the data, and variables that slightly affect survival time were eliminated.  We found a significant increase in accuracy after subjecting the data to variable selection. The same GS and WOA algorithms were chosen to optimize the parameters in order for the model to achieve more effective predictive power. Since the previous datasets show that One-Hot encoding has better results than integer encoding, although some of the models in Table 3 have better performance after integer encoding, the performance of each model after feature selection still shows that One-Hot encoding is optimal. Therefore, in Table 4, only the performance metrics of each model with One-Hot encoding are shown. The performance of each model after feature selection and the performance results of each model after integer encoding optimization are in the supplementary material.
In Table 4   Parameter results for the integer encoding and One-Hot encoding datasets after optimization of parameters by GS and WOA algorithms are in the supplementary material. It can be seen that the Cindex performance of the model for the feature-selected dataset is much improved after the parameters are optimized using the intelligent optimization algorithm. The Cindex results for One-Hot encoding data are more stable at this time. Except for the single-hidden layer and double-hidden layer models in integer encoding, all other models optimized by WOA can show better prediction accuracy, making the proposed rcICQRNN more suitable for predicting right-censored data.
The survival prediction results for the TCC dataset are shown in Table 5. In this section, the results of the data after integer encoding and One-Hot encoding are further described. The results show that for each layer, One-Hot encoding helps to improve the prediction accuracy. When layer = 3, the rcICQRNN_19 model has a Cindex of 0.9813, MMSE of 0.1265, and QL of 0.0732 for the TCC dataset, which is better than the integer encoding model. However, the QRNN technique produced the least significant results. The obtained results were compared with QRNN, COX proportional hazard models (in terms of Cindex and MMSE) based on the same data set. A summary of the obtained results is presented in Table 6 below. It can be observed that the method proposed in this study has a higher precision and accuracy than the other techniques applied in the listed studies.

Conclusions
Machine learning has achieved great success in the field of predictive modeling dealing with survival data containing right censored data, taking advantage of his modeling of nonlinear relationships [1]. Currently, many approaches use the patient's survival status as the output of a neural network and analyze the relationship between features and survival time to the point of improving model predictive power [31,32]. The traditional accelerated time-to-failure (AFT) model, which directly relates the logarithm of the time-to-event to a covariate or predictor variable, has become a popular alternative. However, AFT models are usually modeled parametrically because the logarithm of time-to-event is linearly associated with predicted values with a specific error distribution. Therefore, the quantile-based approach, due to its nonparametric nature and defined from a cumulative distribution function or survival function, allows estimating the entire spectrum of quantile based on covariate values, providing the overall shape of the survival distribution, while allowing statistical inference for specific percentile of interest if necessary, actually offering some advantages over AFT-based and hazard-based models. Combined with the ability of neural networks to capture nonlinearities, more accurate predictions can be achieved.
This study focuses on right censored type data, including data with failure time as the response variable and covariates affecting the failure time, and conducts an in-depth study for the prediction of failure time to estimate the nonparametric effects of covariates on the response variable. A hybrid algorithm of composite quantile regression and deep neural network is applied to the censored data to simulate the quantile function of failure time and to investigate the covariate effects in different quantile. In this study, a survival prediction model based on an improved composite quantile regression neural network framework combined with an inverse probability weighting method is proposed for survival prediction. WOA was subsequently used for hyperparameter tuning, and BWOA was applied for variable selection of high-dimensional covariate data. Evaluation of survival prediction models using three important metrics: C-index, MMSE, and QL. For the categorical variables in the real dataset, integer encoding and One-Hot encoding are implemented to encode the categorical features. Through extensive tests on simulated datasets and three real datasets of NKI70, METABRIC, and TCC, the results show that the proposed rcICQRNN method has high prediction accuracy for right-censored data. In the simulated dataset, the composite quantile regression deep neural network rcICQRNN-5 model with two hidden layers exhibits optimal properties on Cindex, which is significantly better than the single hidden layer model by 0.0051. Overall, the two-hidden layer rcICQRNN-5 model is more suitable for analyzing the simulated dataset. the One-Hot encoding WOA-rcICQRNN-30 model is more suitable for NKI70 data, and in the three hidden layers, the WOA-rcICQRNN-30 model has a Cindex of 0.8864, an MMSE of 0.4583, and a QL of 0.2534. The WOA-rcICQRNN-15 model is more used in the METABRIC dataset. The Cindex performance of the model for the feature-selected dataset was much improved after optimizing the parameters using an intelligent optimization algorithm, when the C-index results for the One-Hot encoding data were more stable, making the proposed rcICQRNN more suitable for predicting right-censored data and providing useful insights for better diagnosis and implementation of specific treatments for breast cancer patients. This study combines composite quantile regression and neural network techniques to provide a new framework to enhance cancer prognostic survival analysis and prediction by fusing multidimensional features. The advantages of the proposed rcICQRNN model compared with previous methods are as follows. First, it is theoretically and numerically demonstrated that the rcICQRNN model utilizes the advantages of both composite quantile regression and feedforward networks. It can not only flexibly exploit the properties of neural networks to explore the nonlinear relationships between variables, but also can use the composite quantile regression features to improve the estimation efficiency and prediction ability. Second, it integrates the use of multiple regression quartiles. In practical applications, it always has advantages in terms of robustness and accuracy of predictions. The superiority of the rcICQRNN model is illustrated by simulation studies and three practical applications. In practice, 1930 seems to be a favorable choice for the rcICQRNN model. Despite the contributions of this study, there are still some limitations that need to be considered in future research. We should acknowledge that rcICQRNN models, while successfully extracting potentially informative features from the data and avoiding manual intervention to generate features, come at the cost of the lack of transparency and interpretability of these models. Attention mechanisms can be extended to this model in the future, not only to rank the importance of input variable features, but also to demonstrate the dynamics of feature importance over time.
Another future work of this study is to fuse composite quantile regression neural networks with different penalty functions to develop an accurate and reliable prediction model that more effectively targets high-dimensional genomic features for survival analysis. It helps physicians to understand the impact of patients' genetic characteristics on breast cancer to effectively predict human survival.