Survival Analysis for Multimode Ablation Using Self-Adapted Deep Learning Network Based on Multisource Features

Novel multimode thermal therapy by freezing before radio-frequency heating has achieved a desirable therapeutic effect in liver cancer. Compared with surgical resection, ablation treatment has a relatively high risk of tumor recurrence. To monitor tumor progression after ablation, we developed a novel survival analysis framework for survival prediction and efficacy assessment. We extracted preoperative and postoperative MRI radiomics features and vision transformer-based deep learning features. We also combined the immune features extracted from peripheral blood immune responses using flow cytometry and routine blood tests before and after treatment. We selected features using random survival forest and improved the deep Cox mixture (DCM) for survival analysis. To properly accommodate multitype input features, we proposed a self-adapted fully connected layer for locally and globally representing features. We evaluated the method using our clinical dataset. Of note, the immune features rank the highest feature importance and contribute significantly to the prediction accuracy. The results showed a promising <italic>C</italic><inline-formula><tex-math notation="LaTeX">$^{\mathit{td}}$</tex-math></inline-formula>-index of 0.885 <inline-formula><tex-math notation="LaTeX">$\pm$</tex-math></inline-formula> 0.040 and an integrated Brier score of 0.041 <inline-formula><tex-math notation="LaTeX">$\pm$</tex-math></inline-formula> 0.014, which outperformed state-of-the-art method combinations of survival prediction. For each patient, individual survival probability was accurately predicted over time, which provided clinicians with trustable prognosis suggestions.


Survival Analysis for Multimode Ablation Using Self-Adapted Deep Learning Network Based on Multisource Features
Ziqi Zhao , Wentao Li , Ping Liu, Aili Zhang , Jianqi Sun , and Lisa X. Xu Abstract-Novel multimode thermal therapy by freezing before radio-frequency heating has achieved a desirable therapeutic effect in liver cancer.Compared with surgical resection, ablation treatment has a relatively high risk of tumor recurrence.To monitor tumor progression after ablation, we developed a novel survival analysis framework for survival prediction and efficacy assessment.We extracted preoperative and postoperative MRI radiomics features and vision transformer-based deep learning features.We also combined the immune features extracted from peripheral blood immune responses using flow cytometry and routine blood tests before and after treatment.We selected features using random survival forest and improved the deep Cox mixture (DCM) for survival analysis.To properly accommodate multitype input features, we proposed a self-adapted fully connected layer for locally and globally representing features.We evaluated the method using our clinical dataset.Of note, the immune features rank the highest feature importance and contribute significantly to the prediction accuracy.The results showed a promising C td -index of 0.885 ± 0.040 and an integrated Brier score of 0.041 ± 0.014, which outperformed state-of-the-art method combinations of survival prediction.For each patient, individual survival probability was accurately predicted over time, which provided clinicians with trustable prognosis suggestions.

I. INTRODUCTION
L IVER cancer's morbidity and mortality are chronically high in East Asia, especially in China.A population study predicts that by 2030, more than 23 million people in the world will be affected by cancer [1].With the development of modern medical imaging technology and surgical navigation technology, minimally invasive thermophysical ablation of tumors, including cryoablation [2], radiofrequency ablation [3] microwave ablation [2], laser ablation [4], and high-intensity focused ultrasound (HIFU) ablation [5] have received more attention.Currently, ablation therapy is internationally recognized as the firstline treatment for small hepatocellular carcinoma (>3 cm) [6].In addition, a large-scale solution for stereotactic ablation has also been proposed for larger tumors (<8 cm) [7].Notably, we innovatively developed multimode thermal therapy [8].Through rapid liquid nitrogen cooling before accurate radio-frequency heating, the microcirculation system and the tumor tissue in situ are destroyed, and tumor antigens are released to the greatest extent to stimulate a strong and effective body [9].
Due to insufficient treatment, there is a high risk of tumor recurrence after ablation, and serious complications from overtreatment can occur.Therefore, it is essential to evaluate the surgical efficacy and postoperative tumor progression.Caloriebased methods [10], [11] are commonly used for post-ablation assessments.Nevertheless, those methods only focus on the relationship between heat dose and coagulative necrosis, ignoring notable apoptosis, tumor microcirculation, and immune responses following heat stimulation.Since no ex vivo anatomical tissue is available for analysis after ablation, image-based assessment is widely used for postoperative evaluation.Passera et al. [12] first attempted to assess the accuracy of radiofrequency ablation by obtaining quantitative metrics.Makino et al. [13] simultaneously superimposed preoperative CT/MRI with intraoperative US images and grouped them based on the degree of ablation at the tumor margin.In addition, Shyn et al. [14] believed it was feasible to use PET/CT to evaluate radiofrequency ablation procedures.However, image-based methods focus more on morphological and topological differences between ablated and nonablated regions, and frequent imaging follow-ups are costly to patients in time and money.
Radiomics is a noninvasive biomarker-based prognostic evaluation method that extracts high-throughput features, including morphology, gray-level, and texture changes of the region of interest, from radiation images [15].Unlike the image-based evaluation method, it objectively and comprehensively evaluates these changes based on quantitative indicators to reflect the correlation between the tissue characteristics of the region of interest and clinical decision-making.Recently, many studies have applied radiomics to survival analysis [16], [17], protein representation [18], and molecular feature classification [19].Although hundreds of feature representations can be expanded by applying different parameter settings and image filtering, low-order image features cannot fully express the heterogeneity of the region of interest.In addition, because the features in radiomics are designed based on medical prior knowledge, the extracted features have strong subjectivity, resulting in limitations in fitting prediction models and quantifying clinical decisions.
Recently, deep learning has shown amazing strength in the computer vision field.High-order features that are extracted from deep learning networks contain more abstract information from medical images and fit more complex predictive models compared with radiomics features.High-order features can extract more complex image information than radiomics features.One of the most representative researches in the field of deep learning includes convolutional neural networks (CNN) [20].Nevertheless, due to the small sample size of medical images and the scarcity of annotations, some researchers [17], [21], [22] employed transfer learning from pre-trained CNN models using large-scale natural image datasets to extract deep learning features.With Transformer's [23] in-depth research in the field of computer vision, the results in image downstream tasks even exceed CNN-based models [24].Vision Transformer (ViT) [25] is a representative network that applies Transformer to computer vision.ViT captures the long-distance pixel relationships instead of the local receptive field of convolutional operation, and outperforms CNNs in many medical image tasks.In particular, ViT-based methods have been used to classify glioma sub-type [26], lung cancer [27], and breast cancer [28], [29].Yap et al. [30] showed that ViT-based models provided better results than CNNs in classifying tumors into benign, malignant, and normal categories.We used pre-trained ViT for deep learning feature extraction rather than CNN-based models.To the best of our knowledge, we are the first to employ ViT to extract features for survival prediction.However, the neural network is a "black box" lacking interpretability.Therefore, we utilized score class activation mapping (Score-CAM) [31], which used the global confidence score of the model for the feature map to measure the linear weight and performed better than other gradient-based CAM methods in our experiment, to visualize the specific region of the tumor site concerned by the deep learning feature extracted from ViT.
Mo et al. [32] used the pretreatment neutrophil to lymphocyte ratio as the input feature of a predictive model.Li et al. [33] combined the median albumin level, median total bilirubin level, and median α-fetoprotein level with demographic and clinical features.Tong et al. [34] and Wang et al. [18] used CD8 and PD-L1 expression as clinical factors, respectively.To explore the mechanism of changes in immunological markers in our multimodal thermal therapy, we evaluated the peripheral blood immune responses and routine blood tests before the treatment, and we conducted the same evaluation on days 3, 30, 90, 180, and 360 after the treatment.Few studies have used complete immunological information as characteristic inputs for tumor progression prediction.It has been proven that immune features play a crucial role in predicting tumor progression in our subsequent experiments.
The Cox proportional hazards regression model (CPH) [35] models the distribution of time-to-event as a typical survival prediction method, but it makes a very strong assumption about the potential random procedure and the relationship between the covariates and parameters of the process.Kraisangka et al. [36] integrated Bayesian networks based on CPH to improve the prediction ability of the model.Kraisangka et al. [37] added a structural sparsity constraint to alleviate the problem of the scarce dataset.Random survival forest (RSF) [38] is also a widely used data-driven model that combines random forest [39] and traditional survival analysis.RSF does not need to meet specific premise assumptions and has no specific requirements for original data.Compared with traditional survival prediction models, deep learning-based models can fit more complex nonlinear relationships between features and outputs and precisely predict the survival probability.The Faraggi-Simon network [40] and DeepSurv (DS) [41] replaced the linear operation with a nonlinear operation using neural networks.However, these methods are still constrained by strong assumptions such as CPH, which limited their prediction ability.Recent studies such as DeepHit (DH) [42] and deep survival machines (DSM) [43] were developed to learn fully parametric models, showing an outstanding prediction ability, particularly in the competing risks scenario.Deep Cox mixture (DCM) [44] assumed a latent group and made the proportional hazards assumption using a mixture model.In this paper, we developed a novel self-adapted fully connected architecture for feature representation based on DCM.
In our previous work [45], we extracted features on preoperative MRI, intraoperative thermal dose map, and postoperative MRI with a regular RSF-DNN method.However, they only used radiomics features and lacked high-order feature representation.The survival model employed a relatively simple one-layer feed-forward neural network, which could not fully mine the association between input features and tumor progression.
In this paper, we have made the following three contributions.First, we used the ViT model to extract deep learning features that were more expressive of tumor regions than the CNN-based models.A large number of time-varying immune features were added to the prediction of tumor progression.We found that its feature importance in survival prediction accounted for 50.5%, which indicated its importance in survival analysis.Second, we developed a self-adapted fully connected layer for locally and globally representing features.We combined this structure into the DCM model and obtained outstanding prediction accuracy.Third, we compared the combinations of four feature selection methods and seven prediction models to choose the best tumor progression prediction framework.Our proposed RSF_SA-DCM outperformed the previous work with a 3% prediction accuracy improvement.

II. MATERIALS AND METHODS
We preprocessed the preoperative and postoperative MRIs with grayscale threshold adjustment, registration and segmentation on the tumor regions after data collection.Feature extraction, feature selection, and survival prediction procedures were conducted to predict tumor recurrence progression.The flowchart for building the survival prediction model was presented in Fig. 1.

A. Data Acquisition
The Ethics Committee of the Fudan University Shanghai Cancer Center (No. 1604159-3-1605&1604159-3-1606) approved the clinical study on July 6th, 2016.The patients that were included in the group were with histologically confirmed hepatic colorectal metastasis (CRCLM) who previously underwent radical resection of the primary lesion and had no local recurrence or extrahepatic metastasis [9].Patients with severe disorders or other uncontrolled diseases were excluded, as well as pregnant or lactating women [46].The overall median progression-free survival (PFS) of 17 patients in the group was 471 days.Eleven patients experienced local tumor progression with a median tumor-free time of 279 days, 4 patients remained progressionfree with a median survival time of 1221 days, and 2 patients exhibited right censoring.In addition, 14 patients had a history of chemotherapy.Preoperative MRI was taken within a week before treatment, and postoperative MRI was taken at one month, and follow-up was performed every three months after multimodal ablation surgery to assess tumor progression.Postoperative MR tumor region presented a smoother shape, showing a more uniform gray distribution and different radiomics values.We evaluated the routine blood tests and peripheral blood immune responses before treatment and on days 3, 30, 90, 180, and 360 after treatment.Table I showed the clinical and demographic information of the patients.

B. Data Preprocessing and Tumor Segmentation
We preprocessed the preoperative and postoperative MRIs, including image resampling, grayscale threshold adjustment, and image registration.We resampled images 1.188 mm × 1.188 mm × 3 mm isotropic voxels with linear interpolation.Moreover, we adjusted the window width and window level for all MRIs and restricted the grayscale range to 0 ∼ 240.
Due to the small movements in the patient's body position relative to the instrument during the MRI scanning, we applied registration using preoperative MRI as reference images with the Elastix toolkit [47].Then, guided by a senior radiologist, the three-dimensional tumor subregions, including preoperative tumors and postoperative ablation zones, were segmented manually by two readers to evaluate interclass reproducibility.We used software ITK-SNAP [48] for tumor segmentation slice by slice.For the different slices of the same patient, we added appropriate perturbations to the immune features, keeping the features at a consistent level but not numerically the same.We also supplemented a small number of missing immune features with mean values.

C. Feature Extraction 1) Radiomics Features:
We extracted radiomics features from 2D slices.Preoperative and postoperative ROIs were the targets of extraction.The postoperative tumor ablation area was used as the ROI because we believed that early postoperative MRI can be used to evaluate the recovery of liver tissue.A total of 892 radiomics features were extracted from each case for both preoperative MRI and postoperative MRI.We further employed wavelet and gradient filters to obtain rich texture features.The radiomics features were chosen to describe three main image properties: shape-based features, intensity-based features, and texture features (gray level cooccurrence matrix (GLCM), gray level run length matrix (GLRLM), gray level size zone matrix (GLSZM), and gray level dependence matrix (GLDM)).Notably, clinical features included age, sex, tumor number, and history of chemotherapy, and we classified them into preoperative feature groups.
2) Deep Learning Features: For deep learning feature extraction, we have tried a variety of state-of-the-art network structures for adequate comparison.Three types of deep learning methods were utilized, including CNN, MedicalNet [49], and ViT.
Seven CNN algorithms were applied on preoperative and postoperative MRIs, including VGG16 [50], VGG19 [50], ResNet50 [51], DenseNet201 [52], InceptionV3 [53], Efficient-NetB7 [54], and InceptionResNetV2 [55].These seven CNN models were pre-trained by the large-scale and well-labeled natural images dataset ImageNet [56].Of note, we fine-tuned the network with our dataset for the classification task of the slices belonging to long or short progression-free survival (PFS), which was divided by the median time.With the masked delineation of regions of interest (ROIs) for the primary tumor, we replicated the 1-channel 2-D slice three times [57].We stacked them along the channel dimension and adjusted the size of the maximum tumor region to 224 px × 224 px for the input layer of the CNN models using a bounding box based on the tumor label boundary.To ensure the tumor region can be covered entirely, we reserved 5 pixels on both sides of the horizontal and vertical directions based on the tumor label boundary, that is, 10 pixels in each direction.The resized channel-stacked images were candidate CNNs for feature extraction.We removed the last fully connected layer and employed global max pooling (GMP) instead, as in [17].GMP had better object localization ability and performed better than global max pooling (GAP) in our experiment.The comparison of GMP and GAP was shown in the supplementary material.
ViT is a network structure that has recently achieved outstanding results in various fields of computer vision.It surpasses CNN-based methods in tasks such as image classification and image recognition.We used the same data processing strategy with CNN methods and fine-tuned it on the same task.We utilized the last transformer encoder output as the extracted features.The network settings are the same as Dosovitskiy's work [25].Table II showed the performance of the three kinds of models in the classification task on the preoperative MRI training cohort.The predictive performance of the preoperative MRI testing cohort and postoperative MRI training and testing cohorts is shown in the supplementary material.We chose the best performance model ViT-L/16-imagenet1k as the deep learning feature extractor, which contained 1024 features for preoperative and postoperative MRIs, respectively.
3) Immune Features: Peripheral blood immune response assessments and routine blood tests were performed before or on days 3, 30, 90, 180, and 360 after treatment.The lymphocyte, myeloid cell counts, neutrophil/lymphocyte ratio (NLR), and monocyte/lymphocyte ratio (MLR) were assessed by routine blood tests [9].Due to the constraints of patients with only short PFS and lack of immune features for external objective reasons, we selected the immunological data before, 3 days after, and one month after multimodal ablation.Each patient contains 162 immune features.

D. Feature Selection
High-throughput features are often accompanied by complex linear and nonlinear relationships, which can lead to the overfitting of the trained model.An effective feature selection process can not only avoid overfitting but also improve the generalization and interpretability of the survival prediction model.The feature selection process was divided into three steps.First, we computed the intraclass correlation coefficient (ICC) from two groups of radiomics features that were extracted from two readers' tumor ROI segmentations.The features with ICCs >0.8 were considered to have good reproducibility.After evaluating the ICC, we excluded the features with a strong linear correlation of >0.8 and reserved the feature with a higher ICC.The partial feature correlation matrices of different types of features are shown in supplementary material.Notably, the above two steps were employed on radiomics, deep learning, and immune features.We combined these selected features and put them into the RSF model, which consisted of 800 decision trees.Hyperparameters, including the max depth of trees, the minimum number of nodes, and the maximum features selected strategy, were evaluated by ParameterGrid [58].Radiomics features clearly show a high linear correlation, while deep learning features display a lower linear correlation.We ultimately selected the top 2% of the features as the input to the prediction model.

E. Survival Prediction
For the most important tumor progression prediction model, we used the state-of-the-art survival regression model DCM as Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.the baseline.DCM is not restricted by the strong assumption of proportional hazards and learns mixtures of Cox to model individual survival distributions.The model assumes a latent group and made the proportional hazards assumption using a mixture model.On this basis, we developed a self-adapted fully connected layer for feature representation.Because of the distributions of our dataset, we utilized a grouping strategy on the first fully connected layer.For different types of features, including radiomics features, deep learning features, and immune features, we first performed the fully connected calculation on the features between the same groups, and after concatenating these fully connected layers together, we developed the global fully connected calculation.We believed that this approach allowed features to be fully represented while reducing the number of training parameters.Fig. 2 provided a schematic description of our proposed SA-DCM.We used the same loss function as DCM for training.The feature representation procedure consists of two fully connected layers, each of which contains 150 nodes.It is worth noting that the first layer was divided into three structures in groups of 50 nodes.Furthermore, we set a finite mixture of K = 3 Cox models for modeling an individual's survival function.We used ReLU6 [59] as the activation function.During the training phase,we used He initialization [60] and the Adam optimizer [61].In addition, the model's learning rate was set to 1 × 10 −4 and the model was trained for 800 epochs.

III. EXPERIMENTS AND RESULTS
We first elaborated the data splitting strategy on our dataset and then evaluated the ability of our proposed model to predict tumor progression.Ablation experiments on comparisons with different prediction frameworks and input combinations were implemented.Additionally, we also quantified the model's clinical performance.Regarding the improved network architecture, we demonstrated the superiority of self-adapted fully connected layers in survival prediction tasks.Finally, we resolved the selected features' interpretability.

A. Experimental Setting 1) Data Splitting:
To thoroughly excavated our rare and precious dataset, we evaluated and recalibrated the follow-up progress of tumor prognosis of each layer by an experienced radiologist through preoperative and postoperative MRIs.We split 17 patients' preoperative and postoperative MRIs with tumors into 120 slices in the cohort.Considering right censoring, our dataset D is a set of tuples {(x i , t i , δ i ) } N i=1 .x i , t i , and δ i represent features, observed times, and events indicators, respectively.Based on the particularity of our dataset, we used the strategy in [44], which contained the patients with the longest and shortest PFS in the training set.For the remaining patients, we separated the dataset by 10-fold cross-validation.
2) Implementation Details: We conducted our novel multimode ablation treatment at the Shanghai Cancer Center of Fudan University.During ablation surgery, we performed twostage procedures, first freezing and then heating.We extracted radiomics features using the Python package Pyradiomics [62].The deep learning features were extracted by ViT.Regarding feature selection and performance metrics, we applied the Python toolkit scikit-survival [63] to fit the random survival forest model and calculated the time-dependent concordance index and integrated Brier score.The log-rank tests, linear CPH regression, and Kaplan-Meier estimations were conducted by the Python package lifelines [64].The self-adapted deep Cox mixtures model for tumor progression prediction was improved based on the Python package auton-survival [65] in PyTorch.Experiments were conducted on an NVIDIA RTX 3090 GPU and an Intel CPU i9-10900K@3.7 GHz.
3) Performance Metrics: We evaluated the predictive ability of our proposed model by two metrics, the time-dependent concordance index (C td -index) [66] and the integrated Brier score (iBS) [67].The concordance index (C-index) [68] compares the consistency of the actual recurrence sequence and the predicted recurrence sequence for all pairs of individuals, that is, the probability that a patient with a higher predicted risk has an event before a patient with a lower risk.However, the ordinary C-index is only calculated at the initial time of observation and therefore does not reflect the risk potential changes as time goes by [42].The C td -index in (1) takes time into account.The higher the C td -index value is, the better the model's ability to predict risk.
Here, F (t|N ) is the cumulative distribution function predicted by the survival analysis model at the truncation time t.
The predicted probability error can be evaluated using the Brier score, which is used to evaluate the overall performance of the model.The lower the Brier score for a group of predicted values, the better the prediction calibration.iBS in (2) represents the average error of the predicted probability in the time interval.
Here, Ĝ(t) is the estimate of conditional survival functions at censored times calculated using the Kaplan-Meier method.Ŝ(t, x i ) is the probability of progression free survival.τ indicates the longest follow-up time in the dataset.l is the indicator function.

4) Performance of Clinical Validation:
In addition to quantitatively evaluating the models's predictive ability, we also need to verify the performance of our model in clinical applications.We divided the results into two parts to explore the overall survival probability and the individual survival probability.After model prediction, each patient would get a predictive risk score.We selected the highest risk score for each patient as the final risk score and used the median risk score to stratify patients into high-/ low-risk groups for recurrence.Overall survival probability was evaluated by Kaplan-Meier curves, which demonstrated the stratified power of the model's risk prediction by risk grouping.According to risk grouping, we randomly selected multiple groups of individuals in the high-/ low-risk groups to verify their recurrence risk over time.The predicted recurrence risk trend and the actual recurrence time were compared to further validate the important role of the model in the assessment of clinical prognosis.Clinicians can refer to the individual survival curve to make timely and effective interventions on the prognosis of patients.

B. Model Effectiveness of Tumor Progression Prediction 1) Comparison With Other Prediction Frameworks:
We compared other feature-selection and predictive-model combinations in Table III with our proposed RSF_SA-DCM method.We applied four different feature selectors, including the filter-based method minimum redundancy maximum relevance (mRMR) [69], the embedded-based methods least absolute shrinkage and selection operator (LASSO) [70], Group LASSO [71], and RSF.It was worth noting that all feature selection methods previously conducted ICC and correlation computations.LASSO and Group LASSO selected 15 and 14 features respectively.mRMR fed the remaining 20 features into the predictive model, while RSF selected a 22-dimensional vector.To obtain the best prediction effect, we tested a variety of predictive models and combined these models with four feature selection methods.Deep learning models included DeepSurv, DeepHit, deep survival machines (DSM), deep Cox mixtures (DCM), and our proposed SA-DCM.The comparative experiments used Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.

TABLE III COMPARISION WITH STATE-OF-THE-ART PREDICTION FRAMEWORK TABLE IV COMPARISON WITH DIFFERENT INPUT COMBINATIONS
combined features as the input of feature selectors and implemented the data splitting strategy in Section III-A.It can be observed that the models with feature selection outperform those without, with a 2%-9% C td -index improvement for RSF and DCM.The nonlinear models (RSF and deep learning-based models) obtained a much better C td -index than linear models (CPH), and the comparison between Group-LASSO_CPH and RSF_SA-DCM resulted in a huge gap of 9.2% in the C td -index, indicating the importance of nonlinear model fitting for tumor progression prediction.It was worth noting that the deep learning predictor significantly outperformed traditional methods, in which the RSF_SA-DCM C td -index of 0.885±0.040showed a more remarkable improvement than RSF_RSF's C td -index of 0.821±0.064.In our experiments, we employed several superior deep learning models.The DCM is generally predicted better than other methods.Our proposed SA-DCM needed to outperformed combinations, with an approximately 0.2% enhancement.The above results emphasized the power prognosis ability of our proposed model, and our methods effectively mined useful information from a large number of redundant features.
2) Comparison With Different Input Combinations: Using our proposed RSF_SA-DCM method as the baseline, we explored the effect of different input combinations as shown in Table IV.We extracted deep learning and radiomics features both in preoperative and postoperative MRIs, and further discussion was provided in Section IV.Compared with individual features as input, combined features had stronger predictive power.Immune features and deep learning features had considerable performance with C td -indices of 0.811±0.020and 0.795±0.014,respectively.However, the combined features DL+Rad, DL+Imm, and Rad+Imm showed superior fitting ability with C td -indices of 0.808±0.067,0.881±0.036,and 0.863±0.020,respectively.Note that the combination of all three types of features showed the best performance, with a C td -index of 0.885±0.031.Interestingly, when we only input immune features, the result was

TABLE V COMPARISON WITH OUR PREVIOUS WORK
better than the DL+Rad combination.Furthermore, the input features with immunity outperformed those with a single input without immunity, with an approximately 10% difference in their accuracy.This demonstrated that immune features played a vital role in tumor progression prediction.DL+Rad+Imm showed a 0.4% improvement compared with the composition of the features without radiomics features, indicating that multimodality features established stronger associations with our multimodal ablation prognosis.Moreover, Fig. 3 showed the iBS of the different input combination comparisons.DL+Rad+Imm outperformed the other combinations.
3) Comparison With Our Previous Work: In our previous work [44], we extracted features on preoperative MRI, intraoperative thermal dose map, and postoperative MRI with a regular RSF-DNN method.We used the same multimodal ablation procedure and data preprocessing methods; therefore, the two methods can be compared.As the intraoperative thermal dose maps were simulated manually by software, we dropped them and joined deep learning and immune features that had not been used before.In addition, we implemented self-adapted DCM to obtain a more powerful predictive ability.Table V demonstrated the C td -index difference.Compared with the previous method, our method improved by 3%.The comparison result further  provided a more accurate prognosis prediction and guided clinicians to conduct a timely intervention.This structure not only extracted the correlation between different grouping features, but also strengthened the relationship between the features in the same group, further excavated the nonlinear relationship between the features and the network output, and more effectively stimulated the neural network to predict tumor progression proficiency.

C. Effectiveness of the Self-Adapted Fully Connected Layer
To verify the excellent performance of our proposed architecture, we used different feature combinations and compared them with the baseline network DCM.DL+Rad, DL+Imm, Rad+Imm, and DL+Rad+Imm were employed as the inputs, and 18, 17, 20, and 22 features were selected after RSF feature selection.Corresponding to each grouping of the remaining features, we set a fully connected layer of 50 nodes for three groups and 75 nodes for two groups, concatenated the fully connected results together, and obtained the final feature representation through a fully connected layer of 150 nodes.We added an additional fully connected layer with the same nodes number of groups corresponding to SA-DCM on original DCM to keep the two models comparable.Table VI and Fig. 4 showed that under the same input features and feature selection method, our proposed SA-DCM outperformed DCM with a 0.2%-1% improvement on the C td -index.Furthermore, from Fig. 5, the advantages of SA-DCM in runtime were visible.We used the same number and nodes of fully connected layers in each deep learning network for training for 800 epochs.We also employed the same data-splitting strategy in Section III-A and implemented 10-fold cross-validation.The result demonstrated that although CPH and RSF had the shortest training time, SA-DCM outperformed the state-of-the-art deep learning predictive model, indicating the performance and efficiency advantages of our algorithm.

D. Clinical Validation 1) Overall Survival Probability:
The output of our predictive model contained a risk score for each test case.For each patient, we selected the highest risk score from the slices.We used the median risk score to divide the patients into high-/ low-risk groups.Fig. 6 showed the Kaplan-Meier curves for the subgroups stratified by median risk score.The curves of the high-risk group and the low-risk group did not intersect during the whole follow-up process, and there was a significant difference (p = 0.00057).The median survival time of the low-risk group was approximately 650 days longer than that of the high-risk group, indicating that the risk score did identify a clinically long-term risk of recurrence without the need for lengthy follow-up.This also demonstrated the ability of the model to qualitatively assess the degree of risk through risk scoring.
Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.2) Individual Survival Probability: We selected random patients from the high-risk and low-risk recurrence rate groups divided above.We composed random cases from subgroups and plotted the individual survival probability.As Fig. 7 showed, the dotted line demonstrated the PFS of patients, and the solid line indicated the predicted survival probability over time.Yellow and purple lines represented the individuals in the high-/ low-risk groups, respectively.Notably, the individual survival probability in the high-risk group dropped dramatically in the upfront period, while the decline in the low-risk group was more gradual over time.Moreover, the survival probability descended rapidly around the actual termination of follow-up, demonstrating that our proposed method had outstanding predictive power, and the results showed an effective reference value for postoperative intervention.The individual survival probability of the remaining patients was shown in the supplementary material.

E. Model Interpretability 1) Visualization of Deep Learning Features:
It is worth noting that radiomics assesses the size, shape, and texture features of the tumor region.Immune features emphasize the relevant information at the molecular level over time.They both have strong interpretability, but deep learning features are extracted from a neural network that is a "black box" and lacks interpretability.To illustrate the relevance of deep learning features for tumor potential risk prediction, we generated Score-CAM to visualize the feature maps.The comparison of different CAM methods was shown in the supplementary material.We evaluated the interpretability of different types of deep learning methods, including MedicalNet (3D-ResNet50 based), EifficentNetB7, and ViT-L/16 (ViT-L/16-imagenet1k).In Fig. 8, through the Score-CAM visualization of the deep learning model, the center of the tumor region was highlighted as a significant area for long or short survival time classification.Note that the cases in the first two columns were preoperative MRI, and the last column was a case of postoperative MRI.The darker the red color was, the greater the influence of the region on the network output.Unlike the other two networks that produced feature maps with more scattered attention regions, ViT-L/16 focused more intensively on the tumor site, which meant it can obtain features that had a greater impact on the network outcome.Therefore, we chose ViT-L/16 as the deep learning feature extractor.The above experiments verified the interpretability of deep learning features and improved the persuasiveness of their impact on the results.2) Feature Selection and Feature Interpretability: Our proposed method used RSF for feature selection and preserved 22 features for tumor progression prediction.Feature importance rankings and proportions were shown in Fig. 9, with 11 immune features, 7 deep learning features, and 4 radiomics features.Immune features had the highest magnitude among all features, accounting for approximately 50.5%.The other two features accounted for 49.5%, indicating that immune features played an irreplaceable role in our task.For the radiomics features, three preoperative features and one postoperative feature were selected with wavelet and gradient filters.All radiomics features were textured, and pre_gradient_glrlm_RunVariance showed the most important status with a 6.2% contribution.Concerning immune features, patients' immune biomarkers before or 3 days and one month after treatment were included for prognosis.We were surprised to find that the monitoring of postoperative immune markers in patients was exactly a necessary condition for long-term prognosis.

IV. DISCUSSION
In this study, deep learning features and immune features were extracted and combined with our radiomics model.Unlike other studies [16], [18] that only used radiomics and demographic features to predict tumor recurrence and progression, we employed a ViT network pre-trained on natural image datasets to extract deep learning features.In addition, the immunological indicators of patients were monitored before or 3 days and a month after treatment.In our previous work, simulated intraoperative thermal dose maps were used to extract intra-features.Although the combination of intra-features showed higher feature importance and better performance metrics, the thermal dose maps were based on the simulation of inverting the steady-state temperature field distribution of radiofrequency therapy.In the future, when magnetic resonance [72], ultrasonic [73], and other methods of temperature measurement are more real-time and accurate, it can be considered again.The neural network obtains complex feature representation by learning the nonlinear association between features and outputs and extracts features through a large number of operations.We employed CNN-based architectures, ResNet-based MedicalNet architectures that were pre-trained by medical image datasets, and ViT-based models to verify the best classifier to distinguish the long or short PFS.As the procedure of extracting deep learning features was difficult to interpret, we employed feature maps to transform the significant subregions for feature generation into prominent regions where deep learning methods indeed had the potential in identifying tumor image patterns.In particular, in Fig. 8 and Table II, the ViT-L/16 architecture showed the best classification performance for evaluating indicators and the most concentrated highlights of the importance of the tumor region.
In the above experimental studies, deep learning and radiomics features were extracted from preoperative and postoperative MRIs.We performed ablation experiments using preoperative MRI and postoperative MRI alone and in combination.Table VII demonstrated that the individual features declined on the C td -index compared with the combination of both deep learning and radiomics features.Interestingly, the model with input preoperative features outperformed the model with input postoperative features.This might be due to the lack of texture information in the ablation area of postoperative MRI after ablation compared with the textured tumor area of preoperative MRI.Few studies have evaluated complete immunological markers in patients after ablation surgery.We explored the variety of immunological markers before and after multimodal ablation surgery.Meanwhile, we also put them into the proposed predictive framework as immune features.Although the collection of postoperative information, including immune features for different periods and postoperative MRI radiomics features, was difficult, the combination of these features provided us with promising performance.Our previous work used a simple one-layer feed-forward neural network for prediction with only one fully connected layer.Although considerable results were obtained, there was still room for improvement.We verified state-of-the-art predictive models based on deep learning and improved the models with the best predictive capabilities, DCM.Since the features we extracted were divided into three groups, we believed that a model with stronger predictive ability can be obtained by first extracting the nonlinear relationship of the same group of features and then globally extracting the correlation between all the features and the output.The results showed that our hypothesis was valid and that the addition of adaptive grouping of a fully connected layer outperformed the fitting ability of DCM.
Despite the encouraging results, our method still has limitations.First, our research was based on a retrospective study with small sample size; however, comprehensive preoperative and postoperative imaging information and immunological information were invaluable.In the future, a large scale of enrolled patients from multiple centers will enhance the generalization ability of the predictive model.Second, we used the pre-trained Authorized licensed use limited to the terms of the applicable license agreement with IEEE.Restrictions apply.
ViT model from the natural image dataset and fine-tuned our dataset for classifying long or short PFS.Due to the limited dataset, we cannot train the model from scratch but by transfer learning.A dedicated model with sufficient data for accurate tumor progression prediction will be trained in our future work.In addition, our intraoperative CT data had metal artifacts, so the extraction of radiomics features of the tumor area was meaningless.For this reason, we only employed MR image features in this experiment.In subsequent experiments, we will extract multiple sequences of CT images as entirely as possible, including preoperative CT images, and calculate CT features through radiomics.

V. CONCLUSION
In conclusion, we combined deep learning and immune features with radiomics features and extracted deep learning features using the recently popular ViT structure.The combination of features offered a substantial increase in prognosis in multimodal tumor ablation surgery.Through experiments, we discovered that immune features played an important part in predicting tumor progression, which meant that postoperative monitoring of immune indicators was essential.We demonstrated in our experiments the benefits of our RSF_SA-DCM prediction framework by comparing its performance to other combinations of feature selectors and state-of-the-art survival prediction models with the highest C td -index of 0.885±0.040.Moreover, DCM with the proposed novel self-adapted architecture, which enhanced the intergroup association, outperformed the network without.Compared with our previous work, our RSF_SA-DCM framework had a 3% improvement in the C td -index, indicating that the feature combination and modified predictive model had a stronger survival analysis ability.Through the predicted individual survival probability, clinicians can monitor the patient's postoperative status promptly and make necessary interventions at the right time point.In our future work, we will expand our multimodal dataset to fit a more accurate predictive model to provide clinicians with more reliable auxiliary judgments.

Fig. 1 .
Fig. 1.Flowchart of our proposed survival analysis method for the multimode ablation treatment.Radiomics, deep learning, and immune features were extracted and selected after random survivsl forest.SA-DCM is used for tumor progression prediction.The features x are input to the self-adapted fully connected layer for different types of features in the second layer with different gray levels.The dark gray module in the third layer represents the global fully connected layer.The linear function f () and g() act on the output feature representation x.Overall survival probability and individual survival probability were obtained after the framework.

Fig. 2 .
Fig.2.Proposed SA-DCM pipline.The input features x consist of radiomics, deep learning, and immune features.For each group of features, we used the self-adapted fully connected layer and concatenated them for the global fully connected layer.Gradient gray modules represent the global fully connected layer, which is used to distinguish from the self-adapted fully connected layer.The linear function f () and g() act on the output feature representation x, and get the survival function S Z k (t) for each cluster Z ∈ {1, 2, . .., K} and mixture probabilities P (Z|X).The cluster individual survival functions interact with mixture probabilities, and the mean value is our final individual survival prediction.

Fig. 4 .
Fig. 4. Effectiveness comparison of self-adapted fully connected layer between DCM and SA-DCM grouped by different input combinations.

Fig. 5 .
Fig. 5. Running times of different survival prediction models, including CPH, RSF, and deep learning-based methods.SA-DCM is faster than other deep learning-based approaches.

Fig. 6 .
Fig. 6.Kaplan-Meier curve of overall survival probability grouped in high-and low-risk tumor recurrence by risk-score.The p-value is 0.00057 between two groups.Red line shows the high-risk tumor recurrence rate group, and orange line shows the low-risk group.

Fig. 7 .
Fig. 7. Individual survival probability line charts of two groups of high-and low-risk tumor recurrence patients.Yellow lines indicate high-risk paitents.Purple lines indicate low-risk patients.

Fig. 8 .
Fig. 8. Score-CAM visualization of deep learning features extracted from MedicalNet(3D-ResNet50), EifficientNetB7, and ViT-L/16.Case 1 and Case 2 are samples of preoperative MRIs.Case 3 is a sample of postoperative MRI.The first row shows the origin tumor images.The second, third, and fourth rows visualize the attention regions of the deep learning networks for classification of long/short OS.

Fig. 9 .
Fig. 9. Feature Importance ranking (Left) and proportion (Right) from RSF feature selection method.Blue means immune features.Yellow means radiomics features, and pink means deep learning features.

TABLE I MULTIMODE
ABLATION PATIENT DEMOGRAPHICS

TABLE II PREDICTIVE
PERFORMANCE OF DIFFERENT TYPES OF MODELS IN IDENTIFICATION OF LONG/SHORT OS ON PATIENTS IN THE PREOPERATIVE MRI TRAINING COHORT WITH 10-FOLD CROSS VALIDATION

TABLE VI EFFECTIVENESS
OF SELF-ADAPTED FULLY CONNECTED LAYER