Feature Analysis of Prognostic Factors for the Radiation Toxicity Prediction of Lung Cancer Using Explainable Outcome Representation Method

Background The method of solely using a black box model for radiation toxicity prediction in patients with lung cancer has limitations in explaining the causality of the prediction results. Therefore, the feature importance of predictors was analyzed using explainable artificial intelligence. Predictions were made for the clinical prognosis through SHAP analysis (Shapley additive explanations) by using pneumonia, interstitial lung disease, chronic obstructive pulmonary disease, concurrent chemoradiation therapy, age, and dosimetric factors [lung volume receiving ≥20 Gy (V20), mean lung dose (MLD)] as prognostic factors in 110 lung cancer patients who received radiation therapy. The model was analyzed using a random forest regressor and a tree explainer; and the SHAP analysis was used to examine the features of prognostic factors affecting radiation side effects and to derive mutual impact. V20, and V30 were analyzed as high-risk factors in predicting radiation side effects. The accuracy, s ensitivity and specificity of the model system were 0.88, 0.79, and 0.78, respectively. Through this study, MLD and V30 were analyzed as important predictors of side effects, and the features of each factor were analyzed for the degree of importance by the SHAP value. To predict radiation pneumonia using this method, a personalized analysis was conducted to identify the factors that influenced each patient. Through this process, comparisons were made with the existing black box method, which confirmed that increasing the explainability can reinforce an in-depth analysis of radiation side effect prediction.


Introduction
Lung cancer radiotherapy occurs after a curative surgical operation for non-small cell lung cancer in stages 1 to 3, or alongside chemotherapy with a curative treatment for small cell lung cancer at the limited stage. In recurrent cancer or metastasis, lung cancer radiation therapy is used as a palliative therapy to relieve symptoms. In general, radiation therapy can cause symptoms, such as dermatitis, alopecia in and around the irradiated area, fatigue, and a loss of appetite. Accordingly, radiation therapy for lung cancer also has side effects. The features of the side effects of radiation therapy for lung cancer are radiation esophagitis and radiation pneumonitis. Radiation pneumonitis is an inflammation of the lungs that have received radiation, which presents with symptoms, such as dry cough or shortness of breath. It leads to lung damage and fibrosis at times, resulting in persistent symptoms of severe breathing difficulties. The side effect grades for lung cancer patients are divided from grade 0 to grade 5, in which mild radiation pneumonitis (RP) are grades 1 or 2, and severe RP are grades 3, 4, or 5 (modified RTOG/EORTC pulmonary toxicity grading scale) (1)(2)(3).
The outcome of radiation therapy is predicted in a personalized treatment plan before and after radiation therapy for cancer patients, including information regarding specific doses and the number of sessions, and the treatment results are followed up after the patient's treatment. By predicting the patient's prognosis, an evaluation is conducted on the suitability of the treatment plan to be implemented. Different types of treatment plans are conducted for this purpose based on various indicators, such as the pathological characteristics of the tumor, metastasis of the cancer, organs at risk (OAR) to the prescribed dose, tumor target homogeneity, treatment response, toxicity, and survival rate. In particular, the radiation side effects among the outcomes are one of the important indicators alongside the response and survival rate. Such side effects appear as acute or chronic and are subdivided into side effect grades to be used for patient evaluation.
The exploration of the factors that influence a patient's radiotherapy outcome and the amount or the method chosen to exert certain degree of influence to induce radiation treatment results (such as side effects, responsiveness, and survival rate) are some of the most controversial research areas (4)(5)(6)(7). Studies have been conducted to predict the outcome by training artificial intelligence models using the data set of patients who received radiation therapy, which classified dosimetric and non-dosimetric factors, considered as major factors in the radiation treatment planning stage, as the predictors (Table 1). From the standpoint of establishing a radiation treatment plan, it is difficult to use predictors other than physical characteristics, dose, and volume factors based on the patient's imaging (computed tomography [CT] and magnetic resonance imaging [MR]). Therefore, factors that consider the patient's physiological and pathological characteristics are also being added. However, the machine learning prediction models are a type of black box model, and existing studies have limitations in proving that certain predictors contributed to the results, regardless of how different the factors used were (8,9).
Nevertheless, the weight analysis system of predictors based on the correlation between each variable has limitations in providing an in-depth interpretation of the prediction results and interpretation of the artificial intelligence model used. In other words, a more reliable prediction model can be constructed if the basis for the factors affecting certain prediction results can be schematically and intuitively presented. To this end, studies using explainable artificial intelligence are being conducted (8,9). Using explainable artificial intelligence increases its reliability through inference of factors that contribute to the prediction result, in addition to the predictability and prediction accuracy. For example, in predicting flu diagnosis, if there are predictors, such as headache, sneeze, weight, and no fatigue, then the headache and sneeze will make a positive contribution to the prediction, whereas the weight and no fatigue make a negative contribution. This is because such methods provide a more accurate explanation compared to the interpretation of the existing AI model which states all four predictors to have contributed to the outcome (8).
In this study, analyses were performed using an explanatory artificial intelligence model to predict the radiation side effects and provide a basis for the prediction. Therefore, by analyzing specific factors that affect the prediction of side effects for a specific patient, this study intends to demonstrate higher reliability of the prediction results compared to the existing model.

Materials and Methods
To predict radiation side effects and analyze the features of predictors, data from 110 cancer registries outlining the results of radiation pneumonia follow-up after radiotherapy in patients with lung cancer were used (IRB No. ED17317, Korea University Anam Hospital, South Korea). The median follow-up time for the patient group was 37 months, and the characteristics of the patients are shown in Table 2. The patients had received radiation therapy for lung cancer, and the total average dose was 63 Gy, the average fractional dose was 3.3 Gy, and the average fraction was 28.67 (Table 1). Based on the data set, a random forest regressor model was constructed, and the features of the predictors were analyzed through a tree explainer. In addition, to apply an explainable analysis method for the prognosis, SHAP (Shapley additive explanation) analysis was performed to analyze the importance affecting the outcome for each predictor. The dependence and effect of each predictor were analyzed and the result was schematized (Fig. 1). In patients who received radiation therapy for lung cancer, features (predictors) that affect radiation side effects were selected (2, 3, 10-17). The choice of specific predictors may be controversial, but our research team extracted common or specific factors that are clinically noteworthy. The prognostic factors used as inputs for the running model are as follows ( For the above, the contribution from all predictors to the predictive model for the total sum of each predictors affecting the outcome was calculated by dividing the features of each prognostic factor through SHAP analysis (Equation 1). Therefore, the correlation of the prediction results on the predictor side effects and degree of influence on each factor were expressed as SHAP values (18). A random forest is one of the ensemble models, which is a learning method that can be applied to solve classification or regression problems (19). It consists of a combination of tree predictors so that each tree is independent of any vector value and uses the same layout for each vector generated (Fig. 2). To predict radiation side effects, 16 input values, including age, were used to create a predictive model that contributes to the side effect grade, which is the output value. Several predictive model trees became the estimators, and the results extracted from the trees were combined. The training and test sets were separated, and the predicted values of   ( 1 (2) In this equation, N is the number of data points, fi is the output value from the mode, and yi is the actual value for data point i.

Results
Following are the results of feature contribution analysis using predictors the 110 patients who were reported to have radiation pneumonia after radiation therapy and follow-up sessions (Fig. 3). From the predictors exerting an influence on radiation pneumonia, the following were analyzed to have the greatest relative influence, in the order of: mean lung dose (MLD, 26.94%), lung volume receiving ≥30 Gy (V30, 16.94%), pathology (9.31%), tumor location (8.17%), forced expiratory volume in one second (FEV1, 8.15%), and V20 (6.29%) (Fig. 3.A). When MLD>17 Gy and MLD<22 Gy (Frequency>42), the influence was also evident for grade 1 and 2 mild RP (RTOG MLD<20Gy) (Fig. 3.B). In addition, 39 patients with squamous cell carcinoma were determined to have radiation side effects corresponding to grades 1, 2, and 3 through pathology, and were also identified as patients who were affected by this factor (Fig. 3.D). In addition, FEV1 was identified as a factor that affected 18 patients with < 2 liters (Fig. 3.E). However, V20 and V30 did not have a significant impact because the treatment of the patients occurred within the range of RTOG dose limiting factor (V20<35 Gy). (Fig. 3.C, G). The SHAP values for the predictors influencing the prediction of radiation side effects were represented by a summary plot (Fig. 4). The summary plot has been formed to enable identification of the feature importance and feature effects at the same time. Each point in the summary plot depicts the Shapley value (observed value) for the features, and the SHAP value for each predictor is indicated. The color indicates the value of the features from low to high, and the distribution of the Shapley value per feature can be seen as the overlapping points nested in the y-axis direction. In addition, the characteristics are sorted according to their importance. The analysis revealed that the lower Gy value of MLD was associated with a higher predicted risk of causing radiation side effects, and the grade risk of radiation side effects increased with higher MLD (Fig. 4.A). However, in the case of V30, 73 patients were treated with <3%. Therefore, the analysis reflected V30 to be effective only at a low value, and thus did not contribute to an increase in the risk. However, it was analyzed to have a high risk in patients with adenocarcinoma and squamous cell carcinoma. The summary plot indicated the relationship between the feature value and the effect on prediction. However, the SHAP dependence plot must be checked to confirm the exact form of the relationship, and although MLD and pathology contributed to the prediction result by showing dependence on the SHAP value ( Fig. 4.B), other factors are shown to exert influence independently ( Fig. 4.C, D). Therefore, through this SHAP analysis, it was possible to confirm the causality between each predictor and the predictive model. The contribution of the predictors to the predicted results of radiation side effects are shown in Fig. 5. Patient A, aged 74, was followed-up with zero radiation side effects, which was the ground truth, and the predicted value was 0.1. Despite the tumor location (right) and age>70 years being risk factors in this instance, the analysis showed no radiation side effects by the low volume dose factor of V30=0.65% and MLD=8.52 Gy (Fig. 5.A). In other words, V30 and MLD had a negative impact on prediction. On the contrary, the results of follow-up for patient D showed the patient as mild RP patient by the ground truth, and the predicted result was 1.6 as expected. MLD>20 Gy, pathology, V20, and V30 were analyzed as high-risk factors (Fig. 5.D). Unlike cases A, B, and C whose SHAP score is less than 1, case D brought a result close to grade 2. It was close to the patient's follow-up results which were analyzed retrospectively. The accuracy of the system model was analyzed as 0.88, and the MSE was calculated as 0.08 (Fig. 6). The sensitivity and specificity of the developed model system were 0.79, and 0.78, respectively.

Discussion
Numerous artificial intelligence prediction models are being used to predict radiation side effects; this study adopted an analysis method to provide the basis for prediction results with a higher reliability. As a result, the characteristics of predictors contributing to prediction results were visualized, and the results were scored to depict objectivity. In addition, the predictors that further contributed significantly to the prediction results were identified by analyzing the features and dependence of each predictor. Therefore, this study is different from the previous studies, because it evaluated the performance of models using various deep learning or machine learning models to increase the accuracy of prediction. However, the contribution of each predictor for the prediction of radiation side effects was different for specific patients, which made it challenging to list the absolute dependence (Fig. 4, 5). This is because the degree of influence of the factors used for prediction on the patients differs for each patient. In addition, it also signifies that the importance of each factor may vary with diverse data. As mentioned above, the contribution to the prediction results for each factor varies according to the predictors used. This can be solved by grouping the predictors under specific conditions. In this study, the predictors were extracted based on various literature and were used as input values of the model. However, for diagnosis and treatment based on empirical knowledge in the existing clinical environment, using the predictors mentioned in this study may incur some differences in predicting the side effects of radiation pneumonia.
We used SHAP as a way to interpret the black box model in this study. First, the advantage of SHAP is that it is possible to represent the factors contributing to specific predictive outcomes for each patient subject (Fig. 5). In other words, even if the integrated contributing factors are extracted as shown in Fig. 3, it has the advantage of intuitively showing that the degree of influence on the contributing factors is different for each target. Second, it is possible to know how the result was derived from the perspective of the user who understands the result by displaying it quantitatively. However, it cannot be interpreted as the absolute factor affected the patient as a drawback. In fact, we could qualitatively sympathize with the contribution of predictors that cause radiation side effects, but did not agree with the analysis results as much as the prediction accuracy (87.54%) with respect to the SHAP prediction results evaluated by the authors. The local interpretable model-agnostic explanations (LIME) is also used in artificial intelligence research for explaining machine learning models (20,21). LIME is suitable for predictive analysis as it can be used for any black box model, whether it is a deep neural network or an SVM. Additionally, LIME is one of the only interpretation technologies that works with table, text, and image data. However, the definition of the data of interest and the neighboring data corresponding to the boundary value is very ambiguous, which means that a different kernel configuration must be attempted for most applications, which can cause problems in interpretation accuracy.

Conclusion
Through this study, MLD and V30 were analyzed as predictors having an important influence on the prediction of side effects, and the features of each factor were analyzed by the importance of the SHAP value. In order to predict radiation pneumonia using the above-mentioned method, personalized analyses were conducted to identify the factors that influenced each patient. In comparison with the black box model, this process was able to enhance the explainability, and therefore, confirm that the in-depth analysis of the predictors of radiation side effects can be reinforced.