Preliminary Report on Computed Tomography Radiomics Features as Biomarkers to Immunotherapy Selection in Lung Adenocarcinoma Patients

Simple Summary The objective of the study was to assess the radiomics features obtained by computed tomography (CT) examination as biomarkers in order to select patients with lung adenocarcinoma who would benefit from immunotherapy. We demonstrated that specific radiomic features could be used to select patients with lung adenocarcinoma who would benefit from immunotherapy by predicting OS or PFS time. Abstract Purpose: To assess the efficacy of radiomics features obtained by computed tomography (CT) examination as biomarkers in order to select patients with lung adenocarcinoma who would benefit from immunotherapy. Methods: Seventy-four patients (median age 63 years, range 42–86 years) with histologically confirmed lung cancer who underwent immunotherapy as first- or second-line therapy and who had baseline CT studies were enrolled in this approved retrospective study. As a control group, we selected 50 patients (median age 66 years, range 36–86 years) from 2005 to 2013 with histologically confirmed lung adenocarcinoma who underwent chemotherapy alone or in combination with targeted therapy. A total of 573 radiomic metrics were extracted: 14 features based on Hounsfield unit values specific for lung CT images; 66 first-order profile features based on intensity values; 43 second-order profile features based on lesion shape; 393 third-order profile features; and 57 features with higher-order profiles. Univariate and multivariate statistical analysis with pattern recognition approaches and the least absolute shrinkage and selection operator (LASSO) method were used to assess the capability of extracted radiomics features to predict overall survival (OS) and progression free survival (PFS) time. Results: A total of 38 patients (median age 61; range 41–78 years) with confirmed lung adenocarcinoma and subjected to immunotherapy satisfied inclusion criteria, and 50 patients in a control group were included in the analysis The shift in the center of mass of the lesion due to image intensity was significant both to predict OS in patients subjected to immunotherapy and to predict PFS in patients subjected to immunotherapy and in patients in the control group. With univariate analysis, low diagnostic accuracy was reached to stratify patients based on OS and PFS time. Regarding multivariate analysis, considering the robust (two morphological features, three textural features and three higher-order statistical metrics) application of the LASSO approach and all patients, a support vector machine reached the best results for stratifying patients based on OS (area under curve (AUC) of 0.89 and accuracy of 81.6%). Alternatively, considering the robust predictors (six textural features and one higher-order statistical metric) and application of the LASSO approach including all patients, a decision tree reached the best results for stratifying patients based on PFS time (AUC of 0.96 and accuracy of 94.7%). Conclusions: Specific radiomic features could be used to select patients with lung adenocarcinoma who would benefit from immunotherapy because a subset of imaging radiomic features useful to predict OS or PFS time were different between the control group and the immunotherapy group.


Introduction
For men, lung cancer is the leading cause of morbidity and mortality among oncological diseases; for women, on the other hand, it is third in incidence and second in mortality [1,2]. As first-line therapy in advanced non-small cell lung cancer (NSCLC), immunotherapy is used both as a single treatment and as a treatment in combination with chemotherapy.
The efficacy of immunotherapy in NSCLC and its pathophysiology has made evident over time the new cellular mechanisms associated with the response to treatment and to intrinsic resistance [3]. Moreover, bioinformatics analyses are becoming increasingly sophisticated, allowing the analysis and integration of complex clinical and biological data to further understand the biology of cancer, notably of lung carcinoma [3][4][5][6].
It is necessary to consider that even in the context of recent clinical scientific progress, the belief in clinical evidence of new robust biomarkers that predict response, resistance and/or toxicity to treatment in clinical care practice remains idealistic. Consequently, there is an urgent need to develop efficient biomarkers that can select patients who would benefit from immunotherapy, thereby providing the appropriate treatment and avoiding toxicity [3,6].
The use of radiomics, as amply demonstrated by some studies, has been fundamental for predicting TNM and histological grade, response to therapy and survival in numerous oncological diseases [12][13][14].
It is inevitable to consider that by associating radiomic parameters with useful clinical and laboratory data, accurate and robust evidence-based clinical decision support systems (CDSS) could be established [15][16][17].
The radiogenomic approach (constituted by the combination of genomic data and radiomic metrics) [18,19] would allow the achievement of the most considerable level of precision medicine [20,21].
The primary endpoint of this study was to assess the efficacy of radiomics features obtained by computed tomography (CT) examination as biomarkers that could select patients with lung adenocarcinoma who would benefit from immunotherapy.

Patient Selection
The Local Ethics Committee of the National Cancer Institute of Naples, involving the National Cancer Institute of Naples Pascale Foundation and the Careggi University Hospital of Florence, with internal resolution no. 15 of 4 March 2019, approved a spontaneous multicenter retrospective study.
For the study, 74 patients (mean age 63 years, range 42-86 years) with histologically confirmed lung cancer who underwent immunotherapy (programmed cell death protein 1 (PD-1) and programmed death-ligand 1 (PD-L1) inhibitors) as first-or second-line therapy and a baseline CT study.
Because the study was performed in accordance with relevant guidelines and regulations, informed consent was not required by the Local Ethics Committee of the National Cancer Institute of Naples due to the retrospective nature of the study.
Inclusion and exclusion criteria are provided in Table 1.
As a control group, we selected 50 patients (median age 66 years, range 36-86 years) from 2005 to 2013 with histologically confirmed lung adenocarcinoma who underwent chemotherapy alone or combined with targeted therapy other than immunotherapy and who were subjected to a baseline CT study, including a CT venous phase protocol.

Inclusion Criteria Exclusion Criteria
Lung adenocarcinoma histologically confirmed Baseline CT study is not accessible Lung nodule size > 10 mm Tumor histology other than adenocarcinoma Immunotherapy ((PD-1)/programmed death-ligand 1 (PD-L1) inhibitors) as first-or second-line therapy CT examination within 1 month of immunotherapy CT protocol included venous phase (70-90 s post-contrast agent injection)

CT Protocol
Thanks to the use of 4 different scanners: General Electric Healthcare CT tomographs with 64 detectors (1 Optima 660 and 1 Discovery 750 HD, General Electric Healthcare, Milwaukee, Wisconsin, USA), 1 Philips CT scanner with 128 detectors (ICT SP 128 slice, Philips, Amsterdam, The Netherlands) and 1 Siemens CT scanner with 64 sections (Siemens Somatom Flash, Erlangen, Germany) it was possible to acquire computed tomography. Parameters of the CT scan data were already reported [22].

Radiological Assessment
Several radiologists with different levels of experience in reading and interpreting chest CT (low experience 5 years, average experience 5-15 years, and high experience ≥15 years) performed the radiological evaluations.
By selecting a single target lesion for each patient, the most visible lesion with the largest diameter was then analyzed.
Radiologists performed CT assessment using dedicated CT post-processing workstations and the HealthMyne ® software platform (www.healthmyne.com, accessed on 16 January 2020, HealthMyne, Madison, WI, USA). To reduce recall bias, all 3 readers maintained a gap of more than 2 weeks between the 2 sessions.

CT Post-Processing with Radiomic Precision Metrics (RPM™) Tool
In this study we used the HealthMyne ® platform for lesion segmentation and for radiomic features extraction from the delineated volumes of interest (VOIs). By means of the RPM™ algorithms, it was possible to semi-automatically recognize and segment the volume of the target lesions identified by the radiologist and automatically extract a wide range of quantitative data. The user initialized the lesion segmentation by drawing a long axis on a plane of the multiplanar reconstruction (MPR) ( Figure 1A). A 2D segmentation updated in real-time with interactive feedback of the lesion boundary [23,24] and 2D segmentations on the other MPR planes were immediately proposed. When the contour on a MPR plane was unsatisfactory, the user could update the segmentation by either drawing long axes on the other MPR views or using a 2D brush tool ( Figure 1B). When the segmentation was satisfactory, the user could confirm to initiate the 3D segmentation computation. Based on these initial user interactions, the RPM™ algorithms combined statistical sampling methods together with deep learning strategies in order to delineate the target volume and provide an automatic 3D segmentation ( Figure 1C). The 3D segmentation occurred quickly (approximate time = 1-2 s), and could be reviewed by scrolling through slices on the MPR views. Interactive editing tools including 2D and 3D brushes could be used to reduce/enlarge or add details to the proposed volume segmentation. As the 3D segmentation was confirmed by the user, the measure of the long and short lesion axes was automatically determined by leveraging the volume delineation ( Figure 2). the RPM™ algorithms, it was possible to semi-automatically recognize and segment the volume of the target lesions identified by the radiologist and automatically extract a wide range of quantitative data. The user initialized the lesion segmentation by drawing a long axis on a plane of the multiplanar reconstruction (MPR) ( Figure 1A). A 2D segmentation updated in real-time with interactive feedback of the lesion boundary [23,24] and 2D segmentations on the other MPR planes were immediately proposed. When the contour on a MPR plane was unsatisfactory, the user could update the segmentation by either drawing long axes on the other MPR views or using a 2D brush tool ( Figure 1B). When the segmentation was satisfactory, the user could confirm to initiate the 3D segmentation computation. Based on these initial user interactions, the RPM™ algorithms combined statistical sampling methods together with deep learning strategies in order to delineate the target volume and provide an automatic 3D segmentation ( Figure 1C). The 3D segmentation occurred quickly (approximate time = 1-2 s), and could be reviewed by scrolling through slices on the MPR views. Interactive editing tools including 2D and 3D brushes could be used to reduce/enlarge or add details to the proposed volume segmentation. As the 3D segmentation was confirmed by the user, the measure of the long and short lesion axes was automatically determined by leveraging the volume delineation ( Figure 2).   On the left, the CT image with the lesion segmentation (light blue contour) and the longest diameters measured on the lesion volume. The blue lines represent the longest long axes and the green lines represent the longest short axes on the axial direction. On the right, it is possible to observe the 3D rendering of the lesion volume and its location inside the automatic lung segmentation.
A total of 573 radiomic metrics were extracted from the delineated VOIs as previously reported in [24]: 14 features based on Hounsfield unit (HU) values specific for lung CT images; 66 first-order profile features based on intensity values (statistical distribution of image value); 43 second-order profile features based on lesion shape (geometric analysis of shape, volume, curvature and volumetric length); 393 third-order profile features, i.e., texture features, with IBSI-consistent implementation [25] of the grey-level co-occurrence matrix (GLCM), the grey-level distance zone matrix (GLDZM), the grey-level run length matrix (GLRLM), the grey-level size zone matrix (GLSZM), the neighboring grey-level dependence matrix (NGLDM), the neighboring grey-tone difference matrix (NGTDM) and the different features' aggregation methods, as well as 57 features with higher-order profiles (statistical metrics after transformations and wavelet analysis).

Univariate Analysis
Overall survival (OS) was defined as the time between the date of first dose of therapy and the date of death or date of last clinical follow-up. Similarly, progression-free survival (PFS) was measured from the date of first dose of therapy to the time of tumor progression, recurrence, death or the time the patient was last known to be alive. The estimate of overall survival and progression-free survival was calculated with Kaplan-Meier analysis.
For each metric, median and range values were calculated. The calculation of inter-observer variability between readers by intraclass correlation coefficient (ICC) and the evaluation of unstable features were performed.
Cox proportional hazard models were used for exploring univariate associations between OS and each stable imaging feature (identified as ICC value ≥0.8) and between PFS A total of 573 radiomic metrics were extracted from the delineated VOIs as previously reported in [24]: 14 features based on Hounsfield unit (HU) values specific for lung CT images; 66 first-order profile features based on intensity values (statistical distribution of image value); 43 second-order profile features based on lesion shape (geometric analysis of shape, volume, curvature and volumetric length); 393 third-order profile features, i.e., texture features, with IBSI-consistent implementation [25] of the grey-level co-occurrence matrix (GLCM), the grey-level distance zone matrix (GLDZM), the grey-level run length matrix (GLRLM), the grey-level size zone matrix (GLSZM), the neighboring grey-level dependence matrix (NGLDM), the neighboring grey-tone difference matrix (NGTDM) and the different features' aggregation methods, as well as 57 features with higher-order profiles (statistical metrics after transformations and wavelet analysis).

Univariate Analysis
Overall survival (OS) was defined as the time between the date of first dose of therapy and the date of death or date of last clinical follow-up. Similarly, progression-free survival (PFS) was measured from the date of first dose of therapy to the time of tumor progression, recurrence, death or the time the patient was last known to be alive. The estimate of overall survival and progression-free survival was calculated with Kaplan-Meier analysis.
For each metric, median and range values were calculated. The calculation of inter-observer variability between readers by intraclass correlation coefficient (ICC) and the evaluation of unstable features were performed.
Cox proportional hazard models were used for exploring univariate associations between OS and each stable imaging feature (identified as ICC value ≥0.8) and between PFS and each stable imaging feature (identified as ICC value ≥0.8). The evaluation between the survival rate and the variables was done using a technique called Cox regression analysis.
The risk measure provided for each variable was the risk ratio (RR): a RR of 1 means that the risk is the same for each participant; a RR >1 indicates higher risk; a RR <1 indicates lower risk.
A Wilcoxon-Mann-Whitney U test was performed to identify differences among imaging radiomic metrics of two groups (immunotherapy group and control group). A non-parametric Kruskal-Wallis test was performed to identify the significant features for stratifying the patients into two groups based on median cutoff of survival time (i.e., OS = 32 months and PFS = 10 months) corresponding to short (i.e., <median survival time) or long survival time (i.e., ≥median survival time).
Receiver operating characteristic (ROC) analysis was performed. The Youden index was used to individuate the optimal cutoff value for each feature and area under the ROC curve (AUC), sensitivity (SENS), specificity (SPEC), positive predictive value (PPV), negative predictive value (NPV) and accuracy (ACC) were obtained, considering the optimal cutoff value.
The statistical analyses were performed using the Statistics Toolbox of MATLAB R2007a (MathWorks, Natick, MA, USA).

Multivariate Analysis
For multivariate analysis, we considered all stable significant features of univariate analyses as inputs for a classifier model. Pattern recognition methods (linear discrimination analysis (LDA), support vector machine (SVM), k-nearest neighbor (KNN), artificial neural network (ANN) and decision tree (DT)) were considered to assess the survival prediction ability [26]. The best model was chosen considering the highest area under the ROC curve and highest accuracy. Moreover, the analysis was made before and after a feature selection method: the robust features were selected by the least absolute shrinkage and selection operator (LASSO) method [27,28]. In the LASSO method, 10-fold cross-validation was used to select the optimal regularization parameter alpha, considering that the average of each patient's mean square error was the smallest. With the optimal alpha, features with a nonzero coefficient in LASSO were reserved. Feature selection was carried out considering the λ value with the minimum mean square error (minMSE) [29,30].
A 10-k-fold cross validation approach was used to individuate the best classifier on the training set; therefore, median and 95% confidence interval values of AUC, accuracy, sensitivity, and specificity were calculated.
Multivariate analysis was performed using the statistics and Machine Learning Toolbox of MATLAB R2007a (MathWorks, Natick, MA, USA).

Results
Thirty-eight patients (median age 61; range 41-78 years) with confirmed lung adenocarcinoma and subjected to immunotherapy satisfied the inclusion criteria. We excluded: (a) 19 patients since the histological diagnosis was other than adenocarcinoma, (b) 17 patients since the baseline CT studies were not performed with contrast media.

Univariate Analysis Results
The Kruskal-Wallis test did not detect statistically significant differences in OS ( Figure 3a) and PFS values (Figure 4a) among the two groups (immunotherapy group and control group), demonstrating the homogeneity among the two patient groups. Kaplan-Meier curves of OS and PFS are shown, respectively, in Figure 3b,c for the immunotherapy and control group and in Figure 4b,c for the immunotherapy and control group. The median value of OS for the immunotherapy group was equal to 32 months (range 2-72 months), while the median value of OS in the control group was 28 months (range 6-162 months).
The median value of PFS for the immunotherapy group was equal to 12 months (range 1-60 months), while median value of PFS in the control group was 10 months (range 3-162 months).
1-60 months), while median value of PFS in the control group was 10 months (range 3-162 months).  Stable features (intraclass correlation coefficient value ≥ 0.8) were 121 among 573 calculated (see Appendix A for the description of each stable feature): 5 lung CT features, 26 morphological features, 1 feature based on intensity values, 76 texture features and 13 higher-order statistical features. The median value of intraclass correlation coefficients for stable features was 0.9 (range 0.85-0.96). Median size of lesions was 3 cm, with range of 1.0-12 cm. The size of the lesion did not affect the stable metrics (p value > 0.05 at the Wilcoxon-Mann-Whitney U test performed between the groups obtained by dividing patients with lesions < 3 cm and patients with lesions ≥ 3 cm).
Using Cox proportional hazard models, we found significant radiomic features to predict OS and PFS time in both groups (see Tables 2 and 3): exclusively textual features including higher-order statistical metrics were significant in the Cox proportional hazard model. No metrics in the control group had a risk ratio > 1 to predict OS, while only one textural metric in immunotherapy group had a risk ratio > 1 to predict OS (Table 2): the grey-level nonuniformity as volume, with full merging by grey-level size zone matrix (GLSZM_IBSI_GL_NONUNIF_3D_HU GLSZM).
Several radiomic textural metrics in the immunotherapy group had a risk ratio >1 to predict PFS, while only one textural metric in the control group had a risk ratio >1 to predict PFS (Table 3): the NGLDM GL nonuniformity by slice, with merging by slice by neighboring grey-level dependence matrix (NGLDM_IBSI_GLNONUNIF_2DV_HU).    Stable features (intraclass correlation coefficient value ≥ 0.8) were 121 among 573 calculated (see Appendix A for the description of each stable feature): 5 lung CT features, 26 morphological features, 1 feature based on intensity values, 76 texture features and 13 higher-order statistical features. The median value of intraclass correlation coefficients for stable features was 0.9 (range 0.85-0.96). Median size of lesions was 3 cm, with range of 1.0-12 cm. The size of the lesion did not affect the stable metrics (p value > 0.05 at the Wilcoxon-Mann-Whitney U test performed between the groups obtained by dividing patients with lesions < 3 cm and patients with lesions ≥ 3 cm).
Using Cox proportional hazard models, we found significant radiomic features to predict OS and PFS time in both groups (see Tables 2 and 3): exclusively textual features including higher-order statistical metrics were significant in the Cox proportional hazard model. No metrics in the control group had a risk ratio > 1 to predict OS, while only one textural metric in immunotherapy group had a risk ratio > 1 to predict OS (Table 2): the grey-level nonuniformity as volume, with full merging by grey-level size zone matrix (GLSZM_IBSI_GL_NONUNIF_3D_HU GLSZM).
Several radiomic textural metrics in the immunotherapy group had a risk ratio >1 to predict PFS, while only one textural metric in the control group had a risk ratio >1 to predict PFS (Table 3): the NGLDM GL nonuniformity by slice, with merging by slice by neighboring grey-level dependence matrix (NGLDM_IBSI_GLNONUNIF_2DV_HU). Stable features (intraclass correlation coefficient value ≥ 0.8) were 121 among 573 calculated (see Appendix A for the description of each stable feature): 5 lung CT features, 26 morphological features, 1 feature based on intensity values, 76 texture features and 13 higher-order statistical features. The median value of intraclass correlation coefficients for stable features was 0.9 (range 0.85-0.96). Median size of lesions was 3 cm, with range of 1.0-12 cm. The size of the lesion did not affect the stable metrics (p value > 0.05 at the Wilcoxon-Mann-Whitney U test performed between the groups obtained by dividing patients with lesions < 3 cm and patients with lesions ≥ 3 cm).
Using Cox proportional hazard models, we found significant radiomic features to predict OS and PFS time in both groups (see Tables 2 and 3): exclusively textual features including higher-order statistical metrics were significant in the Cox proportional hazard model. No metrics in the control group had a risk ratio > 1 to predict OS, while only one textural metric in immunotherapy group had a risk ratio > 1 to predict OS (Table 2): the grey-level nonuniformity as volume, with full merging by grey-level size zone matrix (GLSZM_IBSI_GL_NONUNIF_3D_HU GLSZM).
Several radiomic textural metrics in the immunotherapy group had a risk ratio >1 to predict PFS, while only one textural metric in the control group had a risk ratio >1 to predict PFS (Table 3): the NGLDM GL nonuniformity by slice, with merging by slice by neighboring grey-level dependence matrix (NGLDM_IBSI_GLNONUNIF_2DV_HU). With regard to ROC analysis, we considered only the most important features. Table 4 reports the subset of significant features from Kruskal-Wallis tests for stratifying the patients into two groups based on median cutoff of OS time (short and long OS time). A total of 19 features were significant (3 morphological features, 1 feature based on intensity value, 12 textural metrics and 3 higher-order statistical metrics) to predict overall survival time. Among these 19, the best feature for stratifying the patients with short or long OS time was a higher-order statistical metric: the mean value of 2D Laplacian of Gaussian transformed voxels at 2.5 mm of smoothing (LOG_2D_MEAN_2_5MM_HU) with an AUC of 66.0%, a sensitivity of 69.0% and a specificity of 65.0%. Table 5 reports the subset of significant features from Kruskal-Wallis tests for stratifying the patients into two groups based on median cutoff of PFS time (short and long PFS time). A total of 104 features (5 lung CT features, 23 morphological features, 1 feature based on intensity value, 64 textural features and 11 higher-order statistical metrics) were significant in predicting PFS time. Among these 104, the best feature for stratifying the patients based on PFS time was a textural feature: the average energy of gray-level co-occurrence matrix (GLCM_ENERGY) with an AUC of 70.0%, a sensitivity of 73.0% and a specificity of 64.0%.
The shift in the center of mass of the lesion due to image intensity (SHIFT_CENTE-R_OF_MASS_MM) was significant for predicting OS in patients subjected to immunotherapy and also for predicting PFS in both groups (patients subjected to immunotherapy and patients in the control group).

Multivariate Analysis Results
Regarding multivariate analyses, only the most useful results considering the purposes of this study are reported. Using all stable significant features, no tested classifier reached higher accuracy than a single radiomics feature for stratifying patients based on OS and PFS time (short or long survival time).
Considering the robust predictors by the LASSO approach and all patients, an SVM ( Figure 5) reached the best results for stratifying patients based on OS time, with an AUC of 0.93 (0.85-0.96 95% confidence interval (CI)), an accuracy of 84.1% (80-86% 95% CI), a sensitivity of 74.4% (69-78% 95% CI) and a specificity of 93.3% (88-95% 95% CI). The robust predictors as input to the SVM totaled seven, including two morphological features, two textural features and three higher-order statistical metrics: greatest planar axis; volume fraction of the approximate enclosing ellipsoid occupied by the ROI (VOLUME_DENSITY_AEE); two features by GLCM cluster prominence for grey-leveled image from IBSI by slice (GLCM_IBSI_CLUSTERPROMINENCE_2DS_HU and complexity from averaging metrics by neighborhood gray-tone difference matrix (NGTDM_COMPLEXITY_2DF_HU)); median value of voxels under wavelet transforms with filters HHL (WAVELET_HHL_MEDIAN_HU); minimum value of voxels under wavelet transforms with filters (HHL WAVELET_HHL_MI-N_HU); and the mean value of 2D Laplacian of Gaussian transformed voxels at 2.5 mm of smoothing (LOG_2D_MEAN_2_5MM_HU). The SVM classifier in the subset of patients treated with immunotherapy reached an AUC of 0.89, an accuracy of 81.6%, a sensitivity of 82.4% and a specificity of 81.0%. Table 4. Median and range values for textural features significant for stratifying the patients into two groups based on median cutoff of OS time (short and long time). Diagnostic performance is also reported for each significant feature, considering the optimal cutoff value obtained by ROC analysis.      Conversely, considering the robust predictors by the LASSO approach and all patients, a decision tree ( Figure 6) reached the best results for stratifying patients based on PFS time with an AUC of 0.96 (0.895-1.0 95% confidence interval (CI)), an accuracy of 93.2% (88-96% 95% CI), a sensitivity of 91.1% (87-94% 95% CI) and a specificity of 5.3% (90-99% 95% CI). The robust predictors as inputs of SVM totaled seven (six textural features and one higher-order statistical metric): Grey-level variance by neighborhood grey-level difference matrix (NGLDM_IBSI_DE-P_VARIANCE_2DF_HU); average correlations of GLCM (GLCM_IBSI_CORRELLATION_2D-S_HU); grey-level non-uniformity in three dimensions (GLDZM_IBSI_ZONE_DISTANCE_NO-NUNIFORMITY_3D_HU); entropy value in three dimensions by neighborhood grey-level difference matrix (NGLDM_IBSI_DEP_ENTROPY_3D_HU); high grey-level run emphasis in three dimensions by neighborhood grey-level difference matrix (NGLDM_IBSI_HIGH_D- Conversely, considering the robust predictors by the LASSO approach and all patients, a decision tree ( Figure 6) reached the best results for stratifying patients based on PFS time with an AUC of 0.96 (0.895-1.0 95% confidence interval (CI)), an accuracy of 93.2% (88-96% 95% CI), a sensitivity of 91.1% (87-94% 95% CI) and a specificity of 5.3% (90-99% 95% CI). The robust predictors as inputs of SVM totaled seven (six textural features and one higher-order statistical metric): Conversely, considering the robust predictors by the LASSO approach and all patients, a decision tree ( Figure 6) reached the best results for stratifying patients based on PFS time with an AUC of 0.96 (0.895-1.0 95% confidence interval (CI)), an accuracy of 93.2% (88-96% 95% CI), a sensitivity of 91.1% (87-94% 95% CI) and a specificity of 5.3% (90-99% 95% CI). The robust predictors as inputs of SVM totaled seven (six textural features and Grey-level variance by neighborhood grey-level difference matrix (NGLDM_IBSI_DE-P_VARIANCE_2DF_HU); average correlations of GLCM (GLCM_IBSI_CORRELLATION_2D-S_HU); grey-level non-uniformity in three dimensions (GLDZM_IBSI_ZONE_DISTANCE_NO-NUNIFORMITY_3D_HU); entropy value in three dimensions by neighborhood grey-level difference matrix (NGLDM_IBSI_DEP_ENTROPY_3D_HU); high grey-level run emphasis in three dimensions by neighborhood grey-level difference matrix (NGLDM_IBSI_HIGH_D- Grey-level variance by neighborhood grey-level difference matrix (NGLDM_IBSI_DE-P_VARIANCE_2DF_HU); average correlations of GLCM (GLCM_IBSI_CORRELLATION_2D-S_HU); grey-level non-uniformity in three dimensions (GLDZM_IBSI_ZONE_DISTANCE_NO-NUNIFORMITY_3D_HU); entropy value in three dimensions by neighborhood grey-level difference matrix (NGLDM_IBSI_DEP_ENTROPY_3D_HU); high grey-level run emphasis in three dimensions by neighborhood grey-level difference matrix (NGLDM_IBSI_HIGH_D-EP_LOW_GL_EMPH_3D_HU); strength value by slice with full merging by neighborhood grey-level difference matrix (NGTDM_STRENGTH_2DV_HU); the mean value of 2D Laplacian of Gaussian transformed voxels at 2.5 mm of smoothing (LOG_2D_MEAN_2_5MM_HU). The decision tree classifier in the subset of patients treated with immunotherapy reached an AUC of 0.96, an accuracy of 94.7%, a sensitivity of 87.5% and a specificity of 100.0%.

Discussion
Immunotherapy, capable of stimulating the cellular immune response against cancer, uses immune checkpoint blockade (ICI), a treatment paradigm in advanced cancer treatment.
The two main groups of agents used almost exclusively in tumors [31,32] are programmed cell death protein 1 (PD-1) and programmed death-ligand 1 (PD-L1) inhibitors; some authors analyzed the results of these ligands in small cell lung cancer [33][34][35]. Despite this, resistance to primary therapy does not allow all patients to benefit from the treatment regimen.
Considering this, it is therefore necessary to identify biomarkers that allow the appropriate selection of patients and the right stratification. Radiomics can therefore effectively support precision medicine decisions by identifying imaging biomarkers. Indeed, radiomics consists in the extraction of many quantitative characteristics through medical images [36]. This quantitative analysis, considering the heterogeneity of the macroscopic features based on the image [37], can identify the overall tumor.
The division into segments is not an immediate step in the whole of the radiomic process because the subsequent extraction of the characteristics is obtained from the segmented VOI. Lately, plot analysis has broadened its application to medical applications [37]. The quantification of grayscale patterns and pixel interrelationships that provide a measure of heterogeneity is what is called texture analysis.
In this study we evaluated 573 radiomic features; among them 121 were stable: 5 lung CT features, 26 morphological features, 1 feature based on intensity values, 76 texture features and 13 higher-order statistical features. Considering Cox proportional hazard models, textual features including higher-order statistic metrics were exclusively significant.
Considering Kruskal-Wallis tests, 19 radiomic features (3 morphological features, 1 feature based on intensity value, 12 textural metrics and 3 higher-order statistical metrics) were significant for predicting overall survival time. The best feature for stratifying the patients with short or long OS time was a higher-order statistical metric: the mean value of 2D Laplacian of Gaussian transformed voxels at 2.5 mm of smoothing (LOG_2D_MEAN_2_5MM_HU), with an AUC of 66.0%, a sensitivity of 69.0% and a specificity of 65.0%.
Considering Kruskal-Wallis tests, 108 radiomic features (5 lung CT features, 23 morphological features, 1 feature based on intensity value, 64 textural features and 11 higherorder statistical metrics) were significant for predicting PFS time. The best feature for stratifying the patients based on PFS time was a textural feature: GLCM ENERGY, with an AUC of 70.0%, a sensitivity of 73.0% and a specificity of 64.0%.
However, the subset of imaging radiomic features for predicting OS or PFS time was different in the control group and immunotherapy group; this demonstrated that specific radiomic features could be used to select patients with lung adenocarcinoma who would benefit from immunotherapy.
Exclusively, the shift in the center of mass of the lesion due to image intensity (SHIFT_CENTER_OF_MASS_MM) was significant both for predicting OS in patients subjected to immunotherapy and for predicting PFS in patients subjected to immunotherapy and in patients in the control group.
However, in univariate analysis, low diagnostic accuracy was reached for stratifying patients based on OS and PFS time.
A multivariate analysis using all stable significant features found that no tested classifier reached higher accuracy than a single radiomic feature for stratifying patients based on OS and PFS time (short or long survival time). Conversely, considering the robust predictors by the LASSO approach and all patients, an SVM reached the best results for stratifying patients based on OS. The SVM classifier in the subset of patients treated with immunotherapy reached an AUC of 0.89, an accuracy of 81.6%, a sensitivity of 82.4% and a specificity of 81.0%. However, considering the robust predictors of the LASSO approach and all patients, a decision tree reached the best results for stratifying patients based on PFS time. The decision tree classifier in the subset of patients treated with immunotherapy reached an AUC of 0.96, an accuracy of 94.7%, a sensitivity of 87.5% and a specificity of 100.0%.
The relationship between radiomics and immunotherapeutic response was demonstrated by numerous experts.
Radiomic signatures could be considered critically important inputs as a biomarkers for immune profiles and immune checkpoint inhibition response, according to a multicenter retrospective study on advanced cancers that considered all advanced cancers, including lung cancer [38].
Consequently, in a sample of 200 advanced NSCLC patients who received single anti-PD-1/PD-L1, Yang et al. assessed 1633 CT scans and 3414 blood samples, including serial radiomics, laboratory data and baseline clinical data, to build deep learning models useful for the selection and identification of responders and non-responders to immunotherapy. They found that a deep learning-based prediction model showed a good performance in distinguishing responders from non-responders to anti-PD-1/PD-L1 therapy [39].
In patients treated with the anti-PD-1 antibody, by combining PD-L1ES with a clinical model that was constructed using age, sex, smoking history and family history of malignant tumors, the reaction to immunotherapy could be anticipated in a manner more accurate than using PD-L1ES or the clinical model alone as predictors [40].
Accordingly, Tian et al. conducted analyses on PD-L1 expression in 939 consecutive stage IIIB-IV NSCLC patients with baseline CT images and found that deep learning on computed tomography images could predict a high expression of PD-L1 (PD-L1 ≥50%), with an AUC of 0.78.
The present study has several limitations: the small population size considered, the retrospective nature of the study and the awareness that CT images were collected by different centers and thus were usually obtained using different protocols. The radiomic model can be affected by these differences; radiomics data were not correlated and combined with clinical information.

Conclusions
With the contribution of medical images usually acquired in clinical practice, radiomics can be a useful support for precision medicine.
We demonstrated that specific radiomic features extracted by CT could be used to select patients with lung adenocarcinoma who would benefit from immunotherapy; in fact, the subset of radiomic features able to predict OS or PFS time was different in the control group and immunotherapy group.