A CT-based radiomics nomogram for predicting prognosis of coronavirus disease 2019 (COVID-19) radiomics nomogram predicting COVID-19

Objectives: To identify the value of radiomics method derived from CT images to predict prognosis in patients with COVID-19. Methods: A total of 40 patients with COVID-19 were enrolled in the study. Baseline clinical data, CT images, and laboratory testing results were collected from all patients. We defined that ROIs in the absorption group decreased in the density and scope in GGO, and ROIs in the progress group progressed to consolidation. A total of 180 ROIs from absorption group (n = 118) and consolidation group (n = 62) were randomly divided into a training set (n = 145) and a validation set (n = 35) (8:2). Radiomics features were extracted from CT images, and the radiomics-based models were built with three classifiers. A radiomics score (Rad-score) was calculated by a linear combination of selected features. The Rad-score and clinical factors were incorporated into the radiomics nomogram construction. The prediction performance of the clinical factors model and the radiomics nomogram for prognosis was estimated. Results: A total of 15 radiomics features with respective coefficients were calculated. The AUC values of radiomics models (kNN, SVM, and LR) were 0.88, 0.88, and 0.84, respectively, showing a good performance. The C-index of the clinical factors model was 0.82 [95% CI (0.75–0.88)] in the training set and 0.77 [95% CI (0.59–0.90)] in the validation set. The radiomics nomogram showed optimal prediction performance. In the training set, the C-index was 0.91 [95% CI (0.85–0.95)], and in the validation set, the C-index was 0.85 [95% CI (0.69–0.95)]. For the training set, the C-index of the radiomics nomogram was significantly higher than the clinical factors model (p = 0.0021). Decision curve analysis showed that radiomics nomogram outperformed the clinical model in terms of clinical usefulness. Conclusions: The radiomics nomogram based on CT images showed favorable prediction performance in the prognosis of COVID-19. The radiomics nomogram could be used as a potential biomarker for more accurate categorization of patients into different stages for clinical decision-making process. Advances in knowledge: Radiomics features based on chest CT images help clinicians to categorize the patients of COVID-19 into different stages. Radiomics nomogram based on CT images has favorable predictive performance in the prognosis of COVID-19. Radiomics act as a potential modality to supplement conventional medical examinations.


Objectives:
To identify the value of radiomics method derived from CT images to predict prognosis in patients with COVID-19. Methods: A total of 40 patients with COVID-19 were enrolled in the study. Baseline clinical data, CT images, and laboratory testing results were collected from all patients. We defined that ROIs in the absorption group decreased in the density and scope in GGO, and ROIs in the progress group progressed to consolidation. A total of 180 ROIs from absorption group (n = 118) and consolidation group (n = 62) were randomly divided into a training set (n = 145) and a validation set (n = 35) (8:2). Radiomics features were extracted from CT images, and the radiomics-based models were built with three classifiers. A radiomics score (Rad-score) was calculated by a linear combination of selected features. The Rad-score and clinical factors were incorporated into the radiomics nomogram construction. The prediction performance of the clinical factors model and the radiomics nomogram for prognosis was estimated.

Conclusions:
The radiomics nomogram based on CT images showed favorable prediction performance in the prognosis of COVID-19. The radiomics nomogram could be used as a potential biomarker for more accurate categorization of patients into different stages for clinical decision-making process. Advances in knowledge: Radiomics features based on chest CT images help clinicians to categorize the patients of COVID-19 into different stages. Radiomics nomogram based on CT images has favorable predictive performance in the prognosis of COVID-19. Radiomics act as a potential modality to supplement conventional medical examinations. without consolidation in posterior and peripheral lungs. These features are similar to the ones observed in other coronavirus infections. [4][5][6] However, some CT imaging findings were not so typical. 7,8 Chung et al 6 reported that GGO might appear in most patients, and thus, is considered as the earliest radiographically visible CT manifestation in some patients. A previous study showed that lung involvement gradually increased to consolidation up to 2 weeks after the onset of initial symptoms. 9 The preliminary study by Fang et al 5 showed that GGO could be observed to demonstrate the consolidation absorption. Consolidation has been considered as an indication of disease progression that serves as an alert in the management of patients. 10 Early prediction based on CT imaging of the whole disease process of COVID-19 could prompt early clinical diagnosis, speed-up treatment and early isolation, providing evidence for evaluating the effect of a comprehensive therapy.
The purpose of this study was to predict the prognosis of COVID-19 and help the physicians to more accurately categorize patients into different stages so as to provide better decisionmaking for patients, such as the need for ICU, length of hospital stay, and the requirement for oxygen.

METHODS AND MATERIALS Patients and clinical factors
The institutional review board approved the study, and patient informed consent was waived for this retrospective analysis.
A total of 40 patients who were positive for COVID-19 between 21 January and 26 March 2020 were enrolled in the study. COVID.19 was confirmed by laboratory testing of respiratory secretions obtained by bronchoalveolar lavage, endotracheal aspirate, nasopharyngeal swab, or oropharyngeal swab. Specific inclusion criteria were: (1) patients with a positive new coronavirus nucleic acid antibody admitted to our hospital; (2) patients who underwent CT after admission; and (3) CT images confirmed the presence of pneumonia. Exclusion criteria were: (1) patients who did not have GGO in the first CT examination; (2) patients without significant changes of lesions in the repeat CT after 5-7 days; and (3) lesions with consolidation on the initial CT. Baseline clinical data were collected and included age, sex, severity, symptoms, travel and exposure history, as well as laboratory testing results. Univariate analysis was used to identify the correlation between clinical factors, radiomics features, and radiological progression. A multiple logistic regression analysis was applied to develop the clinical factors model by using the significant variables from the univariate analysis as inputs. Correlation coefficients (r) with its 95% confidence interval (CI) were calculated for each independent factor.

CT protocol
Chest CT scans were performed using a single inspiratory phase and non-enhanced scanning in the commercial multidetector CT scanner (256-section Philips Brilliance iCT). CT images were acquired during a single breath-hold. The CT protocol was as follows: volume scan, tube voltage of 120 kVp with automatic tube current modulation, slice thickness, and interval of 5 mm. The same CT protocol and scanner were applied to the two CT scans. The fourth-generation iterative reconstruction (IR) algorithm (iDose4) and sharp kernel were applied to proceed the thin-slice (1 mm) reconstruction. Figure 1 presents the radiomics workflow, including (1) ROIs segmentation, (2) radiomics features extraction, (3) radiomics features selection, (4) prediction model development (in training set), and (5) prediction performance assessment.

ROIs segmentation
All the images were exported from the PACS system and imported into a radiomics cloud platform V.3.1.0 (http:// radcloud. cn/, Huiying Medical Technology Co., Ltd, Beijing, China). The lesions were manually delineated on the reconstructed images with a slice of 1 mm by two independent radiologists (XLW and DL with approximately 18 and 15 years of experience in thoracic radiology, respectively). CT manifestations of COVID-19 may vary among different patients and stages; thus, we took each lesion as a unit to perform radiomics analysis and segmented the regions of interest (ROIs). We set up the prerequisites that all ROIs were GGO; after which patients underwent a second CT examination (the time interval between the two CT scans was 5-7 days). Three experienced radiologists (XLW, LPS, and QY) evaluated the progression conditions of COVID-19 comparing the two CT images dependently and were blinded to the clinical information.
An example of the manual segmentation is shown in Figure 2. GGO was defined as a hazy increase in lung attenuation with no obscuration of the underlying vessels, which was also manifestation in the absorption period of COVID-19. 8 Consolidation was considered as an indication of disease progression. After CT re-examination, ROIs were finally divided into two groups based on the condition of radiological progression: absorption group and consolidation (progression) group. ROIs in the absorption group decreased in the density and scope in GGO, and ROIs in the progress group progressed to consolidation. Inter-and intra-class correlation coefficients (ICCs) were used to assess the intraobserver reproducibility and inter observer reliability of feature extraction. There was a good agreement of the feature extraction if the ICC value <0.75.

Radiomics feature extraction and selection
The radiomics features were divided into four groups: (a) a first-order statistics, including 126 descriptors that quantitatively delineated the distribution of voxel intensities within ROIs through commonly used and basic metrics; (b) shape features, composed of three-dimensional (3-D) features that reflected the shape and size of the region; (c) textural features, which were calculated from Grey Level Run-Length Matrix (GLRLM), Grey Level Co-occurrence Matrix (GLCM),Gray Level Size Zone Matrix (GLSZM) and Gray Level Dependence Matrix(GLDM); (d) filter and wavelet features, which included the intensity and texture features derived from filter transformation and wavelet transformation of the original images, processed using filters, such as wavelet-LLL, wavelet-LHL, wavelet-HLL, wavelet-LLH, logarithm, square, square root, original, and exponential. 11,12 In the present study, three methods were used to progressively reduce the redundant features. Firstly, the variance threshold method was applied to remove the eigenvalues of the variance smaller than 0.8. Secondly, the Select K Best method by using p value to analyze the correlation between the features and the classification results; the selected features were used (p < 0.05). Finally, the least absolute shrinkage and selection operator (LASSO) model were used to reduce the dimensions of features and effectively identify the most significant features. 13,14 For the LASSO model, the error value of crossvalidation was 10, and the maximum number of iterations was 2000. The corresponding parameter settings followed the previous studies. [15][16][17] Classification analysis and radiomics signature Classification analysis was performed to identify absorption and consolidation based on texture features based on CT images. The classifiers, which were constructed by supervised learning, involved learning from a cluster of given samples possessing the selected features so as to create a classifier that can correctly classify new objects and predict the data with respect to prognosis. 18 In this study, the radiomics-based models were built with three classifiers, k-Nearest Neighbor (kNN), Support Vector Machine (SVM), and Logistic Regression (LR). The decision tree algorithm first creates readable rules and decisions by using an inductive algorithm; then, this decision is used to analyze new data. 19 The prediction performance of the radiomics-based classifiers for predicting the progress of COVID-19 was assessed concerning the area under the curve (AUC), the receiver operating characteristic (ROC) curve, sensitivity, specificity, and accuracy in the training and validation sets. The selected features were operated to build radiomics signature, and a radiomics signature (Rad-score) was calculated by a linear combination of selected features weighted by corresponding LASSO coefficients. 20 Development of a radiomics nomogram and performance assessment The significant variables of both the clinical factors and the Rad-score were employed to develop a radiomics nomogram. A calibration plot was performed to assess the calibration and goodness-of-fit of the nomogram. The prediction performance of the clinical factors model and the radiomics nomogram for prognosis was estimated based on C-index in both the training and validation sets. The decision curve analysis (DCA) was conducted to assess the net benefits for a range of threshold probabilities in the training set.

Statistical analysis
Statistical analyses were performed using SPSS v.24.0 (SPSS Inc., Chicago, IL, USA) and R statistical software v.3.3.4 (https://www. r-project. org). Between-group comparisons of the clinical factors were conducted with the chi-squared test or Fisher exact test for categorical variables, the continuous variables were conducted with the Mann-Whitney U-test. The ROCs of the two models were compared using the DeLong test. The prediction performance of models was assessed in the validation set by the same thresholds determined in the training set. The ROC curves were plotted using the "pROC" package. Nomogram development was conducted by using the "rms" package. The DCA was performed using the "dca.R. " package.

Clinical information and clinical factors model
Forty patients were included in our study (age: 47.6 ± 14); 25 (62.5%) were males, and 18 (45%) were previously exposed to COVID-19. The detailed clinical data of patients are summarized in Table 1. Table 2 shows the relationship between the clinical factors, including T lymphocyte subgroups, and the radiological progression of patients with COVID-19. There was significant difference in age, neutrophil count, NK cells %, NK cells count, CD3 + T%, CD19 + %, and CD4 + T% between the absorption group and consolidation group (p < 0.05); but CD19 + count, CD3 + CD8 + T%, CD3 + T count, C-reactive protein (CRP), sex, the ratio of CD4 + /CD8 + , CD4 + T count, CD8 + T count were not significantly different (p > 0.05). Table 3 shows p values of clinical information corresponding to ROIs in the training and validation sets.
Radiomics feature and prediction performance of radiomics-based models A total of 180 ROIs were analyzed; 1409 image features were extracted from the CT images. ROIs from the absorption group (n = 118) and consolidation group (n = 62) were randomly divided into a training set (n = 145) and a validation set (n = 35). For the selection of radiomics features, significant features were selected by the LASSO regression model and forward selection approach (Figure 3). The best performance of LASSO regression was built using a penalty parameter α = 1.17, as the mean square error was minimized. Finally, 15 radiomics features with respective coefficients were calculated (Table 4).
ROC analysis was used to evaluate the prediction performance of radiomic-based models (  Figure 4). With regard to accuracy, the kNN demonstrated the best performance among the three models. AUC values under ROCs of multiple radiomics models obtained by a tenfold crossvalidation method showed better performance.

Radiomics signature construction
The radiomics signature was developed by 15 features. The Radscore was calculated using the following formula:   The Rad-score showed significant difference between absorption and consolidation (r = 0.5022, 95% CI:0.3838-0.6044, p < 0.0001).

Radiomics nomogram construction and prediction performance assessment
The Rad-score, age, neutrophil count, NK%, and CD3% were incorporated into the radiomics nomogram construction (Figure 5a). Calibration curves for the radiomics nomogram in the training, validation sets, and the whole cohort are shown in Figure 5b-d. The ROC analysis was used to evaluate the prediction performance of the clinical factors model and the radiomics nomogram. We found that when the clinical factors were used for predicting prognosis, the C-index was 0.82 [95% CI (0.75-0.88); sensitivity, 0.61; and specificity, 0.92] in the training set, and 0.77 [95% CI (0.59-0.90); sensitivity: 0.83; specificity: 0.64] in the validation set. In comparison, the radiomics nomogram showed optimal prediction performance. In the training set, the C-index was 0.91 [95% CI (0.85-0.95); sensitivity: 0.83; specificity: 0.84], and the C-index was 0.85 [95% CI (0.69-0.95); sensitivity: 0.61; specificity: 1.0] in the validation set. Figure 6 showed the ROC curves of the clinical factors model and the radiomics nomogram in the training and validation sets, respectively. For the training set, the C-index of the radiomics nomogram was significantly higher than clinical factors model (p = 0.0021). The calibration curve showed good calibration in the training set and validation set. The results of DCA in the training set are shown in Figure 7. The radiomics nomogram showed the highest net benefit in the three models.

DISCUSSION
The purpose of this study was to identify the value of radiomics derived from CT images to predict prognosis in patients with COVID-19. The results showed that the radiomics nomogram integrating Rad-score and clinical factors has a good predictive value for radiological progression.
Laboratory testing has a vital role in diagnosing and managing human pathologies, including COVID-19. 21 Reverse transcription-polymerase chain reaction (RT-PCR) on respiratory tract specimens is considered the gold standard for the etiological diagnosis of COVID-2019 infection. 22,23 Still, according to recent reports, the diagnostic accuracy of RT-PCR testing for COVID-19 might be lower than optimal. 24 Ai et al 25 examined 1014 patients suspected of having COVID-19 who underwent RT-PCR testing and chest CT scan. They found that the diagnostic accuracy of the CT was higher than the RT-PCR (88% of patients had positive chest CT findings while only 59% were positive on RT-PCR). Moreover, recent study reported the combination of RT-PCR (real-time) with clinical symptoms, epidemiological evidence, and CT manifestations facilities diagnosis of COVID- 19. 26 Guan et al 27 investigated the clinical characteristics of patients with COVID-19 in China and reported that the majority of cases   on admission presented with lymphocytopenia, thrombocytopenia, and leukopenia. Most of the patients had elevated levels of CRP. Fever and cough were the dominant symptoms, while gastrointestinal symptoms were uncommon. 28,29 A series of clinical factors were enrolled in this study. We found that the consolidation group had higher age, elevated neutrophil count, and changes of T lymphocyte subsets compared with the absorption group, which was consistent with previous studies. In general, during the early phase of the COVID-19 infection, the diagnosis and evaluation were complicated by the diversity in symptoms and imaging findings, and in the severity of disease at the time of presentation.
The chest CT has great significance in diagnosing, monitoring progression, and evaluating curative effect in clinic; yet, the role of CT for COVID-19 diagnosis and conditional evaluation is still controversial. So far, only a few studies reported in detail CT features commonly found in COVID-19. 3,6,30 GGO and consolidation are two main manifestations of COVID-19 lesions on chest CT. Li et al 31 reported singular or multiple irregular lesions of GGO or/and consolidation in 49 out of the 51 cases who underwent chest CT (96.1%). In this study, the CT findings were GGO in the early stage; some patients' lesions were absorbed, and the density in lesions gradually decreased. In contrast, in some patients with disease progression, GGO turned into consolidation, and the density in the lesions increased. In this regard, CT could be able to predict what patients may progress, then taking early intervention to partly reduce the incidence of severe COVID-19 and improve the prognosis of patients (e.g. micronutrient, antiviral treatment and immunotherapy). However, chest CT is still limited when identifying specific viruses. The CT features of COVID-19 overlap with the features of diseases caused by viruses from a similar family, such as MERS-CoV or SARS-CoV. Moreover, these findings were qualitative, which limited their accuracy. Therefore, new ways for evaluating CT features for predicting disease progression that could guide the clinical therapies are urgently needed.
Recently, radiomics has been proven to be a potential imaging modality to identify biological characteristics of diseases beyond visual assessment on CT images. A previous study applied radiomics-based predictive models using random forest (RF) and kNN classifiers to identify glucocorticoid-sensitive connective tissue disease-related interstitial lung disease, obtaining AUC of 0.66 in RF models and 0.61 in kNN model. 32 Zhang et al 33 incorporated CT-radiomics and PET metabolic parameters to build a classification model (using a SVM method) for  distinguishing benign and malignant lung lesions. Their model showed a substantial diagnostic capacity.
According to our knowledge, this study first reported on CT-based radiomics in prognosis prediction of COVID-19. We applied three classifiers, including kNN, SVM, and LR to develop radiomics-based models to predict absorption and consolidation of lesions. Our results revealed that all the three models had good predicting performance in feature classification methods (accuracy >0.70, AUC >0.80). Adequate analysis of clinical factors and imaging findings is helpful for accurate diagnosis and management of patients with COVID-19. Then, Rad-score, which was calculated to combine with independent clinical factors to build a radiomics nomogram, achieved favorable efficacy in predicting radiological progression in lesions of COVID-19. Additionally, the nomogram with Rad-score had a relatively higher C-index, and net benefit than the clinical factors model did, suggesting the additional value of the CT texture features in differentiating absorption group and consolidation group. Radiomics, as a computer-assisted technique, helps to identify microscopic features associated with the biology of disease progression. Assessment based on clinical factors separately cannot be used to fully evaluate the course of pneumonia; integrating the radiomics features, and clinical factors within a combined nomogram could achieve earlier detection of higher risk patients. Although chest radiographs are mainly used for COVID-19 management worldwide due to several reasons (radiation dose, patient transport, efficiencies, availability, etc), 34,35 the prediction performance based on radiomics features can be hardly achieved with chest radiographs due to the limit on the number of images, which lead to the loss of image information.
This study has a few limitations. First, it is a retrospective cohort study; thus, potential selection bias might influence the repeatability and stability of the results. Patients who were already in the hospital and who underwent CT had either more serious clinical conditions or rather an atypical one. Second, a limited number of patients were included in the study, which may influence the generalizability of the final conclusion. Third, in a few patients, simultaneous consolidation lesions were observed, but relatively few in the early stage of the disease that had a small overall effect on results. Finally, the present study did not discuss the fibrosis condition. Still, some radiologists hinted that fibrosis might indicate a poor outcome of COVID-19, reporting that it may subsequently progress to peak stage or result in pulmonary interstitial fibrosis disease. 9,36 The sample size should be expanded and included in the fibrosis group for prognostic evaluation.

CONCLUSION
The radiomics nomogram based on CT images has favorable prediction performance in the prognosis of COVID-19. The radiomics nomogram, as a quantitative and noninvasive modality, could act as a potential biomarker to supplement conventional imaging and laboratory examinations and help physicians to more accurately categorize patients into different stages for clinical decision-making process.