The diagnostic value of CT-based radiomics nomogram for solitary indeterminate smoothly marginated solid pulmonary nodules

Objectives This study aimed to explore the value of radiomics nomogram based on computed tomography (CT) on the diagnosis of benign and malignant solitary indeterminate smoothly marginated solid pulmonary nodules (SMSPNs). Methods This study retrospectively reviewed 205 cases with solitary indeterminate SMSPNs on CT, including 112 cases of benign nodules and 93 cases of malignant nodules. They were divided into training (n=143) and validation (n=62) cohorts based on different CT scanners. Radiomics features of the nodules were extracted from the lung window CT images. The variance threshold method, SelectKBest, and least absolute shrinkage and selection operator were used to select the key radiomics features to construct the rad-score. Through multivariate logistic regression analysis, a nomogram was built by combining rad-score, clinical factors, and CT features. The nomogram performance was evaluated by the area under the receiver operating characteristic curve (AUC). Results A total of 19 radiomics features were selected to construct the rad-score, and the nomogram was constructed by the rad-score, one clinical factor (history of malignant tumor), and three CT features (including calcification, pleural retraction, and lobulation). The nomogram performed better than the radiomics model, clinical model, and experienced radiologists who specialized in thoracic radiology for nodule diagnosis. The AUC values of the nomogram were 0.942 in the training cohort and 0.933 in the validation cohort. The calibration curve and decision curve showed that the nomogram demonstrated good consistency and clinical applicability. Conclusion The CT-based radiomics nomogram achieved high efficiency in the preoperative diagnosis of solitary indeterminate SMSPNs, and it is of great significance in guiding clinical decision-making.


Introduction
Pulmonary nodules are common in clinical practice, and the correct differentiation between benign and malignant nodules is critical in guiding treatment planning.Pulmonary nodules can be classified into subsolid and solid nodules based on computed tomography (CT) density (1)(2)(3)(4).Solid nodules have proven to be more difficult to make a correct diagnosis than subsolid nodules.Studies on patients undergoing surgical resection have shown that over 95% of subsolid nodules are malignant, whereas the malignant rate of solid nodules ranges from 51% to 67% (4)(5)(6).In addition, the malignant degree of solid lung cancers is higher than that of subsolid ones, which are not suitable for long-term follow-up (1,2,7).Therefore, the accurate and timely diagnosis of solid nodules must be improved.
Spiculation sign is defined as the presence of strands radiating from the margin of the nodule to the lung parenchyma without reaching the surface of the pleura (8).Spiculation sign is a wellknown sign associated with malignancy (3-5, 8, 9).Studies have found that benign solid nodules usually present with a well-defined smooth margin, whereas most malignant solid nodules show spiculated and ill-defined margins (3,(10)(11)(12).In clinical practice, solitary smoothly marginated solid pulmonary nodules (SMSPNs) tend to be diagnosed as benign.However, we found that many malignant nodules present with similar characteristics.For SMSPNs, the presence of fat and/or benign calcification pattern (including central, diffuse, laminated, and "popcornlike") within the nodules are highly specific indicators of benignancy (10,11).In addition, the risk of lung cancer of smoothly marginated triangular, lentiform, oval, or semicircular juxtapleural nodules is extremely low (10,(12)(13)(14).Except for the above situations, differentiating benign from malignant SMSPNs by CT features is usually difficult.Thus, new methods are needed to improve the diagnostic ability for indeterminate SMSPNs.
Radiomics is a research hotspot in recent years.It can extract a large number of high-dimensional and quantifiable features from traditional CT images, as well as combine clinical factors and traditional CT features to establish prediction models via artificial intelligence methods, thereby achieving disease diagnosis, lymph node metastasis, and prognosis prediction (5,6,9,(15)(16)(17).We hypothesized that the combination of additional radiomics features within the nodules can improve the diagnostic ability for indeterminate SMSPNs.Therefore, we constructed a nomogram that combined the CT-based radiomics features, clinical factors, and conventional CT features for the diagnosis of benign and malignant solitary indeterminate SMSPNs.

Patients
Patients with lung nodule who underwent surgical resection at our institution were retrospectively analyzed from February 2019 to March 2022.Our Institutional Ethics Committee approved this study.Because this was a retrospective study, the requirement for written informed consent was waived.The inclusion criteria were as follows: nodule size ≤ 3 cm, solid nodule on CT, lesion was located away from the segmental bronchi, and nodule had a well-defined smooth edge.The exclusion criteria were as follows: nodule contained fat, nodule contained benign calcification (including central, diffuse, laminated, and "popcornlike"), juxtapleural nodule, and lesion received chemotherapy or radiotherapy prior to surgery.

CT feature interpretation
Two radiologists who specialized in thoracic radiology (one with 13 years and one with 15 years of work experience) reviewed all the CT images independently.They were blinded to the histopathological data and resolved disagreements by consensus.
The Intraclass Correlation Coefficient (ICC) and Kappa statistics were utilized to assess interobserver agreement in CT feature interpretation.The ICC was employed for quantitative data analysis, while Kappa statistic was utilized for categorical data analysis.Finally, the above two radiologists classified all nodules as benign or malignant based on preoperative clinical factors and CT images.

Radiomics feature extraction
The latest guidelines set forth by the Image Biomarker Standardization Initiative (IBSI) were meticulously adhered to in the analysis of radiomics features (18).Specifically, feature extraction was performed on a RadCloud platform (version 7.2, Huiying Medical Technology Co., Ltd, Beijing, China, https:// mics.huiyihuiying.com)(19).The platform is based on the IBSIcompliant PyRadiomics (20).The region of interest (ROI) was delineated on axial thin-section lung window images by a semiautomatic delineation method.
In our study, gray level discretization was executed with a consistent bin width of 25, and voxel sizes were resampled to a uniform dimension of 1mm×1mm×1mm using the PyRadiomics software package.For each ROI, 1688 radiomics features were extracted in our study.In each three-dimensional segmentation, a total of 107 radiomics features were extracted from the original image, including 14 three-dimensional shape features, 18 first-order statistics features, 14 Gray Level Dependence Matrix features, 16 Gray Level Size Zone Matrix features, 16 Gray Level Run Length Matrix features, 24 Gray Level Co-occurrence Matrix features, and 5 Neighboring Gray Tone Difference Matrix features.Following this, texture characteristics were derived from filtered images including Square, Square Root, Wavelet, Exponential, Logarithm, Gradient, two-dimensional local binary pattern, and three-dimensional local binary pattern.

Radiomics feature selection and Radscore construction
First, 50 patients were randomly selected for intra-and interobserver reproducibility of feature selection.The specific step was that the ROI of the 50 patients' images was separately delineated by two radiologists, and the features were extracted.After 1 month, the ROI of the 50 patients' images was delineated again by the junior radiologist, and the features were extracted.The intra-and interobserver reproducibility of feature selection was tested by ICC, and features with ICC value greater than 0.75 were selected for further feature analysis.Subsequently, the ROI of the rest patients' images was delineated by the junior radiologist.The variance threshold method, SelectKBest, and least absolute shrinkage and selection operator (LASSO) were used to select the key radiomics features to construct the rad-score.

Nomogram building and validation
After univariate and multivariate logistic regression analyses, a nomogram was constructed by combining the rad-score, clinical factors, and CT features, which were significantly different (P<0.05) between benign and malignant nodules in the training cohort.The nomogram performance was verified by area under the receiver operating characteristic curve (AUC) value, sensitivity, specificity, accuracy, calibration curve, and clinical decision curve analysis.In addition, the diagnosis ability of the nomogram was compared with the radiomics model (rad-score), the clinical model (constructed by CT features and clinical factors), and radiologists' diagnosis.

Statistical analysis
SPSS software (Version 23.0) and R software (version 3.4.2) were used for statistical analysis.c2 or Fisher's exact test were used for categorical variables, while the Mann-Whitney U test or twosample t-test were used for continuous variables.The AUC values among the models were compared via DeLong test.P value < 0.05 indicated statistical significance.
Among the 205 patients, 143 (scanned at CT 1, 2, and 3) were selected as the training cohort, and 62 (scanned at CT 4 and 5) were selected as the validation cohort.The patients' clinical factors and CT features are summarized in Table 1.
The consistency in interpretation of CT features among radiologists demonstrated good interobserver agreement, with ICC values ranging from 0.913 to 0.953, and Kappa values ranging from 0.798 to 0.962.

Radiomics feature selection and Radscore construction
A total of 1688 radiomics features were extracted in this study, and 1565 features remained after excluding those with ICC values less than 0.75 in intra-and interobserver reproducibility.Using the variance threshold, which was set to 0.8, 1234 features were selected.A total of 232 features were obtained via SelectKBest with P value < 0.05.Finally, 19 features were left via the LASSO algorithm with five-fold cross-validation (Figure 2).The rad-score was constructed by the 19 key radiomics features.

Nomogram building
After univariate and multivariate logistic regression analyses, the rad-score, one clinical factor (history of malignant tumor), and three CT features (namely, calcification, pleural retraction, and lobulation) were identified as independent factors for the diagnosis of indeterminate SMSPNs (Table 2).The nomogram was constructed using the above selected factors (Figure 3).The P value of the Hosmer-Lemeshow test was 0.706, which indicated the good fit of the model.

Nomogram performance
The Flow chart of the study.values diagnosed by the radiologists were lower than those in the nomogram in the training and validation cohorts (Figure 4).The accuracy of the nomogram for nodule diagnosis was higher than that of the radiomics model, clinical model, and radiologists' diagnosis in the training and validation cohorts (Table 3).The calibration curve (Figure 5) and decision curve (Figure 6) showed that the nomogram demonstrated good consistency and clinical applicability.

Discussion
Solitary SMSPNs are common in clinical practice, and many of such nodules are indeterminate by conventional CT features.In this study, we developed a CT-based nomogram incorporating the radscore, clinical factors, and conventional CT features for the diagnosis of such nodules.The AUC values of the nomogram in the training and the validation cohorts were 0.942 and 0.933, respectively.These values showed that the nomogram performed better than the radiomics model, clinical model, and experienced radiologists who specialized in thoracic radiology.The calibration curve and decision curve showed that the nomogram demonstrated good consistency and clinical applicability.
Our study showed that a history of malignant tumor was an independent risk factor for malignancy in indeterminate SMSPNs (P < 0.05).In this study, among the 33 cases with a history of malignant tumor, malignant nodules were seen in 28 (84.8%)cases and benign nodules were seen in 5 (15.2%) cases.Metastasis was the most common pathological type, with up to 24 cases.The reason is related to the susceptibility of malignant tumors to lung metastasis, and lung metastases typically exhibit smooth margin on CT (21,22).Therefore, for indeterminate SMSPNs, a history of malignant tumor strongly indicates a high likelihood of malignancy, especially metastases, even if they are solitary.Although gender and smoking history were significantly different between benign and malignant nodules in this study, multivariate logistic regression analysis showed no significant difference between the two groups.
The calcification pattern in the nodule is helpful for nodule diagnosis (5,10,11).Benign calcification comprises four patterns: central, diffuse, laminated, and "popcornlike."The first three patterns are typically seen in chronic inflammatory nodules, whereas "popcornlike" calcifications are characteristic of hamartoma.Other calcification patterns include eccentric, punctate, stippled, and amorphous, which can be seen in benign and malignant nodules.Although this study excluded the nodules with benign calcification patterns, we found that the presence of other calcification patterns within the indeterminate SMSPNs was still an independent predictor for benign nodules (P < 0.05).In this study, a total of 28 nodules had calcification, of which 24 were benign nodules and only 4 were malignant nodules.The most common pathological type of calcified nodules was hamartoma, with up to 19 cases.Other calcified nodules included 3 inflammatory nodules, 2 squamous cell carcinomas, 2 adenocarcinomas, and 2 sclerosing pulmonary cell carcinomas.Calcification was most commonly seen in hamartoma due to the following reasons.First, hamartoma is mainly composed of cartilage, which exhibits variable degrees of calcification and ossification (23).Second, hamartoma is the most common Nineteen key radiomics features and corresponding coefficients.
pathological type of solitary indeterminate SMSPNs in this study, which was found in 61 cases.
Research has found that malignant nodules are more prone to have pleural retraction than benign nodules (5,6,24).In this study, pleural retraction was found to be an independent risk factor for malignancy in indeterminate SMSPNs (P < 0.05).Pleural retraction was identified in 16 nodules, among which the most common pathological type was adenocarcinoma in 14 cases, and the others included 1 inflammatory nodule and 1 hamartoma.The pleural retraction observed in adenocarcinoma is most likely related to the epithelial-mesenchymal transition.Some studies have found that the epithelial-mesenchymal transition in lung cancer is most likely to occur in adenocarcinoma; solid adenocarcinomas are more likely than subsolid adenocarcinomas to have epithelial-mesenchymal transition, which can lead to contractile force, pulling the pleura and causing pleural retraction (25,26).Notably, all the nodules in this study were smoothly marginated without spiculation, and the pleural retraction was very mild or just appeared as a slight tension on the interlobular fissure pleura.Therefore, a combination of 3D reconstruction thin-layer images is required for careful observation of pleural retraction.
Lobulation was defined as an abrupt bulging of the contour of the lesion.Many studies have found that the lobulation sign is more commonly seen in malignant nodules than in benign nodules (3,8,17,24).Similar to previous findings, our study found that lobulation sign was identified in 71.3% of malignant nodules versus 53.2% of benign nodules (P < 0.05), and lobulation sign was an independent risk factor for malignancy in SMSPNs (P < 0.05).It should be noted that a high proportion of hamartomas (67.2%) showed lobulation sign in this study.Hamartoma is mainly composed of cartilage, which is arranged in lobules separated by cleft-like branching channels and cystic spaces lined by respiratory epithelium,  CT (23,27,28).In this study, univariate analysis showed that the probability of mediastinal lymph node enlargement and bronchial truncation sign was significantly higher in malignant nodules than in benign nodules (P < 0.05).However, multivariate logistic regression analysis showed that these factors were not independent factors for the diagnosis of indeterminate SMSPNs.
In addition to a detailed study of clinical factors and traditional CT signs, this study also investigated radiomics features extracted from the CT image of the nodule.This study extracted a total of 1688 radiomics features, and 19 key radiomics features were retained after feature selection.Among the 19 features, 18 were high-order statistical features.In addition, there were 14 texture features and 5 first-order features, without shape-based features.These results indicated that the intensity information and the relationship between pixels of the higher-order radiomics features within the nodules were meaningful for indeterminate SMSPNs diagnosis.The nomogram that incorporated rad-score, clinical factors, and CT features achieved better diagnostic efficiency than the radiomics model, clinical model, and experienced radiologists who specialized in thoracic radiology, which confirmed our hypotheses that the combination of additional radiomics features can improve the diagnostic ability for indeterminate SMSPNs.At present, most research on the diagnosis of pulmonary nodules focused on differentiating between benign and malignant nodules, with some studies on different pathological subtypes, such as adenocarcinoma versus tuberculosis, or lung cancer versus organized pneumonia (5,6,9,15,17).With the exception of juxtapleural nodules and nodules with benign calcification or fat, most solitary SMSPNs are difficult to diagnosis by conventional CT features (10-14).Among the 205 indeterminate SMSPNs in this study, 111 (54.1%) were benign and 94 (45.9%) were malignant.A high proportion of benign nodules underwent unnecessary surgery, because of the low confidence in the diagnosis of such nodules by traditional CT.In clinical practice, solitary SMSPNs on CT tend to be diagnosed as benign; however, 45.9% of such nodules were malignant in this study.Misdiagnosis of malignant nodules as benign may result in uncontrolled tumor progression and poor prognosis.Therefore, a systematic study was carried out to differentiate benign from malignant indeterminate SMSPNs in this study.We found that a history of malignant tumor, calcification, pleural retraction, and lobulation were independent factors for indeterminate SMSPNs diagnosis.In addition, we identified that 19 key CT-based radiomics features were independent predictors for indeterminate SMSPNs diagnosis.Finally, a nomogram was constructed, and it achieved high efficiency in the preoperative diagnosis of indeterminate SMSPNs, which is of great significance in guiding clinical decision-making.
This study had some potential limitations.First, this work was a retrospective study, and some selective bias may exist.Second, this work was a single-center study and lacked external validation.There were 5 CT scanners in our hospital, and the training cohort and validation cohort were grouped based on different CT scanners.The monogram achieved good results in the training and validation cohorts, indicating that the model had good generalization capability.However, further validation of the nomogram is still needed through larger, prospective studies with more diverse datasets to assess its generalization capability.
In conclusion, this study developed a nomogram incorporating the rad-score, clinical factors, and CT features for the diagnosis of FIGURE 1
solitary indeterminate SMSPNs.The diagnosis ability of the nomogram was better than that of the radiomics model, clinical model, and experienced radiologists who specialized in thoracic radiology.The nomogram provided a new method for preoperative diagnosis of solitary indeterminate SMSPNs.

TABLE 1
The patients' characteristics and conventional CT features of the training and validation cohorts.

TABLE 2
The independent clinical factors and conventional CT features for the diagnosis of indeterminate smoothly marginated solid pulmonary nodules.

TABLE 3
Predictive performances of radiomics nomogram, radiomics model, clinical model, and radiologist's judgment in the training and validation cohorts.