Development and Validation of a Radiomics Model Based on 3-Dimensional Endoanal Rectal Ultrasound of Rectal Cancer for Predicting Lymph Node Metastasis

Background: Development of a radiomics model for predicting lymph node metastasis status in rectal cancer patients based on 3-dimensional endoanal rectal ultrasound images. Methods: This study retrospectively included 79 patients (41 with lymph node metastasis positive and 38 with lymph node metastasis negative) diagnosed with rectal cancer in our hospital from January 2018 to February 2022. The tumor’s region of interest is first delineated by radiologists, from which radiomics features are extracted. Radiomics features were then selected by independent samples t-test, correlation coefficient analysis between features, and least absolute shrinkage and regression with selection operator. Finally, a multilayer neural network model is developed using the selected radiomics features, and nested cross-validation is performed on it. These models were validated by assessing their diagnostic performance and comparing the areas under the curve and recall rate curve in the test set. Results: The areas under the curve of radiologist was 0.662 and the F1 score was 0.632. Thirty-four radiomics features were significantly associated with lymph node metastasis (P < .05), and 10 features were finally selected for developing multilayer neural network models. The areas under the curve of the multilayer neural network models were 0.787, 0.761, 0.853, and the mean areas under the curve was 0.800. The F1 scores of the multilayer neural network models were 0.738, 0.740, and 0.818, and the mean F1 score was 0.771. Conclusions: Radiomics models based on 3-dimensional endoanal rectal ultrasound can be used to identify lymph node metastasis status in rectal cancer patient with good diagnostic performance.


INTRODUCTION
Among various cancers, colorectal cancer is the third most common cancer and the fifth leading cause of cancerrelated death in China. 1 A total of 30%-40% of colorectal cancer is rectal cancer (RC). The status of lymph node metastasis (LNM) in patients with RC has a vital influence on local recurrence, overall survival, and whether the patients need to undergo neoadjuvant radiotherapy (NAT) or chemotherapy. 2 However, the current medical technology is challenging to accurately predict the LNM status of RC patients before surgery. Meta-analysis studies have shown that even if endoanal rectal ultrasound (ERUS), computed tomography (CT), and magnetic resonance imaging (MRI) are used in combination, the clinical lymph node staging is not certain. 3 Radiomics is a new concept in recent years. Radiomics is used to solve clinical problems by extracting features from medical images and building machine learning models. Since it was proposed in 2012, 4 it has played an increasingly important role in cancer research. Using it can improve the accuracy of cancer diagnosis, prognostic assessment, and metastasis prediction. 5 Machine learning models developed from radiomics features have been widely accepted as reliable tools for predicting clinical events, and has successfully assisted the diagnosis of several malignant tumors, preoperative prediction of LNM status, and prediction of radiotherapy and chemotherapy effects. 6-10 Multilayer neural network (MLP) is a machine learning model, which belongs to neural network and is a nonlinear statistical classifier. 11,12 Multilayer neural network has been widely used in the radiomics research. Studies have shown that MLP performs well in the diagnosis of breast tumors, bladder tumors, and gastric tumors. [13][14][15] It even showed good results in predicting the incidence of COVID-19. 16 Three-dimensional endoanal rectal ultrasound (3D-ERUS) is a new technology that can automatically obtain volume data of tissues around the rectum, which can obtain more information than traditional 2-dimensional rectal ultrasound (2D-ERUS). It shows higher accuracy than 2D-ERUS or CT in predicting RC staging and LNM status. 17 To our knowledge, no study has evaluated whether MLP models based on 3D-ERUS radiomics features can improve LNM state prediction in RC. Therefore, this study aimed to establish a radiomics model for preoperative prediction of LNM status in RC patients.

MATERIALS AND METHODS
This study involving human participants were reviewed and approved by the Second Affiliated Hospital of Guangzhou Medical University. Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.

Study Design
The overall design of our study was illustrated in Figure 1, including patient recruitment, tumor segmentation and feature extraction, model development with nested cross-validation, and evaluation of model performance.

Patients
From January 2018 to February 2022, a total of 109 RC patients who were confirmed through histopathology were retrieved using the Picture Archiving and Communication System work station in our institution. Our inclusion criteria were as follows: (1) 3D-ERUS was performed in the patient prior to surgery; (2) the diagnosis of rectum cancer was confirmed through histopathology analysis; (3) the time interval between 3D-ERUS and radical RC resection was ˂1 month. Exclusion criteria were: (1) the mass belonged to the high RC category and were incapable of undergoing 3D-ERUS examination (n = 14); (2) the mass was too large to be fully included in the ultrasound scan, so the mass could not be completely displayed (n = 7); (3) the NAT treatment was performed before 3D-ERUS (n = 9). A total of 79 patients diagnosed with RC were included in this retrospective study, including 41 (51.9%) males and 38 (48.1%) females. Based on histopathological examination results, these patients were divided into LNM-positive group (stage N1-2) or LNM-negative group (stage N0).

Three-Dimensional Endoanal Rectal Ultrasound
The ultrasound equipment used was a BK Pro Focus 2202 ultrasound system equipped with the 8820 3D intra-anal probe. The patient received a cleansing enema 2 hours before 3D-ERUS. During the ultrasound examination, 50 mL of warm Coupland was injected into the patient's rectum through the anus, then the probe was inserted and the tumor was placed in the center of the image, and finally the automatic 3D scanning procedure was started. All 3D-ERUS are acquired with the following parameters: MI 0.86 <1.90, TIS 0.1 <4.0, Res/Hz 2/38 Hz, B Gain 58%, DynRange 71 dB, Harmonic off, persist 1, Edge 3, Noise Reject 15, ACI On, ETC 3.

Radiologists
Three-dimensional endoanal rectal ultrasound images of RC were reviewed by 2 experienced radiologists blinded to pathological information to assess LNM status, as shown in Figure 2. Any disagreement was resolved by consultation.

Image Segmentation
Region of interests were manually drawn on the largest transverse section of the tumor by 2 experienced radiologists using the ITK-SNAP 3.8 software (https ://ww w.itk snap. org), as shown in Figure 1B. Each disagreement was resolved through discussions. Radiologists were blinded to clinical information.

Radiomics Feature Extraction
When the 3D-ERUS images segmentation was complete, use the python program to perform radiomics analysis. 18 A total of 1694 radiomics features in 8 categories were extracted for each patient, including (a) first-order statistics, (b) shape-based (3D), (c) shape-based (2D), (d) graylevel cooccurrence matrix, (e) gray-level run length matrix, (f) gray-level size zone matrix, (g) neighboring gray tone difference matrix, and (h) gray-level dependence matrix.  Receiver operating curve and precision recall curve (P-R curve) analysis were used for model performance evaluation.

Feature Normalization
After feature extraction, the value range of some features is quite different, some ranging from 0 to 10, and some ranging from 0 to 1000. However, we cannot directly assume that features with larger values have greater value to the model. Thus, we normalize values of different features so that all values fall into the same numeric interval. 19 Radiomics Feature Selection Due to the very high dimensionality of radiomics features (n = 1694) compared to the sample size of the study cohort (n = 79), feature selection was necessary to avoid overfitting. To reduce dimensionality, we design a 3-step feature screening procedure. First, use Student's t-test to remove redundant features with small differences. Second, perform correlation analysis on the features. From this step, highly correlated features with correlations above 0.80 are removed. Finally, the least absolute shrinkage and selection operator (LASSO) method is applied for feature screening.

Model Development
We have developed a preliminary deep learning model, an MLP. Grid-search cross-validation (Grid-search CV) is used for hyperparameter selection during model building. 18 It is worth noting that in order to avoid the model being too complex, we limit the number of hidden layers of the MLP model to less than 5 layers in Grid-search CV.

Nested Cross-Validation
Due to the small sample size of this study, it is easy to cause deviations between the research results and the actual situation. Therefore, this study uses 3*3 nested cross-validation to validate the model results. 20

Clinical Characteristics
In this study, 79 patients with RC were finally included, and they were divided into LNM positive group (41 cases) and LNM-negative group (38 cases) according to the LNM status of pathological diagnosis. The clinical data of the 2 groups are shown in Table 1. Among them, the lymph node status reported by the radiologist and pathological T stage were significantly different (P < .05).

Radiologists
Among the 79 patients, radiologist correctly diagnosed 29 lymph node-positive patients and 23 lymph node-negative patients. The AUC is 0.662, the F1-score is 0.632, the specificity is 0.763, the sensitivity is 0.561, and the accuracy is 65.8%.

Radiomics Feature Extraction and Selection
In the 3D-ERUS images, we extracted a total of 8 categories and 1694 radiomics features. Our results showed that 34 radiomics features were significantly associated with LNM status (P < .05) ( Figure 3A and B). After the 3-step procedure, 10 features were finally selected for developing the radiomics model ( Figure 3B and D).

Diagnostic Performance of Radiomics Models
The 3*3 nested cross-validation results are shown in Table 2. The AUCs of the 3 MLP models are 0.780, 0.761, and 0.853, and the average AUC is 0.798. The F1 scores of the 3 MLP models are 0.738, 0.740, and 0.818, and the average F1-score is 0.771 ( Figure 4A and B). The nested cross-validation results show that the diagnostic performance of the MLP model is higher than that of radiologists.

DISCUSSION
This study aimed to develop a 3D-ERUS image-based radiomics model to predict LNM status in RC patients. In 3*3 nested cross-validation, the mean AUCs and F1 scores of the MLP models are higher than that of radiologists, showing good effect in predicting LNM status in RC patients.
The LNM status in RC patients has a vital influence on local recurrence, overall survival, and whether patients need to undergo NAT. The LNM status of RC patients is a reference indicator for deciding whether to perform NAT. For patients with advanced RC, surgical resection after NAT treatment could reduce the risk of local recurrence by 50%-61% compared to surgery alone. 22 Therefore, accurate preoperative assessment of LNM status is crucial for optimizing treatment regimens and prognostic prediction. However, it remains a challenge to assess LNM status before surgery.
In this study, radiologists had poor results in predicting LNM status in RC patients using 3D-ERUS images (AUC = 0.662), which was consistent with previous studies. 3 This may bring wrong information to clinicians, leading to the wrong choice of treatment options.
Radiomics can extract information from images to detect differences that cannot be detected by visual inspection. 4,5 Previous studies have predicted the LNM status of RC patients based on the imaging characteristics of 2D-ERUS, CT, and MRI. 8 11,12 The MLP models have been effectively applied to the diagnosis of liver cancer and breast cancer, to predict LNM status in breast cancer patients, and to assess the risk of cardiovascular disease in patients. [35][36][37] It is worth noting that the increase of hidden layers will also increase the complexity of the model. Studies have shown that an overly complex model can easily lead to overfitting and reduce the performance of  the model for unfamiliar objects. To avoid this situation, this study used grid-search CV to optimize the model hyperparameters while limiting the number of hidden layers in the model.
In order to verify the actual effect of the model applied to the clinic, the model was validated. Due to the small sample size of this study, it is easy to cause model overfitting. In layman's terms, overfitting means that an artificial intelligence (AI) model learns in a way that is only applicable to training samples, and no longer generalizes to the entire population. 29,38 This means that the model has extremely high performance on the training set and extremely low performance on the validation and test sets. Therefore, this study uses 3*3 nested cross-validation to verify the model results. The 3*3 nested cross-validation results of the MLP models show that the 3 models have better performance on the test set and validation set, respectively. The diagnostic performance of the test set and validation set is lower than that of the training set, but the gap is within an acceptable range. It is worth noting that for each model, the validation and test sets are external datasets, which indicates that the model has a low degree of overfitting. Based on the mean AUC and mean F1-score, the MLP models showed good diagnostic performance in multiple validations, and all were larger than the radiologist's AUC, showing good reliability and repeatability.
Nevertheless, our study has several limitations: First, our relatively small sample size may cause unstable results, even if we use nested cross-validation. Second, this study lacks multi-institution verification of radiomics characteristics. Finally, instead of 3D analysis of the entire lesion volume, we performed a 2D analysis of the region of interest in the largest slice of the lesion cross-section. This method is less labor intensive but less sensitive to intertumoral changes.

CONCLUSION
In conclusion, the radiomics features based on 3D-ERUS images are of great value for identifying the LNM status of RC patients, and the diagnostic performance of the MLP models constructed based on it are better than that of radiologists. Multicenter retrospective validation and prospective randomized clinical trials should be performed in subsequent studies to obtain high-level evidence for the clinical application of this radiomics model. Informed Consent: Written informed consent for participation was not required for this study in accordance with the national legislation and the institutional requirements.