Distinguishing Lymphomatous and Cancerous Lymph Nodes in 18F-Fluorodeoxyglucose Positron Emission Tomography/ Computed Tomography by Radiomics Analysis

Background. *e National Comprehensive Cancer Network guidelines recommend excisional biopsies for the diagnosis of lymphomas. However, resection biopsies in all patients who are suspected of having malignant lymph nodes may cause unnecessary injury and increase medical costs. We investigated the usefulness of 18F-fluorodeoxyglucose positron emission/ computed tomography(18F-FDG-PET/CT-) based radiomics analysis for differentiating between lymphomatous lymph nodes (LLNs) and cancerous lymph nodes (CLNs). Methods. Using texture analysis, radiomic parameters from the 18F-FDG-PET/CT images of 492 lymph nodes (373 lymphomatous lymph nodes and 119 cancerous lymph nodes) were extracted with the LIFEx package. Predictive models were generated from the six parameters with the largest area under the receiver operating characteristics curve (AUC) in PETor CT images in the training set (70% of the data), using binary logistic regression. *ese models were applied to the test set to calculate predictive variables, including the combination of PET and CT predictive variables (PREcombination). *e AUC, sensitivity, specificity, and accuracy were used to compare the differentiating ability of the predictive variables. Results. Compared with the pathological diagnosis of the patient’s primary tumor, the AUC, sensitivity, specificity, and accuracy of PREcombination in differentiating between LLNs and CLNs were 0.95, 91.67%, 94.29%, and 92.96%, respectively. Moreover, PREcombination could effectively distinguish LLNs caused by various lymphoma subtypes (Hodgkin’s lymphoma and non-Hodgkin’s lymphoma) from CLNs, with the AUC, sensitivity, specificity, and accuracy being 0.85 and 0.90, 77.78% and 77.14%, 97.22% and 88.89%, and 90.74% and 83.10%, respectively. Conclusions. Radiomics analysis of 18F-FDG-PET/ CT images may provide a noninvasive, effective method to distinguish LLN and CLN and inform the choice between fine-needle aspiration and excision biopsy for sampling suspected lymphomatous lymph nodes.


Introduction
Lymph node enlargement has several causes, including invasion by lymphomas and metastatic cancers (not including lymphoma), which result in lymphomatous lymph nodes (LLNs) and cancerous (not including lymphomatous) lymph nodes (CLNs), respectively [1,2]. Traditional imaging techniques, such as ultrasonography (US), computed tomography (CT), and magnetic resonance imaging (MRI), often rely on the size, location, and shape to determine the nature of a swollen lymph node [3]. However, these features alone are insufficient for a reliable and effective judgment [4]. erefore, newer imaging techniques, such as contrast-enhanced ultrasonography (CEUS) and positron emission tomography/computed tomography (PET/CT), are currently used to assess the nature of lymph nodes. A previous study demonstrated that LLNs often exhibit a rapid, well-distributed hyperenhancement pattern on CEUS, which may help to distinguish between lymphomatous and benign lymph nodes [5]. Several studies have reported that PET/CT can help distinguish between benign and malignant lymph nodes and provide useful evidence for medical decisions [6][7][8].
e choice between excisional biopsy and needle biopsy for a suspicious malignant lymph node has always been difficult for clinicians. Pathological examination is the gold standard procedure for the diagnosis of malignant lymph nodes. Several medical institutions use fine-needle aspiration (FNA) for preliminary screening of enlarged lymph nodes [9]. FNA is a highly accurate and feasible procedure to diagnose most tumors [10][11][12]. However, it is difficult to accurately diagnose lymphomas by FNA [13]. Even in patients who have undergone FNA, a lymph node resection may nevertheless be required for further examination [14,15], resulting in the unnecessary burden of multiple invasive procedures. Excisional or incisional biopsy is clearly recommended by the National Comprehensive Cancer Network guidelines. Completely resected lymph nodes can provide adequate tissue for histological, immunological, and molecular biological assessments and can enable the accurate diagnosis of lymphoma and the differentiation of its various subtypes [16,17]. Because of the complexity of diagnosis, lymphomas may be misdiagnosed as other benign conditions, such as reactive immunoblastic proliferation [18], autoimmune lymphoproliferative syndrome [19], lymph-node infarction [20], or other tumors. ere are vast differences in the treatment of and evaluation methods for different tumors. erefore, the development of an effective method for the differentiation of LLN and CLN, without the need for additional investigation, time, and extra cost is an urgent unmet concern in clinical practice.
Using texture analysis (TA), radiomics can be utilized to analyze the voxel gray levels, as well as the distribution and relationship of pixels in US, CT, MRI, and PET/CT images to obtain radiomic features, thereby providing an objective and quantitative assessment of tumor heterogeneity [21,22]. is mode of analysis has been applied to a variety of imaging techniques to distinguish between benign and malignant lesions, such as US imaging of the thyroid [23] and liver [24], CT images of the lungs [25] and kidneys [26], and MRI imaging of the breasts [27]. Moreover, it has been used to evaluate the prognosis of esophageal [28], lung [29], and hypopharyngeal [30] cancers. e results of the abovementioned studies have been mainly limited to the differentiation between benign and malignant lesions; studies reporting on the differentiation of tumor types are rare [31][32][33]. We previously identified the benefits of applying radiomics analysis to PET/CT images to distinguish between renal cell carcinoma and renal lymphoma [34] and between breast carcinoma and breast lymphoma [33]. In this study, we focused on the usefulness of 18F-fluorodeoxyglucose (FDG) PET/CT-based radiomics analysis for distinguishing between LLN and CLN, and, moreover, used this method to classify the different subtypes of LLN in patients with lymphomas. We believe that the results of this study can help clinicians decide the optimal biopsy method (FNA or excision) for sampling diseased lymph nodes and thereby reduce the rates of misdiagnosis and unnecessary application of other invasive procedures.

General Demographic Data.
We evaluated all patients who underwent 18F-FDG PET/CT at our hospital between October 2013 and June 2018. e inclusion criteria were as follows: (1) presence of any solid tumor or lymphoma confirmed by pathological diagnosis, (2) lymph nodes that were invaded by lymphoma or cancer (determined by experienced radiologists and oncologists based on patient imaging characteristics such as FDG uptake and lesion morphology) and clinical information (e.g., treatment status and symptoms), and (3) no history of any systemic treatment before undergoing 18F-FDG PET/CT. e exclusion criteria were as follows: (1) unknown or uncertain pathological diagnosis and (2) combined diagnosis of lymphoma and cancer. Due to the retrospective nature of this study, which spanned a long period of time, informed consent was not sought from the subjects. is study was approved by the Institutional Review Board of our Hospital (Nos. 2019310 and 2019410).
Of note, the diagnosis of lymph node tumor invasion in patients was a clinical diagnosis rather than a pathological diagnosis. e diagnoses were mainly made using the patient's imaging report based on the level of FDG uptake and the morphology of the lymph nodes (such as the presence of fusion, significantly increased volume, irregular edges, and irregular density). All imaging reports were issued by a junior nuclear medicine physician of the Department of Nuclear Medicine in West China Hospital and reviewed by at least one senior nuclear medicine physician. Furthermore, the oncologist also consulted the patient's case data, such as other imaging examinations, signs and symptoms, changes in the lymph node after treatment, and pathological diagnosis. e lymph node was included in the analysis only when the imaging report demonstrated that the node was invaded and other evidence was not contradictory. e lymph nodes were divided into LLN, LLN caused by Hodgkin's lymphoma (HLLN), LLN caused by Non-Hodgkin's lymphoma (NHLLN), and CLN groups according to this diagnosis.

Image Acquisition and Clinical Data Collection.
All patients were subjected to an 18F-FDG PET/CTexamination using a Gemini GXL PET/CT scanner (containing 16 layers of CT, Philips Medical Systems, Cleveland, Ohio, USA). All patients were instructed to fast for more than 6 hours before the examination, and blood glucose estimation tests were performed before the examination to ensure that the blood glucose levels were below 8.0 mmol/L. An initial low-dose CT (120 kV, 40 mAs, 5 mm slice thickness) was performed approximately 1 hour after the intravenous injection of 5.18 MBq/kg (1.4 × 10 ∧ −4 Ci/kg) of 18F-FDG, followed immediately with a whole-body PET scan (head to extremities). Subsequently, PET/ CT images (attenuation corrected, based on CT) were generated and interpreted by experienced radiologists. e PET and CT images were reconstructed based on the European Association of Nuclear Medicine Research Ltd (EARL) guidelines (matrix size 4 × 4 × 4 mm and 1.2 × 1.2 × 5 mm voxel size). e data on participants' age, weight, and gender were also recorded.

Texture Analysis.
Texture analysis (TA) is a mathematical description of the gray level intensity and distribution of pixels or voxels in images [35]. It can be used to describe various parameters through statistics-based [36], model-based [37], or transform-based approaches [38] to express heterogeneity within lesions [39]. We used the LIFEx software (version 3.74, IMIV, CEA, Inserm, CNRS, Univ. Paris-Sud, Université Paris Saclay, CEA-SHFJ, Orsay, France) to perform TA of the PET and CT images. reedimensional volumes of interest (VOI) for each malignant lymph node (no limit to the maximum number in a single patient) in every slice were delineated manually by a welltrained radiologist (2 years' work experience). VOIs on PET and CT images were exactly consistent with each other. VOIs smaller than 64 pixels were excluded from the analysis. In the process of delineation, the tumors in each slice are carefully assigned to the VOI, and the surrounding tissues are excluded. Intensity discretization was performed automatically by the software: for PET images, intensity discretization was performed with the number of gray levels of 64 bins and the intensity rescaling was defined with absolute scale bounds between 0 and 20; for CT images, intensity discretization was performed with the number of gray levels of 400 bins and absolute scale bounds between −1000 and 3000 Hounsfield units (HUs) [33]. ereafter, extractionbased algorithms (fixed thresholding at 40% of maximum intensity cutoff) [39] were used to extract the radiomic parameters for both the PET and CT slices, including standardized uptake values (SUV) or CT-value parameters, PET parameters, and radiological parameters. e radiological parameters were divided into six groups: shape, graylevel zone-length matrix (GLZLM), gray-level run-length matrix (GLRLM), neighborhood gray-level different matrix (NGLDM), gray-level co-occurrence matrix (GLCM), and histogram (HISTO). During the extraction of radiomic parameters, the researcher was blinded to the patient's clinical information. Resampling voxel size was set at 4 × 4 × 4 mm (PET) and 1.2 × 1.2 × 5 mm (CT). We also measured the short/long diameter of the selected lymph nodes. Forty-seven parameters were extracted from the PET and CT images (94 in total). e image processing and radiomics workflow in this study followed the image biomarker standardization initiative (IBSI) guidelines.

Statistical
Analyses. Data were randomly divided into a training set and a test set (ratio 7 : 3), and the ratio of lymph node types in the training set and test set remained constant. e consistency of the number of samples in the two types of lymph node which were compared was maintained by oversampling in the training set, if the number of lymph nodes was extremely unbalanced (a ratio of less than 1 : 3). e sample in the smaller set will be randomly selected and replicated until the amount of data is equal to that in the other group when oversampling is performed. In the training set, we used the area under the receiver operating characteristic (ROC) curve (AUC) to identify and compare the screening effectiveness of the radiomic parameters. Optimal radiomic parameters (the six parameters with the largest AUC in PETor CT images) were used in the modeling by binary logistic regression; based on above models, new predictive variables were created, including CT predictive variable (PREct), PET predictive variable (PREpet), and combination of PET and CT predictive variables (PREcombination). We chose six parameters because, according to the preanalysis, the six parameters with the largest AUC size were relatively stable, and this number of parameters was sufficient to achieve a good AUC, sensitivity, and specificity within both the training and test set. en, we predicted the type of malignant lymph nodes in the test set by using the predictive models generated in the training set and compared the predictions with the pathological results of the primary tumor. In addition, we also performed the abovementioned analysis with the maximum standardized uptake value (SUVmax) as a separate prediction parameter. ROC and AUC were used to compare and analyze the new predictive variables in the training and test set. e best cutoff score corresponded to the top-left point on the ROCs in the training set [40]. e AUCs of these predictive variables in the test set were compared, and the result was verified by the z test [41].

Patient Characteristics and Imaging Parameters.
A total of 492 lymph nodes from 324 eligible patients were screened (Table 1). e SUVmax, mean CT value, long/short diameter of LN, and patient characteristics for each group in the training set and test set are presented in Table 2.

ROC Analysis of Potentially Optimal Radiomic
Parameters. In the LLN vs CLN, HLLN vs CLN, and NHLLN vs CLN groups, the six largest AUC radiomic parameters were no appreciable difference, CTvalue_min_CT and NGLDM_Coarseness_PET being the parameters with the largest AUC in all groups. In the HLLN vs NHLLN groups, the six radiomic parameters with the largest AUC differed from the previous three groups. In other words, different radiomic parameters are suited for distinguishing between LLN and CLN than for distinguishing HLLN from NHLLN (Table 3).

Regression Coefficient and Radiomic Predictive Variables.
Above six largest AUC radiomic parameters in each groups were used to modeling by binary logistic regression, respectively (Table 4). Based on these models, we can derive new predictive variables for each lymph node (PREct, PREpet, and PREcom) and evaluate their cutoff values of predictive variables (Table 5). Two examples are presented in Figure 2 and Table 6 to describe how to apply the abovementioned predictive models to the test set to derive   predictive variables, followed by predictions of the type of malignant lymph nodes.

ROC Test of Predictive
Variables. e methods described in the above example were applied to all lymph nodes in the training and test sets to derive the corresponding predictive variables (PREct, PREpet, and PREcom) and prediction results. e ROC was used to test the predictive performance of these predictive variables. In the training set, while distinguishing between LLN and CLN, the PREcombination yielded an AUC of 0.96, sensitivity of 92.86%, specificity of 93.98%, and accuracy of 92.81% (the cutoff point was 0.39); in the test set, the AUC was 0.95, while sensitivity, specificity, and accuracy were 91.67%, 94.29%, and 92.96%, respectively. SUVmax was not a good predictor in each group, with an AUC < 0.75 for each group. Other outcomes for the differentiation between HLLN and NHLLN, as well as for the differentiation HLLN or NHLLN from CLN, are presented in Table 5. e ROCs for all predictive variables for all test groups are shown in Figure 3.

Comparison of the Diagnostic Ability among the ree Radiomic Predictive Variables and SUVmax.
PREcom had the highest AUC in each group, with some differences being statistically significant. e difference in AUC value between PREct and PREpet was not statistically significant in most groups. e AUC value of the SUVmax was significantly lower than that of other predicted parameters in each group (Table 7, Figure 3).

Discussion
is study is the first to report the use of radiomics for the analysis of PET/CT images of malignant lymph nodes, to construct a predictive radiomics model to distinguish between LLNs and CLNs. Furthermore, we found that this approach could be used to distinguish between HLLN and CLN, NHLLN and CLN, and HLLN and NHLLN. We believe that radiomics-based 18-FDG PET/CT is a promising tool for distinguishing between LLN and CLN, since it provides a useful reference for further clinical analysis of suspected malignant lymph nodes.  Our results provide a new and reliable way to distinguish between LLN and CLN. Usually, radiologists examine images visually and make a diagnosis based on their knowledge and experience [42]. However, important information may be missed by visual examination [43], rendering a quantifiable, objective method necessary. Lymphomas and other types of cancer show significant differences in the biological behavior and spatial structure [44,45], which may be the primary cause of the heterogeneity that enables its detection by radiomics. Our study confirmed that the method based on the analysis of CT and PET images by radiomics was effective to distinguish between LLNs and CLNs and erefore, the clinical application of these findings, combined with the knowledge and experience of radiologists, could be instrumental in the judgment of the nature of lymph nodes and provide valuable guidelines for the determination of the appropriate method to use for a biopsy.
Radiomics functions differently for different imaging techniques. Previous studies have found that the AUC, sensitivity, and specificity of PREct, PREpet, and PREcom were significantly different while distinguishing breast cancer from breast lymphoma and renal cell carcinoma from renal lymphoma [33,34]. In this study, PREcom demonstrated the highest differentiating ability in any scenario. PREpet was narrowly inferior to PREct and PREcom. e close similarity between the differentiating ability of PREct and PREcom was unexpected, which is similar to the findings of our previous research [33]. We speculated there are two reasons why CT metrics demonstrate a high diagnostic performance and play a decisive role in PREcombination, which is completely different from the clinical scenario [46]. First, PET images may act as a guide to depict areas that should be taken into consideration, so that radiomic parameters from CT images can be extracted in these areas. Second, high differentiating ability may be achieved by CT image-based radiomics analysis itself, because several studies have shown that texture analysis based on CT images alone can identify whether lymph nodes are invaded by tumors [47][48][49]. We believe that PET/CT-based PREcom is highly effective in distinguishing between LLNs and CLNs, and this high efficiency is a combination of the high AUC of CT metrics and the localization effect of PET metrics.
We also discovered the differential ability of radiomics for distinguishing between different types of malignant lymph nodes. For distinguishing LLNs (and LLNs of different subtypes) from CLNs, our methods can achieve a high AUC, sensitivity, and specificity. is may be attributed to the high heterogeneity between CLN and LLN. On the other e cutoff point in the training set is consistent with that in the verification set. Abbreviations: AUC, area under the ROC curve; ROC, receiver operating characteristics; SUV, standardized uptake value; LLN, lymphomatous-lymph nodes; CLN, cancerous-lymph nodes; HLLN, LLN caused by Hodgkin's lymphoma; NHLLN, LLN caused by non-Hodgkin lymphoma; PREct, CT-predictive variables; PREpet, PET-predictive variables, PREcombination, the combination of PET and CT predictive variables. 8 Contrast Media & Molecular Imaging hand, while distinguishing between HLLN and NHLLN, the AUC, sensitivity, specificity, and accuracy were both lower than those for separating CLN and LLN. is could be related to the lower heterogeneity between lymphoma subtypes than that between lymphomas and cancer.
SUVmax is often used as an indicator to distinguish between benign and malignant lesions. In some cases, even higher specificity and sensitivity can be achieved [50]. However, in our study, SUVmax was not suitable as an indicator for distinguishing different types of malignant lymph nodes. In all groups, the AUC, specificity, sensitivity, and accuracy of SUVmax were significantly lower than those of PREct, PREpet and PREcom, and the absolute value was also low. We believe this is due to the fact that SUV values are high in most malignant diseases, leading to insufficient discrimination in different types of malignant diseases. e current trends in texture research involve the use of machine learning/deep learning to avoid the tedious process of manual operation and the accompanying uncertainty. Texture analysis based on dual-energy CT, full-field digital mammography, dual time 18 F-FDG PET/CT, and biparametric MRI can identify benign and malignant diseases with high efficiency (AUC fluctuates between 0.84-0.96 depending on the disease and analysis method) in studies using machine learning/deep learning [51][52][53][54]. e AUC in these studies did not significantly differ from that in our study, but the abovementioned studies focused on distinguishing between benign and malignant tumors.
At present, the final diagnosis of tumors is determined by biopsy. FNA is acceptable or recommended for most neoplastic lymph nodes, but as noted earlier, FNA is not sufficient to diagnose lymphoma (according to the National Comprehensive Cancer Network Guidelines). e application of FNA in a patient with lymphoma may render an accurate diagnosis difficult, and a repeat lymph node resection may be indicated in such a patient in order to obtain sufficient biopsy material [14]. However, blind biopsy of a swollen lymph node may cause unnecessary damage and increase medical costs. Although biopsy is the gold standard technique for the diagnosis of all malignant diseases, it has certain drawbacks. Biopsy is usually invasive, nonrepeatable, and time-consuming and can only be performed for a single lesion [55]. Most patients undergo imaging tests (PET, CT, MRI, etc.) to determine the location or extent of a lesion. Radiomics extracts more information (often missed on visual examination) from existing data to yield valuable diagnostic reference information, without increasing the medical costs. Moreover, the method devised herein has acceptable differentiating reliability even if a patient only undergoes CT and not PET/CT. erefore, radiomics can be an accurate screening method without the requirement of additional resources.
is method can provide a reliable reference for clinicians to determine the optimal biopsy method for sampling tumors and to avoid misdiagnosis or unnecessary damage. e present method is efficient and noninvasive and does not require additional testing for distinguishing between CLN and LLN. Future research should be directed towards the application of PET/CT in the differential diagnosis of lymphomas. Some studies on the radiomics analysis of US, CT, and MRI have reported good discrimination of papillary thyroid microcarcinoma [23], primary lung cancer [25], renal cell carcinoma [26], and prostate cancer [56] from benign lesions. Similarly, PET/CT-based radiomics analysis has been used to distinguish gliomas [57], thyroid cancer [23], and lung cancer [58] from benign lesions. A study using TA in combination with machine learning to distinguish the nature of neck lymph nodes also achieved very good results:  Table 6 (a)

Sample 2
Original PET/CT image PET/CT image with volumes of interest (purple marked) Delineate VOI in every slice Extract parameters by LIFEx so ware (40% thresholding) Shown in Table 6 (b) Figure 2: Two examples of how predictive models work (next to Table 6). Abbreviations: VOI, volume of interest.
Contrast Media & Molecular Imaging Table  6: Two examples of how predictive models work (continued Figure 2  an accuracy of up to 93% and 80% for distinguishing lymphoma and inflammatory from normal nodes, respectively, and of 92% for distinguishing benign and malignant lymph nodes [53]. However, the abovementioned studies focused on the differentiation between benign and malignant lesions; reports on the use of radiomics for differentiating between different types of malignant lesions are rare [31][32][33]. Moreover, some earlier studies have reported the feasibility of radiomics-based TA of PET/CT imaging for distinguishing between renal lymphoma and renal cell carcinoma and between breast lymphoma and breast carcinomas [33,34]. In fact, the differential uptake of 18F-FDG by the lesion alone can be a good indicator for the distinction of benign and malignant lesions [59,60]. However, the differentiation of different malignant tumors exclusively based on the quantity of 18F-FDG uptake is difficult, which suggests that radiomics is a better method for the differentiation of malignant tumors. Our research focused on the differentiation between LLN and CLN. Compared with earlier studies, the sample size of the present study was larger, and the results obtained were more significant, which is of great practical and clinical significance. The present study has a few limitations. First, this was a retrospective, single-center study, which may limit the generalization of the results. Second, the inclusion and exclusion criteria employed in this study resulted in the accrual of a small CLN sample, which was not subdivided further. ird, cases of diffuse large B-cell lymphoma accounted for a large number of the LLN samples, which may have led to potential bias in the comparison of LLN caused by NHL and CLN. Fourth, the CT images were obtained from PET/CT scans for the radiomics study, which may have affected the quality of the CT images, in turn reducing the predictive performance by PREct. Fifth, we used a wide range (between −1000 and 3000 HU) for intensity discretization of CT images, based on our previous study findings [33]. is is outside the general HU range of lymph nodes [61] and may have an impact on TA in CT. However, given the relatively good results of both the previous and current studies, we believe that these effects are likely to be minor. Finally, as described earlier, the diagnosis of lymph node invasion by tumors in most patients was based on imaging reports and clinical data. However, PET/CT is very good in diagnosing whether lymph nodes have been invaded by multiple types of tumor (including lymphoma), especially in terms of specificity (92.06%-100%) [62][63][64][65], which ensures a low chance of including nontumorous lymph nodes. Nevertheless, collecting and investigating clinical data can further reduce the inclusion of nontumorous lymph nodes. However, because the present study is a retrospective study, it was not possible to perform a biopsy on every suspicious lymph node.
is makes the inclusion of a very small number of nonneoplastic lymph nodes inevitable. However, the relatively large sample size of this study may partially alleviate the impact of this situation. In addition, the incidence of multiple or secondary cancers was relatively low, mostly due to the side effects of subsequent cytotoxic treatments or radiotherapy [66]; our included patients did not undergo systemic treatment (including chemotherapy and radiotherapy) before undergoing PET/CT. erefore, we believe that this issue would have minimal impact on our results.

Conclusions
Radiomics based on 18F-FDG PET/CT images may provide an effective noninvasive modality for distinguishing between LLN and CLN and may even be applicable for the differentiation of LLN caused by different lymphomas. is modality can help clinicians decide on the method of biopsy and avoid misdiagnosis or unnecessary procedures. However, multicenter studies with large samples are required to validate these preliminary results.

Data Availability
e data used to support the findings of this study are available from the corresponding author upon request.

Disclosure
is research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors.

Conflicts of Interest
e authors declare that they have no conflicts of interest.