Prognostic role of preoperative fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography with an image-based harmonization technique: A multicenter retrospective study

Objectives Despite the prognostic impacts of preoperative fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography examination, fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography–based prognosis prediction has not been used clinically because of the disparity in data between institutions. By applying an image-based harmonized approach, we evaluated the prognostic roles of fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography parameters in clinical stage I non–small cell lung cancer. Methods We retrospectively examined 495 patients with clinical stage I non–small cell lung cancer who underwent fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography examinations before pulmonary resection between 2013 and 2014 at 4 institutions. Three different harmonization techniques were applied, and an image-based harmonization, which showed the best-fit results, was used in the further analyses to evaluate the prognostic roles of fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography parameters. Results Cutoff values of image-based harmonized fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography parameters, maximum standardized uptake, metabolic tumor volume, and total lesion glycolysis were determined using receiver operating characteristic curves that distinguish pathologic high invasiveness of tumors. Among these parameters, only the maximum standardized uptake was an independent prognostic factor in recurrence-free and overall survivals in univariate and multivariate analyses. High image-based maximum standardized uptake value was associated with squamous histology or lung adenocarcinomas with higher pathologic grades. In subgroup analyses defined by ground-glass opacity status and histology or by clinical stages, the prognostic impact of image-based maximum standardized uptake value was always the highest compared with other fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography parameters. Conclusions The image-based fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography harmonization was the best fit, and the image-based maximum standardized uptake was the most important prognostic marker in all patients and in subgroups defined by ground-glass opacity status and histology in surgically resected clinical stage I non–small cell lung cancers.

Results: Cutoff values of image-based harmonized fluorine-18 fluorodeoxyglucosepositron emission tomography/computed tomography parameters, maximum standardized uptake, metabolic tumor volume, and total lesion glycolysis were determined using receiver operating characteristic curves that distinguish pathologic high invasiveness of tumors. Among these parameters, only the maximum standardized uptake was an independent prognostic factor in recurrence-free and overall survivals in univariate and multivariate analyses. High image-based maximum standardized uptake value was associated with squamous histology or lung adenocarcinomas with higher pathologic grades. In subgroup analyses defined by ground-glass opacity status and histology or by clinical stages, the prognostic impact of image-based maximum standardized uptake value was always the highest compared with other fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography parameters.
Conclusions: The image-based fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography harmonization was the best fit, and the image-based maximum standardized uptake was the most important prognostic marker in all patients and in subgroups defined by ground-glass opacity status and histology in surgically resected clinical stage I non-small cell lung cancers. ( An image-based PET/CT harmonization appears superior to a conventional method.

CENTRAL MESSAGE
An image-based harmonization was the best fit compared with others, and SUVmax adjusted with image-based harmonization was superior to a conventional harmonization method in all cohort groups.
Popularization of thin-section computed tomography (TS-CT) has accelerated the detection of small-sized lung cancers. 1,2 Surgical resection is the standard of care for such early-stage non-small cell lung cancers (NSCLCs). However, there remains an appreciable risk of postsurgical recurrence even for patients with clinical stage I NSCLCs.
Because these early-stage NSCLCs are heterogeneous regarding prognosis, numerous studies have tried to predict the prognosis of clinical stage I NSCLCs using parameters that can be obtained preoperatively. Fluorine-18 fluorodeoxyglucose-positron emission tomography/computed tomography (FDG-PET/CT) is an almost mandatory clinical examination before pulmonary resection for patients with early-stage NSCLC to detect lymph node or distant metastases at a higher sensitivity and specificity compared with computed tomography alone or other radiological examinations. [3][4][5] Additionally, FDG-PET/CT provides quantitative values that reflect the glucose uptake of the tumor, metabolic activity, and proliferative potential of cancer cells in the tumors. Using some of these quantitative values, such as maximum standardized uptake value (SUVmax), metabolic tumor volume (MTV), and total lesion glycolysis (TLG), many studies have reported that these parameters are significant prognostic markers in surgically resected patients with early-stage NSCLC. [6][7][8][9][10][11][12] However, the quantitative values of these FDG-PET/CT parameters differ between institutions, which hampers clinical application of the FDG-PET/CT data as a tool to predict patients' outcomes.
In early-stage NSCLCs, harmonization of FDG-PET/CT data has been attempted using mathematical-based harmonization methods 8,12 that use an equation generated using an anthropomorphic body phantom that conformed to National Electrical Manufacturers Association standards 13 to reduce inter-and intra-scanner variability. Nakayama and colleagues 8 performed harmonization by adjusting the solid component sizes of the tumors because the deviation from the true standardized uptake value (SUV) depends on the solid component size of the tumors. In contrast, Okada and colleagues 12 reported a harmonization method that calibrated SUVs by dividing the actual SUV by the SUVmean measured in the phantom background. However, these mathematicalbased methods are considered inaccurate. Nakayama and colleagues' method 8 would be inadequate for smaller tumors, including those with ground-glass opacity (GGO) with a small solid component, and Okada and colleagues' simplified method 12 would be inadequate because the differences in SUVs between institutions are nonlinear. Therefore, imagebased methods are becoming mainstream for FDG-PET/CT harmonization in other types of malignancies. [14][15][16][17][18] In this study, we performed a multicenter retrospective study to evaluate the difference in SUVs between the 2 mathematical-based methods and an image-based

Thin-Section Computed Tomography Evaluation
For all patients, preoperative TS-CT images were independently reviewed by 2 investigators, and patients were classified into part-solid or solid groups based on the presence of a GGO component, as described in previous reports. [22][23][24][25][26][27] Computed tomography images were evaluated on a monitor with a window level of 600 to 700 Hounsfield units and a window width of 1500 to 2000 Hounsfield units. Solid components were defined as areas of increased opacification that completely obscured the underlying vascular structures on TS-CT images. GGO components were defined as areas of increased hazy density that did not obscure the underlying vascular structures. 28,29 Fluorine-18 Fluorodeoxyglucose-Positron Emission Tomography/Computed Tomography Examination and Harmonization Techniques The participating institutions used different FDG-PET/CT scanner systems: Biograph Duo (Siemens Healthcare), Discovery 600 (GE Healthcare), Gemini TF (Philips Medical Systems), and Gemini GXL (Philips Medical Systems). Before the examination, patients fasted for at least 5 hours, and blood glucose was measured immediately before injection of FDG at 3.0 to 4.0 MBq/kg of body weight. None of the patients had a blood glucose level greater than 200 mg/dL. Approximately 60 minutes after the injection, static emission images were obtained, during which the patients were  allowed to breathe normally. The experienced physicians (K.K. and H.K.), who were board certified in both diagnostic radiology and nuclear medicine and who were blinded to the other imaging results or clinical and histopathologic data, retrospectively reviewed all of the FDG-PET/CT images. Regarding the techniques for FDG-PET/CT harmonization, 3 different harmonization methods were compared using mathematical-based 8,12 and image-based methods. Image-based harmonization was performed using RAVAT (Nihon Medi-Physics Co, Ltd, Tokyo, Japan), which is a commercially available software package that harmonizes SUVs obtained with different PET/CT systems in a range advocated by the Japanese Society of Nuclear Medicine, using phantom data. 16,30 Stand-alone RAVAT software can quantify PET images, and the software is typically used to adjust spatial resolution to harmonize PET images using a 3-dimensional Gaussian filter.

Fluorine-18 Fluorodeoxyglucose-Positron Emission Tomography/Computed Tomography Parameters
SUVmax was defined as maximum SUV within the target volume of the primary tumor and was determined using the following formula: concentration of radioactivity in the volume of interest (MBq/mL) 3 total body weight (kg)/injected radioactivity (g/MBq). The SUVmean was calculated as the summed SUV in each voxel in the target volume divided by the number of voxels within the target volume of the primary tumor. MTV was measured automatically inside the primary tumor volume of interest, with the margin threshold set at 40% of SUVmax. TLG was calculated as SUVmean 3 MTV, considering both metabolic activity and tumor burden. Image-based SUVmax (iSUVmax), image-based iMTV (iMTV), and image-based TLG (iTLG) were defined as the SUVmax, MTV, and TLG values calculated using the image-based harmonized FDG-PET/CT method in individual patients. Receiver operating characteristic (ROC) curves were used to identify optimal iSUVmax, iMTV, and iTLG cutoff values for predicting high pathologic invasiveness in all patients and in each subgroup ( Figure E1).

Pathologic Evaluation
Pathologic diagnoses were made by expert pathologists (T.Ka., T.Y., and S.H.) in accordance with the World Health Organization classification. Lung adenocarcinoma was classified as adenocarcinoma in situ, minimally invasive adenocarcinoma, and invasive adenocarcinoma, which was further divided into lepidic predominant, acinar predominant, papillary predominant, micropapillary predominant, solid predominant, or invasive mucinous adenocarcinoma. 19 As previously reported, the predominant pattern was defined as the pattern with the largest percentage throughout the tissue sample. Invasive adenocarcinomas were further classified into 3 groups: grade 1, lepidic predominant; grade 2, acinar or papillary predominant; and grade 3, solid or micropapillary predominant, in accordance with the predominant pattern-based grading system. 19

Statistical Analyses
Statistical analyses were performed using JMP software, version 15.0 (SAS Institute Inc). Only simple statistical analyses were performed in this study, and following the recent guidelines, 31,32 these analyses were performed by well-educated and experienced researchers. Continuous variables were compared using the Mann-Whitney U test, whereas categorical variables were compared using the chi-square test. ROC curves for iSUVmax, iMTV, and iTLG to predict lymphatic/vascular invasion, pleural invasion, or nodal involvement (high pathologic invasiveness) were generated to determine the cutoff values that yielded optimal sensitivity and specificity in accordance with a previously reported method. 12 Recurrence-free survival (RFS) was defined as the interval from the day of surgery to the first event (relapse or death from any cause). For patients who did not experience disease recurrence, RFS was censored at the last visit. Overall survival (OS) was defined as the interval from the day of surgery to death from any cause. OS was censored at the last visit. RFS and OS were analyzed using the Kaplan-Meier method, and statistical differences in RFS or OS between groups were compared using the log-rank test. Univariate and multivariate   Cox proportional hazard regression analyses were performed to assess the prognostic impact of the clinical parameters on RFS and OS.

Comparison of an Image-based Versus Mathematical-based Harmonization
We started our analysis by comparing the image-based harmonization approach with previously reported 2 mathematical-based approaches (ie, Okada and colleagues' method 12 and Nakayama and colleagues' method 8 ). We observed that harmonized SUVmax values for 5 different FDG-PET/CT scanners did not fall within the Japanese Society of Nuclear Medicine reference range for a National Electrical Manufacturers Association body phantom if harmonized using mathematical-based approaches ( Figure 1, A-C). However, we found that the image-based approach provided the best-fit results (Figure 1, D). Therefore, we conclude that the image-based harmonization is preferable to mathematical-based harmonization, and we used the image-based method in the subsequent analyses.

Correlation Between Maximum Image-Based Standardized Uptake Value, Image-Based Metabolic Tumor Volume, and Image-Based Total Lesion Glycolysis and Clinicopathologic Findings
Cutoff values for iSUVmax, iMTV, and iTLG were determined using ROC curves that distinguish tumors with high pathologic invasiveness ( Figure E1). Among 3 FDG-PET/ CT parameters, iSUVmax had the highest sensitivity and specificity (area under the curve [AUC]: AUC ¼ 0.811) in all patients compared with iMTV (AUC ¼ 0.562) and iTLG (AUC ¼ 0.740).
Next, we examined the correlations between these FDG-PET/CT parameters and clinicopathologic findings in all patients. A high iSUVmax was significantly correlated with male sex, smoking, pure-solid tumors, large tumor size, nonadenocarcinoma histology, pathologic lymph node metastasis, lymphatic invasion, vascular invasion,  and pleural invasion ( Table 1). The correlation between iMTVor iTLG and clinicopathologic characteristics is summarized in Table 1.
Correlation Between Maximum Image-Based Standardized Uptake Value, Image-Based Metabolic Tumor Volume, and Image-Based Total Lesion Glycolysis Values and Pathologic Grading We also analyzed the correlations between the FDG-PET/ CT parameters and the histological findings. As shown in Table 2, most patients with squamous cell carcinoma (94%) were classified into the high iSUVmax group, whereas half of the patients with lung adenocarcinoma were classified into this group. Among lung adenocarcinomas, the percentages of patients who were classified into the high iSUVmax group (2.3) were 14%, 57%, and 83% for predominant pattern grade 1, grade 2, and grade 3 tumors, respectively. Such hierarchy was not evident in iMTV (66%, 79%, and 73%, for predominant pattern grade 1, grade 2, and grade 3, respectively) or in iTLG (28%, 53%, and 63%, respectively). These results suggest that a high iSUVmax value was the most important predictor of lung adenocarcinomas with higher pathologic grade.
Correlation Between Maximum Image-Based Standardized Uptake Value, Image-Based Metabolic Tumor Volume, and Image-Based Total Lesion Glycolysis Values and Prognosis In the analyses of RFS and OS for the entire cohort, iSUVmax and iTLG values separated patients' outcomes significantly (Figure 2). In the multivariate analysis, we found that iSUVmax (hazard ratio [HR], 3.02, P < .001 for RFS and HR, 3.66, P ¼ .003 for OS), but not iTLG, was a significant prognostic factor (Table 3).
In the subgroup analysis focusing on pure-solid lung adenocarcinomas (Figure 3), a high iSUVmax value was again the only significant predictive factor of both poor RFS (P < .001) and OS (P ¼ .015). In multivariate analysis, iSUVmax was a significant prognostic factor for RFS (HR, 3.18, P ¼ .001) and OS (HR, 2.72, P ¼ .017) ( Table 4). In the other subgroups, such as the part-solid adenocarcinoma and nonadenocarcinoma groups, high iSUVmax was consistently a significant poor prognostic factor for both RFS and OS ( Figures E2 and E3). In further subgroup analysis, we observed that iSUVmax was a significant prognostic factor in patients with part-solid adenocarcinoma (consolidation/tumor ratio [C/T ratio] >0.5), whereas those with C/T ratio 0.5 or less usually had low iSUVmax and had excellent survival outcomes ( Figure E4). These results suggest that iSUVmax is an important prognostic factor to predict poor RFS and OS in patients with clinical stage I NSCLC and in subgroups defined by GGO status (excluding those with C/T ratio 0.5) or histology. In addition, iSUVmax was a significant prognostic factor in patients with clinical stage IA and IB NSCLC. It is of note that patients with clinical stage IB with high iSUVmax had the worst RFS (5-year RFS: high iSUVmax group, 39%; low iSUVmax group, 68%, P ¼ .008; 5-year OS: high iSUVmax group, 60%; low iSUVmax group, 94%, P ¼ .001) (Figure 4). Adjuvant chemotherapy, tegafur/uracil, or platinum doublet (if pathological nodal involvement was found) was administered for these patients according to the Japanese guideline and patients' general conditions. We observed that high iSUVmax was associated with poor prognosis in patients with clinical stage IB disease irrespective of the administration of adjuvant chemotherapy ( Figure E5). We also evaluated the ability of iSUVmax as a predictor of poor prognosis compared with a conventional mathematical-based SUVmax (mSUVmax). 12 Among 244 patients in the low-risk group judged by the mSUVmax, 33 were reclassified into the high-risk group by iSUVmax. As shown in Figure E6, these patients showed poorer RFS and OS compared with patients who were low risk by both mSUVmax and iSUVmax, although there were only 3 patients who were low risk by iSUVmax but high risk by mSUVmax.

DISCUSSION
This clinical study uses an image-based harmonization method for FDG-PET/CT parameters to evaluate prognostic factors in patients who received pulmonary resection for clinical stage I NSCLCs in multi-institutions (with different FDG-PET/CT machines). We observed that the imagebased harmonization method was superior to previously reported mathematical methods. In addition, we observed that iSUVmax was an important FDG-PET/CT parameter in terms of prognostic markers for RFS and OS as well as the predictor of histological grades among patients with lung adenocarcinoma. Last, we found that iSUVmax can identify patients with poor prognosis among those with low risk judged by a mathematical method ( Figure 5).   iSUVmax t 3.9 The importance of SUVmax as a prognostic marker in clinical stage I diseases was reported in a previous study, although the main results of the study were the usefulness of MTV and TLG in the total cohort (clinical stage I-II NSCLCs) in multivariate analysis. 6 Therefore, we consider that our result is consistent with the previous one, because our cohort enrolled only patients with stage I disease. In small-sized NSCLCs, it is hypothesized that the simple SUVmax, rather than the factors that include volume elements, would be more useful as a predictor of pathologic invasiveness, pathologic grade, and prognosis. Furthermore, it is of note that subgroup analysis in our study, based on histology and the GGO status, showed that iSUVmax was consistently better than iMTV and iTLG to predict RFS and OS. Recent studies of surgically resected patients with stage I NSCLC have reported that the prognosis of patients who have part-solid tumors is significantly better than that of patients with pure-solid tumors, even if the solid components of both tumors have the same diameter. [22][23][24][25][26][27]33 This phenomenon was also confirmed in our study (Table 3).
In addition, we found that the iSUVmax was a prognostic factor irrespective of the GGO status among patients with lung adenocarcinoma.
Recently, perioperative treatment strategies using immune checkpoint inhibitors have joined the list of the standard of care in patients with NSCLC with clinical stage II and III diseases. 34,35 However, in this study, we showed that some patients with clinical stage I diseases, such as those with clinical stage IB with high iSUVmax, had a worse prognosis (Figure 4). We hope that this imagebased harmonization for FDG-PET/CT (if validated with prospective clinical trials) may provide a simple yet reliable way to identify high-risk patients with clinical stage I NSCLC who may benefit from future clinical trials of neoadjuvant and adjuvant therapies in this patient subgroup.

Study Limitations
This study has some limitations. One is the retrospective design with a relatively small cohort of patients. Furthermore, the cohort was a heterogeneous population in terms

Pre-harmonization
What is the best harmonization technique? In subgroup analyses defined by ground-glass opacity (GGO) status and histology, the prognostic impact of SUVmax was always the highest compared with other FDG-PET/CT parameters.

What is the best prognostic PET/CT parameter? Subgroup analyses based on GGO status
Three different harmonization techniques were applied in this study. We found that a novel image-based harmonization showed the best-fit result (D), over mathematical based ones (B & C). 1 2 Years since operation  of variable follow-up imaging. Because not all patients at the participating institutions with clinical stage I NSCLC underwent FDG-PET/CT imaging (at least during the study period), selection bias may exist. Prospective validation studies will be needed before more widespread use of this promising harmonization method can be recommended.

CONCLUSIONS
Our results suggest that the novel image-based harmonization method, used in this study, was superior to mathematical-based harmonization methods, and among the FDG-PET/CT parameters, iSUVmax was the most important marker to predict malignant potential as well as RFS and OS after pulmonary resection in patients with clinical stage I NSCLC.      . RFS and OS curves focusing on patients with part-solid group tumors by C/T ratio according to iSUVmax. A and B, RFS and OS for the patients with C/T ratio 0.5 or less. C and D, RFS and OS for the patients with C/T ratio greater than 0.5. ROC curve was also used to identify optimal iSUVmax cutoff values for predicting high pathologic invasiveness. The cutoff value for iSUVmax was set at 0.52 and 2.3 for the patients with C/T ratio 0.5 or less and with C/T ratio greater than 0.5, respectively. The 95% CIs are shown in shaded area. C/T ratio, Consolidation/tumor ratio; RFS, recurrence-free survival; iSUVmax, image-based maximum standardized uptake value; OS, overall survival.