The effect of post-injection 18F-FDG PET scanning time on texture analysis of peripheral nerve sheath tumours in neurofibromatosis-1

Background Texture features are being increasingly evaluated in 18F-fluorodeoxyglucose positron emission tomography (18F-FDG PET) as adjunctive imaging biomarkers in a number of different cancers. Whilst studies have reported repeatability between scans, there have been no studies that have specifically investigated the effect that the time of acquisition post-injection of 18F-FDG has on texture features. The aim of this study was to investigate if texture features change between scans performed at different time points post-injection. Results Fifty-four patients (30 male, 24 female, mean age 35.1 years) with neurofibromatosis-1 and suspected malignant transformation of a neurofibroma underwent 18F-FDG PET/computed tomography (CT) scans at 101.5 ± 15.0 and 251.7 ± 18.4 min post-injection of 350 MBq 18F-FDG to a standard clinical protocol. Following tumour segmentation on both early and late scans, first- (n = 37), second- (n = 25) and high-order (n = 31) statistical features, as well as fractal texture features (n = 6), were calculated and a comparison was made between the early and late scans for each feature. Of the 54 tumours, 30 were benign and 24 malignant on histological analysis or on clinical follow-up for at least 5 years. Overall, 25/37 first-order, 9/25 second-order, 13/31 high-order and 3/6 fractal features changed significantly (p < 0.05) between early and late scans. The corresponding proportions for the 30 benign tumours alone were 22/37, 7/25, 8/31 and 2/6 and for the 24 malignant tumours, 11/37, 6/25, 8/31 and 0/6, respectively. Conclusions Several texture features change with time post-injection of 18F-FDG. Thus, when comparing texture features in intra- and inter-patient studies, it is essential that scans are obtained at a consistent time post-injection of 18F-FDG.


Background
There is an assumption that medical images contain additional data that is not apparent to the human eye and the field of radiomics aims to extract this information through (semi) automated analysis, without the need to change the image acquisition protocols [1,2].
Texture analysis, measuring image heterogeneity, is an example of additional information that is contained within medical images. Although first-order statistics, based on global measures from voxel intensity histograms, are not a true measure of image texture, they are often reported due to their clinical relevance in a number of studies. Second-order statistics consider the relationship between pairs of voxels providing a measure of local texture features. High-order statistics consider the relationship between three or more voxels and provide a measure of both local and regional texture features. Fractal analysis is a further method that can be used to quantify texture information on the basis of repeating geometric patterns (self-similarity) and roughness [3][4][5].
Whilst a number of factors have been described that influence the measurement, accuracy and reproducibility of texture features [4,6,7], to our knowledge, there have been no published data on how texture features change with time post-injection of 18 F-FDG in soft tissue tumours. As there is a growing interest in using texture features in the clinical environment, these data are essential to inform on the design of clinical and research protocols to enable intra/inter-patient scans to be compared reliably and multi-centre trials to be carried out.
Standardised uptake value (SUV) parameters have been shown to change with time post-injection of 18 F-FDG [8][9][10][11], and we hypothesised that texture parameters may also change with time. Therefore, the aim of this study was to investigate the effect of time post-injection of 18 F-FDG on the measurement of texture features in a cohort of patients with neurofibromatosis-1 (NF1) in whom malignant transformation of neurofibromas to malignant peripheral nerve sheath tumours (MPNST) was suspected clinically and in whom 18 F-FDG PET data had been acquired at two separate time points post-injection. As a purely technical study, we did not aim to assess the ability of texture features to discriminate benign from MPNSTs, an analysis that will be the subject of a separate study.

Patients
This retrospective study included 54 patients with NF1 attending our national neurofibromatosis service (30 male, 24 female, mean age 35.1 years). All patients had symptomatic plexiform neurofibromas clinically suspected of malignant transformation and were referred for further investigation with 18 F-FDG PET/CT. An institutional review board waiver was obtained for this retrospective analysis. All patients either had histological confirmation of the tumours or were followed clinically for at least 5 years.

F-FDG PET/CT scan acquisition and analysis
Adult patients were injected with 350 (±10%) MBq of 18 F-FDG, and in children, the injected activity was scaled by body weight (weight/70 × 350 MBq). All patients had blood-glucose levels below 10 mmol/l at the time of injection. Imaging was performed at two time points postinjection of 18 F-FDG: an initial early acquisition at 101.5 ± 15.0 min and a later acquisition at 251.7 ± 18.4 min, as per the standard clinical protocol of our department for characterisation of masses in patients with NF1 [8]. The early scan was from the cerebellum to mid-thigh with additional images acquired if the tumour was below the mid-thigh or above the cerebellum. The late scan involved acquiring a local view of the symptomatic tumour only. The acquisition time for both the early and late scans was 5 min per bed position.
Scans were performed on one of two scanners (Discovery VCT or DST, GE Healthcare, Chicago, USA) which are cross-calibrated to within 3% [12]. All images, from both scanners, were reconstructed with a voxel size of 4.7 mm and slice thickness of 3.27 mm using the ordered-subset expectation maximisation algorithm (2 iterations, 20 subsets). They were subsequently postfiltered using a 3D Gaussian kernel with a full-width at half maximum of 6 mm. Low-dose CT was acquired at 120 kVp and 65 mAs for the purposes of anatomical localisation and attenuation correction without administration of oral or intravenous contrast agent.
The reconstructed PET datasets were imported into inhouse texture analysis software implemented in MATLAB (Release 2013b, The MathWorks, Inc., Natick, MA, USA). Many of the tumours, particularly those that were classified as benign, showed only very low-grade 18 F-FDG uptake, and it was therefore not possible to implement automated segmentation (e.g., threshold defined by a percentage of maximum standardised uptake value (SUVmax) or a fuzzy locally adapted Bayesian (FLAB) method) nor was it possible to reliably define the region of interest (ROI) by hand on the PET scan. All tumours, at both time points, were therefore segmented manually on the CT images where the tumour edges were easily defined. ROI definition was carried out by an experienced operator trained in both radiology and nuclear medicine. The ROIs drawn on the CT scans were automatically mapped onto the PET scan ( Fig. 1). Statistical and textural features that were calculated from tumour volumes of interest included 37 first-order, 25 second-order, 31 high-order and 6 fractal features as listed in Table 1. First-order ROI features were decay corrected from the time of injection. Voxel values within the tumour volume of interest (VOI) were resampled to yield 64 discrete equally spaced bins. Seven 3D direction vectors and 2 distances were considered resulting in 14 matrices. The 2 distances were used to capture relationships between voxels at larger distances and 7 directions to optimise computational time. The texture descriptors were obtained from each matrix followed by averaging the values calculated separately in each matrix. Fractal features were computed using a differential boxcounting method.
To assess inter-observer variability, a random subset of 16 patients had VOIs defined on early and late scans by a separate operator blinded to the initial observer measurements and clinical data.

Statistical methods
All statistical analyses were performed using IBM ® SPSS predictive analytics software, v22.0.0.0. As data were not normally distributed, non-parametric tests were performed. For each texture feature, the values obtained were compared between the initial and later time points using the related-samples Wilcoxon signed-rank test and correlations made with Spearman correlation. Comparisons were performed for all 54 tumours and for the subsets of the 30 benign and the 24 malignant tumours. A significance level of p < 0.05 was used. Inter-observer variation was assessed with intraclass correlation coefficients (ICC).

Results
Good inter-observer agreement was found for measurement of all parameters with mean ICC scores for individual patient scans of 0.93 and 0.96 for early and late scanning time points, respectively, and there was no significant difference between the segmented tumour volumes obtained from the early and late scans (median 35.6 vs 34.9 cm 3 , respectively; p = 0.069). High correlation was observed between early and late scan data for most texture features (mean r value = 0.66 ± 0.21; Table 1).
Overall, more texture features decreased (27/39 benign; 13/25 malignant) than increased (12/39 benign; 12/25 malignant) with time. This pattern was true for first-order, second-order and high-order features in benign tumours and for second-order and high-order features in malignant tumours. However, in malignant tumours, more first-order features increased (n = 9) than decreased (n = 2) ( Table 2).

Discussion
Previous researchers have demonstrated changes in measured SUV parameters post-injection of 18 F-FDG and that this may even be of benefit for differentiating benign and malignant lesions [8][9][10][11]. However, to our knowledge, this is the first study that has investigated how global first-order and loco-regional texture features change with time post-injection of 18 F-FDG.
Our study has demonstrated that a significant number of statistical first-, second-and high-order and modelbased fractal features change with time post-injection of 18 F-FDG in benign and malignant PNSTs. These findings suggest that both global and loco-regional uptake of 18 F-FDG has not stabilised in both benign and malignant tumours by 101.5 ± 15.0 min after injection. We observed an expected high correlation between early and late scan texture features and the differences would therefore be unlikely to impact significantly on discriminatory ability between benign and malignant tumours but would be of greater importance in studies where serial texture features were being calculated as response measures.
The finding that a greater proportion of the global first-order features changed than second-order, highorder or model-based texture features suggests that global changes predominate over regional or local changes in 18 F-FDG distribution. However, a significant proportion of second-order, high-order and fractal   GLSZM grey-level size zone matrix, NGTDM neighbourhood grey-tone difference matrix texture features also changed, showing that regional and local redistribution of 18 F-FDG also occurs with time.
For first-order features, as expected, SUVmax increased with time for malignant but not benign tumours, as previously described for a number of malignant tumours [8][9][10][11].
Overall, SUVmean decreased with time, predominantly due to a decrease in uptake in benign tumours. Firstorder entropy and standard deviation, reflecting the global tumour randomness and distribution of voxel intensities, increased with time across the whole group of tumours but not in either of the benign or malignant groups alone.
For second-order features that reflect the relationships between pairs of voxel intensities and their spatial distribution, 8 out of 9 of the 25 texture features that changed showed a reduction. GLCM energy, a measure of uniformity, increased, and therefore overall the changes in these local texture features implied a reduction in heterogeneity with time. Second-order features showed a decrease in heterogeneity in both benign and malignant tumours suggesting there is a change in relative 18 F-FDG distribution in the tumours between the two different time points, causing a change in local tumour texture features.
With high-order textures features, there were increases and decreases in a number of local and regional features in both benign and malignant tumours but with no consistent pattern. This suggests that considering texture features as showing heterogeneity or homogeneity in a binary manner is probably an oversimplification of what each feature represents mathematically. Similarly, a consistent pattern of change was not seen with model-based fractal features.
A number of technical factors are known to affect the measurement and reproducibility of texture features including matrix size, reconstruction parameters, bin width and tumour volume [4,6,7]. Our findings demonstrate additionally the importance of quoting post-injection 18 F-FDG scanning times when discussing texture features and the importance of consistent postinjection 18 F-FDG scanning times when comparing global and texture features of patient tumours in interand intra-patient longitudinal studies.
There are some limitations to our study. We only considered PNSTs in NF1 patients and as such, it is not possible to generalise these findings to other tumour types, and future research should investigate how texture features change over time in other cancers. Scans were acquired at 101.5 ± 15.0 and 251.7 ± 18.4 min postinjection in this study as per the clinical protocol in our department [8]. Therefore, we cannot comment on the detail of the kinetics of change between these time points or on the magnitude of change compared to scans acquired at 60 min post-injection, which is a more commonly used clinical protocol elsewhere. Whilst two scanners were used for data used in this study, the acquisition and reconstruction parameters were identical and knowing that quantitative differences were minimal [12], it is unlikely that this will have introduced a significant bias in results. All the tumours included in this project were segmented manually on the CT component of the PET/CT scan, and therefore ROIs are subject to more variability than semi-automated methods such as threshold-based or FLAB. This was unavoidable as lowgrade activity in many of the tumours meant that automated methods and direct ROI placement on the PET images proved impossible. However, even on noncontrast-enhanced CT scans as used in our study, the edges of benign and malignant neurofibromas are usually well demarcated (Fig. 1), thus facilitating ROI definition and VOI definition proved straightforward with good inter-observer reproducibility. This method also has the advantage of minimising differences in segmentation volumes due to changes in 18 F-FDG distribution, ensuring the whole tumour is included at both time points. The ROIs that were drawn on the CT scan were mapped onto the PET scan. Although all scans were checked qualitatively by an experienced operator to ensure there was no mis-registration of the ROIs, we otherwise made the assumption of accurate co-registration with no patient movement between CT and PET acquisitions.
As the later scans had fewer counts following radioactive decay of 18 F-FDG, we cannot exclude image noise as an element that may have contributed to differences in texture features. However, more texture features reduced with time (i.e., became more homogeneous) and so it is unlikely that this is a dominant effect. Lastly, the literature suggests that many texture features may be redundant due to collinearity between features and that only a small number of features should be used based on  [4,6]. However, as an initial study of the phenomenon of change in texture feature quantification with time, we preferred to report on multiple features with and without known collinearity to document these findings as broadly as possible.

Conclusions
This study has demonstrated that many 18 F-FDG PET texture features differ significantly between early and late scan acquisition time points. As such, it is important to scan patients at consistent times when measuring texture features in longitudinal patient studies, especially in multi-centre patient trials.