Repeatability of the Maximum Standard Uptake Value (SUVmax) in FDG PET

Objective: SUVmax is often calculated at FDG PET examinations in systematic studies as well as at clinical examinations. Since SUVmax represents a very small portion of a lesion it may be questioned how statistically reliable the figure is. This was studied by assessing the repeatability of SUVmax between two FDG acquisitions acquired immediately upon each other in patients with chest lesions. Methods: In 100 clinical patients with a known chest lesion, two identical 3 min PET registrations (PET1 and PET2, respectively) were initiated within 224±31 sec of each other. The difference in SUVmax between the lesion for the two PET scans (ΔSUVmax) was calculated and the uncertainty expressed as the coefficient of variation, CV (%). The correlation between ΔSUVmax and the lowest SUVmax from PET1 or PET2, the approximate metabolic lesion volume, the time from FDG injection to PET1 and the time between PET1 and PET2, respectively, was also assessed. Results: In 56 patients SUVmax increased at the second acquisition and in 44 patients it decreased. Mean of SUVmax was 7.8±6.1 and 7.8±6.2 for PET1 and PET2, respectively. The mean percentage difference was 0.9±7.8. The difference was not significant (p=0.20). CV gave an uncertainty of 4.3% between the two measurements which is a strong indicator of equivalence. There was no correlation between ΔSUVmax and any of the assessed four parameters. The difference between the acquisitions, 0.9%, was much lower compared to the 3 previous published similar, but more restricted studies where the difference was 2.5-8.2%. Conclusion: From camera and computational perspectives, SUVmax is a stable parameter Conflict of interest:None declared.


Introduction
In medical imaging there is a tendency to accord numbers special value, irrespective of how robust they are (1). Positron Emission Tomography (PET) represents such a quantitative process, as the raw signal can be transformed into absolute concentrations of the radiotracer after a series of transitions and corrections. Hence, PET is often used for quantification of different molecular processes. This is usually made by calculating the Standardized Uptake Value (SUV). In a given image volume, the SUV is defined as the tracer concentration, normalised to the administered activity and e. g. the body weight. The mean or the maximum SUV (SUV mean and SUV max , respectively) within a somehow defined Volume-Of-Interest (VOI), are commonly used figures for description of the tracer uptake.
SUV based PET image quantifications are hampered by technical, physical and biological processes. Despite these well known limitations, SUV calculations are commonly used at clinical examinations as well as at scientific studies. In this respect, SUV quantifications are often ascribed a value which is overrated in relation to their actual precision (1,2,3,4). SUV max is regarded as a figure of merit for prognosis and therapy evaluation (5,6,7,8). While SUV mean heavily depends on the definition of the VOI, SUV max does not, thereby being almost reader independent. Furthermore, SUV max is less affected by partial volume effects while it, on the other hand, is affected by the image noise (9,10). This may be considerable as the administered activity and the scanning time is restricted in clinical practice. In addition, SUV max represents a very small portion of the VOI, why it may be questioned how well it reflects the biology of an entire lesion (11). In the current report, the statistical reliability of SUV max has been studied by assessing its repeatability in an uptake of a chest lesion between two identical acquisitions obtained immediately upon each other in clinical patients.

Patients
The study is based on 100 clinical patients (mean age 64 years; 53 males and 47 females) with suspected tumours of the chest, referred for a clinical PET/CT examination with [18F]-2-fluoro-2-deoxy-D-glucose (FDG). The Regional Research Ethical Committee approved the study.
Examination Approximately one hour after i. v. administration of 4 MBq/kg bw of FDG, the examination was initiated. This was accomplished using a Biograph 64 True Point (Siemens Medical Solutions, Erlangen, Germany) with an axial PET Field-Of-View (FOV) of 21.5 cm. First a low dose CT without contrast medium, from the middle skull to the proximal thigh, for photon attenuation and scatter correction of the PET images was performed. Directly following this, the clinical PET-examination was performed with a 3 min acquisition time for each FOV position and normal tidal breathing. Immediately after this, the two additional study-specific identical PET registrations were sequentially acquired (PET1 and PET2, respectively) with an acquisition time of 3 min and one single FOV including the known lesion. There was a mean of 224±31 sec between the beginnings of the two acquisitions. Thereafter, a full-dose CT, with or without administration of i.v. contrast medium, was performed at breath-holding at a mean inspiratory level. The patient did not change position versus the camera during the series of acquisitions.
All PET images were reconstructed using the manufacturer ordered subset expectation maximization algorithm (OSEM) with 4 iterations and a matrix image size of 168x168 pixels, a nominal slice thickness of 5 mm and a voxel volume of 0.08 cm 3 . Corrections for photon attenuation, random coincidences and photon scatter were made.
Evaluation Only patients showing FDG-uptake of the lesion were included. In some patients several lesions were identified, but for all patients only one lesion was studied. The lesions assessed were based on the possibility of including a wide spread of lesion sizes. There were 3 lesions of the chest wall, 16 lesions of the mediastinum/lung hili, and 81 pulmonary lesions. 70 were suspected lung/oesofageal tumours, 16 were metastases or lymphoma/leukaemia, and 14 were considered to have benign explanations (inflammation/infection, radiation pneumonitis, Wegeners granulomatosis or Hamptons hump).
Evaluation of PET1 and PET2 was made using the commercial software allowing simultaneous assessments of both studies at an identical position. SUV max was calculated by allocating a VOI enclosing the uptake with some margin and using this for evaluation of both examinations.
To estimate the metabolic volume of the lesions, the FDG-uptake was approximated as an ellipsoid/sphere and calculated by manually allocating three orthogonal diameters with a precision of a ½ cm. The distribution of the approximately calculated volumes is shown in Figure  1. The mean volume was 50 cm 3 , and the median volume was 7 cm 3 .

Data Analysis and Statistics
The difference between the SUV max of the two PET scans (ΔSUV max ) was calculated as: Since the Jarque-Bera test showed that ΔSUV max was not normally distributed, the Wilcoxon matchedpairs signed-ranks test was used to test whether the two measurements were significantly different. The uncertainty of ΔSUV max (the measurement error) was evaluated according to Dahlberg's formula and presented as the coefficient of variation, CV (%) (12).
Correlations were assessed between ΔSUV max and the four parameters: the lowest (minimum) of the measured SUV max from either PET1 or PET2, the approximate metabolic lesion volume, the time from FDG injection to PET1, and the time between PET1 and PET2. Analysis showed that this could be made by calculation of Pearson correlation coefficient, which can be used as a measure of strength of linear correlations. Of these, there was a skewed distribution (>1) for the lowest of the measured SUV max , the approximate lesion volume and the time from FDG injection to PET1. Thus, a reciprocal transformation was made for these data prior to the analysis. A rule of thumb is that a Pearson correlation coefficient of 0-0.25 indicates little or no relationship (13). The coefficient of determination (R2) was also calculated. This shows the proportion of the total variation explained by the variable studied.

Results
In 56 patients SUV max increased at PET2 and in 44 patients it decreased. Mean of ΔSUV max was 7.8±6.1 and 7.8±6.2 for PET1 and PET2, respectively. The mean percentage difference was 0.9±7.8. The difference was not statistically significant (p=0.20). The distribution of ΔSUV max is shown in Figure 2. Corresponding CV gave a relative uncertainty of 4.3% between the two measurements. The voxel containing the SUV max in the patient with the lowest number of counts after the various transitions and corrections was 28218 which corresponds to a CV of 0.6% (assuming Poisson counting statistics). There was no correlation between ΔSUV max and any of the studied parameters: the lowest of the measured SUV max from either PET1 or PET2, the approximate metabolic lesion volume, the time from FDG injection to PET1, or the time between PET1 and PET2 (Table 1).

Discussion
The reproducibility of FDG-PET examinations, i. e. the variation between two different examinations carried out at standardised conditions has been studied by several authors and subjected to a metaanalysis (14). In contrast to this, the repeatability of SUV max , i. e. the variability of between two identical consecutive PET scans of the same patient was evaluated in the current study. The aim was to test this at a clinical setting using regularly applied acquisition and reconstruction parameters, thereby also including effects by true image noise. The latter being important as the accuracy of SUV max is limited by a sensitivity to the noise (9,10).
As the lesions were located in the chest, the tracer uptake is affected by respiratory movements. This should not influence the final results, since the movements must have the same impact at the two acquisitions. The metabolic PET volume of the lesion is relevant for the assessments of a possible size influence, why the anatomical (CT) volume was not assessed. The latter would have been much more precise, but of limited value, as the metabolic and anatomic volumes do not always correspond to each other (15). Edge definitions in nuclear medicine examinations, however made, are not precise why the volume assessments are subjected to uncertainties, which represents a limitation of the study. The data presented is dominated by small lesions, this merely reflecting the clinical situation. There was no correlation between the lesion size and ΔSUV max which may have been expected.
Stable conditions between PET1 and PET2 are a prerequisite for the study. This could be influenced by a continuing FDG incorporation still after 60 min, since this has been shown to peak later in tumours   (16,17). As there was no difference of SUV max between PET1 and PET2, the (short) interval between the two acquisitions allows for the analyses to be made. This is further supported by the lack of correlation of the time between FDG injection and PET1 versus ΔSUV max , as well as the lack of correlation of the time between PET1 and PET2 versus ΔSUV max .
The CV corresponding to the lowest number of counts in the study, 0.6%, is much lower than any other uncertainty of the study. Together with the lack of correlation with the lowest of the measured SUV max , our findings are hardly influenced by an insufficient number of counts.
There are three previous similar studies (18,19,20). They are not as extensive as in the current report and do not include possible effects on the repeatability by other mechanisms as also studied by us. In the previous studies, the variation of SUV max is much larger than in the current study. In this, there was difference of 0.9% between the two acquisitions, while at the previous studies this figure varied between 2.5-8.2% (Table 2). The activity administered at our examinations, 4 MBq/kg bw, is the lowest compared to the previous studies, why this does not explain our lower value. It should rather have an opposite effect. All previous studies are based on a small number of observations, 8-20 individuals, while we studied a much large cohort to cover differences between patients and lesions sizes. In one of the previous studies, the two comparative acquisitions were initiated as early as 35 min after administration of the radiotracer, why a still ongoing strong increasing tracer uptake may explain the higher uptake at the second acquisition (20). Another study was restricted to normal liver tissue uptake of FDG (18). In contrast to this, the current study is based on pathological lesions located in the chest, which compared to the normal liver uptake are subjected to a lower influence from the surrounding activity, thereby reducing any errors caused by partial volume effects. The divergence towards the previous studies cannot, however, be completely explained other than the current study is based on a larger number of observations and very strictly controlled.
The calculated coefficient of variation (CV), of 4.3% between the examinations is very small. Results of < 5% are considered as a strong indicator of equivalence (12). Consequently, the observed relative measurement error is most likely only due to quantum statistical fluctuations in the disintegration and detection of the positrons. Any odd SUV max results cannot be blamed on a random error in image reconstruction or hardware, but that the value is stable in regards to those factors.

Conclusion
From camera and computational perspectives, the SUV max is a stable parameter. Any fluctuation can be explained by physiological variations in the radiopharmaceutical uptake, which is what it is meant to describe in the end.
Acknowledgment: The authors wish to thank Elisabeth Berg, B. Sc., Karolinska institutet, for the professional statistical analysis.
Conflicts of interest: None.