Positron emission tomography PET/CT harmonisation study of different clinical PET/CT scanners using commercially available software

Objectives: Harmonisation is the process whereby standardised uptake values from different scanners can be made comparable. This PET/CT pilot study aimed to evaluate the effectiveness of harmonisation of a modern scanner with image reconstruction incorporating resolution recovery (RR) with another vendor older scanner operated in two-dimensional (2D) mode, and for both against a European standard (EARL). The vendor-proprietary software EQ•PET was used, which achieves harmonisation with a Gaussian smoothing. A substudy investigated effect of RR on harmonisation. Methods: Phantom studies on each scanner were performed to optimise the smoothing parameters required to achieve successful harmonisation. 80 patients were retrospectively selected; half were imaged on each scanner. As proof of principle, a cohort of 10 patients was selected from the modern scanner subjects to study the effects of RR on harmonisation. Results: Before harmonisation, the modern scanner without RR adhered to EARL specification. Using the phantom data, filters were derived for optimal harmonisation between scanners and with and without RR as applicable, to the EARL standard. The 80-patient cohort did not reveal any statistically significant differences. In the 10-patient cohort SUVmax for RR > no RR irrespective of harmonisation but differences lacked statistical significance (one-way ANOVA F(3.36) = 0.37, p = 0.78). Bland-Altman analysis showed that harmonisation reduced the SUVmax ratio between RR and no RR to 1.07 (95% CI 0.96–1.18) with no outliers. Conclusions: EQ•PET successfully enabled harmonisation between modern and older scanners and against the EARL standard. Harmonisation reduces SUVmax and dependence on the use of RR in the modern scanner. Advances in knowledge: EQ•PET is feasible to harmonise different PET/CT scanners and reduces the effect of RR on SUVmax.


inTrOducTiOn
The recent rate of advance of PET/CT equipment technology has led to the everyday use worldwide of many variants of detector and software. 1,2 Some earlier PET/CT scanners (e.g., bismuth germanate (BGO) crystal equipment) allow scanning with or without collimation, that is, in 2D or 3D acquisition modes 3 and these are more prevalent in emerging markets due to healthcare cost factors. However, advanced image reconstruction algorithms including resolution recovery (RR) with point spread function modelling or Bayesian reconstruction 4 are also in widespread clinical use today.
It is frequently necessary to compare standardised uptake values (SUV) of a lesion between two scans, for example in assessing response to therapy using PERCIST. 5 Ideally, standardised protocols would be used to ensure comparability between the SUVs; however, it is frequently necessary to compare SUVs acquired in 2D and 3D, 6 or reconstructed with different algorithms. Harmonisation is the process by which it is assured that SUVs acquired using different scanners or protocols can be compared; if harmonisation is not performed there is a risk of misclassification of lesion response.
https:// doi. org / 10. 1259/ bjro. 20190035 Objectives: Harmonisation is the process whereby standardised uptake values from different scanners can be made comparable. This PET/CT pilot study aimed to evaluate the effectiveness of harmonisation of a modern scanner with image reconstruction incorporating resolution recovery (RR) with another vendor older scanner operated in two-dimensional (2D) mode, and for both against a European standard (EARL). The vendorproprietary software EQ•PET was used, which achieves harmonisation with a Gaussian smoothing. A substudy investigated effect of RR on harmonisation. Methods: Phantom studies on each scanner were performed to optimise the smoothing parameters required to achieve successful harmonisation. 80 patients were retrospectively selected; half were imaged on each scanner. As proof of principle, a cohort of 10 patients was selected from the modern scanner subjects to study the effects of RR on harmonisation.
results: Before harmonisation, the modern scanner without RR adhered to EARL specification. Using the phantom data, filters were derived for optimal harmonisation between scanners and with and without RR as applicable, to the EARL standard. The 80-patient cohort did not reveal any statistically significant differences. In the 10-patient cohort SUVmax for RR > no RR irrespective of harmonisation but differences lacked statistical significance (one-way ANOVA F(3.36) = 0.37, p = 0.78). Bland-Altman analysis showed that harmonisation reduced the SUVmax ratio between RR and no RR to 1.07 (95% CI 0.96-1.18) with no outliers. conclusions: EQ•PET successfully enabled harmonisation between modern and older scanners and against the EARL standard. Harmonisation reduces SUVmax and dependence on the use of RR in the modern scanner. advances in knowledge: EQ•PET is feasible to harmonise different PET/CT scanners and reduces the effect of RR on SUVmax.
SUV variations can arise from factors like patient preparation, scan protocol, data acquisition/processing and image reconstruction/analysis. 7 Such variations are problematic both in clinical work and in multicentre clinical trials, where multiple imaging systems are used to recruit sufficient patients for statistical power. The process of deriving comparable SUVs from different equipment and protocols is termed harmonisation. Controversy exists in comparing clinical scans acquired with different advanced reconstruction software [8][9][10][11] that has stimulated debate regarding the use of harmonisation schemes. 12 Attempts to minimise SUV inconsistencies through harmonisation strategies have been recommended by various internationally recognised programmes, for example European Association of Nuclear Medicine (EANM) Research Ltd (EARL) accreditation 13,14 and the Radiological Society of North America's (RSNA) North American Quantitative Imaging Biomarkers Alliance (QIBA). Guidelines generally recommend scanning a fillable phantom consisting of hot spheres in a colder background, such as that designed by the Association of Electrical Equipment and Medical Imaging Manufacturers (NEMA) and the International Electrotechnical Commission (IEC). EARL recommends a value and permissible range for the recovery coefficient (RC) for each sphere diameter in the phantom. 15 Such approaches have proved helpful for achieving harmonisation in clinical trials.
One challenge with such methodology is the potential generation of a separate image set optimised for harmonisation, distinct from the other datasets usually acquired at a centre. This can have consequences for scan labelling, identifying which scan to report, disk storage/data backup and manpower resources. A solution to the challenge of multiple dataset production and associated management has been proposed as the "EQ•PET" proprietary harmonisation software developed by Siemens (syngo.MM Oncology, Siemens Medical Solutions USA Inc., Knoxville, TN, United States) 16 .
EQ•PET allows SUV to be harmonised to baseline scans obtained with less spatial resolution (generally on older equipment or software). EQ•PET applies a 3D Gaussian filter to images, using a filter kernel width ("EQ•PET factor") derived specifically for the particular scanner and protocol in use, that optimally harmonises that scanner and protocol to an agreed standard (e.g., EARL). The EQ•PET factor is usually derived using the NEMA / IEC phantom and is entered directly into the vendor's software.
Smoothing is performed in the background as the original image is displayed, and the harmonised SUV is displayed without additional image creation. Detectability and harmonisation are thus achieved simultaneously. The EQ·PET has been shown to product similar results as EARL, 17 but without requiring a separate reconstruction.
In this pilot study, we aimed to compare the effects of harmonisation of images from two scanners of different generations: Siemens Biograph-mCT with lutetium oxyorthosilicate (LSO) crystals, including the increased axial field of view (Siemens "TrueV") which operates without septa in "3D mode" ("scanner A"), and General Electric (GE) (GE Healthcare, Milwaukee, WI, USA) Discovery VCT with BGO crystals ("scanner B"), operated with septa extended between imaging planes in "2D mode". Harmonisation was performed both between each scanner to the EARL standard, and also between the two scanners, harmonising the newer mCT to the older VCT. A NEMA phantom was used to derive EQ•PET filter parameters. The SUVs measured on clinical images from both scanners were then compared, with and without harmonisation.

MeThOds and MaTerials
This study does not constitute clinical research according to the National Health Service England and was therefore conducted as an audit without the need for patient consent or ethical approval.

Phantom data
The NEMA phantom was prepared without the lung insert as scatter correction was not a prime concern. Spheres and background were filled with F-18 FDG solution concentrations of 45 kBq/ml and 5 kBq/ml respectively, at the time of the first imaging, giving a ratio of 9:1 between these phantom compartments.
The phantom was scanned on both scanners using standard clinical imaging protocols. The scanner A protocol consisted of a 3D PET acquisition, with 3 min/bed, reconstructed using ordered subsets expectation maximisation using time-of-flight information (OSEM + TOF), both with and without RR, with two iterations, 21 subsets, a 200×200 matrix and a Gaussian 5 mm post-filter (referred to hereafter as OSEM + TOF+ RR and OSEM + TOF). The scanner B protocol consisted of a 2D PET acquisition with 4 min/bed, reconstructed using OSEM VuePoint with two iterations, 28 subsets, a 128×128 matrix and a 5.14 mm post-filter. Standard clinical low-dose CT protocols were used for attenuation correction on both scanners. The phantom was scanned six times, three times alternately on each scanner, starting with scanner B. Both scanners had passed manufacturer and nationally approved QA regimes and were serviced at recommended periods. 18 The following harmonisations were attempted using EQ•PET filtering: Clinical data 40 patients with suspected cancer in the head and neck (n = 10), lung (n = 10), oesophagus (n = 5), colorectum (n = 8) or lymphatic system (n = 7) were consecutively selected from the scanner A database to investigate the influence of different clinical acquisition parameters on the ability of EQ•PET to harmonise suspected lesion SUVmax. Similarly a second independent group of 40 patients with the same number of clinical indications were also consecutively selected from the scanner B database over the same time period. All 80 patients were imaged using routine protocols over a period of approximately 7 months. Pre-harmonisation phantom RCs for both scanners, with and without RR, compared against the EARL specification are shown in Figure 1. Only the scanner A without RR plot adhered to EARL limits.
Respective phantom RCs harmonised to the EARL criteria are displayed in Figure 2. All three plots lie within the EARL recommended range, indicating that harmonisation was possible in all cases for all spheres.
In the case of scanner A harmonised to scanner B (Figure 3), it is seen that scanner A data can be harmonised to lie within a range based on scanner B data, whose width is (arbitrarily) proportionately that of the EARL recommendations. It is noted in the case of the 28 mm sphere the scanner A harmonised RC falls just outside the adopted lower limit by a small percentage difference of 0.58%.

clinical data
40 different patients were imaged on each scanner. For each scanner, demographic parameters were investigated, some known to influence SUVmax: no significant differences were found between paired patient age (p = 0.89), weight (p = 0.96), injected activity (p = 0.72) or blood glucose (p = 0.34). There is therefore a low probability of bias in results presented here arising from variations in patient demographics between scanners.
The clinical effect of scanner A RR with or without harmonisation with respect to EARL or scanner B can be seen in the box and whisker plot ( Figure 4) depicting min -[mean ± sd]max clinical SUVmax distributions from 80 suspected oncology patients. This plot also depicts the same information for scanner B with or without harmonisation with respect to EARL specifications. One-way ANOVA revealed no statistically significant  For this subcohort Figure 6 plots the scanner A data SUVmax with (x-axis) and without (y-axis) RR reconstruction, and both with (square) or without (triangle) harmonisation to the EARL standard for comparison. A good correlation coefficient is achieved both with (r 2 = 0.99) or without (r 2 = 0.97) harmonisation. The correlation coefficient must be interpreted bearing in mind that reconstructions with and without RR derive from the same raw data; note although that the level of correlation remains high when harmonisation is applied. The equality line shows that SUVmax with RR is greater than that without RR, irrespective of harmonisation status; however, the slope for harmonised scans (0.89) is greater than that for non-harmonised scans (0.76), indicating that with harmonisation there is less effect on SUVmax when RR is included in the reconstruction. Figure 7 depicts the SUVmax relationship between RR and no RR ratio, without EARL harmonisation, using Bland-Altman analysis. The mean ratio was 1.19 (95% CI 0.95-1.42). No outliers were observed beyond the 2xSD CI about the mean. Figure 8 shows the SUVmax relationship between RR and no RR, this time with EARL harmonisation. The mean ratio was 1.07 (95% CI 0.96-1.18). The lower ratio in suggests that harmonisation diminishes the effect of RR on the SUVmax. Again no outliers were observed beyond the 2xSD standard CI about the mean. In both cases, a low (<0.2) Pearson's correlation coefficient indicates that there is unlikely to be any correlation with SUVmax.

discussiOn
Harmonising scanners in PET/CT centres with different imaging equipment may be of value when scanners have not or cannot be harmonised to recommended guidelines like EARL, but pass all required QA regimes for safe use 19 and have clinician-approved image quality. During scanner downtime, consistency in patient SUV may be preserved by scanning on different equipment and harmonising SUVs. In addition, harmonisation using modern equipment can play a role immediately following replacement of older technology, where one may wish to ensure consistency of SUV between scanners. EQ•PET harmonisation assists without additional dataset creation in such cases.
We successfully harmonised scanner A data with and without RR to EARL and scanner B data to EARL at a centre that had PET/ CT national certification but had not applied for EARL accreditation, Figure 2. In principle, using an acceptable RC range derived from EARL, we also harmonised scanner A RR data to scanner B, although there are some concerns about this expressed in other work. 20 We found that datasets reconstructed with RR yielded higher mean SUVmax values than datasets reconstructed without RR, but that this difference was less when harmonisation was applied. However, no statistically significant differences in SUVmax were observed between scanner A RR, scanner A RR harmonised to EARL, scanner A RR harmonised to scanner B, scanner B or scanner B harmonised to EARL scanned datasets.
It should be noted that the phantom used for these results did not have the lung insert in place, and this might affect the relationship of 2D to 3D results because the scatter conditions are significantly different between the two modalities. This was not investigated in this work and is a limitation of the study.
SUVmax comparison of clinical datasets from scanner A, with or without RR, and with or without harmonisation was investigated in a subgroup of 10 patients ( Figure 5). Mean SUVmax was greater when RR was used, irrespective of harmonisation, but harmonisation to EARL resulted in a lower mean and sd. This again suggests less influence of reconstruction algorithm when harmonisation was applied, although no statistically significant differences were observed between scanner A RR, scanner A RR harmonised to EARL, scanner A without RR or scanner A without RR harmonised to EARL scanned datasets.
The findings in Figure 6 confirm those in Figure 5. RR increases SUVmax in a subset of 10 suspected oncology patients scanned on scanner A, irrespective of harmonisation to EARL, but this effect is reduced by harmonisation. ( Figures 7 and 8) demonstrate the same thing: in the subset of 10 patients, the mean ratio of SUVmax with and without RR is closer to unity (10.6% lower) when harmonisation is used. The confidence limit is also smaller (72.5% lower) with harmonisation.
The EANM/EARL and EQ•PET harmonisation process applied in this pilot study has limitations. Internationally not all PET/ CT scanning centres fully conform to EANM-endorsed imaging guidelines but will maintain national QA standards for routine and research scans. Here EARL RCs were based on an average of three scans. Furthermore, RR reconstruction may be differently characterised in simple geometry like the NEMA phantom rather than clinically with variable lesion sizes, 21 heterogeneous lesions and variable background activity distributions including respiratory motion effects. Other filtering techniques and measurement metrics for example SUVpeak, tumour lesion glycolysis (TLG) or metabolic tumour volume (MTV) may provide additional information in this setting and be less sensitive to reconstruction algorithms, 22 and it is a specific limitation of the study that only SUVmax is considered. This paper does not consider the frequency with which EQ•PET filter parameters should be checked; however, it is reasonable to assume that this should be done after significant modification to the data acquisition, reconstruction or image processing. 20 There is growing awareness of PET harmonisation techniques, particularly involving modern technology operated under EANM guidelines against the EARL standard. A recent harmonisation study 23 used EQ•PET on images from a single scanner, reconstructed using RR and 3D OSEM, of 50 NSCLC patients. Bland-Altman analysis showed that the impact on SUV of the choice of reconstruction algorithm could be reduced without adversely affecting throughput of reconstruction and image interpretation. In another study, 24 86 non-small cell lung cancer (NSCLC) patients were imaged for treatment response assessment using three 3D scanners from the same vendor. The authors showed that choice of reconstruction algorithm introduced variability in PERCIST (PET response criteria in solid tumours) response classification, which was minimised by the use of harmonisation to EARL with EQ•PET using lean body mass-based SUVpeak.
In a further NSCLC study 25 describing different generation scanners, different reconstruction parameters were used to confirm successful EARL harmonisation via Bland-Altman  analysis. In a different study, 26 517 oncology patients with NSCLC, non-Hodgkin lymphoma and metastatic colorectal cancer were scanned on three different 3D scanner models but from the same manufacturer. Bland-Altman results showed EARL harmonisation with EQ•PET was successful for RR, TOF and OSEM algorithms using SUVmax, SUVmean and SUVpeak.
Another review 27 endorsed PET/CT EARL harmonisation methodology using EQ•PET in the oncology therapy setting whilst validating accreditation methodology in 51 participating centres.
In a new publication, 20 four modern TOF-enabled and RR-enabled 3D PET/CT scanners from three different vendors were used to demonstrate that harmonisation of phantom studies to EARL specifications could be achieved. However, different RCs were advocated and a modification to multicentre accreditation was recommended as a consequence of advanced reconstruction algorithms.
A recent forward-facing review 28 investigated adapting harmonisation to other radiopharmaceuticals for example Ga-68-PSMA in prostate cancer for lymph node detection. Literature also exists relating to advanced image analysis techniques like textural analysis 29 with PET/CT 30 which require consistency between SUV acquired under different conditions. In a single 3D scanner study 31 with 71 lesions in 60 NSCLC patients comparable SUV and heterogeneity information was achieved using RR and harmonising with Gaussian smoothing, and then compared to simpler OSEM reconstructed scans.
The majority of clinical studies using EQ•PET have been based on NSCLC, with Siemens modern 3D scanners using LSObased crystal detector technology. There is little reported about harmonisation between truly different generations of PET/CT scanner, especially across different vendors.
The retrospective cohort of 80 patients in this study demonstrated the clinical feasibility of harmonisation between two scanners and against the EARL standard. Future studies are suggested with larger patient numbers combined with varied scanners from different vendors for further analysis of the clinical consequences of harmonisation. We encourage the adoption of the EQ•PET concept to promote harmonisation without creation of multiple datasets.
cOnclusiOn As a PET/CT centre approved for national research studies, we have shown in a pilot study that different PET/CT scanners can be harmonised using EQ•PET to current EARL specifications.
Our results are in accordance with others. No statistically significant differences among 80 suspected oncology patient lesion SUVmax with or without harmonisation between scanners was observed. Standard clinical protocols with or without RR reconstruction had lower SUVmax mean and sd following respective harmonisation. A subset of 10 patients reconstructed with or without RR and with or without harmonisation revealed smaller mean and sd in harmonised scans, but again differences were not statistically significant. Harmonisation with EQ•PET is shown to reduce dependence on reconstruction algorithm in both older and modern scanners.