Interobserver and Intraobserver Reproducibility with Volume Dynamic Contrast Enhanced Computed Tomography (DCE-CT) in Gastroesophageal Junction Cancer

The purpose of this study was to assess inter- and intra-observer reproducibility of three different analytic methods to evaluate quantitative dynamic contrast-enhanced computed tomography (DCE-CT) measures from gastroesophageal junctional cancer. Twenty-five DCE-CT studies with gastroesophageal junction cancer were selected from a previous longitudinal study. Three radiologists independently reviewed all scans, and one repeated the analysis eight months later for intraobserver analysis. Review of the scans consisted of three analysis methods: (I) Four, fixed small sized regions of interest (2-dimensional (2D) fixed ROIs) placed in the tumor periphery, (II) 2-dimensional regions of interest (2D-ROI) along the tumor border in the tumor center, and (III) 3-dimensional volumes of interest (3D-VOI) containing the entire tumor volume. Arterial flow, blood volume and permeability (ktrans) were recorded for each observation. Inter- and intra-observer variability were assessed by Intraclass Correlation Coefficient (ICC) and Bland-Altman statistics. Interobserver ICC was excellent for arterial flow (0.88), for blood volume (0.89) and for permeability (0.91) with 3D-VOI analysis. The 95% limits of agreement were narrower for 3D analysis compared to 2D analysis. Three-dimensional volume DCE-CT analysis of gastroesophageal junction cancer provides higher inter- and intra-observer reproducibility with narrower limits of agreement between readers compared to 2D analysis.


Introduction
Dynamic contrast-enhanced computed tomography (DCE-CT) is a functional imaging technique to measure tumor vascularization in vivo [1]. Tumor angiogenesis is essential in tumor growth, and DCE-CT could serve as a functional diagnostic tool to supplement morphologic imaging in terms of tumor characterization and response evaluation [2][3][4][5][6].
DCE-CT analysis is based on serial image acquisitions during contrast administration, which by different kinetic models can estimate the exchange of contrast between the intravascular space and the interstitial space in terms of blood flow, blood volume and permeability. Dedicated software generates quantitative maps with perfusion parameters after user definition of the arterial input [7]. Implementation of a DCE-CT scan protocol is relatively straight forward, and the wide availability of scanners compared to other functional imaging modalities makes it an interesting approach in medical imaging. Multidetector CT scanners can cover up to 16 cm in the z-axis with fixed table position, thereby enabling perfusion studies of entire organs or tumors without loss of temporal uniformity [8].
Reproducibility has previously been assessed in DCE-CT [9][10][11][12] with varying results, which could be attributed to tumor heterogeneity. Volume perfusion enables whole tumor coverage, and could thereby limit the variability caused by heterogeneity.
The aim of this study was to assess the inter-and intra-observer agreement in the reading of perfusion studies of gastroesophageal junction cancer with three methods, (I) Average perfusion measures derived from four separate, fixed sized regions of interest in the tumor periphery (2-dimensional (2D) fixed ROIs), (II) 2-dimensional free-hand drawn region of interest (2D-ROI) along the tumor border in the tumor center, and (III) 3-dimensional volume of interest (3D-VOI) covering the entire tumor volume. Our hypothesis was that 3D-VOI improved the inter-and intra-observer agreement on reading blood flow, blood volume and permeability compared to 2D measures.

Patients
Twenty-five consecutive abdominal DCE-CT scans of gastroesophageal junction cancer were selected from a previous longitudinal study investigating perfusion changes in gastroesophageal junction cancer and gastric cancer during pre-operative chemotherapy [13]. The original study population also included five patients with primary gastric cancer and these cases were excluded from this study because they were anatomically different and hence would not fit into the three analysis methods described below. The study population consisted of 23 males and two females, mean age 65. All patients had biopsy confirmed adenocarcinoma of Gastro-Esophageal Junction (GEJ), and were eligible for pre-operative chemotherapy and surgery. All scans for this study were performed prior to chemotherapy. The research protocol was approved by the Committees on Biomedical Research for the Capital Region of Denmark (protocol number H-1-2010-132). All patients gave oral and written informed consent according to the Helsinki II Declaration.

Dynamic Contrast Enhanced CT Analysis
The scanning parameters, contrast media and patient preparation details are listed in List 1. Three radiologists independently analyzed each perfusion scan as described in the following section. Reader 1 (Eva Fallentin) was a Consultant Radiologist with 20 years of experience in abdominal radiology, and two years of experience in DCE-CT. Reader 2 (Thomas Axelsen) was a Consultant Radiologist with seven years of experience in abdominal radiology and one year of experience in DCE-CT. Reader 3 (Martin Lundsgaard Hansen) was a Resident Radiologist with two years of experience in DCE-CT. Readers 1 and 2 assessed inter-observer reproducibility, and Reader 3 assessed intraobserver reproducibility.
For data analysis, the input artery was selected by placing a 100 mm 2 circular ROI in the center of the abdominal aorta and a second ROI was placed in the tumor. Parametric perfusion maps were generated for Arterial flow (maximum slope method, single compartment), blood volume and permeability (k trans ) (Patlak method, double compartment). Each perfusion scan was analyzed with three different approaches by the two readers. Reader 3 re-evaluated the perfusion scans ten months later for intra-observer agreement. The different analysis methods are also illustrated in Figure 1.

Scan protocol and parameters
‚ 320-detector row CT scanner (Aquilion ONE, Toshiba Medical Systems, Ohtawara, Japan) ‚ z-axis coverage 12-16 cm ‚ 100 kV and 100 mA ‚ 0.5 s/rotation time and a fixed     The center level of the tumor was selected on a conventional CT scan with the largest cross-sectional area as the most representative level of the tumor. Small ROIs were placed (25-30mm 2 ) in the tumor periphery at 12, 3, 6, and 9 o'clock. The image number (tumor level) and perfusion parameters from each ROI were recorded. Average arterial flow, blood volume and permeability were calculated from the four ROIs.

Method (II): 2D Region of Interest (2D-ROI) around Tumor Border at the Center Level of Tumor
A representative level of tumor was selected and a free-hand ROI was drawn around the tumor border, excluding the lumen and extratumoral tissue. The image number (tumor level) was noted, area of the ROI and perfusion parameters (arterial flow, blood volume and permeability) were measured.

Method (III): 3D Volume of Interest (3D-VOI) Encompassing Entire Tumor Volume
By using a sculpt tool, a volume covering several images was defined as the tumor volume. Lumens and extratumoral tissues were excluded as mentioned in the two methods above. The first and last image number were noted, tumor volume and perfusion parameters (arterial flow, blood volume and permeability) were measured. List 2 summarizes the data recorded from the study.  The center level of the tumor was selected on a conventional CT scan with the largest cross-sectional area as the most representative level of the tumor. Small ROIs were placed (25-30 mm 2 ) in the tumor periphery at 12, 3, 6, and 9 o'clock. The image number (tumor level) and perfusion parameters from each ROI were recorded. Average arterial flow, blood volume and permeability were calculated from the four ROIs.

Method (II): 2D Region of Interest (2D-ROI) around Tumor Border at the Center Level of Tumor
A representative level of tumor was selected and a free-hand ROI was drawn around the tumor border, excluding the lumen and extratumoral tissue. The image number (tumor level) was noted, area of the ROI and perfusion parameters (arterial flow, blood volume and permeability) were measured.

Method (III): 3D Volume of Interest (3D-VOI) Encompassing Entire Tumor Volume
By using a sculpt tool, a volume covering several images was defined as the tumor volume. Lumens and extratumoral tissues were excluded as mentioned in the two methods above. The first and last image number were noted, tumor volume and perfusion parameters (arterial flow, blood volume and permeability) were measured. List 2 summarizes the data recorded from the study.

Statistics
Perfusion variables derived from the analysis were evaluated for normal distribution using the Kolmogorow-Smirnow test. A paired t-test was used to compare tumor length and tumor volume between the readers. Intra-and interobserver reliability was calculated using Bland-Altman statistics with 95% limits of agreement [14] in Prism (Version 6, GraphPad Software, La Jolla, CA, USA). Correlation between observations were calculated using a two-way random, single measure (absolute agreement) Intraclass Correlation Coefficients (ICCs, model 2,1) for analysis method (I), (II) and (III) using SPSS for mac (version 20, IBM, Armonk, NY, USA). An ICC value below 0.40 was considered poor reliability, fair for values between 0.41 and 0.59, good for values between 0.60 and 0.74, and excellent for values between 0.75 and 1.00 (15). p values below 0.05 were considered statistically significant.

Results and Discussion
A total of 100 DCE-CT readings (25 readings from Reader 1, 25 readings from Reader 2 and 50 readings (2ˆ25) from Reader 3) were available for analysis. Average reading time for Reader 3 was 11 min, which included loading, setting parameters for analysis and drawing ROIs according to the three described methods.

Bland-Altman Limits of Agreement and Intraclass Correlation Coefficient (ICC)
Tables 1 and 2 summarize findings of Bland-Altman limits of agreement and ICC for arterial flow, blood volume and permeability measured for the three methods of analysis: (I) 2D fixed ROIs, (II) 2D-ROI and (III) 3D-VOI. Interobserver ICC was excellent for arterial flow (0.93), blood volume (0.94) and permeability (0.86) with 3D-VOI analysis. For 2D analysis (methods I and II), ICC was fair to good (0.49-0.71) for all perfusion parameters. Intra-observer ICC (Reader 1) for 3D-VOI was excellent for arterial flow (0.96) and blood volume (0.83). Intra-observer ICC (Reader 3) for 3D-VOI permeability was good (0.60). The span of limits of agreement was narrower for all parameters in inter-and intra-observer with 3D-VOI analysis (III) compared to 2D analysis. Inter-observer comparison between the specialists (Readers 1 and 2) and the resident (Reader 3) showed similar trends with narrower limits of agreement with 3D-VOI analysis (data not shown).

CT Perfusion Parameter and Method 95% Limits of Agreement Intraobserver ICC
Arterial flow mL¨min´1¨100 g´1

Tumor Definition and Delineation
There was no significant difference in selecting tumor center (p = 0.2), area of center tumor (p = 0.07) and tumor volume (p = 0.5) between readers 1 and 2.

Discussion
Reproducible data are essential for a new imaging modality to achieve acceptance in a clinical setting. The aim of this study was to examine reproducibility in reading DCE-CT studies of gastroesophageal cancer with three different approaches, and we showed that the limits of agreement were narrower for volumetric analysis compared to 2D ROI analysis, although all the methods had relatively wide limits of agreement in both inter-and intra-observer reproducibility. To our knowledge, this is the first reproducibility study on volumetric DCE-CT in gastroeosphageal cancer.
Physiological measures derived from DCE-CT analysis are the sum of true tissue perfusion, physiological variation, methodological variability in image acquisition and the variability introduced at the level of analysis [12]. This study addresses the variability introduced by the reader during analysis at the workstation. Our data shows no statistical difference in ICC between 2D and 3D analysis but narrower limits of agreement in 3D analysis. Interestingly, there was higher agreement between the two specialists (interobserver) compared to the residents two readings (intraobserver). One explanation could be higher routine and better agreement between specialists as reading and interpretation of gastro esophageal cancer on CT scans is difficult and generally regarded a specialist task. Ng et al. [11] showed that for lung tumors, the reproducibility increased with larger z-axis coverage (10 to 40 mm), thereby enabling the ability to measure the whole-tumor perfusion values. Chalian et al. [15] examined the interobserver variability between ROI analysis and VOI analysis of hepatic metastasis of colorectal cancer, and found a similar narrower limit of agreement in favor of volume analysis. On the other hand, Goh et al. [10] examined 10 patients with colorectal cancers and found the same variability with varying coverage from 5 to 20 mm. Intratumoral heterogeneity can cause a higher variability in measurements in a single plane. Volumetric analysis on the other hand enables the reader to better exclude non-viable tissue or large vascular structures. It has been shown that automated software for setting parameters for the kinetic models could result in better agreement between readers [16], and these results may point out the direction for future analysis software.
The relatively large limits of agreement demonstrated in our study challenges the use of DCE-CT in a clinical setting. DCE-CT has been applied in studies to measure therapy-induced changes in the tumors' vascularity. For hepatocellular carcinomas [17], it has been shown that decreases of more than 35% in blood flow, 43% in blood volume and 93% in permeability were beyond variability in analysis, and could therefore serve as limits for response assessment. The limits of agreement in our interobserver analysis correspond to a decrease of 20% in arterial flow, 39% in blood volume and 36% in permeability. In clinical practice, biological variation in repeated scans will also play a role, making even larger limits necessary. Harders et al. [18] examined 59 patients with a single DCE-CT suspected lung malignancy and found out that DCE-CT could not discriminate between benign and malignant lesions partly because of the wide limits of agreement. 3D analysis is time consuming compared to 2D analysis and is typically done on a dedicated workstation, which does not favor its use in a clinical setting.
Research within the field of DCE-CT has been challenged by the lack of standards making inter-study comparison difficult. In 2012, Miles et al. published the Experimental Cancer Medicine Centre (ECMC) Network consensus document [19], and proposed recommendations for both scan protocols and for reporting data derived from DCE-CT. The required scan length is primarily dependent on the selected analysis model, and Miles et al. [19] recommends a total scan duration of 60-75 s when using Patlak analysis. Our protocol with a scan duration of 55-60 s was just short of this recommendation. Another fundamental requirement for DCE-CT analysis is maintaining temporal alignment of the targeted tissue for correct perfusion calculations. Tumors of the gastrointestinal tract are affected by motion from both peristalsis and respiratory movement of the diaphragm and anterior abdominal wall. Without taking measures to reduce and compensate for motion, the tumor borders can become fuzzy, impairing perfusion parameters [7]. It has been shown that a free breathing protocol requires less post-processing compared to a breath-hold scan protocol [20]. We placed an abdominal strap around the patient to hinder movement artefacts from anterior wall movement [21,22], and it was our experience that this measure also helped reminding the patient to breath shallow. We used hyoscine butylbromide as anti-peristaltic drugs which is generally recommended in DCE-CT imaging of the bowel or pelvis to hinder movement [2,7,10]. Lastly, we applied an advanced semi automated 3-dimensional non-rigid motion correction model, which is recommended in volumetric DCE-CT [19]. Motion correction software is typically vendor specific and is implemented in various forms making universal recommendations troublesome.
Many tumors exhibit heterogeneous perfusion [23]. We observed a substantial difference between each of the four ROIs used in method I, demonstrating this intratumoral heterogeneity. By averaging these four ROIs, the limits of agreement came close to the results derived from 2D-ROI analysis. Furthermore, 3D-VOI analysis (method III) averages the intratumoral heterogeneity in all planes and resulted in higher ICC and narrower limits of agreement, suggesting that volumetric perfusion analysis is more suitable for clinical use. On the other hand, average measurements does not reflect the distribution of perfusion inside the tumor volume [24]. Tumors respond differently to treatment with chemotherapy, radiotherapy, embolization or cryoablation, and, in some instances, it could be more attractive to report residual vascular "hot spots" [25] instead of perfusion averages. Analytic methods such as histogram analysis [26] or texture analysis [27] could also be a future direction for optimizing the reading of perfusion analysis.
Our study only demonstrates the variability in data analysis, and does not address the importance of repeatability in DCE-CT. The variability in data analysis was caused both by the setting of analysis parameters and the actual delineation of the tumor. The variability of setting the analysis parameters could have been further investigated by letting the two readers analyze the data with the same initial setting of arterial input and Patlak phase.

Conclusions
In conclusion, 3D volumetric analysis of DCE-CT studies has narrower limits of agreement in gastroesophageal junction cancer compared to 2D analysis. Tumors at the gastroesophageal junction are difficult to delineate, and there is a difference between readers in defining the cranial and caudal border of the tumor, although this did not affect the final tumor volume. Future studies are needed to address whether average tumor perfusion measures and more complex analytic methods such as texture analysis are suited for reporting clinical relevant values from DCE-CT studies.