On the total system error of a robotic radiosurgery system: phantom measurements, clinical evaluation and long-term analysis

The total system error (TSE) of a CyberKnife® system was measured using two phantom-based methods and one patient-based method. The standard radiochromic film (RCF) end-to-end (E2E) test using an anthropomorphic head and neck phantom and isocentric treatment delivery was used with the 6Dskull, Fiducial and Xsight® spine (XST) tracking methods. More than 200 RCF-based E2E results covering the period from installation in 2006 until 2017 were analyzed with respect to tracking method, system hardware and software versions, secondary collimation system, and years since installation. An independent polymer gel E2E method was also applied, involving a 3D printed head phantom and multiple spherical target volumes widely distributed within the brain. Finally, the TSE was assessed by comparing the delineated target in the planning computed tomography images of a patient treated for a thalamic functional target with the radiation-induced lesion defined on the six-month follow-up magnetic resonance (MR) images. Statistical analysis of the RCF-based TSE results showed mean  ±  standard deviation values of 0.40  ±  0.18 mm, 0.40  ±  0.19 mm, and 0.55  ±  0.20 mm for the 6Dskull, Fiducial, and XST tracking methods, respectively. Polymer gel TSE values smaller than 0.66 mm were found for seven targets distributed within the brain, showing that the targeting accuracy of the system is sustained even for targets situated up to 80 mm away from the center of the skull. An average clinical TSE value of 0.87  ±  0.25 mm was also measured using the FSE T2 and FLAIR post-treatment MR image data. Analysis of the long-term RCF-based E2E tests showed no changes of TSE over time. This study is the first to report long-term (>10 years) analysis of TSE, TSE measurement for targets positioned at large distances from the virtual machine isocenter, or a clinical assessment of TSE for the CyberKnife system. All of these measurements demonstrate TSE consistently  <  1 mm.

furthermore, the apparent locations (defined on the treatment planning image series) of a set of fiducials residing on appropriate devices mounted on the frame (e.g. a localization box) are used to register the image modality coordinate system to the stereotactic space (Heller et al 2008). In frameless radiosurgery, on the other hand, the stereotactic space is defined on the treatment planning CT imaging data, using patient anatomical features or structures and/or fiducials properly implanted to the anatomical region intended for treatment. Precise dose delivery is then facilitated by the in-room image guidance subsystem, which enables accurate localization of the target within the stereotactic space via a sequence of radiographs acquired during treatment, with selectable temporal resolution (Heller et al 2008).
The total treatment delivery uncertainty of an SRS system is composed of uncertainties associated with the imaging equipment used for planning (CT, MR, etc), including uncertainties associated with image fusion, uncertainties in the volume definition and dose calculation within the treatment planning system, the mechanical uncertainties of the treatment delivery machine, and uncertainties relating to the image guidance subsystem, when applied (ICRU 2014). The latter comprise the combined effect of mechanical uncertainties of the imaging or detector subsystems, the finite resolution of the digital detectors and the uncertainty ascribed to the procedure of co-registering the images or radiographs acquired during treatment delivery, and the digitally reconstructed radiographs (DRRs). In a comprehensive quality assurance program, the overall uncertainty should be characterized into individual uncertainty contributors (Mackie and Palta 2011). Assessment of the overall uncertainty is also commonly performed using end-to-end (E2E) tests tailored to the specifications and methods of each radiation treatment modality. The development of appropriate E2E test methods, enabling accurate and precise assessment and characterization of the total treatment delivery uncertainty, is therefore of crucial importance (ICRU 2014).
In this work, the total geometric treatment uncertainty, also referred to as total system error (TSE), of a CyberKnife ® System (Accuray Inc., Sunnyvale, CA, USA) is assessed by three independent methods. First we present results of the standard E2E tests proposed by the manufacturer, using radiochromic films (RCFs), as obtained in our clinic over a period of more than 10 years. The RCF results are presented and compared with regard to tracking method and the years since the system's installation. Temporal analysis of CyberKnife TSE results is valuable, since the long-term stability of uncertainty contributions such as the mechanical accuracy of the robotic manipulator has not been previously investigated. However, the standard E2E test method is limited in that it measures TSE only for a target volume close to the virtual machine isocenter, whereas treatments often involve targets several cm from this position. In addition, this same method is used as the final stage of geometric system calibration, so that any systematic uncertainty inherent in the test method would not be detected by routine E2E tests. Therefore, we also present TSE results measured using polymer gel dosimetry combined with a 3D printed head phantom derived from a patient CT and containing multiple targets, to emulate a scenario closer to clinical practice. This method is fully independent of the vendor-recommended E2E test and enables TSE measurement for targets located anywhere within the phantom. However, both of these methods involve a phantom simulation of the patient anatomy used for tracking, and unlike with a real patient there is no intra-fraction target motion. Finally, therefore, the TSE of the system was estimated using the imaging data of a patient treated for a thalamic functional target, by comparing the delineated target in the treatment planning CT images to the radiation-induced lesion defined on the six-month follow-up MR images.

Materials and methods
2.1. The CyberKnife system CyberKnife (CK) is a dedicated treatment system for stereotactic radiosurgery and whole body stereotactic radiation therapy (Adler et al 1999). Detailed descriptions of its features and clinical applications can be found elsewhere (e.g. Kilby et al (2010)). In brief, CK consists of a compact lightweight linear accelerator (linac) that uses an X-band cavity magnetron and a side-coupled accelerating waveguide to produce a 6 MV x-ray treatment beam with a dose rate of 1000 cGy min −1 at an 800 mm source-to-axis distance. The linac is mounted on a sixjoint robotic manipulator (Kuka Roboter GmbH, Augsburg, Germany) capable of positioning the x-ray target at any point within a spherical shell of 600-1000 mm from a virtual machine isocenter with a nominal mechanical precision better than 0.12 mm (Kilby et al 2010). Secondary collimation is performed using either 12 fixed circular collimators, a variable circular aperture collimator (Iris ™ , Accuray Inc.) (Echner et al 2009, Kilby et al 2010 or, most recently, the InCise ™ micrο multi leaf collimator (Asmerom et al 2016). The patient is placed in the treatment position using either a six-joint robotic treatment couch (RoboCouch ® , Accuray Inc.) or a standard couch capable of performing all six motions except yaw (i.e. rotations around the central antero-posterior axis of the patient) (Kilby et al 2010). Dose delivery accuracy is ensured by an image guidance sub-system consisting of two kV x-ray tubes and two digital flat panel detectors. The x-ray tubes are installed on the ceiling of the treatment room and tilted at 45° so that the central axes of the generated beams are orthogonal to each other. The flat panel x-ray detectors, mounted horizontally on the treatment room floor, consist of a cesium-iodide scintillator deposited directly on amorphous silicon photodiodes and generate high-resolution digital images (1024 × 1024 pixels of 0.4 mm in size with 16-bit resolution).
Both the robotic manipulator and image guidance systems are spatially calibrated relative to a unique point in the treatment room space. In CK terminology, this point is referred to as the 'virtual isocenter' and is physically represented by a small crystal ('isocrystal') situated on the top of a floor-mounted post, the so-called 'isopost' Pantelis 2008, Dieterich et al 2011). Mechanical calibration of the robotic manipulator is performed using a laser beam aligned with the treatment beam. During calibration the robot is instructed to scan through all the nodes comprising each treatment path; at each node, the position and direction of the laser beam producing the maximum signal on the isocrystal is recorded (Antypas and Pantelis 2008). Mechanical calibration of the image guidance system is performed by ensuring that the central x-ray of each tube depicts the isocrystal at the central pixel of the corresponding flat panel (Antypas and Pantelis 2008).
It should be noted, however, that a residual systematic offset is possible between the image guidance and manipulator spatial coordinate systems; for example, owing to discrepancies between the laser and treatment beam axes. The translational component of this offset is taken into account by a final spatial correction, referred to as 'DeltaMan'. This correction is included in the calibration files stored on the treatment delivery computer of a CK system and is applied when the robotic manipulator and the image guidance subsystems are simultaneously employed, such as during treatment delivery and dosimetric QA measurements. The DeltaMan correction is derived from a sequence of E2E tests using RCFs, at the system's acceptance testing and after software upgrades or hardware changes (Dieterich et al 2011).
During treatment, the image guidance system acquires x-ray radiographs of the treated anatomical region (referred to as 'live images' in the treatment delivery software) and compares them, using dedicated image registration methods, with corresponding pre-calculated DRRs generated from the planning CT series. For intracranial lesions the 6Dskull tracking method is used (Fu and Kuduvalli 2008) which utilizes a rigid two-dimensional (2D) to 3D image registration algorithm based on the spatial distribution of the high contrast bone information depicted in the live images and the corresponding DRRs. For spinal lesions, the Xsight ™ spine tracking (XST) method is used . Similar to 6Dskull, the XST method uses the high contrast bone information of the treated vertebra plus the two adjacent vertebrae. In this method, a non-rigid registration algorithm is used to account for local deformations of the vertebrae, e.g. due to the spine flexing differently in the planning CT versus at time of treatment. For soft tissue lesions, a set of fiducial markers (typically gold seeds) is implanted inside or near the treated lesions and the Fiducial tracking method is used (Hatipoglu et al 2007). The calculated deviations of the target/patient coordinates relative to the corresponding values in the treatment planning CT are corrected automatically by adjusting the position and direction of each treatment beam, using the robotic treatment manipulator (Kilby et al 2010). During treatment, these 6D (three translational and three rotational) corrections are calculated continually by repeating the live x-ray image acquisitions at a user-defined frequency, typically every 30-60 s. In this way, intra-fraction target motion is automatically detected and corrected during every treatment. (Note that a separate system, not described here, is used to track the higher frequency motion caused by respiration.)

Total system error
While the accuracy of each component in the treatment process that contributes to the total system accuracy can be tested independently (Suh et al 2007, Wong et al 2007, Fu and Kuduvalli 2008, Fürweger et al 2011, it is more meaningful to measure the TSE by registering the planned dose distribution to that delivered (Thomas et al 2013, Pavoni et al 2017. For this purpose, E2E phantom-based procedures integrating all components of the therapeutic procedure have been proposed and are widely used for estimating the TSE of a radiosurgery system (ICRU 2014). This work reports TSE results for a CK system obtained using RCFs (Chang et al 2003, Yu et al 2004, Muacevic et al 2006, Antypas and Pantelis 2008, Ho et al 2009 and 3D polymer gels (Moutsatsos et al 2010, Liu et al 2016 combined with a 3D printed head phantom. The RCF results have been derived from (more than) 10 years of measurements, following the E2E test proposed by the manufacturer which corresponds to isocentric treatment delivery with a single target volume near the virtual isocenter. Polymer gel measurements were performed to enable emulation of more realistic intracranial clinical applications, delivering non-isocentric conformal beams to multiple target volumes distributed within a head phantom. Besides phantom measurements, the TSE was estimated by exploiting the clinical data of a treated patient for functional radiosurgery (Massager et al 2007).

Radiochromic film measurements
The standard anthropomorphic head and neck phantom provided by the vendor was used for RCF-based TSE measurements (Kilby et al 2010, Dieterich et al 2011, Pantelis and Niroomand-Rad 2018. This phantom resembles the x-ray attenuation properties and radiographic appearance of the corresponding human cranial and spinal anatomy, and contains a Ball Cube in which a pair of orthogonal RCFs can be placed (see figure 1). This cube contains a spherical structure with higher x-ray attenuation to the surrounding cube, which can be visualized in CT and provides a target volume of interest for the test. A large Ball Cube situated close to the center of the head phantom, or a smaller one residing in the cervical spine, are provided by the vendor to test 6Dskull and XST tracking methods, respectively. Fiducial tracking verification is facilitated using the larger Ball Cube, in which five fiducial markers are properly implanted. The phantom was CT scanned employing the imaging protocol used in clinical practice for CK intracranial treatments (1 mm slice thickness, 300 mm field of view, 512 × 512 matrix) and the images were imported in the MultiPlan ™ treatment planning system (TPS), where the central radiopaque sphere was delineated. After selecting the tracking method (i.e. 6Dskull, Fiducial, or XST) a pseudo-isocentric conformal treatment plan was prepared such that the 70% isodose line conformed to the spherical ball-target contour (see figure 1). A dedicated TPS tool was used to fine-tune the pseudo-isocenter coordinates so that the center of the planned dose distribution coincided with the center of the delineated sphere. Field sizes of 30 mm, 25 mm and 15 mm (defined by fixed or Iris collimators) were used for the 6Dskull, Fiducial and XST tracking methods, respectively. Initially, MD-55 films (Ashland Advanced Materials, Bridgewater, NJ, USA) were used for these tests, and a dose of 2400 cGy in one fraction was prescribed to the 70% isodose. From 2009 the films were replaced by EBT2 and then EBT3 (Ashland), and the prescription was reduced to 420 cGy. Following treatment delivery, the irradiated film pieces were scanned in an Epson Expression 1680 Pro transparency scanner at 300 dpi resolution. Exposed film analysis was performed using the software provided by the vendor, including alignment of the film pieces and calculation of the optical densities and corresponding relative dosimetry values, as shown in figure 1. The TSE of the system was assessed by comparing the center of mass (CoM) coordinates of the area encompassed by the 70% isodose line measured on each RCF to the known coordinates of the geometrical center of the radiopaque sphere inside the Ball Cube. Due to the orientation of the films inside each Ball Cube, one measurement of the TSE was performed for each test along the left-right and superior-inferior directions, and two measurements for the anterior-posterior direction, which were compared for consistency and averaged.

Polymer gel measurements
An E2E procedure based on polymer gel 3D dosimetry was also developed to assess the TSE of the CK system. For this test, the services of RTsafe (RTsafe P.C., Athens, Greece) were used for 3D printing of a hollow head phantom with a radiographically bone-mimicking 3D printed material based on a patient CT scan. The phantom was filled with VIPAR normoxic polymer gel formulation covering the total brain volume (see figure 2) (Papoutsaki et al 2013) while the remaining volume (e.g. the nasal and oral cavity) was filled with a non-dosimetric gel. After creating a thermoplastic mask precisely fitting its surface, the phantom was CT scanned following the imaging protocol used for intracranial CK treatments in our clinic (120 kVp, 1 mm slice thickness, 512 × 512 matrix and 300 mm FOV). In the resulting images the mean Hounsfield Unit (HU) of the skull and brain were found to be equal to 1061 and 23, respectively. Using the calibration curve of the employed CT scanner, these HU numbers correspond to mean densities of 1.78 g cm −3 and 1.04 g cm −3 , which differ by less than 2% from the nominal density of the cortical bone and brain, respectively. The CT series was imported into the MultiPlan ™ v.4.5 TPS, where seven spherical targets were delineated with a spatial distribution extending from the center to the periphery of the brain in all three dimensions (figure 2). Five targets had a volume of 0.6 cm 3 each, while the other two had a larger volume of 3.9 cm 3 each. A multi pseudo-isocentric conformal treatment plan was prepared, using the 12.5 mm fixed collimator for the smaller targets and the 20 mm fixed collimator for the two larger targets. Tuning structures were used to maximize spatial dose gradients and conform the dose distribution to each target contour. The plan consisted of 181 beams (129 with the 12.5 mm collimator and 52 with the 20 mm collimator) and 96 nodes. A dose of 1200 cGy was prescribed to an 80% isodose line encompassing the smaller targets, while a dose of 1000 cGy was prescribed to the two larger targets (see figure 2(b)). The total number of Monitor Units was 15 061.3, with MU per beam ranging from 15 MU to 116 MU. Treatment delivery was performed by aligning the phantom on the treatment couch (see figure 2(c)) and employing the 6Dskull tracking method setting imaging parameters of 115 kV and 10 mAs on both x-ray tubes. These kV and mAs values are typical for real patient intracranial treatment and were also found to be adequate for tracking the phantom's high-density region-mimicking human skull (see figure 2(d)).
After irradiation, the phantom was left in a cool, dark place for one day to allow for polymerization growth and stabilization. Imaging was performed with a 1.5 T AVANTO (Siemens Healthcare GmbH, Erlangen, Germany) MR scanner, using the head Matrix coil and the pulse sequence commonly employed by RTsafe for 1.5 T Siemens scanners (a 2D, 4-Echo (TE = 58, 563, 1070 and 1580 ms, TR = 2050 ms) T2 Haste MR sequence). A T2 map was calculated for each reconstructed slice, using a mono-exponential fitting routine of TE versus signal intensity on a pixel by pixel basis. The resulting T2 image series (2 mm slice thickness and 1.09 × 1.09 mm 2 in-plane resolution) was imported into the MultiPlan TPS and fused to the treatment planning CT images. The fused T2 images were then exported in the CT resolution and used to reconstruct a 3D matrix mapping the relaxation rate, R2 (=1/T2), of each imaging voxel. Using the dose response data for the employed gel formulation (Papoutsaki et al 2013), a corresponding 3D dose matrix was calculated, which was afterwards split into seven sub-volume matrices; one for each irradiated target. Τhe CoM coordinates of the radiation-induced gel polymerization corresponding to each target were estimated by the average CoM coordinates of successive isodose surfaces resulting from thresholding the measured dose distribution at increasing threshold values lying between the 50% and 70% relative dose levels with respect to the dose prescribed at each target (Moutsatsos et al 2013). The range of dose threshold levels was selected on the grounds of maximizing the measurements' accuracy and precision in view of the increased spatial dose gradient associated with the 50%-70% relative dose interval, whereas a total number of five isodose surfaces were evaluated, taking into account the spatial resolution of the MR images used for gel readout. The same procedure was applied to the planned dose distribution exported in RTDOSE format using the CT resolution. The geometric distance between the CoMs of the planned dose distribution and the corresponding gel polymerization for each target was used to measure the TSE of the system.

Patient based measurements
Besides phantom assessments, the TSE of an SRS system should also be evaluated using clinical data when available. This allows the TSE estimation to include system capabilities that are crucial for the clinical outcome but difficult to measure by phantom-based procedures, such as patient movement detection and correction during treatment delivery. In this study, the TSE of the CK system was estimated by exploiting the imaging data of a patient treated in our department for medial thalamotomy (Stancanello et al 2009). Prior to treatment, the nucleus ventralis intermedius (VIM) of the thalamus was identified with the aid of the Talairach and Tournoux (TT) and the Montreal Neurologic Institute (MNI) electronic atlases, which were fused to the patient MR images. A single fraction, using 207 non-coplanar beams collimated using the 5 mm and 7.5 mm fixed collimators and delivering a dose of 12 000 cGy at the 77% isodose level, was prescribed to the functional target. On the six-month follow-up 1.5 T MR images, a radiation-induced lesion at the thalamus was well-visualized and was used to quantify the TSE. For this purpose, the follow-up axial FSE T2 (TE = 91 ms, TR = 3000 ms, 5 mm slice thickness, 0.47 mm in-plane resolution) and coronal FLAIR (TE = 83 ms, TR = 8000 ms, 5 mm slice thickness, 0.94 mm in-plane resolution) MR image series were fused to the treatment planning CT images and the radiationinduced lesion was delineated on both of them. The lesion contours, along with the fused MR image series and the RTDOSE matrix mapping the planned dose distribution, were exported using the DICOM RT protocol.
In each exported MR image stack, the CoM coordinates of the binary 3D object defined by the voxels whose centers lie within the lesion contour were calculated. Similarly, the CoM coordinates of the binary object defined by thresholding the 3D RTDOSE matrix at the prescription dose level were also calculated. The TSE was then assessed by the average geometric discrepancy between the CoM of the radiation-induced lesion and the CoM of the prescription isodose surface (Massager et al 2007).

Long term measurements of total system error
According to the vendor and user community suggestions, RCF-based E2E TSE tests for the CK system should be performed monthly for an intracranial and an extracranial tracking method (Dieterich et al 2011). The CK in our clinic was installed in March 2006 and it was the first CK G4 system version installed in Europe. A total of 204 RCF-based E2E tests have been performed since installation, using the same robotic manipulator. Several hardware and software upgrades have been made to the system during this period of time. Hardware changes have included replacements of the x-ray tubes of the image guidance system, an upgrade of the linac from a 600 MU min −1 dose rate model to an 800 MU min −1 version, both with and without additional head shielding (Chuang et al 2008), and installation of the Iris variable aperture collimator (Echner et al 2009). Software changes involved upgrades of the treatment delivery and tracking algorithms to all versions, starting with v.6 and up to v.9.6. While all tested versions involved the same 6Dskull tracking algorithm (Fu and Kuduvalli 2008), Fiducial tracking changed to a Hidden Markov Model based tracking algorithm (Kuduvalli 2006, Hatipoglu et al 2007. Several upgrades to the Spine tracking algorithm were also included in the aforementioned versions , Fürweger et al 2011. The MD55 RCFs initially used to perform E2E tests, using a prescription dose of 2400 cGy, were replaced by EBT films with an improved Ball Cube and a prescription dose of 420 cGy (Ho et al 2009), in an effort to reduce irradiation time and to improve the consistency of film to Ball Cube alignment. After every hardware replacement and software upgrade, the system was recalibrated. Therefore, the E2E TSE results were separated into nine groups containing the same system calibration files, and results were compared with regard to tracking method, software version and collimator type.

CyberKnife targeting error
3.1.1. Radiochromic film results RCF-TSE results for the CK system installed in our clinic are given in figures 3(a), (b) and (d) in the form of probability distribution plots for the 6Dskull, Fiducial and XST tracking methods, respectively. A general inspection of figure plots shows that TSE is consistently better than 1 mm, regardless of the tracking method. The TSE 6Dskull and TSE Fiducial probability distributions exhibit an asymmetric shape with (mean ± s.d.) TSE values of (0.40 ± 0.18) mm and (0.40 ± 0.19) mm for the 6Dskull and Fiducial tracking methods, respectively, while that for TSE XST approximates a symmetric normal distribution with a mean ± s.d. of (0.55 ± 0.20) mm.
In figure 3(c), the measured TSE results for the XST and Fiducial tracking methods are compared to corresponding 6Dskull values. Each point on this plot corresponds to TSE measurements for 6Dskull and one other tracking method acquired over a period of less than two months, during which both the CK system configuration and the calibration files were unchanged. Assuming an ideal condition in which the three image guidance methods exhibit the same registration accuracy, and in the absence of any systematic uncertainties, the plotted TSE results should lie along the y = x line also drawn on figure 3(c). While the plotted TSE values ripple around the ideal y = x line due to experimental uncertainties, a slight trend is observed of smaller TSE Fiducials values and greater TSE XST values, compared to 6Dskull. This trend was further quantified by fitting a a linear trend-line of the from y = a · x on each data set. Slopes (a) of (0.85 ± 0.18) and of (1.18 ± 0.23) were found for the Fiducial and XST versus 6Dskull TSE results, respectively. Both slopes, however, agree within one sigma with (the ideal condition of) unity. Figure 4 presents a screenshot from the MultiPlan TPS showing the T2 images of the irradiated gel phantom, fused to the corresponding planning CT images, upon which the seven target contours are superimposed. The observed close agreement between the delineated targets and the corresponding radiation-induced polymerization distribution offers a qualitative assessment of the CK TSE using the 6Dskull tracking method. To quantify these TSE results, the Euclidean distance between the CoMs of the prescribed isodose surface and the radiation-induced polymerization, along with the corresponding differences along the x, y, and z axes (i.e. in the left-right, posterior-anterior and superior-inferior directions, respectively) were calculated and are presented in table 1 for each target. The distance between the planned location of each target and the imaging center, CTcenter (i.e. the point within the CT image volume that will be registered with the virtual isocenter of the system, which is close to the center of the brain), is also tabulated to aid discussion of the results. According to table 1, the planned dose versus the delivered dose CoM differences range from 0.13 mm to 0.38 mm for the x axis, −0.55 mm to 0.16 mm for the y axis, and −0.43 mm to 0.34 mm for the z axis. These coordinate differences correspond to TSE results ranging from 0.26 mm to 0.66 mm, with the greatest value observed for the target planned at the left temporal lobe at ~ 50 mm distance from the CTcenter. Figure 5 presents the central axial, sagittal, and coronal slices of the CT-fused MR image series acquired for the six-month follow-up of the patient treated for neuropathic pain. As shown, the prescription isodose line is in close agreement with the contour of the radiation-induced lesion delineated in each MR image series in all three spatial planes. Following the methodology described in section 2.2.3, the system's TSE was estimated to be 0.69 mm and 1.04 mm based on the FSE T2 and FLAIR MR image data respectively, resulting in an average clinical TSE value of 0.87 ± 0.25 mm.

Long-term stability of the targeting system error
Long-term TSE results derived from RCF measurements for the 6Dskull, Fiducial and XST tracking methods, over more than a decade, are presented in figure 6 in the form of boxplots. Each boxplot corresponds to a time interval of unchanged system configuration and calibration files. A general inspection of figure 6 data shows that TSE is less than 1 mm for all tracking methods over the entire period, while XST is associated with relatively higher TSE values than the other two tracking modes. It is noted that the TSE results refer to measurements using both the fixed and Iris collimation systems, except for those obtained prior to September 2009, when only the  fixed collimators were available. In September 2009 the system was updated to version 9 and was equipped with the Iris collimator, while the linac was upgraded to the 800 MU min −1 model. At this same time the Ball Cube 2 was purchased. Relative to Ball Cube 1, this device provides more accurate and reproducible film registration to the ball cube, resulting in improved reproducibility of TSE 6Dskull and TSE fiducial measurements using EBT-2 or -3 films (Ho et al 2009). All TSE XST measurements refer to the same mini Ball Cube, which does not have the advantages of Ball Cube 2. The mean TSE along with its spatial components were calculated for each tracking method, CK system version, and radiochromic film type. Results are presented in table 2 and indicate that there is no significant dependence of the mean measured TSE values with the tracking method, film type, or CK software version. Additionally, it is worth noting that a statistical comparison of TSE results obtained independently with the fixed and Iris collimators did not reveal any systematic differences.

Discussion
The total system error of a CK system was estimated using E2E tests integrating all components of the therapeutic procedure; from planning image acquisition to treatment delivery. The vast majority of the E2E tests were performed following the standard quality assurance procedures performed in our clinic, which are based on the head and neck phantom provided by the vendor, including Ball Cubes to hold two orthogonal RCFs (Antypas Table 1. Euclidean distances from CT center (virtual isocenter) to the centers of mass of the seven spherical targets and corresponding relative spatial differences of the centers of mass of the radiation-induced polymerized and planned dose distribution volumes along the three main axes and in radial distance.  and Pantelis 2008, Dieterich et al 2011. Statistical analysis of the E2E tests showed mean ± s.d. TSE values of 0.40 ± 0.18 mm, 0.40 ± 0.19 mm and 0.55 ± 0.20 mm for the 6Dskull, Fiducial, and XST tracking methods, respectively. These lie well within corresponding values published in literature for the CK system, summarized by Kilby et al (2010). Furthermore, analysis of the TSE results obtained using RCFs for the different tracking methods revealed a slight trend of smaller TSE Fiducials and greater TSE XST mean measured values compared to 6Dskull; it was not statistically significant. Fu and Kuduvalli (2008) evaluated the accuracy of the 6Dskull algorithm by moving an anthropomorphic head and neck phantom to predefined positions inside the imaging field of view with the aid of the robotic manipulator. The fiducial-based registration served as a gold standard in their assessments. The authors reported that 'the registration errors were very small in the fiducial-method and a little larger in the intensity-based method', which is in agreement with the results of this work. It is worth pointing out, however, that in clinical practice TSE Fiducials depends on the number of fiducials used (Murphy 2002) and is prone to sources of systematic uncertainty such as migration within the patient body (Yu et al 2014). To the contrary, the 6Dskull tracking method is characterized by its robustness and stability. The relatively larger values observed for TSE XST compared to both TSE 6Dskull and TSE Fiducial could also be attributed to the registration accuracy of the XST tracking method compared to the 6Dskull and Fiducial methods. Using the robotic manipulator to move the standard head and neck phantom to predefined positions, Fürweger et al (2011) reported a mean deviation of 0.2 mm from the nominal translational offset and a maximum root mean square (RMS) error of 0.4 mm for the XST registration algorithm, but neglecting by design (i.e. due to the use of a phantom); the registration component arising from deformations. Using a more realistic approach, Ho et al (2007) reported a registration error of (0.61 ± 0.27) mm for the XST algorithm by (meta-) analyzing the registration data of 11 patients who underwent CK treatment for spinal lesions using implanted fiducial markers.
The RCF-based TSE measurements correspond to isocentric treatment delivery geometry, with the targets situated relatively close to the center of the stereotactic space defined in the planning CT series. Also, the plans used for RCF measurements are prepared using the 30 mm, 25 mm and 15 mm field sizes for the 6Dskull, Fiducial and XST tracking methods, respectively. In most clinical applications, however, non-isocentric treatment delivery techniques are used, with targets at distances of up to several cm from the virtual isocenter. Here, radiation beams are delivered on arbitrary directions defined during treatment planning, based on the node positions of the employed path and heuristically determined points on the surface of the target (Kilby et al 2010). Furthermore, in cases where steep dose gradients are necessary, smaller collimator sizes are used than those tested by the RCF-based E2E procedures. In view of the above, E2E tests closer to the clinical practice, incorporating image guidance, are suggested (Dieterich et al 2011).
To address these issues, an E2E procedure based on polymer gels was used to evaluate the TSE of the CK system installed in our department. Using a 3D printed head phantom, rendering real patient CT data, TSE 6Dskull measurements were performed by filling the phantom with the VIPAR polymer gel formulation and exploiting the gel's ability to record 3D dose distributions according to the methodology described in section 2.2.2. For an irradiation plan mimicking a clinical application of seven head metastases extending from the center to the periphery of the brain, the polymer gel based TSE 6Dskull results are presented in table 1 and-qualitatively-in figure 4. The polymer gel results confirmed the TSE 6Dskull obtained with EBT films and Ball Cube 2 for lesions lying close to the isocenter, while also showing that sub-millimeter delivery accuracy is maintained for target locations situated up to 80 mm away from the isocenter. Antypas and Pantelis (2008) have previously attempted to assess any correlation of TSE 6Dskull with measurement location within the stereotactic space. Moving the standard head and neck phantom, the authors achieved TSE 6Dskull measurements distancing up to only 8 mm from the isocenter, since the CK system disables treatment delivery by default for detected translations greater than 10 mm. At these distances, Antypas and Pantelis reported targeting accuracy results in agreement with those obtained by the standard RCF TSE 6Dskull measurements referring to the isocenter. This study provides the first TSE results which extend this target-isocenter offset beyond 10 mm.
In addition to providing TSE measurements at multiple and distant points within the stereotactic space, the polymer gel based E2E procedure also accounts-in contrast to film measurements-for the uncertainty associated with the spatial registration of MR to CT images, commonly performed in stereotactic applications. It should be noted, however, that polymer gel TSE results incorporate an uncertainty component stemming from the spatial distortions inherent to the MR images used for the gel readout (Moutsatsos et al 2010, Weygand et al 2016. MR distortions due to gradient non-linearities are sequence-independent (Baldwin et al 2009) and were minimized by enabling the distortion correction algorithm provided by Siemens' prior image acquisition. Using a grid-based phantom (Pappas et al 2017), residual sequence-independent distortions of up to 0.15 mm were measured for the specific MR scanner employed in this work. Sequence-dependent spatial distortions, however, stemming mainly from static field inhomogeneities, chemical shift and susceptibility effects, affect both the acquired MR images in the frequency encoding (AP) and, to a lesser degree, slice reconstruction (IS) directions (Watanabe et al 2002, Baldwin et al 2009. Applying the reversed read gradient technique (Chang and Fitzpatrick 1992), sequence-dependent spatial distortions of up to 0.6 mm were measured within a sphere of 80 mm radius centered at the scanner's isocenter, for a 3D MR sequence commonly used in clinical head and neck applications. Given that the RF receiver bandwidth of 185.28 kHz (i.e. 579 Hz/pixel × 320 pixels in the frequency encoding direction) used in the gel phantom MR image acquisition is much greater than that of 53.76 kHz (i.e. 210 Hz/ pixel × 256 pixels in the frequency encoding direction) used in the clinical sequence employed for our distortion measurements, sequence-dependent MRI distortions are expected to affect polymer gel TSE results by no more than 0.2 mm. Phantom-based estimation of total system error has the disadvantage of being unable to include the capability of the system to correct for patient movements during treatment delivery. This was addressed in this work by studying the imaging data of a functional radiosurgery patient treated for neuropathic pain. Comparison of the position of the radiation-induced lesion observed in the six-month follow-up FSE T2 and FLAIR MR images with the prescription isodose surface showed an average TSE of 0.87 ± 0.25 mm. This value is comparable to the phantom-based measurements, considering the increased slice thickness of the diagnostic MR images used for TSE estimation. The geometric distortions of the MR images are expected to be small since the lesion is close to the scanner isocenter and corrections were applied for gradient nonlinearities. It should be pointed out that the CK system uses a temporal scheme based on the elapsed time since previous image acquisition to acquire each new x-ray image pair and refresh the 6D beam alignment corrections. Statistical analysis of patient treatment log files has revealed that an image age of ~ 1.5 min ensures a targeting accuracy of less than 1 mm for central nervous system clinical applications (Hoogeman et al 2008, Murphy 2009, Fürweger et al 2010. In this work the aforementioned temporal scheme was used to irradiate the patient over an extended treatment time of ~ 2 h, due to the use of a large number of very small beams (207 beams, using the 5 mm and 7.5 mm fixed collimators) and the very high dose prescription (12 000 cGy in a single fraction).
The long-term TSE results derived from RCF measurements for the 6Dskull, Fiducial and XST tracking methods presented in figure 6 indicate that TSE is less than 1 mm for all tracking methods, regardless of the system software version and linac/collimation devices, over a period of more than 10 years corresponding to more than 8000 patient treatments and almost 24 000 delivered fractions. This notable finding is not a priori granted in head and neck CK applications. Rather, it comprises the result of a comprehensive and scrupulous quality assurance program followed in our department, including E2E targeting accuracy procedures and calibration adjustments according to the TG-135 suggestions. Finally, it should be noted that larger TSE values (>1 mm) could be associated with extracranial treatment sites. Their evaluation and long-term analysis requires E2E procedures analogous to those described herein, but using special phantom design and detector settings. While also being of increased importance, such an evaluation lies beyond the scope of this work.

Conclusion
The total system error of a CK system was estimated using E2E tests integrating all components of the therapeutic procedure; from planning image acquisition to treatment delivery. The results of the standard RCF-based E2E tests performed for more than a decade during the quality assurance of the system were analyzed with regard to tracking method, years since installation, system software and hardware versions, and radiochromic film type. A polymer gel based E2E procedure was also employed, using a 3D printed head phantom irradiated using a multi-metastases treatment plan covering the whole head, and MRI for data reading. Besides phantom-based measurements, the targeting accuracy of the system was estimated by analyzing the treatment and follow-up data of a patient treated for a thalamic functional lesion. Statistical analysis of the RCF-E2E tests showed TSE values (mean ± s.d.) of 0.40 ± 0.18 mm, 0.40 ± 0.19 mm and 0.55 ± 0.20 mm for the 6Dskull, Fiducial and XST tracking methods, respectively. These lie within corresponding ranges published in the literature for the CK system (Kilby et al 2010). Polymer gel TSE 6Dskull values ranging from 0.26 ± 0.08 mm to 0.66 ± 0.07 mm were found, showing that the sub-millimeter targeting accuracy of the CK is sustained even for target locations situated up to 80 mm away from the virtual isocenter. An average clinical TSE value of 0.87 ± 0.25 mm was measured using the FSE T2 and FLAIR post-treatment MR image data of the functional radiosurgery patient. No degradation of the manipulator reproducibility or system targeting accuracy was revealed from the analysis of the long-term RCF-based E2E tests carried out over a period of more than 10 years.