End-to-end test of an online adaptive treatment procedure in MR-guided radiotherapy using a phantom with anthropomorphic structures

Online adaptive treatment procedures in magnetic resonance (MR)-guided radiotherapy (MRgRT) allow compensating for inter-fractional anatomical variations in the patient. Clinical implementation of these procedures, however, requires specific end-to-end tests to validate the treatment chain including imaging, treatment planning, positioning, treatment plan adaption and accurate dose delivery. For this purpose, a new phantom with reproducibly adjustable anthropomorphic structures has been developed. These structures can be filled either with contrast materials providing anthropomorphic image contrast in MR and CT or with polymer dosimetry gel (PG) allowing for 3D dose measurements. To test an adaptive workflow at a 0.35 T MR-Linac, the phantom was employed in two settings simulating inter-fractional anatomical variations within the patient. The settings included two PG-filled structures representing a tumour and an adjacent organ at risk (OAR) as well as five additional structures. After generating a treatment plan, three irradiation experiments were performed: (i) delivering the treatment plan to the phantom in reference setting, (ii) delivering the treatment plan after changing the phantom to a displaced setting without adaption, and (iii) adapting the treatment plan online to the new setting and delivering it to the phantom. PG measurements revealed a homogeneous tumour coverage and OAR sparing for experiment (i) and a significant under-dosage in the PTV (down to 45% of the prescribed dose) and over-dosage in the OAR (up to 180% relative to the planned dose) in experiment (ii). In experiment (iii), a uniform dose in the PTV and a significantly reduced dose in the OAR was obtained, well-comparable to that of experiment (i) where no adaption of the treatment plan was necessary. PG measurements were well comparable with the corresponding treatment plan in all irradiation experiments. The developed phantom can be used to perform end-to-end tests of online adaptive treatment procedures at MR-Linac devices before introducing them to patients.

therefore one of the most prominent uncertainties in RT. Conventionally, this is accounted for by restricting the anatomical changes by means of immobilization aids (Verhey et al 1982) and by adding safety margins around the tumour volume ( van Herk 2004), which however, increases the irradiated normal tissue volume.
The development of online adaptive radiotherapy procedures using image-guidance has the potential of correcting for anatomical changes over the treatment course (Dawson and Sharpe 2006, Martinez et al 2001, Kontaxis et al 2015, Green et al 2019 and can be used to reduce margin sizes potentially leading to less side effects in normal tissue (Kron 2008) as well as the safe application of dose escalation to the tumour (Yan et al 1997). Mostly, image-guidance is performed by x-ray imaging using on-board kilo voltage cone beam computed tomography (kV-CBCT) (Jaffray et al 2002). However, kV-CBCT provides only poor soft tissue contrast and thus tumour visibility (Njeh 2008) and its applicability for identifying daily changes of the tumour and OAR is limited. To improve the soft-tissue contrast and to reduce the patient's exposure to ionizing radiation (Chang et al 1987), new hybrid devices have been introduced recently by combining a conventional linear accelerator (Linac) with magnetic resonance (MR) imaging for MR-guided radiotherapy (MRgRT) (Lagendijk et al 2008, Fallone et al 2009, Keall et al 2014, Paganelli et al 2018, Klüter 2019. With these MR-Linac machines, it is now possible to identify anatomical changes of soft tissue structures with much higher precision and based on online (i.e. while the patient is on the treatment couch) acquired images, the treatment plan may be adapted to the new anatomical situation.
Due to the complex adaption, end-to-end tests are needed to validate the entire chain of treatment planning, positioning, imaging and image registration, plan adaption and irradiation. Such workflow-specific endto-end tests evaluate the accumulation of uncertainties throughout the treatment procedure which may not be detected by component-by-component testing only (Zakjevskii et al 2016). Such end-to-end tests are already well established in conventional radiotherapy using phantoms such as the StereoPHAN ™ (Sun Nuclear Corp, Melbourne, FL, USA) or Lucy 3D ® QA Phantom (Standard Imaging Inc, Middleton, WI, USA) in stereotactic radiosurgery (Sarkar et al 2016). However, typical phantom inserts are static and only visible in CT, but not in MRI. For MR-Linac systems several MRI compatible phantoms are already available to perform end-to-end testing in case of intra-fractional motion, such as breathing motion, e.g. the QUASAR ™ MRI 4D phantom (Modus Medical Devices Inc, London, ON, Canada). To our knowledge, however, no phantom is commercially available yet being capable of simulating inter-fractional anatomical changes in a realistic and reproducible manner and being visible in both MRI and CT. While tests of image registration algorithms can be realized by a deformable phantom, the validation of dose delivery requires the use of dosimeters in 1D to 3D. A promising method to perform 3D dose measurements is the use of polymer gels (PG) (Schreiner 2006, Baldock et al 2010. PGs are based on radiation sensitive chemicals, which polymerize as a function of the absorbed dose when being irradiated. The resulting change in mass density and relaxation rate can be evaluated using either x-ray computed tomography (CT) (Hilts et al 2000) or MR imaging (MRI) (Venning et al 2005). PGs offer a high spatial resolution enabling measurements of steep dose gradients as they occur e.g. in intensity-modulated radiation therapy (IMRT) (Sandilos et al 2004, Vergote et al 2004. Moreover, they exhibit minimal influences by magnetic fields on their radiation response (Lee et al 2017), and radiation absorption properties equivalent to that of soft tissues (Baldock et al 2010, Schreiner 2015. In addition, structures mimicking various anthropomorphic imaging contrasts in CT and MRI are required to provide realistic conditions for image registration algorithms and treatment planning. In this study, we developed a new phantom with reproducibly adjustable anthropomorphic structures that can be filled either with PG or anthropomorphic imaging contrast materials. This phantom was used to perform an end-to-end test of an online adaptive treatment procedure at a 0.35 T MR-Linac Dempsey 2014, Klüter 2019).

Phantom design
For use in end-to-end tests of online adaptive treatment workflows in MRgRT, a phantom was designed according to the following requirements: (i) the phantom contains adjustable irregular geometric structures that can be reproducibly shifted and rotated. (ii) These structures provide anthropomorphic imaging contrasts in CT as well as in MRI. (iii) 1D, 2D or 3D detectors for dose measurements can be inserted into the structures. While (i) and (ii) are necessary to test the performance of image registration algorithms used for treatment plan adaption, (iii) enables verification of beam guidance and dose delivery.
The 'Anthorpomorphic QUality AssuRance phantom to study Interfractional Uncertainties in MRgRT (AQUARIUM)' consists of a polymethyl-methacrylate (PMMA) cylinder (diameter: 25 cm, height: 25 cm, wallthickness: 0.5 cm) which was filled with milliporous water enriched with 3.6 g l −1 sodium chloride (NaCl) and 1.25 g l −1 copper sulphate (CuSO 4 ) to increase the conductivity (American Association of Physicists in Medicine 2010) and to reduce T 1 relaxation. Up to eleven reproducibly shiftable and rotatable hollow PMMA rods fixed to various structures can be inserted into the phantom (figure 1). In addition, thimble ionization chambers can be led through the rods into the structures to perform absolute dose measurements.

Phantom structures
In this study, seven structures were designed and filled with different anthropomorphic image contrast materials or with PG, respectively. Five structures were 3D printed with the Objet500 Connex 3 3D printer (Stratasys, Eden Prairie, USA) using the VeroClear ™ printing material, which has been shown to be compatible with PG . For the other two, PG compatible Barex ™ (VELOX GmbH, Hamburg, Germany) container were used . The following structures have been fabricated: (i) CT contrast. To simulate bone, 1250 g l −1 dipotassium phosphate (K 2 HPO 4 ) and 1.6 g l −1 CuSO 4 was solved in water (Niebuhr et al 2016) and filled into a 3D printed element (figures 2(a), (f) and (k)). Three layers of gypsum bandages (Cellona,REF 20 110,Lohmann & Rauscher International,Rengsdorf,Germany) were additionally attached to the structure to ensure a high attenuation and were impregnated with a clear lacquer to protect the gypsum from the surrounding water (Lackspray Spezial SaBesto, Würth, Künzelsau, Germany). An identical 3D printed element was left air-filled to simulate low CT contrast (figures 2(b), (g) and (l)). (ii) MR contrast. Three different MRI contrasts were produced using Ni-DTPA doped agarose gel (Tofts et al 1993) with a 50 mM Ni-DTPA solution and filled into two 3D printed spheres with a diameter of 20 mm (44.2% Ni-DTPA solution, 1.6% Agarose, 54.2% water resulting in relaxation rates similar to fat) and 25 mm (15.6% Ni-DTPA solution, 3.3% Agarose, 81.1% water resulting in relaxation rates similar to muscle) (figures 2(c), (h) and (m)), respectively, and a Barex ™ vial (12.2% Ni-DTPA solution, 1.3% agarose, 86.5% water resulting in relaxation rates similar to prostate tissue). (iii) PG container. Two PG containers providing water-equivalent contrast were prepared. A 3D printed irregularly shaped container served as a tumour (figures 2(d), (i) and (n). Additionally, a PG filled Barex ™ container was used to simulate an organ at risk (OAR) (figures 2(e), (j) and (o)).

Polymer gel dosimetry
For 3D dosimetry, the PAGAT (PolyAcrylamide Gelatin gel fabricated at ATmospheric conditions) PG was used (Venning et al 2005). When being irradiated, the gel polymerizes as a function of the absorbed dose, which locally alters the relaxation rate R 2 of the transverse magnetization in MRI (Baldock et al 2010). The PAGAT gel was selected as it shows a small dose rate dependence (De Deene et al 2006) and can be produced in-house at low costs under atmospheric conditions. For conversion of R 2 -values to dose, a calibration was performed using

Fabrication
The PG is based on two monomers (2.5% w/w acrylamide and 2.5% w/w N,N′-methylene-bis-acrylamide) which are added as active components to a gelatin matrix (6% w/w Gelatin, 300 bloom, SIGMA Aldrich). Due to the high reactivity of the PG with oxygen the gel was flushed with nitrogen for 1 min to reduce the amount of dissolved oxygen (De Deene et al 2002) and 5 mM bis[tetrakis(hydroxymethyl)phosphonium] chloride (THPC) was added as an antioxidant. To protect the gel from light-induced polymerization (Koeva et al 2009), the gel containers were enwrapped in aluminum foil. Afterwards, they were placed in a desiccator, which was flushed with nitrogen for 10 min and stored in a refrigerator at 4 °C for 20-24 h. The gel containers were then removed 4 h prior to irradiation to adapt to room temperature.

MRI evaluation
48 h after irradiation, the gel containers were evaluated on a diagnostic 3T Magnetom Prisma fit (Siemens Healthineers, Erlangen, Germany). For temperature constancy within ±0.1 °C during MRI measurement, the containers were placed in a dedicated water-flow phantom . For quantitative R 2 measurement, the phantom was scanned within a 64-channel head/neck coil using a multi spin-echo sequence with 32 equidistant echoes and echo times of TE = 22.5-720.0 ms and an echo spacing of 22.5 ms. The scans were performed with a repetition time TR = 10 000 ms to avoid influences of T 1 -relaxation, a resolution of 1.0 × 1.0 × 1.0 mm 3 , and a band width of BW = 130 Hz/pixel. Furthermore, an additional high-resolution (0.5 × 0.5 × 0.5 mm 3 ) 3D-image was acquired for registration purposes to compare the measured 3D PG dose distribution with the planned dose (see section 2.2.3). This was performed with a standard true fast imaging sequence with steady state precession (TrueFISP) (Scheffler andHennig 2003, Chavhan et al 2008) as implemented by the MRI vendor using the parameters TR = 11.68 ms, TE = 5.84 ms, number of averages = 2, and a flip angle of 70°. For this scan, the water flow in the phantom was turned off to avoid flow artifacts.

Post-processing
The MR data was transferred to a personal computer and processed using an in-house developed Matlab (The Mathworks Inc., Natick, USA)-based PG evaluation tool  to pixel wise calculate the spin-spin relaxation rate R 2 = 1/T 2 and generate R 2 maps. An edge conserving total variation filter (Rudin et al 1992) was used for noise reduction while steep dose gradients are conserved (Mann 2017). Absolute dose maps were generated using the mono-exponential calibration curve, which was previously renormalized according to the high dose region in the treatment plan . Afterwards, co-registration of the MR-images for PG evaluation to the planning MR-images of the MR-Linac was performed on the image processing platform MITK (Nolden et al 2013) using a pointbased RigidClosedForm3D b-Spline 3rd order interpolation algorithm as implemented by the software and three uniquely defined landmarks on the surface of the gel containers. A 3D γ-map analysis (Low et al 1998) was performed of the TPS-calculated and measured dose distributions in the commercial software VeriSoft (PTW, Freiburg, Germany) using a passing criterion of 3%/3 mm (dose difference with respect to the local dose/ distance-to-agreement) and taking only dose levels larger than 10% of the maximum dose into account. The results of the γ-map analysis are presented as passing rates, i.e. the percentage of evaluated voxels that meets the gamma criterion.

General treatment workflow
The online adaptive treatment workflow to be tested in this study is visualized in figure 3. In this possible adaption workflow, first CT and MR scans of the patient are performed (pre-treatment imaging) with the MRI being acquired at the MR-Linac. For treatment planning, the CT is registered to the MRI and an electron density map (pseudo-CT) is created for dose calculation. Structures used for planning are now delineated based on the pretreatment MRI and a treatment plan is calculated based on the generated pseudo-CT. For the actual treatment, the patient is positioned again on the couch of the MR-Linac and an additional MRI (termed as online MRI) is acquired at each treatment session (Raaymakers et al 2017). To correct for anatomical changes, the treatment plan is adapted online to the current patient anatomy using the information of the online MRI and the previously generated treatment plan. For this, the pre-treatment MRI is registered deformably to this online MRI and the contours are transferred (Paganelli et al 2018). Using the resulting deformation, the electron density map is deformed accordingly to generate a pseudo-CT of the actual anatomical situation. Changes of the tumour position are first corrected by a setup correction. If there are further clinically relevant anatomical changes, the initial treatment plan is adapted and the new treatment plan is delivered to the patient.

End-to-end test of an online adaptive MRgRT treatment procedure
The AQUARIUM was used to perform an end-to-end test of an online adaptive treatment procedure at a clinical 0.35 T MR-integrated 6 MV flattening filter free linear accelerator (MR-Linac, MRIdian ® Linac, ViewRay, Inc., Oakwood Village, OH, USA). For this, the AQUARIUM was used in two different settings (table 1). As a preparation, a pre-treatment CT and MRI of the phantom in the reference setting were acquired and a treatment plan was generated. After planning, the AQUARIUM was positioned again at the MR-Linac and after an image-based setup correction, it was irradiated with a nominal dose rate of 630 MU/min under three different conditions: (i) the AQUARIUM being in the reference setting, (ii) after changing the AQUARIUM setting to the displaced setting without adapting the treatment plan, and (iii) after changing the AQUARIUM setting to the displaced setting and adapting the treatment plan online to the new phantom setting. For each experiment (i)-(iii), a new set of PG containers for both tumour volume and OAR was used and the AQUARIUM setting was exactly reproduced. The different steps of the treatment workflow including the online adaption are described in the following.

2.3.2.1.CT imaging
A pre-treatment CT was acquired for treatment planning at a SOMATOM confidence RT Pro (Siemens Healthineers, Erlangen, Germany) scanner using the following parameters: voltage 120 kVp, current 216 mAs, slice thickness 1 mm, and a resolution of 1 × 1 mm 2 .

2.3.2.3.Treatment planning
For treatment planning, the pre-treatment CT of the AQUARIUM in reference setting was registered to the corresponding pre-treatment MRI. As the configuration of the AQUARIUM was exactly the same in both images, a rigid registration was used to simulate optimal irradiation conditions. An intensity modulated radiotherapy (IMRT) treatment plan was calculated using the treatment planning system (TPS) of the MR-Linac with a dose calculation grid of 0.2 cm. The treatment plan was optimized to irradiate the PG-filled tumour at the centre of the phantom with nineteen equally spaced beams prescribing a homogenous dose of 4 Gy. The PG in the target was delineated as the gross target volume (GTV) and a uniform margin of 3 mm was added to define the planning target volume (PTV). The following objectives were used for optimization: V 4.00 Gy ⩾ 50%, V 3.80 Gy > 95%, V 4.28 Gy < 1% of the PTV and V 1.00 Gy < 30%, V 2.00 Gy < 1.00 cm 3 of the OAR (V x Gy being the volume in % or cm 3 receiving more than x Gy). The dose volume parameters achieved for the initial treatment plan with the AQUARIUM in reference position are displayed in table 2. The dose calculation was performed based on the electron density of the pseudo-CT.

2.3.2.4.Irradiation workflow
Prior to irradiation, an additional MRI was acquired using the same parameters as for the pre-treatment MRI. Subsequently, irradiations were performed under the three different conditions: (i) The AQUARIUM in reference setting. The AQUARIUM in the reference setting was aligned at the MR-Linac by means of the laser-system and moved to the isocentre position. Then, an online MRI was acquired and a setup correction was derived by rigidly registering the planning to the online MRI. After realizing the setup correction by a couch shift, the PG tumour was irradiated without any adaption of the treatment plan. (ii) The AQUARIUM in displaced setting without plan adaption. In the second experiment, the PG container were replaced and the configuration of the AQUARIUM was changed to the displaced setting (table 1). After positioning of the AQUARIUM at the MR-Linac, an online MRI was acquired and the   (i)). The changes are given as longitudinal shifts (Δz) and rotations (Δα) of each rod (see figure 1(b)).

Rod Structure
Modified setting relative to reference setting planning MRI was registered to the online MRI using the intensity-based deformable registration algorithm as implemented by the vendor (Bohoudi et al 2017) to transfer the contours of the treatment plan to the actual MRI data set. For this, the advanced registration mode of the system was used with the following parameters: deformation in both ways, tissue stiffness = 1, number of pyramids = 2, downsampling method = minimum, final grid size = 6, max. final iterations = 10, max. intermediate iterations = 8, and contour smoothing = 2. After applying an image-derived couch shift, the treatment plan was delivered without any adaption. Finally, the dose distribution was recalculated on the actual MRI without reoptimization using the respective pseudo-CT. (iii) The AQUARIUM in displaced setting with plan adaption. In the third experiment, the PG container were again replaced while keeping the displaced setting of the AQUARIUM (table 1). The deformable registration was performed in the same way as in (ii) and after applying an image-based couch shift, the treatment plan was adapted to the new configuration of the AQUARIUM using the same optimization objectives as for the initial treatment plan. The adapted treatment plan was then delivered to the phantom.

Deformable image registration
Qualitative evaluation revealed that the deformable image registration algorithm was able to deform the planning MRI of the AQUARIUM in the reference setting to the online MRI of the AQUARIUM in the displaced setting. All shifted and rotated structures were accurately matched. The corresponding deformation vector field was then applied to the contours and the treatment planning CT to generate the pseudo-CT required for dose calculation on the actual geometry. No artefacts were found in the deformed images. All deformed contours matched well the corresponding structures in the online MRI, in the deformed planning MRI, and in the pseudo-CT.

Treatment plan evaluation
The dose volume parameters of the three treatment plans delivered to the AQUARIUM are depicted in table 2 and figure 4 displays the corresponding dose volume histograms (DVH). While all dose objectives were met for the AQUARIUM in reference setting (i), application of the same plan for the displaced setting (ii) lead to a clear under-dosage of the PTV (43.80% at 3.80 Gy) and an over-dosage in the OAR (40.58% at 1.00 Gy). In contrast, applying the adapted plan to the AQUARIUM in the displaced setting (iii), the dose distribution was restored and the dose objectives in the PTV were met. In addition, the over-dosage in the OAR was reduced again (32.22% at 1.00 Gy). Figure 5 displays representative dose profiles for the PG-filled tumour and OAR measured in the AQUARIUM in reference setting when delivering the initial treatment plan (i). No significant dose deviation from the prescribed dose was found in the tumour and the dose volume parameters of the OAR met the objectives used for plan optimization. This is also reflected by the dose calculation, which agrees well with the measurement. Comparing measurement and calculation results in 3D passing rates of the γ-index of 96.4% and 93.7% with only a few voxels with absolute dose differences of up to 0.25 Gy and 0.12 Gy for the tumour and the OAR, respectively. Figure 6 shows representative dose profiles for the PG-filled tumour and OAR measured in the AQUARIUM in displaced setting when delivering the initial treatment plan (ii). As a result, a significant under-dosage down to 45% of the planned dose was measured in the PG tumour while the OAR experienced a large over-dosage of up to 180% of the initially planned dose. These results correspond well with the dose distribution recalculated for the new geometry. Comparing measurement and calculation results in 3D passing rates of 96.1% and 94.7% with only a few voxels with absolute dose differences of up to 0.30 Gy and 0.18 Gy in the PG tumour and OAR, respectively.  Figure 7 shows representative dose profiles for the PG-filled tumour and OAR measured in the AQUARIUM in displaced setting when delivering the adapted treatment plan (iii). The online re-optimization of the treatment plan restored the dose distribution. No significant dose deviation from the prescribed dose was found in the tumour and the dose levels in the OAR were comparable to the case, when the AQUARIUM was irradiated in the reference setting using the initial treatment plan. This is also reflected by the dose calculation, which agrees well with the measurement. Comparing measurement and calculation results in 3D passing rates of 93.1% and 94.1% with only a few voxels with absolute dose difference of up to 0.25 Gy and 0.12 Gy in the PG tumour and OAR, respectively.

Discussion
In this study, an end-to-end test of an online adaptive treatment workflow was performed at an MR-Linac using the newly developed AQUARIUM. The new phantom allows simulating the complete workflow including the validation of the implemented image registration algorithms, the online adaption of the treatment plan and the verification of the dose delivery. To our knowledge, already existing phantoms (e.g. StereoPHAN ™ by Sun Nuclear Corp, Lucy 3D ® QA Phantom by Standard Imaging Inc, QUASAR ™ MRI 4D by Modus Medical Devices Inc., Niebuhr et al 2019) used for end-to-end testing in various radiotherapy treatment procedures are either static, do not resemble anthropomorphic image contrasts in CT and MRI, focused on intra-fractional motion, or are not able to simulate anatomical changes in a highly reproducible way. As compared to the existing phantoms, the AQUARIUM is capable of simulating inter-fractional anatomical changes in a realistic and reproducible manner and it provides anthromorphic imaging contrasts in both CT and MRI. In this work, the image registration algorithm was challenged by using adjustable irregular geometric structures having anthropomorphic image contrast. The reproducible setting of the structure configuration is ensured by scales allowing for adjustments with an accuracy of better than 1 mm and 2.5°, respectively, if settings defined by the scale marks are used ( figure  1(b)). For dose measurements, it is possible to use 3D polymer gels, ionization chambers or thermoluminescence detectors (TLD) that can be inserted or attached to the tumour or OAR structures.
Experiments in this study were performed at the ViewRay MRIdian ® Linac machine using a magnetic field strength of B 0 = 0.35 T. However, the use of the AQUARIUM is not limited to this device. All phantom mat erials are as well compatible with higher magnetic field strengths used in other MR-Linac devices. This makes the phantom a versatile tool for comparative end-to-end tests at different MR-Linac centres. However, higher magnetic field strengths might induce additional image artefacts in specific imaging sequences not being observed at B 0 = 0.35 T. These image artefacts could have an impact on the image registration accuracy and would have to be evaluated when using the AUQARIUM at higher field strengths.
In this work, three irradiation experiments were performed. While experiment (i) acts as a reference measurement under ideal conditions, where no adaption was necessary, (ii) represents a negative control demonstrating the effect on the dose distribution, when geometrical changes are not considered by treatment plan adaptions. Experiment (iii), finally, represents an end-to-end test of the clinically intended adaptive treatment workflow.
In this study, the parameters of the deformable image registration algorithm were optimized to cope with the geometrical changes in the AQUARIUM. To register the pre-treatment CT with the pre-treatment MRI, a rigid registration was chosen, since there were no displacements within the phantom. Hence, the two images were perfectly aligned after registration allowing for treatment planning under ideal conditions. This was also the case for the registration of the pre-treatment to the online MRI in experiment (i). In the irradiation experiments (ii) and (iii), the pre-treatment and the online MRI were registered deformably and the parameters of the algorithm were optimized until all structures were fully aligned and the treatment planning contours were deformed using the resulting deformation vector fields. In this study, the choice of the downsampling method had the largest impact on registration quality. Similar to a real patient treatment, where a physician has to check the transferred contours prior treatment, we also checked this for the new phantom geometry. However, the systematic evaluation of the registration algorithms as well as the identification of its limitations was beyond the scope of this study and requires further work. This also includes testing of the deformable registration algorithm for other scenarios such as tumour growth or shrinkage, which may be simulated using the AQUARIUM with differently shaped and sized 3D printed tumour structures. For this purpose also flexible inserts may be inserted to the AQUARIUM to simulate organ deformations (Niebuhr et al 2019).
Comparing the results of the 3D dose measurements clearly demonstrates the benefit of the online treatment plan adaption for the shifts and rotations of the phantom structures employed in our study: The deteriorated dose distribution within the PTV and the significant increased dose within the OAR obtained in experiment (ii) were completely restored by the adapted treatment plan in experiment (iii) leading to a uniform dose in the PTV and a significantly reduced dose in the OAR. Both dose distributions agree well with those of the reference experiment (i), where no adaption was necessary. This is also reflected by the dose volume parameters (table 2) and the DVH ( figure 4).
For the present study, it was important to capture the dose distribution within the tumour and OAR in 3D. As this is not feasible with point-like detectors such as ionization chambers or TL-detectors, PG was used. As the dosimetric accuracy of PG is usually lower than that of standard detectors, high efforts were taken to be as accurate as possible. Interestingly, measured and planned dose distributions agreed very well for the standard irradiation as well as in the new geometry in experiment (ii) and (iii). This is demonstrated by the γ passing rates of 96.4%, 96.1%, and 93.1% in the tumour and 93.7%, 94.7%, and 94.1% in the OAR for irradiation (i)-(iii), respectively. Maximum deviations relative to the planned dose were in the order of 0.1-0.3 Gy, which can be considered as small for PG measurements (Baldock et al 2010) and are within the overall dose resolution of gels (Baldock et al 2001(Baldock et al , 2010. The registration of the measured dose distributions within the PG containers to the phantom images allows for a geometric validation of the planned dose distribution. The good agreement between measured and calculated dose confirms that our PG measurements are reliable and that the combination of the AQUARIUM with PG-filled structures can be used to perform full end-to-end tests of adaptive treatment procedures at MR-Linac devices. Moreover, due to its generality, the method can also be applied to verify image-guided treatment workflows at other modern image-guided radiotherapy devices, such as conventional Linacs, Tomotherapy (Mackie et al 1999), or Cyberknife (Kilby et al 2010) machines.
For the 3D dose evaluation, it should be kept in mind that the calibration curve was normalized based on a reference point within the calculated dose distribution for the AQUARIUM, which is a standard procedure in PG dosimetry . A recently published method by Mann et al (2019) suggested a new method, which uses TLDs within the same experiment to normalize the calibration curve. In principle, this method can also be adapted to the present experimental setup by additionally attaching several TLDs around both the OAR and target structures. Since both PG and TLDs measure time-integrated doses (De Deene and Vandecasteele 2013, Murthy 2013), such a combined dosimetric system within the AQUARIUM may also be used to simulate a whole fractionated treatment scheme with anatomical changes between the fractions being corrected by online adaptions of the treatment plan.
As a major draw-back, quantitative PAGAT dosimetry requires up to 48 hours until the polymerization process stabilizes . It is therefore not feasible to evaluate the PG at the MR-Linac directly after irradiation. However, as recently shown by Dorsch et al (2019), evaluation of geometrical parameters such as isocenter alignment and image distortions directly after irradiation is feasible and exhibits results comparable to those of films. This would allow the visualization of sharp dose gradients, e.g. an under-dosage within OARs directly after irradiation as long as no absolute dose levels are required.
In our treatment simulation there was no independent online quality assurance (QA) of the adapted treatment plan. Although this is a required step in patient treatment, the PG measurements in the AQUARIUM confirmed that the re-optimization of the treatment plan and dose delivery was correctly performed. In this study, we developed an end-to-end test for the whole chain of an online adaptive treatment workflow and successfully performed a dosimetric validation.

Conclusion
In this study, a new phantom with adjustable anthropomorphic structures has been developed. The phantom was used to perform an end-to-end test of an online adaptive treatment procedure at a 0.35 T MR-Linac by simulating the complete workflow including the validation of image registration, treatment plan adaption and dose delivery. 3D dosimetry gel measurements confirmed that the adapted treatment plan resulted in dose distributions in the tumour and the OAR that were well-comparable to a static case, where no adaption of the treatment plan was necessary. The developed phantom can be used to perform end-to-end tests of online adaptive treatment procedures at MR-Linac devices before introducing them to patients.