Evaluation of continuous beam rescanning versus pulsed beam in pencil beam scanned proton therapy for lung tumours

The treatment of moving targets with pencil beam scanned proton therapy (PBS-PT) may rely on rescanning strategies to smooth out motion induced dosimetric disturbances. PBS-PT machines, such as Proteus®Plus (PPlus) and Proteus®One (POne), deliver a continuous or a pulsed beam, respectively. In PPlus, scaled (or no) rescanning can be applied, while POne implies intrinsic ‘rescanning’ due to its pulsed delivery. We investigated the efficacy of these PBS-PT delivery types for the treatment of lung tumours. In general, clinically acceptable plans were achieved, and PPlus and POne showed similar effectiveness.


Introduction
Pencil beam scanned proton therapy (PBS-PT) has shown dosimetric improvements over conventional photon radiotherapy due to an achievable high dose conformity to the target, while reducing the dose to the organs-at-risks (OARs) (Diwanji et al 2017). The potential clinical benefits of PBS-PT are particularly relevant for the treatment of moving targets, such as thoracic and abdominal cancers, due to the critical OARs surrounding the tumour (lung and oesophagus, heart, and spinal cord) (Chang et al 2014). However, the deployment of PBS-PT to thoracic tumours is still hampered by the high sensitivity of this modality to several uncertainties, resulting in pronounced differences between planned and delivered doses . Uncertainties that can compromise the robustness of PBS-PT treatment plans for moving targets are: machine uncertainties, setup and range errors, anatomic variations throughout the treatment, and interplay effects. Interplay effects occur due to the respiratory induced motion of the target volume relative to the delivered pencil beams. Consequently, the dose might be delivered in-homogeneously, creating hot or cold spots inside the tumour or delivering more dose to the OARs nearby (Grassberger et al 2013). Rescanning is a motion mitigation technique which has been shown to be effective in reducing interplay effects in PBS-PT (Knopf et al 2011). Using rescanning, the target is irradiated multiple times during each field delivery. By this means, hot or cold spots can be smoothed out and adequate target coverage could be achieved.
By the same vendor (IBA, Louvain-la-Neuve, Belgium), two different types of PBS-PT machines are provided: Proteus®Plus (PPlus) (Marchand et al 2000, Galkin et al 2014, Saini et al 2016, which uses isochronous cyclotrons to accelerate the protons and Proteus®One (POne) (Pearson et al 2013, Pidikiti et al 2018, using superconducting synchrocyclotrons instead. Among others, one of the main differences between these two machines is that the beam pulses of PPlus and POne occur on the nanosecond and millisecond scales, respectively. This is the reason why, on macro-scale, PPlus is considered a continuous beam, while POne is considered a pulsed beam. Compared to a continuous beam, a pulsed beam has increased delivered dose uncertainty per burst. As a result, on POne, each spot is delivered in multiple bursts to ensure total dose accuracy per spot. Part of the dose for a spot is delivered by the initial burst, afterwards missing dose is calculated and another burst is delivered till prescription for every spot is reached. Typically, this takes about three to four bursts for most pencil beams. The total number of bursts depends, however, on the dose that is delivered per fraction. This delivery behaviour of POne machines can therefore replicate a sort of intrinsic (uncontrolled) rescanning. Conversely, for the PPlus machines, rescanning, if applied, is scaled (controlled) (Zenklusen et al 2010). This type of rescanning can be done using either a layered or a volumetric approach (Bernatowicz et al 2013). As such, adding rescanning in PPlus inevitably increases the treatment time (compared to the same plan delivered without rescanning).
This study aims to quantify the mitigation capability against interplay effects provided by the delivery architectures of those two IBA systems. Particularly, the question whether conventional scaled rescanning or intrinsic 'rescanning' (given by PPlus and POne, respectively) is more effective for the treatment of moving targets remains unanswered so far. For this purpose, we analysed here different thoracic treatment plan approaches delivered either with PPlus or POne by employing a comprehensive 4D robustness evaluation method (4DREM) (Ribeiro et al 2019). As such, this work ultimately compares PPlus and POne machines in terms of achievable robustness for interplay effects, together with other potential PBS-PT dosimetric impacts.

Patient data, delineations, and target characteristics
Five stage III non-small cell lung cancer (NSCLC) patients, treated in the past with photon therapy, were included in this study. Through a clinical trial approved by the medical ethical committee, all patients provided written informed consent (ClinicalTrials.gov NCT03024138) to acquire a planning four-dimensional computed tomography (4DCT) scan (before start of treatment) and five repeated 4DCTs (in consecutive weeks of the treatment course). The individual 4DCT phases of the patients included in this study were confirmed to be major-artefact-free, with appropriate field-of-view, and without any relevant missing slices. All delineations were performed by treatment planners under supervision of radiation oncologists. Clinical target volumes (CTVs) were delineated on all 4DCT scans (each representing ten breathing phases). Relevant OARs (heart, spinal cord, oesophagus, lungs minus GTV, among others) were defined on the end-of-exhalation planning CT phase (reference CT). Target characteristics (primary tumour location, CTV volume, and CTV motion) were extracted for all five patients (table 1). The CTV motion amplitude per patient was quantified through the deformation vector fields resulting from deformable image registration (DIR) between the end-of-exhalation and end-of-inhalation phases of each 4DCT. As DIR algorithm, the Anatomically Constrained Deformation Algorithm method was used, with the CTVs as controlling regions of interest (Weistrand and Svensson 2015). The reported motion values represent the mean of all the deformation vector lengths within the CTV (Inoue et al 2016). For each patient, this mean was then averaged over all available 4DCTs. CTV volumes were averaged over all end-of-exhalation phases of the patient 4DCTs.

Treatment planning
For all patients, intensity-modulated proton therapy (IMPT) treatment plans using the Monte Carlo dose engine were created in RayStation 6.99 (RaySearch Laboratories, Stockholm, Sweden) treatment planning system (TPS) (Grassberger et al 2014, Taylor et al 2017. Prescribed dose (for a constant relative biological effectiveness (RBE) of 1.1) was 60.00 Gy RBE (2.40 Gy RBE in 25 fractions) for all patients. Three beam directions were used for all plans. Patient-specific beam arrangement was selected based on tumour location, minimisation of OARs dose, plan robustness, and compliance with planning criteria (table 1). The treatment table was added for all patients in all images with a specific override to water (0.189 g cm −3 ) applied for dose calculations. Beams travelling through the edges of the treatment table were avoided. The minimax robust optimisation approach (Fredriksson et al 2011) was used, which takes into account multiple scenarios. These scenarios aimed for robustness against range uncertainties of ±3% and setup uncertainties of 6.0 mm (equivalent to the internal CTV to planning target volume margin used for lung cancer patients treated with photon therapy in our clinic). Robust optimisation was used for the minimum dose on the CTV, and the penalty of the worst-case scenario is then minimised (Fredriksson et al 2011). Inoue et al (Inoue et al 2016) already showed the impact of this robust optimisation in PBS-PT for stage III NSCLC patients.
To account for changes due to the breathing motion (besides setup and range uncertainties), a 4D robust optimisation approach was used for all plans (Liu et al 2016). The plans were created on the reference CT, and all planning 4DCT phases were included in the optimisation process. The 4D robust optimised plans were then created by optimising the CTV worst-case scenario dose distribution for all planning 4DCT phases. All nominal plans were ensured to have sufficient target coverage (V 95 (CTV) ≥ 98%) and minimised OARs dose. Achieved mean dose to the CTV was aimed to be within ±0.50 Gy RBE from the prescribed dose (maximum of ±1.00 Gy RBE ). Subsequent to preliminary robustness evaluation for setup and range errors, all plans were visually inspected in several meetings within a multidisciplinary team of treatment planners, radiation oncologists, and medical physicists. The created plans were revised until clinical acceptance was achieved. A range shifter, placed downstream of the nozzle, slows down the protons, lowering their energy to reduce the range of the proton beam and treat shallower tumours. It was added to the beam to treat the tumours located at approximately 4 cm water equivalent thickness from the patient's surface. With the addition of a range shifter, due to scattering, the beam is broadened and so the delivered spots will become larger for increased air gaps. This results in plans which may maintain target coverage robustness, but simultaneously deliver more dose to the OARs (Grassberger et al 2013, Both et al 2014, Liu et al 2018. This is why in case there was a range shifter in the beam, the air gap size (distance between the patient surface and the most downstream part of the treatment machine) was minimised to 5 cm in order to reduce the spot size after the range shifter. The air gap for beams delivered without range shifter automatically varied according to a snout position of 42 cm, which is the most retracted.

PPlus and POne plan delivery
Particularly for POne, it is not possible to predict how the delivery will be split since this is updated in real time between scannings. Therefore, in this study we obtain the time structure for both proton systems retrospectively, on the basis of delivery log files.
To compare the suitability of POne and PPlus delivery techniques for NSCLC, treatment plans were prepared to be delivered in both machine types. After clinical acceptance, the created 4D optimised PPlus and POne treatment plans were delivered in dry runs at the respective proton facilities to obtain machine log files. For the specific PPlus and POne facilities where the log files were acquired, the energy switching times varied between 0.7 and 1.0 s. The spot sizes at the PPlus and POne beam line range from 6.5 mm to 3.0 mm and 7.0 mm to 3.5 mm for proton energies from 70 MeV to 230 MeV in air (sigma at isocentre), respectively.
As recommended by Bernatowicz et al (Bernatowicz et al 2013), layered rescanning was applied to the PPlus plans for superior robustness. Five rescans were chosen, following clinical practice. For POne, due to the nature of the pulsed beam, the plans comprise inherent 'rescanning' , and about three to four 'rescans' per delivery are implicitly performed. Average field delivery time for each plan per patient was extracted (table 1).

4D robustness evaluation method (4DREM)
A 4DREM, implemented using in-house developed Python scripts and utilising features available in the TPS, was used to evaluate all IMPT plans (Ribeiro et al 2019). This method assesses the robustness of thoracic PBS-PT plans to the combined possible disturbing effects: (1) setup and range errors (simulated considering the fractionation smoothening effect of a treatment course), (2) machine errors, (3) anatomy changes, (4) breathing motion, and (5) interplay effects.
First, the nominal plan was split into sub-plans using a dedicated script (log file interpreter) that retrieves information from the log files (spot position, dose, and energy and the absolute time of delivery) . Machine errors, anatomic changes, breathing motion, and interplay effects are simultaneously considered by calculating sub-plans on 4DCT phases, and subsequently accumulating these dose distributions onto the reference CT. Setup errors are simulated by shifting the planning isocentre by a total magnitude of 2 mm (Ribeiro et al 2019), divided by a systematic portion and a random component for different fractions (Van Herk et al 2000). The reduction from 6 mm (optimisation setup error) to 2 mm (setup error in the 4DREM) has been established internally due to the disregard of the patient inter-fractional variability, which is plausible since repeated 4DCTs are already accounted for (Sonke et al 2009, van der Laan et al 2019, Anakotta et al 2020, den Otter et al 2020. Range errors are included by randomly applying a perturbation of 0 or ±3% on the CT densities for different scenarios (Fredriksson et al 2011).
In total, with the 4DREM, for each evaluated plan, 14 4D accumulated scenario dose distributions were obtained, each representing a possible treatment course of the nominal plan (Malyapa et al 2016. For each scenario, eight fractions were taken into account (Lin et al 2015). The available 4DCTs were distributed and equally weighted through the eight evaluated fractions. For the first two fractions, 4D dose accumulation of sub-plan doses was performed on the planning 4DCT. For the subsequent two fractions, the first repeated 4DCT was used. For the last four fractions, the remaining repeated 4DCTs were successively selected. The 4DCT starting phase of the delivery was randomly selected.
Plan robustness was then assessed on the reference CT through the obtained scenario doses. The voxel-wise worst-case minimum and maximum, corresponding to the minimum and maximum dose per voxel, respectively were computed (Harrington et  . The voxel-wise worst-case minimum over all 14 4D accumulated scenarios (4DVwa min ) is used to report on the minimum dose statistics for the target. Conversely, the voxel-wise worst-case maximum over all 14 4D accumulated scenarios (4DVwa max ) is used to report on target maximum doses and OAR DVHs.

Treatment plan robustness evaluations
Using the sub-plans (derived from the log files) and all six patient 4DCTs, the previously described 4DREM was executed in the TPS for all calculated plans of the five NSCLC patients to evaluate their robustness. The V 95 (CTV) and D 98 (CTV) values were extracted from the 4DVwa min , and the D 2 (CTV) from the 4DVwa max . Additionally, the OAR DVH indices D mean (lungs-GTV), D mean (heart), D 1 (spinal cord), and D mean (oesophagus) (MLD, MHD, D 1 (spine), and MOD, respectively) were averaged over all scenarios resulting from the execution of the 4DREM, and extracted for all plans.
Before comparing the intrinsic 'rescanning' and scaled rescanning (given by POne and PPlus machines respectively) for moving targets, a robustness analysis between different plans within the PPlus was made. Robustness comparisons for different PPlus treatment plans were done to select the maximally robust strategy for PPlus for further comparisons with POne. To investigate the influence of rescanning on 4D PPlus plan robustness, we compared PPlus plans without and with five times layered rescanning (one scan and five rescans respectively) for all five patients.
The efficiencies of POne and PPlus machines for the treatment of moving targets were finally evaluated and compared. We specifically wanted to find out if the scaled rescanning of PPlus as motion mitigation strategy was comparable in terms of robustness and treatment delivery times with the intrinsic 'rescanning' of POne for the patients included in this study.

Results
Concerning delivery times in PPlus machines, as expected, there was a substantial increase (up to 52.7 s per field for patient 1) in the delivery time for all PPlus plans when applying rescanning (table 1). The POne plan delivery times were below delivery times of PPlus plans with 5 rescans, but higher than PPlus plans delivered in 1 scan. The difference in average delivery time per field between POne and PPlus 1 scan plans reached up to 14.0 s (for patient 2).
To illustrate the effect of rescanning on the target coverage robustness for PPlus, and if this robustness was maintained when constructing POne plans, we compared the 4DVwa min dose distributions that resulted from the PPlus plans with 1 scan, the PPlus plans with 5 rescans, and the POne plans. As can be seen by the sample case in figure 1(a), there were no clear visual differences in dose distribution between the 4DVwa min of the three plans. There was also no general improvement in target coverage with the application of rescanning on the PPlus delivery machine for all the cases analysed (table 1). Conversely, a constant improvement in D 2 -D 98 (CTV) values was verified with the application of rescanning on the PPlus. For all patients, no clinically relevant differences in 4DVwa min V 95 (CTV) values were observed between the POne plans and the PPlus plans (excluding patient 5, these were constantly ≥99.86%). In general, target homogeneity proved to be superior for POne in all cases, when compared to PPlus 1 scan (on average a 0.41 Gy RBE enhancement in homogeneity was confirmed). However, considering all five cases, it did not show to be consistently improved (nor worsened) between POne and PPlus 5 rescans delivery.
For patient 5, the target coverage for all PPlus plans and POne plan was not adequate. One possible explanation for this can be the variability in patient positioning detected in the repeated 4DCTs (relative to the planning 4DCT).
The averaged relevant OAR dose parameters obtained over all scenarios considered within the 4DREM for all PPlus and POne plans were plotted for all patients ( figure 1(b)). In general, there were no considerable differences for doses to relevant OARs between the two different machines and the application (or no) of rescanning in PPlus. Additionally, as expected, the OAR dose deviations between different 4DREM scenarios within one plan were patient specific, and particularly more prominent for OARs closer to the CTV. The largest OAR dose SDs between different scenarios were observed in the PPlus 5 rescans plans. These variations reached up to 12.81 ± 0.44 Gy RBE (MLD), 9.52 ± 1.02 Gy RBE (MHD), and 35.90 ± 2.06 Gy RBE (D 1 (spine)) for patient 1, and 10.66 ± 1.00 Gy RBE (MOD) for patient 3.

Discussion
For this project, two different types of PBS-PT machines have been assessed. The intrinsic 'rescanning' of POne has been compared to the scaled rescanning of PPlus for the treatment of moving targets. Essentially, the effectiveness in terms of robustness for target coverage and homogeneity and OAR dose statistics for different perturbation scenarios in PPlus and POne has been investigated.
All created IMPT 4D robust optimised treatment plans, delivered either with PPlus or POne, were evaluated using the 4DREM (Ribeiro et al 2019). As such, possible dosimetric impacts influencing PBS-PT delivery of a moving target (setup and range errors, machine errors, anatomy changes, breathing motion, and interplay effects) are considered. The plan specific disturbing scenarios and resultant 4DVwa min and 4DVwa max dose distributions were carefully analysed for all NSCLC patients included in this study. Some limitations of the 4DREM are the reduced number of fractions assumed for each scenario simulation, the introduction of DIR intrinsic uncertainties in 4D dose accumulations and dose accumulation itself, and the reliability on motion information captured by 4DCT. Robustness evaluation accuracy could be further improved by including more case-specific data points, such as treatment-fraction specific imaging (as provided in the future at our clinic with 4DCBCT (Niepel et al 2019)), or even realistically considering the tumour intra-fractional motion variability (Souris et al 2019). Additionally, the probabilistic sampling inherent to the 4DREM setup error simulations can also cause some minor variations in the obtained results from this method. However, the 4DREM represents a comprehensive approach to inspect the robustness of PBS-PT plans for thoracic indications, by including the combination of numerous substantial uncertainties (Ribeiro et al 2019).
In line with the previous results by Liu et al (2016) and other publications showing differences in favour of 4D robust optimisation planning in relation to 3D robust optimisation (Graeff 2014, Yu et al 2016, Ge et al 2019, 4D optimisation was initially selected for all created plans to get the best out of both PPlus and POne. To encompass the whole averaged breathing cycle, we decided to include all phases of the planning 4DCT. Naturally, this choice increases optimisation time considerably compared to a strategy that would have included only a limited number of phases (such as the extreme end-of-exhalation and end-of-inhalation phases) in the optimisation process.
Since most 4D optimised PPlus plans (except for patient 5) were already robust without rescanning, rescanned PPlus plans did not prove better target coverage robustness than their respective non-rescanned plans. However, as expected, rescanning within PPlus proved better target homogeneity. Besides the impact of rescanning on the dose to the OARs being slightly more evident than without rescanning for the PPlus plans in several 4DREM scenarios, these differences were not clinically meaningful. Only five times layered rescanning was explored here. Results could differ for a different number of rescans. However, previous work has shown that an increased number of rescans does not always result in better target dose (Knopf et al 2011, Bernatowicz et al 2013. Future work comparing PPlus and POne may also include different numbers of rescans for PPlus. The target coverage in the 4DREM failed for all delivered PPlus and POne plans for patient 5. We confirmed a pronounced variation in the position of the patient along one of the beam directions (left-anterior-oblique) throughout the course of treatment, compared to the planning situation. This led to considerable dosimetric impacts within the CTV when performing 4D dose accumulations between planning and repeated 4DCTs in the 4DREM. However, distinct shifts in position such as the ones verified for this patient would most probably be adjusted during verification of the patient positioning in a proton clinic treatment workflow.
A drawback of this study is the rather limited number of patients included in the robustness comparisons, and so conclusions were based on general trends. Even though only five NSCLC patients (with limited to moderate CTV motion amplitudes) were presented here, these cases are highly heterogeneous concerning primary tumour location and representative in terms of motion extent of the lung cancer patient population treated with PBS-PT (den Otter et al 2020). Furthermore, all delineations for the extensive 4DCT imaging available are approved by radiation oncologists. Clinically acceptable treatment plans and numerous scenarios were calculated for each of these patients. In total, for each patient, three plans were examined (1 4D PPlus 1 scan, 1 4D PPlus 5 rescans, and 1 4D POne). Additionally, 42 respective scenario doses for each patient (14 for each plan) were considered. We are confident that the extent of this study is sufficient to determine general trends. However, to perform statistical tests in the future, more NSCLC patients (with different CTV volume and CTV motion characteristics) should be included in the comparisons, and therefore we suggest further investigation on this topic.
The target motion amplitudes reported for these patients were quite limited (as high as 5.7 ± 1.3 mm). However, these were quantified by the mean of all deformation vector lengths from DIR within the whole CTV (CTV of primary tumour plus pathological lymph nodes), averaged over all available 4DCTs. Therefore, it might be that there are regions of the CTV with larger movement, or even weeks of treatment with higher motion amplitudes, and of course these values would change if other quantitative metrics would be used.
In our investigation, POne and PPlus plans showed similar target coverage robustness. POne improved the target homogeneity compared to PPlus without rescanning. This in-homogeneity in the PPlus plans could, however, be mitigated by applying rescanning. Comparable dose to OARs between PBS-PT facilities were also achieved. Within the PPlus plans, irradiation times from 1 scan to 5 rescans increased on average 79%. From PPlus 1 scan to POne, field delivery times also increased, but not as remarkably as the former comparison. Fields in POne took on average 10.1 s (32%) longer to be delivered than PPlus without any rescanning.
With stereotactic treatment cases, which are not an obvious indication for proton therapy, a more pronounced deterioration in target coverage is expected from PPlus to POne. This conclusion comes in the scope of a previous similar analysis performed including other lung cancer patients with mainly small targets and relatively large motion amplitudes (Dumont et al 2018). A considerable small target volume can be more sensitive to POne delivery since the dose between different scannings is not evenly distributed. Therefore, scanning delivery to specific breathing phases can largely affect the dose homogeneity, and consequently impair the target coverage. All the remaining conclusions of this previous study concerning the evaluation of PPlus 5 rescans vs. POne are in accordance with the present technical note, for which all patients included comply with the target characteristics that make them suitable for proton treatment.
Overall, based on the results of this work, we conclude that clinically acceptable robust plans can be achieved for moving targets treated with PPlus, as well as with POne technologies. The performances of PPlus and POne machines were comparable. The scaled rescanning of PPlus has similar effectiveness in reducing interplay effects (together with other potential PBS-PT disturbing effects) than the intrinsic 'rescanning' of POne for NSCLC. Attention was also given to the plan optimisation and evaluation of the dose on the OARs, which ultimately proves the equality in the capabilities between PPlus and POne for NSCLC.