Introduction

Central Nervous System (CNS) tumours account for approximately 25% of all childhood neoplasms. Improvements in multimodality treatment regimens including surgical resection, focal and craniospinal radiotherapy (RT) and chemotherapy, have led to the 5-year overall survival rate of around 75% for this group of tumours in UK children [1]. Conventional RT (photon RT), which uses photon (x-ray) beams to target cancer cells, has made a significant contribution to survival, however it is associated with long-term adverse effects resulting from damage to adjacent healthy tissue which can lead to long-term cognitive, developmental and behavioural dysfunction [2,3,4]. These are caused by a combination of the direct and indirect impact of the tumour itself and also patient and treatment related parameters. There has been increasing interest in the potential of proton beam therapy (PBT) to reduce these late adverse events. Compared to photon RT, PBT is associated with smaller volumes of non-target irradiated normal tissue [5,6,7,8,9] largely due to the near complete elimination of exit dose [10]. Based on modelling assumptions from dosimetric studies, PBT has been adopted as the primary RT treatment modality for selected paediatric CNS tumours in several healthcare systems worldwide. In turn it is assumed that the radiodosimetric advantage of PBT will translate into improved clinical benefits such as a reduction in neuro-psychological sequalae and a lower incidence of radiotherapy induced second tumours.

The utility of systematic reviews to summarise research evidence in a non-biased, reproducible and transparent way is well established. Our initial scoping review identified three published systematic reviews that had investigated the effectiveness of PBT [11,12,13]. In all three, searches were up to 2014, meaning they were all out of date. In addition one had missing studies [11], one included both adults and children with brain tumours [12] and one included all paediatric cancers, not just brain tumours [13]. With the recent opening of two UK NHS proton facilities in Manchester at The Christie Hospital and in London at the University College London Hospital (UCLH) [14] [15], it is timely for an up-to-date assessment of the evidence base.

The aim of this systematic review was to evaluate the effectiveness of PBT in children and young adults with CNS tumours to assess the potential benefits and harms and identify any research gaps.

Methods

Protocol

Standard systematic review methodology aimed at minimising bias as recommended by the Cochrane Collaboration was employed and reported in accordance with Preferred Reporting Items for Systematic Reviews and Meta-Analysis (PRISMA) guidelines [16]. For more details see the published protocol (PROSPERO CRD42016036802) [17].

Eligibility criteria

Studies were included in the review if they met the following criteria:

Population

Children and young adults (age up to 25 years) with any type of CNS tumour. Studies had to have a minimum sample size of nine patients [18, 19]. Studies with a mix of older adults and children/young adults were included provided that patient baseline data and outcomes were reported separately for children/young adults. Studies reporting a mix of tumour types were initially included, however, it was felt that disease-specific data within these was at risk of reporting bias, therefore a decision to exclude them was made at data extraction where this was suspected.

Intervention

PBT, used alone or as part of a multimodality treatment regimen.

Comparator

For comparative studies, we accepted conventional photon external beam radiation including three-dimensional (3D) conformal techniques or intensity-modulated radiation therapy (IMRT) including arc therapy, stereotactic radiosurgery, or brachytherapy used alone or as part of a multimodality treatment programme.

Study designs/publication type

Published full text studies that were either randomised controlled trials (RCTs), non-randomised controlled studies, phase II single arm trials and case series studies were included.

Search strategy

Searches were undertaken from database inception to May 2021 in twelve bibliographic databases including MEDLINE, EMBASE and the Cochrane Library (search strategy provided in Supplementary Information (SI 1 and SI 2)). No language, publication or study design filters were applied. Reference lists of relevant studies were reference checked and clinical experts in the field consulted.

Study selection

Study selection was undertaken independently by multiple reviewers in the author group and disagreements resolved by discussion, with JSW and BP making the final decisions.

Data items and extraction process

Data extraction and risk of bias assessment were undertaken by one reviewer and checked by a second. Data was collected on specially designed pro-forma in Word and included data on patient characteristics, treatment regimens, and outcome measures. Proton radiation dose was measured in SI units of Gray Relative Biological Effectiveness (GyRBE). Missing data was not imputed (SI 3). Risk of bias was assessed using a checklist designed to assess the validity of case series [17, 20], covering the domains of selection, detection and attrition bias. Additional criteria to assess the adequacy of the sample size, methods of analysis, outcome reporting and external validity of the study were also added and reported as a global assessment of the data set—see questions 13–17 of the data extraction sheet (SI 3).

Effect measures

Effect measures were categorised as tumour related or toxicity related. Tumour related included: overall survival (OS), progression-free survival (PFS), event-free survival (EFS), recurrence-free survival (RFS), local and distant failure rates (LFR/DFR), response rates (RR), nodular failure-free survival (NFFS), and cystic failure-free survival (CFFS). Toxicity-related included: short- and long-term adverse events, such as necrosis, endocrine insufficiencies, ototoxicity and health related quality of life (HRQoL).

Synthesis methods

Results were grouped according tumour type, and reported in a standard format across the tumour types, allowing for consistent reporting and missing data to be identified. The format was as follows: study characteristics, including number of patients, study design, patient characteristics and interventions received. Outcomes were grouped as tumour related outcomes and toxicity related outcomes.

Results

Quantity of the research

Thirty-one full-text studies met the inclusion criteria, consisting of one phase II study, 24 retrospective and six prospective case studies. Twenty-three studies were single arm, the remaining were non-randomised comparisons of PBT with photon RT. There were no RCTs (Fig. 1).

Fig. 1
figure 1

PRISMA diagram showing search process and number of included studies

Conducted in 10 institutions, 27 studies were based in the USA, one in France and two in Switzerland. One study was multinational with data from the USA and Canada [21]. In total, 1731 children participated in the studies, with 1465 children (85%) receiving PBT and 266 (15%) receiving photon RT. The studies were conducted between 1991 and 2018, with the majority of studies conducted between the years 2000 and 2015. The mean sample size was 51 and ranged from 10 to 179. Average follow-up ranged from 0.9 to 7.6 years (Table 1).

Table 1 Baseline characteristics of children and young adults with CNS tumours included in PBT studies

Eleven studies included children with medulloblastoma/primitive neuroectodermal tumours (PNET) (n = 712) [21,22,23,24,25,26,27,28,29,30,31], five ependymoma (n = 398) [32,33,34,35,36], four atypical teratoid/rhabdoid tumour (AT/RT) (n = 72) [37,38,39,40], six craniopharyngioma (n = 272) [41,42,43,44,45,46], three low-grade glioma (LGG) (n = 233) [47,48,49], one germ cell tumours (GCT) (n = 22)[50], and one pineoblastoma (n = 22) [51]. Ninety percent of patients were receiving first-line therapy and 57% were male (Table 1).

Quality of the research

Selection bias and reporting bias were the major methodological limitations, due to studies involving opportunity/convenience samples and the retrospective nature of the data collection. Poor reporting compounded selection bias with few studies reporting eligibility criteria making it difficult to assess representativeness and generalisability. Where studies included patients at different stages in disease progression, most did not report results separately by disease status. Poor reporting also hampered assessments of outcomes, for example, timing of outcome assessments was generally not reported and long-term adverse events were frequently reported in a seemingly arbitrary sub-group of patients. Length of follow-up was long enough for some outcomes to occur (e.g. PFS in AT/RT), but not others (e.g. long-term adverse events, particularly neuro-cognitive outcomes) (SI Fig. 1).

Medulloblastoma

Eleven studies assessed the effects of PBT, reporting data on 712 patients with medulloblastoma/PNET, with 515 receiving PBT and 197 receiving photon RT. In seven studies children were treated with PBT at the Massachusetts General Hospital (MGH). All MGH studies have slightly different study designs and focus, but it should be noted that double counting for common outcomes may have occurred as there is substantial overlap in study dates/periods suggesting a shared cohort of patients particularly between 2002 and 2009 and for OS outcomes.

The 11 studies comprised of one single-arm phase II trial [31] and 10 case series studies (three prospective [26, 27, 30] and seven retrospective [21,22,23,24,25, 28, 29]. Five studies compared PBT (n = 179) with photon RT (n = 197) [21,22,23, 28, 30]. The mean sample size was 65. Median follow-up ranged from 0.9 to 7 years. One study had 11 (14%) recurrent patients [21].

Eight studies defined patients according to risk, with 78% (429/551) defined as standard-risk and 21% (115/551) defined as high-risk. One study defined six patients as intermediate-risk—see paper for definitions—accounting for 1% of the total, however, these patients outcomes are reported as if they were high-risk [31]. Across the studies the youngest patient was 1.9 years [25], the oldest 21.9 years [22] but the median age within the studies ranged from 2.9 to 10 years. Two studies focused solely on very young children [24, 25] (Table 1).

PBT was given as part of a multimodal treatment regimen consisting of surgical resection prior to radiotherapy and chemotherapy (various protocols). Gross total resection (GTR) was achieved in 86% of PBT patients. The median craniospinal irradiation (CSI) dose for standard-risk patients was 23.4 GyRBE (36.0 GyRBE for high-risk patients) with a median boost dose to the tumour bed of 54 GyRBE both delivered in fractions of 1.8 GyRBE. (Table 1 and SI Table 1).

Tumour related outcomes

Survival was reported in five studies (n = 285) [23,24,25, 29, 31]. OS for all PBT patients ranged from 68 to 89% in newly diagnosed patients, depending on patient and tumour characteristics and follow-up. For example, Yock (2016) reported 7-year OS rates of 81% for 39 standard-risk PBT patients compared with 68% for 20 high-risk PBT patients [31]. Eaton (2016) reported a 6-year OS of 82% for 45 PBT patients compared with 88% for 43 photon RT patients but the comparison was non-significant [23]. In very young children, Grewal reported an OS of 84% at 5 years in 14 PBT patients [24] (Table 2).

Table 2 Summary of results of tumour related outcomes studies on PBT in children and young adults with CNS tumours

Failure rates were given in three studies for PBT patients [24, 25, 29]. At 3.2 years, LFR was 5% and DFR 10% (n = 109), with the spine the most common site for isolated local failure (Table 2).

Toxicity related outcomes

Early to medium term toxicities were reported in two studies [24, 31]. Serious adverse events experienced 90-days post PBT included stroke (grade IV) in one patient and brainstem injury consistent with necrosis (grade III) in another, with no toxicity-related deaths reported [24, 31]. One patient died from viable tumour and necrosis in the brainstem, but it was unclear if the necrosis was related to PBT [24] (Table 3).

Table 3 Adverse events other than endocrinopathies, ototoxicities or neuro-cognitive outcomes

A variety of late effects were reported. Endocrinopathies were reported in four studies (165 patients) [22, 24, 25, 31]. Yock reported at 3, 5 and 7 years post PBT, observing that deficiencies increased over time. By year 7, 61% (36/59) of patients had at least one endocrine deficiency, the most common being growth hormone deficiency (GHD) occurring in 31 patients [31]. Comparing PBT with photon RT, Eaton (2016) found a statistically significant reduction in the incidence of central hypothyroidism (p < 0.001) and sex hormone deficiency (p = 0.013) in PBT patients at 5.8 and 7-years follow-up [22] (Table 4).

Table 4 Summary of results of endocrinopathies in children and young adults treated with PBT for CNS tumours

Ependymoma

Conducted in three institutions, five case series studies (two prospective [32, 34] and three retrospective [33, 35, 36]) assessed the effects of PBT in 398 children with predominantly intracranial ependymoma. One study was comparative and compared PBT with patients who had received photon RT (non-randomised) [36]. The mean sample size was 80 and the median study follow-up was 3.6 years (Table 1).

Eighty-eight percent of patients were receiving first-line chemotherapy while 12% had recurrent local or metastatic disease [33,34,35,36]. Patients ranged from infants to young adults with median age within the studies ranging from 2.5 to 5.3 years. Patients received PBT as part of a multi-modal treatment regimen with patients undergoing surgical resection (78% achieving GTR) and chemotherapy (38%) prior to PBT/photon RT. The median dose of PBT was 55.8 GyRBE delivered in fractions of 1.8 GyRBE (Table 1 and SI Tale 1).

Tumour related outcomes

Survival was reported in all five studies. In patients treated with PBT, three-year OS ranged from 90% [34] to 97% [36] in patients receiving first-line therapy, with 3-year PFS ranging from 76% [34, 35] to 82% [36]. In Eaton’s study of 20 patients with recurrent disease, 3-year OS was 79% and PFS was 28% [33]. Comparing PBT with photon RT, Sato found statistically significant differences in favour of PBT for both 3-year PFS (82% versus 60%; p = 0.031) and local RFS (88% versus 65%; p = 0.01), but no statistical difference for OS [36]. Ares reported a 5-year OS of 84% in respect of 50 patients treated with pencil beam scanning PBT [32] (Table 2).

Failure rates were reported in all five studies. LFR at 3-years was 15% [34] and 17% [35] with 5-year LFR at 22% [32] and 23% [35]. DFR at 3-years was 15% [34] and 23% [35] and at 5-years 17% [35]. Median time to LFR and DFR was 1.4-years and 1-year, respectively [34]. In a univariate analysis LFR was related to extent of surgery (GTR: 21.6%, subtotal resection (STR): 35.5% (p = 0.003)) [34]. Comparing PBT with photon RT, Sato reported a LFR of 15% and DFR of 2% for PBT assessed at 2.6 years follow-up and LFR of 47% and DFR of 8% for photon RT assessed at 4.9 years follow-up, but this difference is likely to be due to the differences in follow-up times [36]. In recurrent patients 3-year LFR and DFR was 45% and 67%, respectively with second failure following first failure patterns [33] (Table 2).

Toxicity related outcomes

Short-term serious adverse events were reported in all five studies (398 patients) [32,33,34,35,36]. There were 14 cases of RT-associated vasculopathy presenting as stroke [34, 36] and radio-necrosis [36], 11 cases of brainstem toxicity including one fatality reported [32, 34, 36] as well as three cavernoma and two cervical subluxations [35] (Table 3).

Various medium-term and late endocrine toxicities were reported. Central hypothyroidism and GHD were the only endocrinopathies reported over three studies, with GHD being the most common [32, 34, 35] (Table 4.)

Ototoxicity was reported in three studies [32, 34, 35], but occurred at low levels and appeared to be related to prior cisplatin chemotherapy or in patients with the tumour close to the cochlea [32, 35] (Table 5).

Table 5 Summary of results of ototoxicity in children and young adults treated with PBT for CNS tumours

Neuro-cognitive outcomes were only assessed by MacDonald (2013) who reported small and non-statistically significant increases in both mean Full Scale Intelligence Quotient test (FSIQ) (n = 14) and adaptive skills/functional independence (n = 28) at 2.2 years follow-up compared to baseline [35] (Table 6).

Table 6 Summary of results of neuro-cognitive and adaptive behaviour outcomes

No studies reported quality of life measures.

Atypical teratoid/rhabdoid tumours (AT/RT)

Conducted in separate institutions, four single-arm, retrospective case series studies assessed PBT in 72 children with AT/RT [37,38,39,40]. The mean sample size was 18 and study follow-up ranged from 2.0 to 3.2 years.

All patients were receiving first-line therapy and 28% had confirmed metastatic disease at presentation. Mean age across the studies was 1.7 years. Prior to PBT, 97% of patients underwent surgical resection (47% achieved GTR) followed by induction chemotherapy (92%). The average PBT dose was 50.4 GyRBE in two studies [37, 39] and 54 GyRBE in two studies [38, 40] delivered in fractions of 1.8 GyRBE. Chemotherapy was delivered either concurrently (25%) or post-PBT (67%) (Table 1 and SI Table).

Toxicity related outcomes

All four studies reported comprehensive lists of adverse events. Radiation necrosis was reported in six patients all of whom survived [38, 40] (Table 3).

Endocrinopathies and ototoxicity were assessed by De Amorim Bernstein in seven (70%) and ten patients, respectively (100%). Two patients (28%) developed hypothyroidism and three (43%) GHD at 2.5 years. One patient developed high-frequency sensorineural hearing loss (SNHL) at 2.3 years follow-up [37] (Tables 4 and 5).

HRQoL was assessed by Weber in 15 children, predominantly less than 2 years of age. Based on parental proxy reports, there was little variation between mean scores for physical, social, emotional and psycho-social functioning at two-months follow-up compared with baseline [40] (SI Table 2).

Tumour related outcomes

Survival was reported in all four studies with variable follow-up schedules possibly impacting estimates. OS ranged from 53% at 2 years [39] to 90% at 2.3 years [37]. PFS ranged from 46% at 2 years [39] to 75% at 1.4 years [38] (Table 2).

Failure rates were reported in three studies (n = 41). LFR ranged from 0 to 20%, and DFR 20% to 27% [37, 38, 40] (Table 2).

Craniopharyngioma

Six studies assessed the effects of PBT in 272 children with craniopharyngioma. Of these, five were single arm retrospective case series [41, 43,44,45,46] and one was an historical control study, comparing PBT with photon RT [42]. The average sample size was 45 and study follow-up ranged from 2.0 to 6.2 years (Table 1).

Fifty-one percent of patients were receiving first-line therapy and 49% had recurrent disease [42,43,44,45,46]. Patient age ranged from 1.3 to 20 years [43,44,45,46]. Prior to radiotherapy, 97% of patients underwent surgical resection (69% STR, 11% GTR) and 20% either had a cyst drainage, fenestration or shunt inserted [41,42,43,44, 46]. The median dose of PBT ranged from 50.4 to 59.4 GyRBE delivered in fractions of 1.8 GyRBE (Table 1 and SI Table 1).

Tumour related outcomes

OS was reported in three studies (n = 149) [42, 43, 45]. Comparing PBT and photon RT, Bishop reported a non-statistically significant difference in 3-year OS between 21 patients who received PBT (OS 94%) and 31 patients who received photon RT (OS 97%) [42]. In 77 patients treated with PBT, 5-year OS was 97.7% [43]. Luu (n = 16) also reported a 5-year OS of 100% for patients who had undergone one surgical resection compared to 60% for those with more than one resection [45]. PFS was not reported (Table 2).

Specific to craniopharyngioma, Bishop reported NFFS and CFFS. No statistically significant differences were found in 3-year NFFS (92% versus 96%; p = 0.54) or 3-year CFFS (67% versus 77%; p = 0.99) between the PBT and photon RT groups [42].

LFR was reported in three studies. Winkfield (n = 24) reported LFR at 0% at 3.4 years [46]. In Luu (n = 16) and Jiminez (n = 77) the 5-year LFR was 6% and 10%, respectively [43, 45]. Median time to failure from PBT completion was 3.6 years (range 1.8–8.4) (Table 2).

Toxicity related outcomes

Bishop reported no significant differences in the incidence of post-RT vasculopathy, visual dysfunction and obesity between PBT and photon RT [42] (Table 4 and 5). In the Jiminez report one patient had vasculopathy symptoms (1.3%), one patient had a stroke (1.3%) and one Moyamoya syndrome (1.3%). Jiminez also reported visual outcomes including pre and post PBT, with 68% experiencing stable vision, 10% worsening, 10% improving and 12% unknown [43] (Table 3).

Endocrinopathies were reported in four studies [42,43,44,45]. Bishop reported no statistically significant difference between PBT and photon RT patients in the incidence of endocrinopathies newly acquired from the start of RT. The most common endocrinopathy was panhypopituitarism occurring in seven (13%) PBT and 17 (33%) photon RT patients (p = 0.162) [42]. Luu reported just one patient (6%) with panhypopituitarism [45], while Laffond reported pituitary dysfunction in 28 patients (96%) and hypothalamic syndrome in 18 PBT patients (62%) between 1.7 and 14 years follow-up [44]. Jiminez measured endocrinopathies pre- and post-PBT and found 49% were stable, 47% worsened and 4% improved [43] (Table 4).

Ototoxicity was comprehensively reported by Bass. Rates were low for clinically significant SNHL in the extended high frequency (EHF) range at 3% [41] (Table 5).

Neurocognitive outcomes were reported by Jiminez [43]. FSIQ, verbal and visual memory scores were stable, with adaptive skills (Scales of Independent Behaviour Revised (SIB-R)) had a statistically significant decrease in mean follow-up score compared with baseline, however this was not considered clinically important (Table 6).

HRQoL and executive functioning outcomes were reported by Lafford [44]. HRQoL was assessed via patient and parental proxy reported scores in 22 PBT patients (nine of which also received photon RT). At 3.4 year follow-up, overall HRQoL was deemed satisfactory, although between 25 and 50% of scores were indicative of low HRQoL for seven of the ten sub-domains. Fifty percent of patients had mild-moderate mood disorders, but no patients experienced severe depression. With respect to executive function, 24–38% of patients experienced problems with flexible thinking (‘shift’), emotional control and working memory (SI Table 2).

Low grade glioma (LGG)

Three non-comparative single centre case series studies (one prospective [49] and two retrospective [47, 48]) assessed the effects of PBT in 233 children with LGG. The two retrospective studies had small sample sizes and both started recruitment in the 1990s, however, the prospective study by Indelicato involved 174 patients and was conducted between 2007 and 2017. Study follow-up ranged from 3.3 to 7.6 years.

Reported in two studies (n = 59), 75% were newly diagnosed while 25% had recurrent disease [47, 48]. No patients had metastatic disease. Mean patient age at time of PBT ranged from 8.7 to 11 years, although most included children from 2 to 21 years. Prior to PBT, a selection of patients underwent surgery (87%) followed by chemotherapy (44%) [47, 49]. One-hundred and seventy patients in the Indelicato series had > 0.5 cm gross disease at time of irradiation, the remaining four patients received RT due to multiple prior recurrences [49]. The average dose of PBT was 54 GyRBE (Table 1 and SI Table 1).

Tumour related outcomes

Survival was reported in all three studies. OS rates of 85%, 92% and 100% were reported at 3.3, 5.0 and 8.0 years follow-up, respectively [47,48,49]. PFS, reported in two studies (n = 206) was 84% and 90% at 5.0- and 6.0 years, respectively [47, 49] (Table 2).

LFR, reported in two studies, was 22% and 15% at 3.3 and 5.0 years, respectively [48, 49]. DFR reported in one study was 0% at 3.3 years [48] (Table 2).

Toxicity related outcomes

Indelicato reported serious PBT-attributable late toxicities in seven patients (4%), most notably brainstem necrosis (treated with steroids), vasculopathy and second malignancy [49] (Table 3).

Across the studies, endocrine abnormalities were reported in 23% of patients assessed, including hypopituitarism [48], growth hormone deficiency [49] and cortisol insufficiency [47] (Table 4).

Reported in one study, there was no significant decline in neuro-cognitive outcomes (FSIQ, verbal comprehension or perceptual reasoning) at 5-years relative to baseline in 12 patients (38%) assessed [47]. Visual acuity, assessed in 18 patients, was stable/improved relative to baseline in the 15 non-high-risk patients [47]. Ototoxicity was assessed in 174 patients, at 4.4 years, 4 patients (2%) had grade II partial hearing loss in one ear and one patient had grade III hearing loss with need for amplification [49] (Table 5 and 6).

For HRQoL, Hug reported that of 27 patients, no patient experienced a drop of more than 10% in the Lanksky performance scale [48] (SI Table 2).

Germ cell tumours (GCT)

One single-arm retrospective case series by MacDonald, reported the effects of PBT in 22 children (mean age 11 years) with newly diagnosed GCT [50]. Fifty-nine percent had germinoma and 41% non-germinomatous germ-cell tumours (NGGCT) (Table 1 and 2). OS and PFS were 100% and 95%, respectively at 2.3 years follow-up. No patients experienced a local failure whilst DFR rates were 0% and 11% for germinoma and NGGCT patients, respectively (Table 2). Two patients (9%) experienced hypothyroidism and two (9%) required growth hormone replacement at 2.3 years. No patients developed RT-related diabetes insipidus (Table 4).

Pineoblastoma

One study by Farnia reported the effects of PBT in children with pineoblastoma [51]. Undertaken in a single institution between 1982 and 2012, this historical control study included 22 patients under 25 years, of which 11 received PBT and 11 received photon RT and one gamma knife treatment. Median age was 7.7 years and 14.5 years for PBT and photon RT, respectively (Table 1). Survival and recurrence rates between PBT and photon RT were not statistically different (Table 2). Long-term toxicities—which all occurred in patients treated with photon RT—included grade 3 cognitive decline (n = 3), grade 3 seizures (n = 1), grade 3 hearing impairment (n = 1) and grade 3 avascular necrosis of the femoral head (n = 1) (Table 3, 5 and 6).

Discussion

The aim of this systematic review was to investigate if the published clinical evidence supports the assumptions derived from dosimetry studies of PBT compared with photon RT in terms of equivalent survival, improved quality of life and/or reduced long-term treatment sequelae. Furthermore, recommendations for improving the quality and consistency of output data are presented.

In order to minimise bias we have undertaken this systematic review according to Cochrane methodology, which is designed to produce a systematic review that is as free as possible from methodological flaws, is reproducible and transparent. Our scoping search identified three previous systematic reviews, however, all are out of date with searches up to 2014 [11,12,13]. The review by Laprie 2015 [11] was the most closely aligned to our review, with aims to examine PBT and photon RT in children with brain tumours. However, some of the methodology that they have used may have introduced bias, for example they only utilised the database Medline, only sought English language publications, did not have an a priori protocol, did not quality assess the included studies and their searches were up to 2014. Systematic review is a powerful tool, but is by nature a retrospective exercise and governed by the available evidence. In rapidly evolving fields such as PBT it is important that reviews are regularly updated to ensure that they include all of the evidence and are as up-to-date as possible.

Thirty-one full-text published studies involving 1,730 children met our inclusion criteria. All but five studies [21, 32, 40, 41, 44] were conducted in the USA. Publication dates ranged from 2002 [48] to 2021 [43]. Studies were undertaken from 1982 [51] to 2018 [21]. Most of the patients were treated between the years 2000 and 2015, so the studies in this review are fairly similar regarding the dates, therefore any era differences may be small within this data set. There was one phase II single-arm study, six prospective case series studies, with one of these being comparative and 24 retrospective studies with seven of these being comparative. No RCTs were identified. Largely because of referral patterns in the USA, all the case series used opportunity sampling, i.e. data was based on patients referred to the proton centre routinely, not part of a specific PBT clinical trial, and in terms of the retrospective studies this was derived mainly from patient records. Tumour types included: medulloblastoma (11 studies); ependymoma (5 studies); ATRT (4 studies); craniopharyngioma (6 studies); LGG (3 studies); GCT (1 study) and pineoblastoma (1 study).

The studies were heterogeneous regarding aims and objectives, patient diagnoses, patient populations (some assessed younger patients) and outcomes. For this review we identified nine outcomes of interest. Five measured disease control (OS, PFS/RFS, LFR DFR), four measured treatment related short- to long-term side effects (adverse events, endocrinopathy, ototoxicity, neurotoxicity), and one measured treatment related HRQoL. Across the studies OS was the most frequently reported outcome, followed by LFR, and endocrinopathy. Adverse event reporting was inconsistent across the tumour types making it impossible to assess the incidence across the dataset. However, there were some serious adverse events reported—albeit in very small numbers—such as radio-necrosis, stroke and brainstem toxicity [24, 31,32,33,34,35,36, 38, 40, 45, 49]. Outcomes least reported were HRQoL, neurocognitive and ototoxicity. HRQoL was reported in just three tumour types (medulloblastoma, AT/RT, craniopharyngioma) and neurotoxicity in four tumour types (medulloblastoma, ependymoma, craniopharyngioma, LGG). Given that a reduction of late effects is the proposed key advantage of using PBT, it is disappointing that few studies reported these outcomes. Some study authors commented on the difficulty in obtaining long-term follow-up data as many patients had travelled from other hospital facilities to receive PBT and long-term outcomes were either not evaluated at or not reported to the proton centres. The difficulty in acquiring long-term late effects and HRQoL data has been an issue for many paediatric cancer trials including those which have included RT delay or avoidance. Prospective initiatives such as the USA Pediatric Proton Consortium Registry may yield more useful data in the future [52, 53] but may not be able to solve all these problems [54].

Ependymoma provided the most comprehensive dataset, both in terms of the number of outcomes measured and the proportion of patients in each study evaluated per outcome. The remaining tumour types were either inconsistent in terms of outcomes reported, only included a small percentage of the available patients across the outcomes or as in the case of GCT, pineoblastoma and AT/RT, were extremely limited in the number of patients available, therefore caution must be used in interpreting the results due to lack of power of the dataset.

OS was the most common outcome measure. Generally, for standard paediatric CNS indications, the rates of tumour control and hence cure are expected to be the same for protons as for photons. Most of the patients included in this review were newly diagnosed. OS was reported to be 100% to 68% depending on patient characteristics, follow-up times, etc. however without a randomised comparator it is not possible to “prove” whether PBT offers better, worse or equivalent disease control compared to photon RT. On the other hand, conducting survival equivalence randomised trials in a variety of different histological types with small patient numbers is probably not achievable. Taking into account the totality of radiobiological data and clinical experience it is universally accepted that considering the RBE of PBT tumour control and hence OS are equivalent.

Our systematic review included eight comparative studies, but these utilised either historical [28, 30, 36, 42, 51] or opportunity controls [21,22,23]. The main problem with the use of historical controls is confounding due to temporal shifts in care [55], particularly in older historical controls [28, 42, 51]. This is particularly pertinent to radiotherapy practices which has seen a shift from whole brain radiotherapy to more localised treatments, which may have impacted long-term adverse events and HRQoL. In addition, the multimodality of brain tumour treatment and improvements in delivering photon RT may also have had a substantial impact on disease control in historical comparisons. Temporal shifts may also have improved the accuracy of outcome assessment measures, for example, improvements in imaging may make adverse events such as radio-necrosis easier to identify and appear more common in newer studies, a consideration when comparing PBT radio-necrosis event rates with those from historical controls treated with photon RT. In studies using opportunity controls, the main problem is selection bias where patients not receiving PBT may not have been eligible to receive it and are therefore fundamentally different in terms of prognosis. This is exemplified by Sato, where 93% of patients receiving PBT had had a GTR at surgery compared to 76% of photon RT patients, indicating patients given photon RT were in the higher risk group, potentially biasing survival outcomes in favour of PBT [36].

Retrospective opportunity sampling also limits the type and methods of data collection. Across the studies, measurement and reporting of outcomes (particularly in patients with the same tumour type) were inconsistent, making between study comparisons difficult. One study which reported outcomes measured from diagnosis and completion of PBT demonstrated a marked difference between the two time points, with 2-year OS at 68% when measured from diagnosis and 48% when measured from PBT—a difference of 20% [39]. By using prospective data collection researchers can control what data are collected and the methods of collection. Utilising data from clinical trials investigating non-radiotherapy questions, such as the ongoing SIOP (International Society of Paediatric Oncology) Ependymoma II study [56] and the PNET5 study [57] which include patients treated with both PBT and photon RT can allow better prospective control on data collection. Although non-randomised, data derived from prospective trials also provides data with associated radiation therapy quality assurance and more robust evidence on the relative outcomes, and may help to demonstrate equivalence or otherwise for tumour control and toxicities.

Description of patient populations was also inconsistent within the studies. Seven studies included patient populations comprising both newly diagnosed children receiving first-line therapy as well as those with recurrent disease, but failed to report patient baseline status or outcomes separately [28, 35, 42, 44,45,46, 48]. We originally planned to include studies with mixed tumour types provided data for individual tumours were reported. Three were identified [58,59,60] however, after examining these studies we felt that an element of reporting bias could be a factor, as not all the results were consistently reported across the tumour types with the possibility that only exceptional results had been reported, therefore we excluded these studies.

For PBT centres publishing work on expanding cohorts, it is important that it is clear which data has been previously reported, so that the data is not double counted in systematic reviews. Unique cohort identifiers could help this problem [61] such as the system employed for Randomised Controlled Trials [62]. However, this may cause issues with getting studies published as many journals follow the Inglefinger rule, which stipulates that only new previously unpublished data is published [63, 64]. Journals could help by allowing expanding cohorts and encouraging authors to be transparent. This is particularly pertinent to rare disease research where there are fewer patients available to study and where there is a tendency for specific specialist treatment centres to be research active and likely to report on expanding cohorts.

The medical literature has seen a great deal of debate on the necessity or ethical justification of conducting RCTs to evaluate PBT in children. Some commentators contend that equipoise does not apply as the superior dose distributions associated with PBT, must translate into improved patient outcomes and therefore an RCT would not only be unnecessary but unethical [7]. Others argue that it is unethical to use a technology that has had insufficient controlled evaluation of clinically relevant benefit [7, 65]. As well as ethical considerations, differences in the development of radiotherapy treatment compared to drug development also provide challenges in evaluating clinical effectiveness [66, 67]. This may explain why previous paradigm shifts in RT delivery technology, such as IMRT which have been widely implemented, were supported by relatively few RCTs in adults and none in children. The rarity of paediatric CNS tumours, the severity and delayed nature of many of the late effects and willingness of patients and families to undergo randomisation may also render RCTs with late effect endpoints impractical [7, 68] It is, however, recognised that RCTs between PBT and photon therapy are being conducted or planned in adults with cancer including the forthcoming APPROACH trial in adult patients with grade 2 and 3 oligodendroglioma with neurocognitive function as an end point.

This review did not identify any published RCTs, therefore we are unable to answer our primary review questions regarding effectiveness of PBT compared to other radiotherapy treatments in particular photon RT and its role in ameliorating long-term adverse events. Given the increasing use of PBT as standard of care for paediatric brain tumours, perhaps it is too late to ask this question. Indeed, in the UK the large majority of children with primary brain tumours receive radiotherapy with PBT as opposed to photon therapy although this does not apply to many other countries worldwide. We may need to ask how we can maximise the use of PBT both in patients traditionally treated with radiotherapy and patients thus far prohibited such as younger children. If this were the question, again the current body of evidence would have limitations, particularly given the haphazard nature of the research, with few proton centres reporting their activity. Problems with long-term follow-up of patients and little standardisation of the data collected and reported compound the literature. These factors highlighted in this review, stress the need for consistent and systematically collected data on all patients receiving PBT (both trial and non-trial patients) to monitor the effects of treatment including short-term side effects such as radio-necrosis and long term sequelae such as neuro-psychological dysfunction. This is necessary to fully inform clinicians and thus patients and their families of the likely treatment outcome. Indeed such arguments should ideally apply to children receiving photon radiotherapy, and thus may potentially offer a comparison of outcomes between the two techniques albeit in a non-randomised setting. Such comparisons could be subject to future systematic reviews.

Registry data may be one model that could collect data and is a growing area especially with the development of ‘big data’ techniques employed to analyse the data [69]. The success of these ventures is reliant upon the accuracy and consistency of the data input, as well as the continued engagement of stakeholders especially patients, parents, referring teams and of course sufficient long-term funding. Alongside comprehensive prospective databases, there also needs to be a well thought out publications strategy to avoid data duplication/double counting, if separate research teams access one single data source. Although, as discussed above, it is unlikely to see RCTS in children with CNS tumours that will directly compare PBT with photon therapy, RCTs are potentially more feasible with respect to important PBT questions such as delivery techniques (e.g. proton arc therapy), dose and volume, and these are to be encouraged.

In conclusion this review provides a summary of the available data of PBT delivered for a range of CNS tumours arising in children. PBT has been widely implemented in many high-income countries for the treatment of children with cancer including many with CNS tumours. However, in order for the implementation of PBT to continue to evolve, areas where the quality of data could be improved have been highlighted. This may be useful in the context of health systems where cost or geographic access to PBT are issues. Furthermore, improved outcome data, particularly with respect to late effects could inform the continued evolution of the standard indications for PBT.