Comparative Analysis of Radiotherapy Linear Accelerator Downtime and Failure Modes in the UK, Nigeria and Botswana Clinical

The lack of radiotherapy linear accelerators (linacs) in low- and middle-income countries (LMICs) has been recognised as a major barrier to providing quality cancer care in these regions, together with a shortfall in the number of highly quali ﬁ ed personnel. It is expected that additional challenges will be faced in operating precise, high-technology radiotherapy equipment in these environments, and anecdotal evidence suggests that linacs have greater downtime and higher failure rates of components than their counterparts in high-income countries. To guide future developments, such as the design of a linac tailored for use in LMIC environments, it is important to take a data-driven approach to any re-engineering of the technology. However, no detailed statistical data on linac downtime and failure modes have been previously collected or presented in the literature. This work presents the ﬁ rst known comparative analysis of failure modes and downtime of current generation linacs in radiotherapy centres, with the aim of determining any correlations between linac environment and performance. Logbooks kept by radiotherapy personnel on the operation of their linac were obtained and analysed from centres in Oxford (UK), Abuja, Benin, Enugu, Lagos, Sokoto (Nigeria) and Gaborone (Botswana). By deconstructing the linac into 12 different subsystems, it was found that the vacuum subsystem only failed in the LMIC centres and the failure rate in an LMIC environment was more than twice as large in six of the 12 subsystems compared with the high-income country. Additionally, it was shown that despite accounting for only 3.4% of the total number of faults, linac faults that took more than 1 h to repair accounted for 74.6% of the total downtime. The results of this study inform future attempts to mitigate the problems affecting linacs in LMIC environments.


Introduction
Radiation therapy is a critical component for treating and relieving the symptoms of cancer and is useful in half of all cancer cases [1]. There is, however, a global disparity in the access to radiotherapy; in 2012, over 50% of the approximately 4.0 million cancer patients in low-and middleincome countries (LMICs) who required radiotherapy were unable to access such treatment [2,3]. With many LMICs having inadequate or, in many cases, no radiation therapy centres, it is projected that to meet the LMIC radiotherapy demand over the next two to three decades, there is a need for around 12 600 radiation therapy machines [4].
Radiotherapy can be delivered via a radioactive source, typically cobalt-60, or by accelerating electrons in a linear accelerator (linac), producing X-rays by colliding the electron beam with a tungsten target. Although both technologies are mature and offer a range of benefits and drawbacks as a solution for providing external beam radiotherapy [5], it is argued by Coleman et al. [6] that for reasons of security and safety, radiation delivered using a linac is the most effective solution to the radiotherapy burden in LMICs. Current generation linacs, however, experience significant downtime in LMICs as they face challenges in these environments that they are not designed to manage. Their performance is adversely affected by regular interruptions to the energy supply, a lack of air temperature control in buildings and weak health care systems [7].
Tackling the radiotherapy burden in LMICs is a complex task that requires multidisciplinary collaboration [8e10]. An International Cancer Expert Corps-sponsored workshop held on the CERN campus in 2016 invited experts from fields including oncology and accelerator physics to consider future options, such as innovative technology, for tackling this global problem [11]. The absence of detailed statistical data on linac downtime and failure modes, however, prevents the determination of the exact effect of the LMIC environment and its challenges on the performance of current linac technology.
The aim of this work was to quantitatively determine the effect of environment on linac performance. Failure mode data from 14 current generation linacs in the UK, Nigeria and Botswana were obtained and analysed; this sample offers variations in both socioeconomic and physical environments and provides a dependent variable with which linac performance can be compared. The conclusions from this analysis allow for recommendations towards linac designs that are optimised for performance in challenging environments.

Collection and Sampling of Linac Performance Data
The study used data obtained from 14 current generation linacs: six from Oxford (UK), six from across Nigeria and two from Gaborone (Botswana), as detailed in Table 1. As the linacs studied do not record or log their own performance data for local analysis, in order to analyse the linac failure modes and downtimes, data on machine performance were obtained from notes recorded by radiotherapists, medical physicists and engineers in logbooks at each institution. A typical entry would include the date and time of the fault, details on any interlocks and inhibits observed, how the linac was repaired and the amount of downtime the fault caused. There were, however, variations between the centres in both the level of detail in the description and whether all faults were recorded or only the most severe; for instance, an average of 250 faults were recorded per linac per year in Oxford compared with just over 4 in Sokoto. In the case of incomplete information in the logbook, the most likely scenario was estimated based on other logbook entries and, where possible, this was made more accurate by liaising with the authors of the logbooks. It was assumed that all centres recorded all faults that caused more than 1 h of downtime.
This sample was chosen as there is variation in the environment with which linac downtime and failure modes can be compared, particularly in determining the effect of variation in socioeconomic status [as of 2019, the World Bank classes the UK as a high-income country (HIC), Botswana as an upper middle-income country and Nigeria as a lower middleincome country] and physical factors (for example, the variation in climate and stability of power supply between the countries) on linac performance. There were also strong collaborative links to each of the centres. It should be noted that in this sample, all HIC linacs were from one vendor, whereas all LMIC linacs were from another. Additionally, the data available also covered different periods of the lifetime of the linacs and so linacs are not compared throughout the same stage of their life.

Linac Subsystems
In order to compare the failure modes, rates and downtime of the radiotherapy machines between the different centres, the linac was deconstructed into 12 different subsystems and each fault assigned one of seven causes, as detailed in Table 2. Every entry recorded by the radiotherapy centre was assigned to a subsystem and given an overall fault cause. By estimating the downtime for each centre, a failure rate per 1000 hours of uptime was calculated for each subsystem. The downtimes and failure rates of each subsystem were analysed to determine the effect of linac environment on performance.

Analysis of Downtime in Oxford
Linac fault logs are typically concise notes recorded by relevant members of staff. Data is not currently recorded systematically enough to allow for automated analysis. To analyse the large amount of data available (i.e. 11 875 faults recorded across six Oxford linacs over a 7.5-year period), the dataset was sampled in two ways. First, only faults affecting the linac and multileaf collimator (MLC), as detailed in Table 2, were analysed. The omitted systems included additional imaging systems (kV and MV), additional positioning and targeting (patient position management, respiratory gating), other systems (computed tomography scanners) and communication and computing issues beyond those detailed in Table 2 (DICOM). As the provision of these systems differs between environments, their omission from the analysis gives a more direct comparison of linac performance between centres.
The second sampling technique was to only analyse the most severe faults that caused more than 1 h of downtime. The justification is clear when binning the data according to its impact on downtime [12], where A < 5 min, 5 min < B < 60 min, C > 60 min. Table 3 shows that category C faults accounted for 74.6% of all downtime in the case of linac faults and 46.8% of all downtime in the case of MLC faults. As category C faults were the biggest contributors to downtime and reduce the dataset to a size that allows for manual analysis, in this study we focused solely on these faults.
Trends may exist between the occurrence of a more minor category A or B fault and the probability of a more severe category C fault occurring in the near future and this may be useful in a study on preventative maintenance. However, as category A and B faults were not always recorded in LMIC centres, we are only able to compare category C faults at present.

An Overview of Downtime and Failure Rate Differences
The linac performances were grouped by country, but we refer to the UK as a HIC and Nigeria and Botswana as LMICs.
To determine how the environment a linac operates in affects its performance, we analysed two factors: the downtime and the rate of failure for each subsystem. Downtime is defined as the median downtime of each subsystem in each linac. Where more than one linac existed in a centre, the mean of these values is given.
The failure rate must be analysed with respect to the expected uptime of the linac. The uptime assumes a typical number of hours a linac would treat per week: based on discussions with the radiotherapy personnel, this was taken to be 50 h every week for Oxford and 40 h every week for all other centres. A superior measurement would be the failure rate per patients treated, but differing patient loads and ethical considerations of patient records make this impractical at present.
The failure rate per hours of uptime (inversely proportional to the mean time between failure) was calculated by dividing the total number of category C faults by the total uptime of each centre (Table 4). By dividing by the uptime of each centre, the contextual issues that influence linac availability do not affect the failure rate results.
However, contextual issues will necessarily affect downtime results, as centres spend different amounts of time waiting for parts and waiting for engineers, and centres have engineers of varying experience. The ways in which this contextual information affects conclusions are discussed below. Figure 1 shows the mean downtime of each of the linac subsystems as well as their failure rates per 1000 h of uptime. The mean downtime in each subsystem was comparable between the Botswana and UK centres but significantly larger in Nigeria for most subsystems. This seems to reflect the different service contracts the countries had in place: the Oxford centre had a full parts contract with the manufacturer, the Gaborone centre had a full parts and service contract with the manufacturer, whereas the Nigerian centres did not have either a parts or a service contract. The mean downtime in Nigeria was significantly larger for the diagnostics, RF power and vacuum subsystems. This is perhaps related to the significant cost to replace the ion chamber, thyratron/magnetron and ion pump. Such contextual issues are discussed below. The failure rate was greater in LMIC environments for all subsystems except for the beam, positioning and gun. The air, cooling and generator and vacuum subsystems are discussed in detail in the following sections. Other results include: Computing: this subsystem failed more than nine times as often in Nigeria and Botswana than in Oxford. In the LMIC environments, computing equipment was not as readily available as in Oxford. This means that faults could be escalated to category C, requiring complex repairs rather than a more simple (but more expensive) replacement. Couch and external door: The failure rate in the LMIC environments was three times greater than in Oxford. This subsystem is affected by power cuts (and the subsequent surges when the power returns) causing fuses, that are not trivial to find or replace, to blow. There were more door issues in LMICs due to their linacs requiring a large mechanical door for shielding and safety purposes, whereas the infrared sensor systems used in Oxford seemed to generate fewer issues. RF power: this subsystem failed twice as often in Nigeria, however this result was skewed by an outlier from the Abuja (2017) linac (arising due to the thrice repeated failure of the 10A fuse in the thyratron pulse assembly in the short time data were available for), which significantly increased the mean downtime. LMIC centres had more RF power faults caused by power issues than HIC centres. Gantry: there were four times more failures of this subsystem in LMICs. There were only three category C gantry faults in the Oxford data; this small number may be due to more frequent planned maintenance of the gantry system in Oxford, quicker repairing of gantry faults (meaning a higher proportion of faults are category B rather than C) or this could be a vendor difference. MLC: Although the failure rate was at least four times greater in the LMIC environments, the fact that only the Oxford, Abuja (2017) and Gaborone (2015) linacs had MLCs contributes to the large apparent disparity between the two environments. The data available for the two LMICs were from their date of installation, compared with 4 years after the installation of the linacs in the HIC. As a result, the data in the LMICs may have been skewed by 'early failures'. Furthermore, the data available for the Abuja (2017) and Gaborone (2015) linacs were small (1902 and 4583 h, respectively) and thus statistical fluctuations had a large impact on the calculated failure rate. To compare the performance of the MLC between environments, more data should be collected. Diagnostic: The rate of failure was comparable between the environments. The rate of failure of the ion chamber itself was very consistent between the environments, the slight difference was caused by more failures of the board equipment relating to the ion chamber in Gaborone.
A few subsystems seemed to fail more frequently in the HIC than the LMICs: beam, positioning and gun subsystems.
Beam: The beam failure rate may have been greater in the HIC because these issues were always recorded by Oxford, whereas they were not necessarily always recorded at other centres due to the nature of the issue. Positioning: The failure rate was slightly greater in the HIC data and this seemed to be due to Oxford having a greater number of issues with their position read out (PRO), secondary PRO (SPRO) and encoders. This may result from vendor differences or tighter tolerances imposed in the Oxford centre. Gun: A higher HIC failure rate was probably due to the difference in design between the vendors. In Oxford, the gun had 17 category C faults across the six linacs for issues requiring it to be re-potted or replaced. By contrast, the only comparable issue the gun had in the LMIC datasets was that it required replacing twice on the Gaborone (2001) linac. This highlights the importance in the design of the gun subsystem.

Air, Cooling and Generator Subsystem
As shown in Figure 1, air, cooling and generator faults were at least three times as frequent in the LMIC environments than in Oxford. Figure 2 (upper) shows a breakdown of the failure rate of the different linacs studied. The most prominent failures for this subsystem were mechanical and external failures. The mechanical failures were mostly leaking pipes and low water and gas pressures, which All centres had an external chiller, yet it is evident from Figure 2 (lower) that the chillers failed more often in LMIC environments, perhaps due to operating in a hotter, dustier environment. Active maintenance was observed at the Oxford centre with weekly checks and observations carried out on the chiller by local engineers; similar procedures at all centres could improve uptime. Chiller and generator maintenance was often subcontracted out in LMICs and thus the faults were not necessarily recorded in logbooks (which may explain the absence of reported chiller and generator faults in Benin).
The power supply differed between environments. The Benin, Enugu, Lagos and Sokoto centres were solely powered by generators to circumvent the frequent power cuts resulting from the instability of the grid power supply. In Abuja, the grid was used with a dedicated back-up generator. The use of generators created an additional singlepoint failure mode; if the generator was down (reported issues include running out of fuel and fires), so was the linac.
Generators will probably be necessary in LMIC environments in the future, but their implementation needs careful planning to avoid extra downtime. This is evidenced in a study on radiotherapy in Botswana [13], where there was a clear increase in unplanned downtime resulting from changing from the more stable power supply of South Africa to that of Botswana from 2012 onwards. The implications of power failure on the linac and recommendations for managing this are discussed below. Figure 3 displays the failure rate of the vacuum subsystem and it is this subsystem that had the most striking difference between the HIC and LMIC environments. There were no recorded failures in any of the six HIC linacs, whereas there were recorded faults in all linacs at the LMIC  centres. The failure of the vacuum is not a trivial issue: depending on the amount of contamination, the level of vacuum to recover and any damage to pumps, a failure can cause hours to weeks of downtime. This is a clear environmental factor that is not experienced in HICs and affects the performance of the linac. The vacuum is susceptible to multiple routes of failure as a result of interruptions to the power supply. Irregular power supplies can affect the temperature regulation of the linac, leading to overheating and the vacuum pressure drifting. Power surges cause fuses to fail, affecting many subsystems and components, including the ion pump. Finally, a common and dramatic failure mode is the loss of power to a backing pump, leaving a (poorly maintained) ion pump to support the vacuum. The ion pump fights a losing battle trying to keep the vacuum and eventually overheats and fails, causing a total loss of vacuum to atmosphere. The linac must then be brought back down to vacuum and the (expensive) ion pump must be replaced.

Downtime in Enugu
Contextual factors, including the number and skill of local engineers, ease of access to spare parts and the level of contractual support available all affect the downtime of the linac. This is particularly evident in Figure 4, which visually represents the 54.7% downtime experienced by the Enugu (2007) linac. This figure agrees with the qualitative experiences of downtime discussed by Reichenvater and Matias [14].
The overall downtime is dominated by a few long periods rather than many frequent, small periods. The Enugu machine was initially installed in 2007 but vandalisation (scavenging the system for valuable parts) and a fire delayed the machine treating patients to 2011. After the period of downtime from May 2014, the formation of a privateepublic partnership in 2017 enabled the centre to start running again. The reasons for the long periods of downtime include: Waiting for parts. It was reported by multiple LMIC centres without service contracts, including Enugu, that the administrative process of sourcing funds for spare parts can take so long that exchange rate fluctuations make quotes invalid. The whole, lengthy internal process must then be repeated. Waiting for specialist engineers who can assist with troubleshooting and diagnosing a fault, or performing a complex repair. Local engineers may have difficulty troubleshooting linac failures because they have no experience in linac maintenance. Some centres cannot afford to send them on the vendor recommended training courses for linac engineers so they are trained 'in-house' in the country. They also struggle to interpret the interlocks and inhibits reported by the machine when a fault occurs. When the machine has been down for a long time, patients are referred for treatment elsewhere. After repair, the centre must go through a lengthy administrative process for operating as a treatment centre again and this is not a trivial issue.

Discussion
The results of this paper are based on an analysis of logbooks and databases kept by radiotherapy personnel.  The results obtained show a key failure rate difference between the vacuum and air, cooling and generator subsystems. Maintaining vacuum during power shortages is critical. In Abuja, the local engineer has built a uninterruptible power supply (UPS) that supports the vacuum during periods of power failure; a similar system could be incorporated into the design of the machine. The linac should also be designed so that it shuts down safely when power is absent, with a passive valve to maintain vacuum and preserve the ion pump. A sealed vacuum unit that requires minimal to no pumping could be an excellent solution to solving the problem of maintaining the vacuum in periods of power outage. The results also show a reduced failure rate when the generators and chillers are regularly maintained and observed and this is recommended at all centres.
We note that differences in recording practices (severity of faults recorded, detail recorded and length and frequency of periods where no log is kept) mean that there is a systematic difference in the recording of faults. An improvement in logbook keeping practice would improve further studies.

Conclusion
This study presents an analysis and comparison of the performance of linacs between different environments based on logbooks and databases. It is shown that failures of the air, cooling and generator, computing, couch and door, RF power and vacuum subsystems all seem to have significantly different rates of failure between HIC and LMIC environments and the underlying reasons for these different rates are discussed. Furthermore, it is shown that the reliance of LMICs on generators means that faults associated with the generators themselves make them a significant cause of failure. The unstable power supplies in LMICs can affect other subsystems, most notably the vacuum. Contextual issues are also discussed and how waiting for replacement parts, the skill and experience of local engineers and slow internal processes all have a very significant impact on linac downtime. Recommendations are made regarding design adjustments that could improve linac performance in LMICs.