Cost-effectiveness

Introduction: Proton radiotherapy (PT) is a promising but more expensive strategy than photon radiotherapy (XRT) for the treatment of non-small cell lung cancer (NSCLC). PT is probably not cost-effective for all patients. Therefore, patients can be selected using normal tissue complication probability (NTCP) models with predefined criteria. This study aimed to explore the cost-effectiveness of three treatment strategies for patients with stage III NSCLC: 1. photon radiotherapy for all patients (XRT All ) ; 2. PT for all patients (PT All ) ; 3. PT for selected patients (PT Individualized ) . Methods: A decision-analytical model was constructed to estimate and compare costs and QALYs of all strategies. Three radiation-related toxicities were included: dyspnea, dysphagia and cardiotoxicity. Costs and QALY’s were incorporated for grade 2 and ≥3 toxicities separately. Incremental Cost-Effectiveness Ratios (ICERs) were calculated and compared to a threshold value of €80,000. Additionally, scenario, sensitivity and value of information analyses were performed. Results: PT All yielded most QALYs, but was also most expensive. XRT All was the least effective and least expensive strategy, and the most cost-effective strategy. For thresholds higher than €163,467 per QALY gained, PT Individualized was cost-effective. When assuming equal minutes per fraction (15 minutes) for PT and XRT, PT Individualized was considered the most cost-effective strategy (ICER: €76,299). Conclusion: Currently, PT is not cost-effective for all patients, nor for patient selected on the current NTCP models used in the Dutch indication protocol. However, with improved clinical experience, personnel and treatment costs of PT can decrease over time, which potentially leads to PT Individualized , with optimal patient selection, will becoming a cost-effective strategy.


Introduction
Proton radiotherapy (PT) is a potentially beneficial (e.g. reduced toxicities) but significantly more expensive treatment strategy for patients with stage I-III non-small cell lung cancer (NSCLC) compared to photon-based radiotherapy (XRT). 1,2 Furthermore, treatment capacity is limited. 3 Considering the limited treatment capacity, costs and the substantial number of patients with NSCLC, great attention should be paid to optimal patient selection, since not all patients will benefit from PT. Hence, using PT instead of XRT without optimal patient selection would lead to unnecessary high costs, inefficient and unfair healthcare provision.
Currently, XRT with or without chemotherapy is the main treatment modality for the management of stage III NSCLC. 4 Considering the proximity of the target volume to the lungs, esophagus and the heart, XRT causes dyspnea in 10% of the lung cancer patients and grade ≥3 dysphagia in approximately 5% of the patients. A larger group of patients experience grade 2 symptoms of dyspnea (18%) and dysphagia (22%). 5 Cardiotoxicity is a less investigated toxicity but has since recently been recognized as a matter of concern, also for lung cancer patients. [6][7][8] Cardiac events occurs in 33% of the patients within five years after diagnosis. 9 An advantage of PT is its favorable in-depth dose distribution, reducing treatment toxicity by minimizing the exposure of radiation to surrounding normal tissues relative to the target dose. 10 A reduction of toxicities could limit the impact on health related quality of life and could pose decreased costs of toxicity management, which could potentially reduce the overall costs for PT. 11 However, despite the clinical benefits, effects are not likely to outweigh the costs for the total lung cancer population receiving radiotherapy. Optimal patient selection is therefore of great importance to use PT in a costeffective way. 12 At present, patient selection is based on normal tissue complication probability (NTCP) models which estimate the risk reduction in toxicity of healthy tissue by using PT compared to XRT (∆NTCP). 3 Using predefined ∆NTCP criteria as described, for example, in the Dutch indication protocol, patients can be selected for PT. 13 This "model-based" approach for patient selection as described by the Dutch Healthcare Institute uses fixed thresholds per grade of toxicity and can be incorporated in a decision-analytical model to explore the cost-effectiveness of PT for selected patients, compared to XRT. 3 The cost-effectiveness of PT versus XRT for patients with early stage NSCLC has been explored in 2010 using a decision-analytical model. 14 However, no studies have been performed yet on cost-effectiveness of PT versus XRT for patients with stage III NSCLC using the "model-based" approach to select patients.
Patients with Stage III NSCLC are currently being considered as potentially eligible patients for PT based on planning studies. Hence, a decision-analytical state-transition model was developed to investigate whether the additional effects of PT for all patients or PT for selected patients are worth the extra costs for patients with NSCLC stage III, compared to XRT. With this decision-analytical modeling technique, evidence of various sources for probabilities, costs and utilities can be synthesized in order to inform decisions and to reflect the posed decision uncertainty. 15 Therefore, the aim of the study was to explore the cost-effectiveness of the model-based approach for selecting patients with NSCLC for PT and XRT.
Although Dutch indication protocols are used, PT capacity is limited globally, patient selection remains fundamental. Therefore, this study is likely to be relevant to other countries and can serve as a methodological approach to overcome unnecessary high costs, inefficient and unfair healthcare provision.

State-transition model approach and structure
A probabilistic decision-analytical state-transition model was used to assess the cost-effectiveness of three treatment strategies for patients with NSCLC stage III: 1. XRT for all patients (XRT All ); 2. PT for all patients (PT All ); 3. PT for selected patients (PT Individualized ). In the third treatment strategy, eligibility of patients was based on the model-based approach. Currently, models for dyspnea 16 , pneumonitis 17 , dysphagia 18 and mortality related to heart dose 19 have been published. For this exploratory analysis, three out of the four models (pneumonitis, dysphagia and mortality related to heart dose) from the Dutch indication protocol were used. 13 Accordingly, patients were eligible for PT according to ∆NTCP criteria and ineligible patients received XRT. Patient are eligible for PT with a clinical relevant ∆NTCP of ≥10% for grade 2 toxicities, ≥5% for grade 3 and ≥2% for grade 4-5 toxicities. In case of multiple complications, the sum of ∆NTCP should be minimal 15%, 7.5% and 3%, respectively. 13 While the Dutch indication criteria are determined by the models to predict pneumonitis, dysphagia and mortality related to heart dose, in this exploratory analysis the model to predict pneumonitis was replaced by the model predicting dyspnea.
The latter model incorporated baseline dyspnea, which is considered as an important for the face validity of the model.
The state-transition model was constructed using cohort simulation and adopted a cycle length of three months. According to the Dutch health economic guidelines, a life-time time horizon, a societal perspective and a discount rate of 4.0% for costs and 1.5% for effects was applied. 20 The model was divided in two timeframes. Within six months, health states were based on whether patients experienced acute toxicities (dyspnea or dysphagia) with or without disease progression. From six-months onwards, patients could be free of toxicities, but could also develop cardiotoxicity, dyspnea or cardiotoxicity concurrently with dyspnea, all with or without disease progression (figure 1.) Disease progression was defined as loco-regional or distant dissemination of cancer cells. "Death" was the absorbing health state in the model, either due to cancer or due to other causes.
Toxicity was graded according to the Common Terminology Criteria of Adverse Events (CTCAE) 3.0. 21 Grade 2 was set as cut-off point. Costs and resource use for toxicities were incorporated for grade 2 and grade ≥3 separately to provide more granularity to reflect toxicity costs and resource use. The occurrence ratio of grade 2 and grade ≥3 toxicities, based transition probabilities, was used to calculate a weighted cost average. Microsoft Office Excel 2016 was used to implement the state-transition model.

Model assumptions
Regarding the model, several assumptions were made: 1. Within the first six months after treatment, patients could only experience one toxicity at a time.
2. The probability of late onset dyspnea and dysphagia after six months was not taken into account.
NTCP models only accurately predict dyspnea onset until six months. Acute dysphagia is most common in the first 2-3 weeks after treatment initiation. 22 Based on expert opinion, occurrence of dysphagia after six months is rare.
3. For patients with dyspnea and/or cardiotoxicity at six months, it was assumed that these toxicities were irreversible. 4. After experiencing disease progression, either in acute or late time frame, patients entered a progressed disease state reflecting the same toxicities as before disease progression. Figure. 1. Schematic representation of the Markov Model structure. The model is divided is an acute (<6 months) and late (≥ 6 months) time frame. Additionally, the model consists a progression free and progressed disease part. Grading of toxicities was not used in the schematic model structure in order to prevent unnecessary complexity. PF = Progression free; PD = Progressed disease.

Transition probabilities
NTCP models for dyspnea 16 , dysphagia 18 and two-year mortality related to heart dose (used to adjust OS until two years) 19 were used to derive treatment dependent transition probabilities (table 1) of grade 2 and grade ≥3 toxicities using dosimetric parameters and patient characteristics (appendix A). Dosimetric parameters were extracted from results of a multicentric in silico clinical trial (ROCOCO) 23 , which compared XRT and PT for patients with stage I-III NSCLC (72% stage III patients). Patient characteristics (age, gender, smoking, WHO performance status, chemotherapy 24 , GTV 25 and baseline dyspnea 26 ) were extracted from literature. However, these studies only reported mean values. Therefore, patient values were randomly sampled from a Gaussian distribution (patient characteristics) or Gamma distribution (dosimetric parameters and Gross Tumor Volume (GTV)).
Treatment independent transition probabilities were estimated for cardiac events (cardiotoxicity) as this is expected to have a substantial impact on costs and consequences. These probabilities were derived from time to event data from a study by Degens et al. 10 This study investigated the incidence of cardiac events in patients with stage III NSCLC within five years after conformal radiotherapy treatment completion. 9 The probability of cardiotoxicity was based on three most common categories: arrhythmia (43.9%), heart failure (HF) (including valve defects) (33.9%) and ischemic heart disease (IHD) (22.2%). 9 From the time to event dataset, a Kaplan-Meier curve of these merged categories was generated in R Statistical Software (version 3.5.1) to calculate time dependent transition probabilities (appendix C). The randomized Phase III NVALT-11/DLCRG-02 study was used to extract overall survival (OS) and progression free survival (PFS) rates with a median follow-up time of 51.3 months (95% CI: 47.5 -60.2 months). 27 Additionally, Table 1 presents transition probabilities derived from NTCP models. Detailed information about NTCP models and cardiotoxicity can be found in appendix B and C, respectively.

Health related quality of life
Utility scores were used to reflect the impact of toxicity, disease progression and mortality on quality of life. "No toxicity" utilities for both progression free and progressed disease were derived from a study of Ramaekers et al 28 , in which utilities were based on the Dutch EuroQoL-5D-3L. 28 Disutilities were assigned to patients with toxicity. Since cardiotoxicity was based on three major categories, a weighted disutility, consisting of a disutility for arrhythmia (43.9%), HF (33.9%) and myocardial infraction (22.2%), was used to calculate a utility for patients with cardiotoxicity. An accurate disutility for dysphagia could not be found, therefore the disutility for dyspnea was also assigned to patients with dysphagia. No distinction could be made between grade 2 and grade ≥3 (dis)utilities due to lack of data. Therefore, the same disutility was used for both grade 2 and ≥3. Additionally, an age related disutility, based on gender, was assigned to patients aged 75 or over. (Dis)utilities are reported in table 2. Depreciation values, annuity factors, personnel costs, overhead percentages and maintenance costs were based on the Dutch costing manual 34 , expert opinion and news articles. 35 Costs were divided in three main categories: Building (incl. structure costs, depreciation and interest charges), medical equipment (incl. depreciation and interest charges) and personnel costs of radiation technologists, physicians and physicists (incl. allowances). Based on these categories, total annual costs were defined. Total annual costs included maintenance, overhead and housing costs which were based on annual depreciation costs and interest charges of both building and medical equipment. Next, total costs per fraction for PT and XRT was calculated based on information of the operation of a facility.

Health state and event resource use and costs
We included related and unrelated healthcare costs, patient and family costs and inter-sectoral costs.
Costs of toxicities were calculated for grade 2 and grade ≥3 separately, except for cardiotoxicity. Since it is difficult to distinguish the origin of dyspnea, which could be either from cardiac or lung diseases, guidelines for treating (radiation-induced) dyspnea are lacking. Therefore, costs of Chronic Obstructive Pulmonary Disease (COPD) were assumed to reflect healthcare utilization for dyspnea. The healthcare costs were derived from a study which included costs of primary, hospital and paramedical care (homecare and rehabilitation), as well as medication. 37 Costs of moderate and severe COPD were assumed to reflect costs of grade 2 and grade ≥3 dyspnea, respectively. For dysphagia, hospitalization, medication use and portion of liquid nutrition were derived from literature. 38 Other resource use was based on expert opinion. According to CTCAE 3.0, hospitalization regarding dysphagia is indicated for grade ≥3. Hence, it was assumed that hospitalization was not applicable to grade 2 dysphagia and therefore not included in the cost calculation. 21 For implementation in the model, costs for grade 2 and grade ≥3 were merged for dyspnea and dysphagia according to the proportion of probabilities. Dysphagia was only included once, as it only occurs during the first cycle.
For healthcare costs of cardiotoxicity, the Practical Application tool to Include Disease Costs (PAID), version 1.1. 39 was used. Based on ICD-9 codes, coronary heart diseases reflected the costs of IHD and other heart diseases and HF reflected the costs of arrhythmia and HF (including valve defects). 39 A weighted average of cardiotoxicity costs was based on the categories as previously described. Costs of cardiotoxicity were age-and gender specific, which implies different costs per patient in each cycle.
Follow-up costs were distributed among different cycles in the first five years. The PAID tool was also used to estimate unrelated healthcare costs.
Travel distance for PT was defined using the geographical midpoint of the Netherlands. A weighted average for distance was calculated based on the annual capacity of three PT centers (appendix E).
Distance for XRT was extracted from a report in which frequencies in distance categories were reported. 40 In case of private vehicle traveling, it was assumed that the patient would be accompanied by an informal caregiver. Unit costs were based on the Dutch costing manual. 34 Productivity losses were included and were calculated using the Friction cost approach (consistent with the Dutch pharmacoeconomic guideline). Information of age-specific full-time and part-time labor participation was based on data from CBS Statline. 41 The distribution was used to calculate a weighted average of costs for productivity losses (appendix F). These costs were implemented once for each patient at baseline, assuming that all patients will pose costs of productivity losses after diagnosis.
All costs were converted to 2019 price levels and reported in Euros (Table 4).

Sensitivity analysis
Distributions were assigned to each individual input parameter in order to perform a probabilistic sensitivity analysis (PSA) with Monte Carlo simulation (2,000 simulations). 45 The ICER was calculated based on the outputs of the PSA. To illustrate uncertainty surrounding the ICER, a cost-effectiveness plane was created. A cost-effectiveness acceptability curve (CEAC) was created to present the probability of strategies being cost-effective at different ceiling ratios. Additionally, deterministic one-way sensitivity analyses were performed.

Scenario analysis
Scenario analyses were performed to examine the cost-effectiveness from a healthcare perspective and possible future or alternative scenarios. Since PT is still in start-up phase in the Netherlands and fraction duration is expected to decrease over time, an analysis assuming that minutes per fraction of PT and XRT are equal was performed (15 minutes). Additionally, a threshold analysis was conducted in order to identify the maximum fraction duration for PT Individualized still being cost-effective at the threshold value. In a second scenario analysis, productivity losses were excluded for PT since toxicity probabilities were lower compared to XRT and less toxicity might reduce productivity losses.

Value of information analysis
The expected value of perfect information (EVPI) represents the risk associated with the decision, i.e. the probability of making a wrong adoption decision (from the CEAC described above) multiplied by the consequences of a wrong adoption decision. In other words, the EVPI provides a maximum value to the amount of resources that should be spend on research to decrease the decision uncertainty. 15 By multiplying the per patient EVPI by the effective population, which reflects the number of patients that are affected by the decision, the population EVPI was calculated. The effective population for the next 10 years of 83,029 patients was based on the incidence of patients with NSCLC in 2019 in the Netherlands.
Additionally, the value of partial perfect information (EVPPI) was calculated to determine which parameter(s) contributed the most to decision uncertainty (i.e. the EVPI for specific (groups) of parameters

Results
Total expected life-time cost per patient from a societal perspective were estimated to be €79,695 for PT All , €68,904 for PT Individualized and €41,231 for XRT All . PT All yielded most QALYs (1.951) and LYs (2.558). XRT All was the least effective (1.769 QALYs), the least expensive strategy, and the most cost-effective strategy.
For thresholds higher than €163,467 per QALY gained, PT Individualized was cost-effective. PT All will be costeffective above a willingness to pay of €301,396 per QALY gained. XRT All had the highest probability of being cost-effective (97%) at a threshold of €80,000 per QALY gained. CEACs and CE planes are presented in appendix G.
The scenario analysis using a healthcare perspective resulted in similar outcomes with XRT All as the most cost-effective treatment strategy until a value of €134,256 per QALY gained followed by PT Individualized as next best up to €245,053 per QALY gained. Excluding productivity losses had a minor impact on cost and cost-effectiveness results. However, when considering equal minutes per fraction as future perspective, XRT All was cost-effective until a value of €76,299 per QALY gained followed by PT Individualized until a value of €124,719 per QALY gained. Above the latter value, PT All became cost-effective. The probabilities of PT All , XRT All and PT Individualized being cost-effective at the threshold value were 16%, 47%, and 38% respectively.
When increasing the PT fraction duration by one minute, the ICER exceeded the threshold value (€81,665).
Result are reported in table 5. The deterministic one-way sensitivity analyses showed that the cardiac event categories and "no toxicity" utilities were the most influential parameters. The tornado diagram can be found in appendix G.
Regarding base case scenario, the estimated EVPI per patient was €84 at a threshold value of €80,000, the population EVPI for the effective population was €7 million. Further research focusing on NTCP models for mortality based on heart dose and OS (NTCP+OS: €388,629; OS: €244,312) would be most worthwhile.
The population EVPI for different threshold values is shown in appendix G figure G.5.

Discussion
This study aimed to examine the cost-effectiveness of three treatment strategies for patients with stage III NSCLC: 1. XRT All ; 2. PT All ; 3.PT Individualized . In the base-case analysis, PT All was not cost-effective neither from a societal perspective, nor from a healthcare perspective when assuming a threshold value of €80,000. There was a substantial cost difference between PT and XRT which can be explained by primary treatment cost that are 4.1 times higher for PT compared to XRT. These results are uncertain and conditional on different assumptions. PT Individualized will potentially be cost-effective compared to XRT All when the fraction duration or the number fractions of PT decreases and would become more costeffective if patient selection is further optimized. Therefore, selection of patients for whom PT is expected to be beneficial is crucial at this point to improve the cost-effectiveness of PT.
To the authors knowledge, the present study is the first study in the Netherlands evaluating the cost- showed that PT was cost-effective relative to conventional radiotherapy from a healthcare perspective.
Differences between the previous and the current model can be explained by the newly performed costanalysis which updated and expanded the previous analysis of Peeters et al. 33 with additional information.
With the current clinical experience, PT proved to be more expensive as essentially calculated in 2010.
Additionally, the use of NTCP models allows individual toxicity probability calculation based on planning data and enables to distinguish between grade 2 and grade ≥3 toxicity which improves the accuracy of predicting potential benefit and moreover cost difference. The current model based approach adopted in the Netherlands 10 has resulted in indication protocols that recommend PT in case a certain difference in NTCP between PT and XRT (∆NTCP) is expected to be achieved. This ∆NTCP threshold differs according to grade of toxicity but not for type of toxicity. The current analyses showed that selecting patients based on anticipated benefit is expected to improve costeffectiveness (compared to providing PT, unselected, to all patients). However, patient selection might be optimized further to increase cost-effectiveness. Optimal patient selection could potentially be achieved through applying different ∆NTCP thresholds according to toxicity type based on the impact on (cost-)effectives and could be explored using the model described in this paper. Without optimal patient selection, costs will be unnecessarily high and might impede healthcare provision to people who can Based on the value of information analysis, it would be most valuable to perform further research on unrelated healthcare costs and NTCP models for mortality related to heart dose. Both are related, since unrelated healthcare costs are based on mortality. When fraction duration of PT decreases, the ICER will possibly become closer to the threshold value of €80,000. More or different parameters will then become more valuable to focus further research on. Additionally, in further research it would be worthwhile to include costs of multiple PT centers to get a comprehensive overview of PT cost in general. Furthermore, it would be of great value to incorporate NTCP models for cardiotoxicity when becoming available to obtain dose-related probabilities.
This study illustrates a methodological approach to assess the cost-effectiveness of PT vs XRT and supporting optimization of patients selection for PT. In conclusion, based on this explorative analysis of the model-based approach, PT All is not cost-effective in the current situation compared to XRT All and PT Individualized . With optimal patient selection, PT Individualized can potentially become a cost-effective treatment when minutes per fraction decrease, the number of fractions decrease or an optimalization by the interaction of both.