Multiple sclerosis in the real world: A systematic review of fingolimod as a case study.

INTRODUCTION
The aim of our study was to systematically review the growing body of published literature reporting on one specific multiple sclerosis (MS) treatment, fingolimod, in the real world to assess its effectiveness in patients with MS, evaluate methodologies used to investigate MS in clinical practice, and describe the evidence gaps for MS as exemplified by fingolimod.


METHODS
We conducted a PRISMA-compliant systematic review of the literature (cut-off date: 4 March 2016). Published papers reporting real-world data for fingolimod with regard to clinical outcomes, persistence, adherence, healthcare costs, healthcare resource use, treatment patterns, and patient-reported outcomes that met all the eligibility criteria were included for data extraction and quality assessment.


RESULTS AND DISCUSSION
Based on 34 included studies, this analysis found that fingolimod treatment improved outcomes compared to the period before treatment initiation and was more effective than interferons or glatiramer acetate. However, among studies comparing fingolimod with natalizumab, overall trends were inconsistent: some reported natalizumab to be more effective than fingolimod and others reported similar effectiveness for natalizumab and fingolimod. These studies illustrate the challenges of investigating MS in the real world, including the subjectivity in evaluating some clinical outcomes and the heterogeneity of methodologies used and patient populations investigated, which limit comparisons across studies. Gaps in available real-world evidence for MS are also highlighted, including those relating to patient-reported outcomes, combined clinical outcomes (to measure overall treatment effectiveness), and healthcare costs/resource use.


CONCLUSIONS
The included studies provide good evidence of the real-world effectiveness of fingolimod and highlight the diversity of methodologies used to assess treatment benefit in clinical practice. Future studies could address the evidence gaps found in the literature and the challenges associated with researching MS when designing real-world studies, assessing data, and comparing evidence across studies.


Introduction
In multiple sclerosis (MS) research, randomized controlled trials (RCTs) have demonstrated the efficacy of MS treatments according to relapses, disability, and magnetic resonance imaging (MRI) outcomes, and in some cases according to patient-reported outcomes (PROs) [1][2][3]. The protocol-driven approach to data generation in RCTs provides high-quality evidence in a carefully selected group of patients who are treated under ideal, controlled conditions [3,4]. In the real world, however, the population of patients with MS who are eligible to receive diseasemodifying therapies (DMTs) is much more heterogeneous than that treated in RCTs [4], and patient medication-taking behavior and patient monitoring are not constrained by protocol [4,5]. These factors can influence clinical outcomes, and as a result, data generated from RCTs may not be generalizable to the clinical practice setting [4,6].
Real-world studies complement RCTs by investigating a more diverse group of patients than those included in clinical trials, providing real-world data (RWD) that can be generalized across the population of patients with MS who are treated and monitored according to standard clinical practice [4]. Real-world studies are often observational in nature, and can collect RWD prospectively during routine patient visits or retrospectively from existing data sources, such as patient registries, medical records, or administrative claims databases [4,7]. Medical records and registries generally report clinical information collected by physicians [7]. Conversely, administrative claims databases, which capture diagnosis codes and payment data for the insured population, may require proxy measures to identify patients with MS and certain clinical outcomes, such as relapses [7][8][9][10]. RWD expand the evidence base for MS, give insight into the short-term and long-term safety and effectiveness A C C E P T E D M A N U S C R I P T we believe report the results of 34 independent studies, are described in Table A.3.
Over half of studies reporting data for IFN and GA did not present data for the individual treatments [23,30,35,37,43,45,52]; therefore, we have not distinguished between the injectable DMTs throughout this review.

ACCEPTED MANUSCRIPT
In total, 12 studies were identified as being of a high quality [20, 24-27, 29, 35, 37, 38, 42, 43, 52]. This was because they met all of the following criteria: defined the inclusion and exclusion criteria; investigated a representative population (included treatment cohorts of at least 30 patients, collected data from multiple centers); defined outcomes according to objective criteria; had a follow-up period that was regarded by the authors of the present study as being long enough to assess outcomes accurately; and used statistical methodology to reduce bias and confounding ( Table A.4). Eighteen studies, which met most but not all the above criteria, were categorized as being of a medium quality [21,22,28,31,32,34,36,39,40,[44][45][46][47][48][49][50][51]53]. The remaining four studies, which did not meet several criteria, were categorized as being of a low quality [23,30,33,41].
Conversely, four studies reported better relapse outcomes with natalizumab treatment than fingolimod [31,34,38,42]; these studies ranged in scale from those that used patient data from one [31] or two [34] clinics to those that used data from 27 MS centers [42] or an international MS registry [38]. Of studies assessing fingolimod as a switch therapy [30,32,36,37,41,44], five studies reported reactivation of relapses after switching to fingolimod [30,32,36,37,41] and one M A N U S C R I P T  [44]. All three studies that provided data on relapses before natalizumab initiation or in the period between treatments (washout period) reported lower relapse rates during fingolimod treatment than in the pre-natalizumab or washout periods [32,36,37].
Fingolimod treatment versus the period before initiation: Nine studies assessed outcomes after fingolimod initiation [21-23, 28, 42, 44, 46, 48, 49], which included eight single-arm studies [21-23, 28, 44, 48, 49]. In four studies disability outcomes were unchanged [23,28,44,48], and in five studies disability outcomes were improved after fingolimod initiation, as measured by EDSS score [21,22,28,42,49] and as the proportion of patients with no disability progression [28,46]. Two singlearm studies assessed disability outcomes from baseline in cohorts of patients subdivided by baseline DMT: one study reported a significant difference from baseline in patients switching to fingolimod from IFN/GA but not in patients switching to natalizumab [44]; conversely, in the other study, similar EDSS scores were reported before and after fingolimod initiation regardless of whether patients had previously received IFN/GA or natalizumab [23]. However, the lack of methodological description in this second publication meant that it was not possible to explore the lack of consistency between the two studies [23,44].
A C C E P T E D M A N U S C R I P T  [35,43]: one study reported better outcomes for patients switching to fingolimod than to another IFN/GA, as measured by the risk of disability progression and disability regression [35], and the other study reported better outcomes for fingolimod for the proportion of patients free from disability progression but not for the risk of disability progression [43]. One study compared fingolimod and IFN/GA in patients discontinuing natalizumab and reported no significant difference in the proportion of patients with disability progression [52].

Fingolimod versus natalizumab:
Six studies compared fingolimod and natalizumab [29,34,36,38,42,47], and one study compared outcomes in patients remaining on natalizumab versus those who switched to fingolimod [41]. Of studies comparing fingolimod and natalizumab, five studies reported similar outcomes for fingolimod and natalizumab for EDSS score, time to disability progression, or proportion of patients free from progression [29,34,36,42,47]. Conversely, the proportion of patients with disability regression was higher for natalizumab than fingolimod (p < 0.001) in the sixth study, an international MS registry study [38]. In a small observational study that investigated fingolimod as a switch therapy, EDSS scores were stable in all patients remaining on natalizumab but increased in over a third of patients switching from natalizumab to fingolimod [41].

Summary of real-world MRI outcomes
Thirteen studies reported MRI outcomes (T2 and gadolinium-enhancing lesions) after initiation of fingolimod (Table A.7) [21-23, 28, 34, 41, 42, 45-49, 51], but none M A N U S C R I P T Fingolimod treatment versus the period before initiation: Eight studies presented MRI data after fingolimod treatment [21-23, 28, 42, 46, 48, 49], which included seven single-arm studies [21-23, 28, 46, 48, 49]. Four of these studies reported improved MRI outcomes after fingolimod treatment [21][22][23]42] and one study reported no change [46]. Three of the single-arm studies reported that over half of patients receiving fingolimod had no active lesions or new lesions, but did not provide baseline data from before patients initiated fingolimod [28,48,49]. One study, using patient data from a medical center, compared outcomes in those who remained on fingolimod versus those who discontinued fingolimod and reported better MRI outcomes in those who remained on fingolimod [51].

Fingolimod versus natalizumab:
Three studies compared fingolimod and natalizumab [34,42,47] and two studies assessed fingolimod as a switch therapy after natalizumab discontinuation [41,45]. Of studies comparing fingolimod and natalizumab, two studies reported better MRI outcomes in the natalizumab cohort than the fingolimod cohort [42,47], whereas a third study reported a similar proportion of patients with new lesions in both cohorts [34]. Of the two studies assessing outcomes in patients receiving fingolimod as a switch therapy after discontinuing natalizumab, one study reported that the majority of patients had MRI activity after switching to fingolimod compared with none of the patients who remained on natalizumab [41]. In the other study, similar proportions of patients switching to fingolimod or first-line therapies (including IFN/GA) had radiological reactivation [45], but this was lower than the proportion of patients with radiological reactivation after a switch to no treatment when discontinuing natalizumab [45].
Although reactivation of disease activity may reflect differences in effectiveness between fingolimod and natalizumab, it is more likely that the reactivation of disease activity in the fingolimod cohort is a consequence of natalizumab withdrawal rather than inadequate disease control during subsequent fingolimod treatment.
Fingolimod treatment versus the period before initiation: In the 12 studies assessing persistence with fingolimod following its initiation [21,22,28,32,33,39,40,44,46,48,49,51], two studies reported that the majority of patients at an MS center were still receiving fingolimod at follow-up (follow-up was at 6.8 months in one study; length of follow-up was not defined in the other study) [33,40]. This is consistent with 10 other studies using registry data or patient records in which a small proportion of patients discontinued fingolimod [21,22,28,32,39,44,46,48,50,51].
In another study of administrative claims data, patients discontinuing IFN reported higher persistence after a switch to fingolimod than those switching to GA, although the difference was not significant [26]. M A N U S C R I P T  [24,27,31,34,38,50,53]. Three studies using administrative claims or registry data reported that similar proportions of patients were persistent with fingolimod and natalizumab [24,38,53]. Conversely, four studies, using medical records, registry data, or administrative claims data, reported better outcomes for fingolimod than natalizumab, according to the risk of discontinuation/non-adherence

Summary of real-world combined measures of disease activity
Ten studies included combined measures of disease activity as an outcome of interest [29-31, 43, 45-49, 51].
Fingolimod treatment versus the period before initiation: Three studies assessed the proportion of patients without relapses, disability progression, and MRI activity after fingolimod initiation [46,48,49]. One of these studies reported a significantly higher proportion without disease activity after fingolimod initiation (p < 0.05) [46]. The remaining two studies reported that almost half of patients receiving fingolimod were 16 free from disease activity, but did not provide baseline data for before patients initiated fingolimod [48,49]. Consistent with these studies, one study at a single MS center reported better outcomes for the proportion of relapse-and MRI activity-free patients in those who remained on fingolimod versus those who discontinued fingolimod treatment [51].

Discussion
Real-world studies provide valuable insight into diseases, treatment approaches, treatment effectiveness, and patient experience from the clinical practice setting [4,5]. These studies complement and strengthen the body of evidence from RCTs by providing generalizable RWD from a broad and diverse population of patients who are treated and monitored according to a care-driven approach [4,5]. Real-world M A N U S C R I P T pharmacoeconomic data and data on resource use, which aid real-world decisionmaking but are not typically assessed in clinical trials [55]. As the body of real-world evidence grows, SRs are a valuable tool to assess the overall benefit profile of an intervention in clinical practice and to evaluate the methodologies being used in realworld studies.

ACCEPTED MANUSCRIPT
The present SR assessed the published literature on the real-world effectiveness of fingolimod in the treatment of MS and, to the best of our knowledge, it is the only study to have done so. In addition to providing a summary of RWD for fingolimod, this study used fingolimod as a case study to present the range of methodologies being used to evaluate DMTs for MS in the real world, and it also provided insight into the quality of the RWD being generated in these studies. The benefit of reviewing the evidence for fingolimod stems from the fact that it has been extensively investigated in the real world. However, a similar SR approach could be performed as part of real-world studies to investigate other DMTs being used to treat MS in clinical practice. As part of a future study, these RWD could be combined with data from RCTs in network meta-analyses in order to increase the sensitivity of measurements of treatment effect [56].
This SR was based on data from 34 peer-reviewed published papers . RWD for several different effectiveness outcomes were available, representing over 7000 patient-years of data. It is likely that data for fingolimod would be available for a much higher number of patient-years if the 134 congress materials identified as meeting the eligibility criteria in this SR were also taken into consideration. However, owing to limited information presented in abstracts and posters, these were not included. The M A N U S C R I P T  [20-22, 24-29, 31, 32, 34-40, 42-53]. This was because they adhered to many, if not all, of the criteria identified as contributing to well-designed studies that have the potential to generate robust data. These quality assessment criteria included defining eligibility criteria, investigating a representative population using data collected from multiple centers, defining outcomes according to objective criteria, having a study follow-up period that was long enough to assess outcomes accurately (as regarded by the authors of the present study), and using statistical methodology to reduce bias and confounding. For studies identified as being of a low quality, results should be interpreted with caution [23,30,33,41].

ACCEPTED MANUSCRIPT
Among the single-arm studies that assessed outcomes after initiation of fingolimod therapy, or that compared fingolimod with IFN/GA, most reported improved clinical outcomes with fingolimod treatment [20, 25-27, 35, 42-46, 48-52]. The trends reported in this SR for fingolimod versus IFN/GA are consistent with RCT data for fingolimod versus intramuscular IFN beta-1a [15] and with the results of real-world studies that were published after the search date or were available only as congress materials, and were therefore not included in the SR (but have been discussed here).
Duerr et al. and Alsop et al. reported better outcomes during treatment with fingolimod than with IFN/GA using registry data from propensity score-matched patients [57][58][59]. Among the included studies comparing fingolimod with natalizumab, the reported trends were contradictory, with studies reporting outcomes with natalizumab to be either better than or similar to those with fingolimod regarding relapses, disability, and persistence/adherence [24, 27, 29, 31, 32, 34, 36-38, 41, 42, 47, 50, 53]. A recent observational study, which was also published after the search date, compared natalizumab and fingolimod and reported that fingolimod and natalizumab had similar effectiveness for relapse and disability outcomes, according to results from a multicenter, retrospective analysis of adjusted data from propensity score-matched patients [60]. Therefore, the relative effectiveness of fingolimod and natalizumab remains unclear, and further large, long-term studies are required to address this [42].
It is notable that only one study compared outcomes in patients receiving fingolimod and other recently approved DMTs [45]. Lo Re et al. compared fingolimod with firstline DMTs, including oral teriflunomide [45], but owing to the fact that data were presented collectively for all first-line DMTs (IFN/GA/teriflunomide/azathioprine), no conclusions could be made about the comparative effectiveness of fingolimod and teriflunomide [45]. Two posters identified in the SR compared fingolimod with another recently approved oral therapy, dimethyl fumarate, and reported a lower risk of treatment discontinuation with fingolimod but a similar risk of relapse or MRI activity [61,62]; however, the conclusions that can be made for relapse and MRI outcomes are limited by the short follow-up times (3-6 months) [61,62].
Among the included studies, most RWD for fingolimod were collected retrospectively, using patient records from national or international MS registries, hospital databases or MS clinics, or administrative/pharmacy claims databases [20-22, 24-27, 30, 31, 33-37, 39, 40, 44-46, 49]. In retrospective studies, data collection and defining of study objectives are independent processes [4]. Retrospective studies are therefore particularly reliant on the quality and type of data available, and missing or incomplete information can compromise the quality and robustness of results [38].
Only a small proportion of the included published papers reported results of prospective studies [28,32,38,42,48,50,52] for which patients would have been identified and data collected after study objectives had been defined. Prospective studies are likely to be based on a more complete data set than retrospective studies, and they have the potential to collect and report data that are not routinely recorded in clinical practice [4]. The strengths and limitations of retrospective or prospective study designs and their impact on data quality and data generation should be considered when assessing and interpreting data from real-world studies.
Patients with MS were selected in the included studies using broad selection criteria on the basis of prescribed DMTs; some studies had the additional criterion that patients had experienced disease activity with previous DMTs or had received DMTs of interest for a minimum length of time, or the studies had a minimum follow-up period [21, 22, 25, 27, 29, 31, 35, 38, 41, 43-45, 47, 48]. In studies using . To account for differences in characteristics between treatment cohorts, some studies used adjustment or propensity score-matching [24,26,35,38,42,43,52,53,63], which can allow for treatment effect to be estimated more accurately [63].
Identifying factors that can contribute to bias and confounding and selecting appropriate statistical analytic methodology are therefore important considerations in the design of robust real-world studies that have the potential to generate high-quality M A N U S C R I P T A challenge of MS research is the difficulty of assessing some aspects of disease progression, such as relapses and disability, which can be difficult to define or to measure accurately and consistently. For example, there is a degree of subjectivity involved in evaluating relapses, with the severity threshold or range of symptoms required for an event to be recognized varying among physicians or patients.

ACCEPTED MANUSCRIPT
Disability can also be difficult to define and measure, and changes in EDSS scores can be influenced, particularly in the short term, by the residual effects of relapses [64]. Results for these "soft endpoints" may therefore not be comparable across studies, or indeed between patients within a study, which can have an impact on the estimated treatment effect. Some of the included studies used standardized criteria to define outcomes or required that outcomes were assessed by accredited individuals, which may address this problem to some degree [29,34,37,40,42,43].
In contrast, "hard endpoints" are assessed using objective, stringent criteria (e.g. number of MRI lesions), and are likely to be more comparable across studies.
In studies using administrative/pharmacy claims databases, relapses were evaluated using claims that have been shown to correlate with this event [24][25][26]. Proxy measures to assess disease activity can lack sensitivity or specificity, and may therefore not give a complete picture of treatment effect. In line with this, the relapse algorithm used in claims databases, which is defined as an inpatient visit with a primary diagnosis code for MS or an outpatient visit with a diagnosis code for MS and corticosteroid use within 7 days of the visit, may detect only relapses that require hospitalization or corticosteroid treatment, not mild relapses that do not impact on daily activities or require treatment [9]. The relapse algorithm has, however, been validated in several studies, and the trends reported for fingolimod from administrative/pharmacy claims are aligned with those from patient records or MS registry databases [7,[24][25][26]. Furthermore, in matched analyses, any bias would apply to both treatment arms equally [24]. A limitation of administrative claims database studies is that some disease outcomes (e.g. relapse severity, disability, MRI lesion type) and baseline parameters, which may be confounding factors, cannot be assessed in the data source [7,24].
The studies included in this SR generally highlight the diversity of data sources and methodologies being used to investigate MS and the benefit of DMTs in clinical practice. Such diversity, while generating a wealth of RWD, can lead to heterogeneity across studies in the patient populations being assessed (e.g. differences in disease severity due to variations in eligibility criteria or the treatment label in different countries), or in the approaches used to evaluate treatment effectiveness (e.g. differences in outcome definitions or in follow-up times) [5]. Heterogeneity can impact on outcomes and preclude comparison of data across studies, and may account for the general non-uniformity in the RWD presented in this SR. This does not reduce the value of the RWD being generated, which reflect patient and physician behavior and outcomes in the real world. Instead, when assessing RWD, it is important to be aware of the potential sources of variation across studies and their potential impact on outcomes. Furthermore, consistent trends across studies that vary in their methodological approach can provide more confidence in the comparative effectiveness of DMTs. German study is investigating PROs and healthcare costs associated with fingolimod, and it is therefore likely that new data will become available to address some of these evidence gaps for fingolimod [67].

Conclusions
The present SR provides a comprehensive insight into the RWD available for fingolimod. It also highlights the diversity of methodologies being adopted to improve our understanding of MS and the impact of DMTs in the real world, as well as the challenges of conducting these studies. Although the included studies provide good evidence of the effectiveness of fingolimod in clinical practice, they also emphasize the gaps in the evidence base for this treatment, which likely reflect a more general deficiency in the field of MS research. Future research should address these gaps in the evidence base in order to provide a more balanced and complete view of treatment benefit and disease progression. Furthermore, the challenges associated with researching MS should be accounted for when designing real-world studies, and reliable methodology should be used in order to generate robust results. Finally, the heterogeneity that is intrinsic to real-world studies, which can impact on outcomes, should be considered when assessing RWD and comparing DMTs in clinical practice, both within and across studies. SRs, as performed for fingolimod in the present study, should be part of the standard protocol when assessing the benefit profile of treatments in the real-world studies.

Competing interests
TZ has received speaking honoraria and travel expenses for scientific meetings and has been a steering committee member of clinical trials or participated in advisory

Funding
This work was supported by Novartis Pharma AG, Basel, Switzerland.

Author contributions
All authors were involved in the design of the study. CRM was responsible for the first draft of the protocol, which was critically reviewed, further developed and

Acknowledgements
Not applicable.

References
[    [39] D. Ontaneda       include seven studies for which the type of observational study was not specified and which contained insufficient methodological information to categorize them as being retrospective or prospective.

Relapse rate
 ARR (over a time interval)  Proportion of patients with/without a relapse  Time to first relapse or inflammatory event

Disability progression and improvement
 Proportion of patients with/without disability progression/improvement (3-month and 6month) based on EDSS score  Change from baseline (to a set time point) in EDSS score  Time to 3-month or 6-month confirmed disability progression/improvement Were treatment cohorts similar at baseline?
Were outcomes assessed using objective criteria?
Were cohorts assessed in an equivalent manner?
Was follow-up long enough to assess outcomes?

Yes
No (difference in EDSS score, relapses, but ARR did not differ significantly) Yes, but relapses were not defined  Barbin et al., 2016 [42] Natalizumab (and baseline) EDSS score: Fingolimod: baseline to year 1, 2.4 to 2.2 (p = 0.0228); baseline to year 2, 2.4 to 2.2 (p = NS) Proportion with disability progression, fingolimod vs natalizumab: Similar in the fingolimod and natalizumab cohorts at year 1 (p = NS) and year 2 (p = NS) in unadjusted and adjusted analysis (multivariate logistics regression and propensity score weighting)