Why systematic literature reviews in Fabry disease should include all published evidence

Fabry disease is an X-linked inherited, progressive disorder of lipid metabolism resulting from the deficient activity of the enzyme α-galactosidase. Enzyme replacement therapy (ERT) with recombinant agalsidase, with intravenous infusions of either agalsidase beta or agalsidase alfa, is available and clinical experience now exceeds 15 years. There are very few randomised, placebo-controlled clinical trials evaluating the outcomes of ERT. Data are often derived from observational, registry-based studies and case reports. Pooled analysis of data from different sources may be limited by the heterogeneity of the patient populations, outcomes and treatment. Therefore, comprehensive systematic literature reviews of unpooled data are needed to determine the effects of ERT on disease outcomes. A systematic literature search was conducted in the Embase and PubMed (MEDLINE) databases to retrieve original articles that evaluated outcomes of ERT in patients with Fabry disease; the outcome data were analysed unpooled. The literature analysis included the full range of published literature including observational studies and case series/case reports. Considerable heterogeneity was found among the studies, with differences in sample size, statistical methods, ERT regimens and patient demographic and clinical characteristics. We have demonstrated the value of performing an unpooled systematic literature review of all published evidence of ERT outcomes in Fabry disease, highlighting that in a rare genetic disorder like Fabry disease, which is phenotypically diverse, different patient populations can require different disease management and therapeutic goals depending on age, genotype, and disease severity/level of organ involvement. In addition, these findings are valuable to guide the design and reporting of new clinical studies.


Introduction
Fabry disease (OMIM #301500) is an inherited X-linked lysosomal storage disorder caused by pathogenic variants in the GLA gene and the resulting loss of function of the lysosomal enzyme α-galactosidase. The disease is caused by the accumulation of glycosphingolipids within tissues that progressively affect multiple organ systems. In classic Fabry disease, patients experience neuropathic pain, hypohidrosis, and gastrointestinal discomfort at a young age, followed by early kidney failure, cardiac disease, and stroke (Germain, 2010). Patients with Fabry disease can be treated with enzyme replacement therapy (ERT) with recombinant human α-galactosidase. There are two forms of ERT available: agalsidase alfa (Replagal ® , licensed dose 0.2 mg/kg every other week) and agalsidase beta (Fabrazyme ® , licensed dose 1.0 mg/kg every other week). Agalsidase alfa and agalsidase beta are both available in many countries; agalsidase beta is available in the USA (Desnick, 2004;Ortiz et al., 2018).
Fabry disease is a rare disease. With such small patient numbers, it is challenging to design sufficiently powered randomised, placebocontrolled clinical trials. Consequently, publications derived from disease registries and case studies are common. However, Fabry disease is also very heterogeneous, with different clinical manifestations in patients with GLA variants associated with classic versus late-onset disease, in males versus females, and even within females depending on their level of X-chromosome inactivation (Echevarria et al., 2016). Therefore, the interpretation of the published evidence is hindered by In other diseases, pooled analyses or meta-analyses of randomised clinical trial data are used. To perform such analyses, the study type, patient population and endpoints need to be comparable across all the studies included. In Fabry disease, pooled analyses of randomised clinical data resulted in very scant conclusions (El Dib and Pastores, 2010;El Dib et al., 2013. When the same authors pooled data from 77 cohort studies they acknowledged the limitations of their approach, considering the variability in patient characteristics (e.g. gender, genotype), baseline disease severity and type of ERT (not always specified) (El Dib et al., 2017).
Therefore, the objective of our analysis was to provide the rationale for performing a systematic literature review of all published literature regarding the effect of ERT in Fabry disease where all the data are unpooled. The comprehensive systematic literature review that was performed provided the foundation for three other publications which report the actual unpooled outcomes in different patient populations Germain et al., 2019a,b;Spada et al., 2019) and the resulting therapeutic goals development (Wanner et al., 2018). It is hoped that the information gained from this analysis of unpooled data can inform the design of future clinical studies in patients with Fabry disease.

Methods
An international panel of specialists from 10 subspecialties (cardiology, endocrinology, metabolic disease, haematology, internal medicine, genetics, nephrology, neurology, paediatrics and statistics) convened twice to examine the results of the systematic review of literature on ERT in Fabry disease.

Literature searches
The initial literature search was conducted on 9th September 2014, using the Embase ® database and the following search terms: ("agalsidase" OR "recombinant alpha galactosidase" OR "recombinant alfa galactosidase" OR "recombinant human alpha galactosidase" OR "recombinant human alfa galactosidase" OR "Fabrazyme" OR "Replagal" OR "enzyme replacement therapy" OR "enzyme replacement therapies") AND ("Fabry disease" OR "Fabry's disease" OR "Anderson Fabry disease" OR "morbus Fabry" OR "angiokeratoma corporis diffusum") NOT "conference abstract". No limits were set on publication dates or publication types for the initial search. The systematic literature search was repeated twice, on 1st September 2015 and 31st January 2017, to identify additional published articles describing treatment outcomes in patients with Fabry disease who were treated with ERT. The pharmacological chaperone migalastat was approved in Europe in 2016 and in the USA in 2018; however, a decision was taken not to include migalastat studies in this systematic literature review because of the limited experience with this new therapy and the scarce published literature at the time of the literature search. Therefore, this systematic literature review only included publications regarding the effect of ERT on patients with Fabry disease.
Although the Embase database should include all articles indexed in the National Center for Biotechnology Information's PubMed ® (MEDLINE ® ), the same search was conducted using the PubMed database to identify any publications that may not have been included in Embase, e.g. publications ahead of print.

Literature screening
All articles identified during the literature searches were screened by two independent reviewers to select relevant articles for inclusion. Screening was based on the abstract; if no abstract was available, then the full-text article was obtained for screening. For each publication, the title and abstract were screened, and a decision was made regarding whether to include the article based on the prespecified inclusion/exclusion criteria. All articles that reported outcomes data on patients with Fabry disease treated with either agalsidase beta or agalsidase alfa were eligible for inclusion. As a result, the analysis included reports from randomised clinical trials (RCTs), non-randomised clinical trials, open-label clinical trials, prospective observational studies, retrospective cohort studies, retrospective database studies, registry studies, case series, and case reports. Preclinical studies that evaluated biomarkers were also eligible for inclusion if data were presented in the context of patient outcomes. Narrative review articles, systematic reviews, pooled analyses, and meta-analyses that did not present new data were excluded as were studies that did not report outcomes following ERT, quantifiable endpoints, or outcomes in patients with Fabry disease. Studies that reported treatment outcomes associated with a single dose of ERT, studies published in a non-English language journal, studies on non-human subjects, and preclinical studies that did not present findings in the context of clinical outcomes were also excluded.

Data extraction
Data were extracted from all eligible studies based on the reported study information, the endpoint data, and the robustness of the data. The study information that was recorded included design, patient population, percentage of male and female participants, total number of patients, number of patients lost to follow-up, ERT type [drug name] and dosage, any dose changes or drug switches, patient age at ERT initiation or dose/ERT switch, duration of treatment, genetic variants, disease severity, concomitant medications and α-galactosidase activity. The extracted outcome/endpoint data included plasma and urinary (lyso-)GL-3/Gb 3 (abbreviated as GL-3) levels or heart and kidney GL-3 accumulation scores; cardiac echocardiographic, magnetic resonance imaging (MRI) and electrocardiogram measures; glomerular filtration rate (GFR); proteinuria/albuminuria and serum creatinine levels; measures of the autonomic, central and peripheral nervous systems; and other outcomes, such as pain, gastrointestinal outcomes, quality of life and immunogenicity/seroconversion. Data on clinical outcomes were recorded as described in the article and any validated assessment scales, if used, were noted. Outcome measures at baseline and at endpoint were extracted for all treatment outcomes; the change between baseline and endpoint and any statistical evaluations of this change were recorded where possible. Study data were assimilated into different sections and analysed in the following patient subgroups: paediatric population, adult female population, adult male population, and gender-mixed population. Patient populations were classified as 'gender mixed' if they were derived from studies that included both male and female patients but did not specify outcomes separately. It is important to note that the adult female and adult male publication sets are not mutually exclusive since some publications presented ERT data on both adult female and adult male patients. Data from registry studies and studies investigating variable ERT dosing were assimilated into different sections and analysed separately.

Level of evidence
A variation of the levels of evidence classification published by the Oxford Centre for Evidence-Based Medicine was used to grade all publications included in the literature analysis (CEBM, 2009). In summary, the levels of evidence were as follows: (the distinction between a Grade 1c single-arm trial and a Grade 2 study was made based on the mention of a protocol in the Methods section or a description of the study as a trial [i.e. Grade 1c], or prospective studies being defined as those in which Fabry patients were not randomised or assigned to a treatment [i.e. Grade 2]) • Grade 3 publications described retrospective observational studies, registry studies, and database analyses • Grade 4 publications described case series • Grade 5 publications described case reports.
In the publications that report the outcomes from this literature analysis (Germain et al., , 2019aSpada et al., 2019), ERT outcome data from clinical trials (Grade 1a-c) and observational studies (Grade 2-5) were analysed and reported separately, in recognition of the different clinical 'weight' that can be ascribed to the findings from controlled studies compared to observational and registry data.

Risk of bias assessment
All Grade 1-3 publications were assessed for risk of bias based on the Cochrane tool for assessing bias originally developed for randomised trials (Table 1; Higgins et al., 2011). Grade 4 and Grade 5 publications were excluded from the assessment. Grade 1a publications were graded according to a high or low risk of selection, performance, and detection bias. Risk of attrition bias, reporting bias, and other bias possibilities (e.g. low samples sizes, lack of comparator group, imbalance between comparator groups, incomplete reporting of patient characteristics) were assessed in all Grade 1-3 publications. Pharmaceutical sponsorship of a study was also noted. When insufficient data were available to make a judgment, the item was noted as a potential risk.

Summary of screening process
A total of 2655 articles were identified by the literature searches in Embase and PubMed. Following the removal of 977 duplicate records, 1678 articles were screened for inclusion. The identification, screening and reasons for exclusion of publications are summarised in Fig. 1. A total of 269 publications were eligible for inclusion and were used to inform the analysis (Supplementary Table S1).

Publication characteristics
The 269 publications included in the data analysis reported clinical outcomes of agalsidase beta treatment (n = 97), of agalsidase alfa treatment (n = 90), and outcomes of both therapies used interchangeably or alongside each other (n = 59); 23 publications did not specify the ERT type (Fig. 2).
Grade 2 prospective observational studies were the most prevalent type of publication (n = 80): 30 on outcomes with agalsidase beta and 19 on outcomes with agalsidase alfa (Fig. 2A). Grade 5 case reports were the next most common group of publications (n = 65), followed by Grade 3 publications of retrospective observational studies (n = 55). There were 25 Grade 1c publications reporting outcomes of single-arm clinical studies; 9 on agalsidase beta and 14 on agalsidase alfa. A total of 24 publications described Grade 4 case series.
Nine publications were classified as Grade 1a RCTs, including 1 reporting outcomes of agalsidase beta, 7 of agalsidase alfa and 1 that used both drugs. Finally, 11 publications reported outcomes from Grade 1a/1c studies including an open-label extension phase as well as a randomised design (6 on agalsidase beta treatment and 5 on agalsidase alfa). No publication was classified as a Grade 1b non-RCT. It should also be noted that certain studies gave rise to more than 1 publication, with separate articles reporting different clinical outcomes in the same patient population.

Risk of bias assessment
Of the 269 included publications, 180 articles that were classified as Grade 1-3 were included in the risk of bias assessment; 89 case series or case report articles were classified as Grades 4 and 5, respectively, and thus excluded from this assessment. A summary of the risk of bias assessment of the aggregate publications categorised according to population is provided as a supplemental figure (Figs. S1A-D).
Varying levels of risk for the different types of bias were observed for each patient population. In all populations, the majority of publications were graded to have high risks of 'other' types of bias. In publications with adult male data, risks for selection, performance and detection bias were relatively high; most of the adult male data publications demonstrated low risks for attrition and reporting bias. Compared with adult male publications, publications with adult female data and gender-mixed data demonstrated similar risk patterns, but lower risks of detection bias were observed. Due to the absence of placebo-controlled RCTs in paediatric patients, risk of selection and performance bias could not be assessed. Detection bias in three open-  label single-arm clinical studies in paediatric patients was found to be of low to moderate risk. Risk patterns of attrition, reporting and other bias in publications with paediatric data were similar to the other populations.

Patient populations
The most common population sample comprised adult male patients (Fig. 2B); 145 publications providing data on ERT outcomes in adult male patients were identified, including Grade 1a RCT (n = 6, all on agalsidase alfa), Grade 1c single-arm study (n = 12) and Grade 1a/1c publications (n = 4, all on agalsidase alfa). Grade 2 (n = 33) and Grade 3 publications (n = 33) had similar numbers using either agalsidase beta or agalsidase alfa as the investigational drug. The remaining 57 publications were Grade 4 case series (n = 16) and Grade 5 case reports (n = 41).
Eighty-four publications reported ERT outcomes in a gender-mixed population, with the large majority describing Grade 2 (n = 43) and Grade 3 studies (n = 23) (Fig. 2E).

Organ system-specific data
The numbers of Grade 1-5 publications identified for biomarker outcomes or outcomes in different organ systems are summarised in Table 2. The majority of publications with data for biomarkers reported  the effect of ERT on plasma GL-3 levels (n = 50), followed by urinary GL-3 (n = 32) and kidney GL-3 accumulation (n = 19). A lower number of publications reported about plasma or urinary lyso-GL-3 (n = 13 versus n = 1). With regards to kidney outcomes, most of the publications (n = 93) reported the effect of ERT on GFR or estimated GFR (eGFR), followed by proteinuria (n = 63) and serum creatinine (n = 41). Measures of heart morphology were also commonly reported outcomes: echocardiographic data (n = 102) were reported more often than MRI data (n = 18). Cardiac function outcomes were reported in 13 publications. Most of the publications reporting nervous system  outcomes described peripheral nervous system pathology outcomes (n = 75), followed by outcomes in relation to the autonomic nervous system (n = 55) and central nervous system (n = 31).

Study heterogeneity
The heterogeneity of the studies according to duration of treatment and age at treatment initiation is illustrated in Fig. 3. In studies in adult male, adult female and adult gender-mixed populations the age at ERT initiation varied widely (18 to 76 years). Less variation in age at ERT initiation was seen in paediatric populations. Similarly, the study duration of ERT had a wide range, with most studies (79%) reporting treatment for < 5 years, regardless of the age or gender of the population. The heterogeneity of the studies regarding study sample size is summarised in Fig. 4. In each of the paediatric, male and female populations analysed, most studies had a small number of patients (≤20 patients); most of the gender-mixed studies included ≤40 patients. Full details of the studies analysed for the male, female and paediatric populations are provided as supplemental tables in the published outcomes papers for the respective populations (Germain et al., , 2019aSpada et al., 2019).

Discussion
Systematic reviews provide a transparent, stepwise approach to literature analysis that is replicable and can be updated. A good-quality systematic review employs clearly stated methods that are predefined in a protocol that formally sets out the required stages and records the data search algorithm, describes rigorous criteria and conduct with regard to the identification and selection of individual studies, and provides justification of the selection criteria. To reduce any selection bias, the study selection process in a systematic review should be performed by experts in the disease area. The systematic review process is designed to analyse published data only and therefore the contribution of any unpublished data cannot be assessed using this approach. Therefore, while experts involved in the systematic literature review process may be aware of potentially important unpublished data this cannot be included in the analysis; instead, the expert panel encourages publication of any relevant unpublished data to ensure the clinical picture is as complete as possible.
Observational studies provide the majority of clinical outcomes data in rare disease settings. Due to the open nature of their design, this introduces potential bias that may impact the assessment or reporting of clinical events and changes in symptom severity. Nevertheless, observational studies are more feasible than RCTs in rare disorders like Fabry disease, and they can provide a useful measure of background factors such as genetic variants, residual enzyme activity, sex differences, clinical presentation and disease severity. If observational studies are excluded from systematic reviews in Fabry disease, as has been the case in the past (Alegra et al., 2012;El Dib and Pastores, 2010;El Dib et al., 2013Rombach et al., 2014), there is a risk of losing important clinical insights in relation to the above-mentioned factors.
Registries have been used as source data for many retrospective observational studies on Fabry disease. The value of registry studies is that they include larger population sizes, represent different patient subgroups, and have longer follow-up periods that are vital for the evaluation of therapies administered over a lifetime. However, data available per patient may be incomplete due to inconsistency in clinical monitoring, including the frequency and type of assessments (performed at the discretion of the treating physicians), rendering a study population too small to demonstrate any significant effect.
This literature analysis was conducted in accordance with the recommended systematic approach and included data from the full range of published literature. When the data were collated and analysed, it was clear that there were inconsistencies in the reporting of findings on the impact of ERT on Fabry disease in the published studies. Furthermore, the homogeneity or, more likely, heterogeneity of a study population is a key determinant of data quality and is impacted by attributes such as patient gender, disease stage, genetic variation and age. Indeed, we found studies that reached disparate conclusions regarding the long-term effects of ERT, which may have been due to differences in sample size, statistical methods, type and dose of ERT regimens, and patient genotypes and phenotypes (Anderson et al., 2014;Rombach et al., 2013).
The duration of treatment is important to consider when comparing studies of clinical outcomes. Although the review found few studies on ERT treatment exceeding five years' duration, published data documenting up to 10 years of outcomes follow-up were identified (Fledelius et al., 2015;Germain et al., 2015;Kampmann et al., 2015;Schiffmann et al., 2015), including 10-year follow-up data from Fig. 3. Study heterogeneity by sample population as demonstrated by duration of treatment and mean age at treatment initiation per patient population. Publications with (A) adult male data, (B) adult female data, (C) paediatric data, and (D) gender-mixed data. Studies that did not provide data for both measures or that provided only range data for one measure were not included in the figure. The figure shows data from all types of publications: Grades 1-5. ERT, enzyme replacement therapy. patients enrolled in the pivotal phase 3 study of agalsidase beta (Germain et al., 2015). Given that ERT has only been available since 2001 in the EU and 2003 in the USA, the relative paucity of published long-term data is not surprising. However, clinical comparisons of data should keep this factor in mind as the impact of ERT on clinical outcomes may only become apparent in the long term (Germain et al., 2015;Kampmann et al., 2015;Schiffmann et al., 2015).
Variations in disease severity in a study (particularly in observational series) and between different study populations also make the comparison of therapeutic interventions difficult. The extent of irreversible organ damage in patients with Fabry disease may differ according to age, timing of diagnosis and treatment initiation. Results from studies that do not report disease stage may be confounded by baseline variability in disease severity. Evidence from open-label studies has shown that baseline patient characteristics, e.g. the degree of renal impairment and the extent of cardiac pathology, significantly impact response to ERT, highlighting the importance of early ERT initiation and the need to stratify patient populations in clinical trials to facilitate the interpretation of ERT outcomes data (Germain, 2007;Germain et al., 2015).
Any analysis of the effects of ERT in patients with Fabry disease must take the impact of sex into account. Female patients may manifest a different clinical phenotype with a later onset of symptoms and slower progression than male patients. The X-chromosome inactivation pattern that is observed in female patients with Fabry disease correlates with the variation in phenotype and disease course (Echevarria et al., 2016). Fabry disease is further characterised by a large number of pathogenic variants in the GLA gene (Germain et al., 1999), including variants associated with the classic presentation of the disease, later onset or atypical disease presentations (Germain, 2001;Germain et al., 1996Germain et al., , 2018 and of uncertain significance. Thus, pooling results from hemizygous male patients and heterozygous female patients is another source of bias in study data (Dobrovolny et al., 2005;Echevarria et al., 2016;Germain, 2007). Caution is therefore required when interpreting the results of studies that do not specify mutational status or report clinical outcomes separately for male and female patients.
Confounding factors such as comorbidities and concomitant medications used for the symptomatic treatment of Fabry disease may also impact outcome data. In this literature analysis, two publications reported subgroup analyses of ERT outcomes with reference to concomitant medications (Feriozzi et al., 2012;Tahir et al., 2007). One small study involving six patients with severe Fabry nephropathy used titration of angiotensin-converting enzyme (ACE) inhibitor/angiotensin receptor blocker (ARB) therapy to reduce levels of proteinuria to target levels and found sustained reductions in proteinuria with stabilisation of kidney function (Tahir et al., 2007). The other study involving 208 Fabry patients with milder stage I/II kidney disease, however, mainly used ARBs to control blood pressure but did not generally titrate to reach target proteinuria levels; this study reported evidence of Fabry nephropathy stabilisation only in women (Feriozzi et al., 2012).
Finally, potential issues stem from limiting the analysis to clinical data and excluding in vitro or animal studies. Although this is a necessary and widely used limitation in systematic reviews of clinical settings, the risk is that important insights about pathological mechanisms may be missed. For example, one paper (Choi et al., 2015) provides highly important and novel insights into the pathophysiological relevance of increased lyso-GL-3 levels for the generation of Fabry-associated pain. It also offers valuable information with which to better understand the clinical observations that a reduction in ERT dosage (Linthorst et al., 2011) or an initially low dosage of ERT is associated with increases in lyso-GL-3 (Smid et al., 2011) and deteriorations in kidney function and pain in patients with Fabry disease (Lenders et al., 2015;Weidemann et al., 2014).

Conclusions
A systematic literature review is a valuable analysis tool with which to study rare disease populations. However, systematic literature reviews are often restricted to randomised, placebo-controlled studies, and in rare disease settings the interpretation of pooled data is limited by considerable heterogeneity in study design, sample size, patient demographics and clinical or genetic characteristics. In this systematic literature review we included the full range of published literature, including observational studies, registry studies and case reports, to study real-world clinical outcomes as closely as possible. However, in contrast with previously published systematic literature reviews, we demonstrated the value of performing an unpooled analysis of all published evidence of ERT outcomes in Fabry disease. The findings in this study suggest that in a phenotypically diverse, inherited disorder like Fabry disease, different patient populations may require tailored disease management and therapeutic goals depending on characteristics such as patient age, mutational status and disease severity. In addition to providing the foundation for four additional publications reporting unpooled treatment outcomes in different subsets of the Fabry population and the subsequent development of therapeutic goals, it is hoped that these findings will be useful in guiding the design of new clinical studies, for example encouraging stratification of patients within clinical trials based on clinical characteristics such as disease severity and reporting the genotype of all patients for whom ERT outcome data are reported.
• Marco Spada has received speaker and advisory board honoraria, and travel support, from Sanofi Genzyme and Shire.
• Christoph Wanner has received research support from Sanofi Genzyme; is a consultant for Actelion Pharmaceuticals, Protalix, Boehringer Ingelheim and Sanofi Genzyme; and is a member of the European Advisory Board of the Fabry Registry.