Advantages of routine next‐generation sequencing over standard genetic testing in the amyotrophic lateral sclerosis clinic

Abstract Background Next‐generation sequencing has enhanced our understanding of amyotrophic lateral sclerosis (ALS) and its genetic epidemiology. Outside the research setting, testing is often restricted to those who report a family history. The aim of this study was to explore the added benefit of offering routine genetic testing to all patients in a regional ALS centre. Methods C9ORF72 expansion testing and exome sequencing was offered to consecutive patients (150 with ALS and 12 with primary lateral sclerosis [PLS]) attending the Oxford Motor Neuron Disease Clinic within a defined time period. Results A total of 17 (11.3%) highly penetrant pathogenic variants in C9ORF72, SOD1, TARDBP, FUS and TBK1 were detected, of which 10 were also found through standard clinical genetic testing pathways. The systematic approach resulted in five additional diagnoses of a C9ORF72 expansion (number needed to test [NNT] = 28), and two further missense variants in TARDBP and SOD1 (NNT = 69). Additionally, 3 patients were found to carry pathogenic risk variants in NEK1, and 13 patients harboured common missense variants in CFAP410 and KIF5A, also associated with an increased risk of ALS. We report two novel non‐coding loss‐of‐function splice variants in TBK1 and OPTN. No relevant variants were found in the PLS patients. Patients were offered double‐blinded participation, but >80% requested disclosure of the results. Conclusions This study provides evidence that expanding genetic testing to all patients with a clinical diagnosis of ALS enhances the potential for recruitment to clinical trials, but will have direct resource implications for genetic counselling.


INTRODUC TI ON
Amyotrophic lateral sclerosis (ALS) is characterised by progressive neurodegeneration of the corticospinal tract and alpha motor neurons in the spinal cord. There is a clinical, histopathological and genetic overlap with frontotemporal dementia (FTD). The mechanisms underlying this neurodegeneration, its cellular specificity, and its relationship to ageing remain incompletely understood [1].
The importance of genetic factors in the pathophysiology of ALS has been long known in the clinic [2]. Between 2% and 12% [3] of patients report a relevant family history of the disease, with estimates diverging widely due to differences in study populations, distortions in ascertainment due to referral bias, and lack of consensus definitions of what constitutes familial ALS [4]. Autosomal dominant variants with high penetrance in C9ORF72, SOD1, TARDBP and FUS account for more than half of all cases with a family history but, importantly, are also be found in patients without affected family members [5]. Even so, these variants only account for a proportion of the total heritability, which has been estimated at around 50% in twin studies [6] and population studies [7]. Over the past decade, high-throughput sequencing technologies and genome-wide association studies (GWAS) have contributed to the elucidation of the genetic architecture of ALS. This has led to the identification of further rare monogenic causes [8], and also the discovery of rare risk variants that do not segregate with disease, but which are found significantly more frequently in ALS patients than in controls such as NEK1 [9], or common variants which modulate the severity and onset of disease such as UNC13A [10].
Targeted high-throughput exome sequencing, now available in the form of 'virtual sequencing panels', allows simultaneous testing of all known genes associated with ALS at a cost similar to Sanger sequencing of a single gene. Such panels have become a routine tool in clinics specialising in hereditary neurological disorders [11], but their utility in the ALS clinic remains under investigation, and testing is usually available only on a research basis for patients without a family history. Previous studies using ALS genetic panels have reported 'pathogenic' and 'likely pathogenic' variants in 12%-21% overall in clinic populations [12][13][14][15][16][17], with a detection rate ranging from 5% to 13% in patients who do not report a family history.
This aim of this study was to investigate the genetic contribution to ALS in an unselected clinic population in a large UK ALS referral centre and to understand the clinical utility of offering C9ORF72 testing and targeted exome sequencing to all patients, in addition to standard approaches to genetic testing. A systematic analysis of known, previously reported ALS variants was extended by using codon-based analysis to look for neighbouring ALS-associated variants and machine-learning algorithms to look for splice site variation, yielding novel deleterious splice donor variants in TBK1 (c.1340 + 2 T > G) and in OPTN (c.1242 + 1G > A). with FTD, or primary lateral sclerosis (PLS) in a regional specialist ALS clinic. We performed a subgroup analysis restricted to patients tested within 2 years from first onset of symptoms, to provide an approximation of an incident population (n = 65; related SNVs]))) or (SpliceAI_pred_DS_ AG >0.5 or SpliceAI_pred_DS_AL >0.5 or SpliceAI_ pred_DS_DG >0.5 or SpliceAI_pred_DS_DL >0.5)

ME THODS
To check for variants not reported on ClinVar, all variants with ExaC minor allele frequency <5% were checked against a curated list of previously published ALS mutations [24].
Manual ascertainment of the variants with the lowest coverage and quality was performed on the final list, resulting in the rejection of one variant. In silico analysis was performed using MutationTaster [25], PolyPhen2 [26], SIFT [27] and FATHMM [28]. Filtered variants were interpreted using the standards and guidelines for the inter- Genetics and Genomics (ACMG) [29].
Relatedness of samples and ancestry were evaluated using Somalier [30].
Statistical analyses, including Kaplan-Meier survival analysis (census date 1 July 2022), were performed in R (v. 4.1.2) and Graphpad Prism (v. 9.3.1) was used to generate illustrations.

Pathogenic variants
Consent for genetic testing was obtained from 163 subjects, 148 of whom had a diagnosis of ALS, 3 ALS/FTD and 12 PLS. DNA extraction failed in one individual with ALS. The demographic data of the participants is outlined in Table 1. In four patients a C9ORF72 result was reported but no sequencing panel data was generated, and in four patients a sequencing panel result was available, but extraction of sufficient DNA for Southern blotting failed. Some 132/163 patients (81%) expressed a wish to be informed of any genetic results relevant to them arising from the study. All but eight participants were of European ancestry using a computational estimate, and none of the participants were found to be interrelated.
In the 150 patients with ALS, 16 pathogenic and 1 likely pathogenic ALS-causing variants were detected in our cohort according to ACMG criteria (11.3%), rising to 12/23 (48%) for ALS patients who reported a family history of ALS or dementia in a first-degree relative ( Figure 1). In the subset of 65 incident patients with disease onset within 2 years of their genetic test, 8 pathogenic ALS-causing variants were detected (12.3%).
The C9ORF72 hexanucleotide expansion was the most common highly penetrant pathogenic variant found in this study, detected in 11/150 (7.3%) patients. Of the 10 patients whose identity is known, four reported a family history of ALS in a first-degree relative, two reported a family history of dementia in a first-degree relative, one reported a first-degree relative with multiple sclerosis, and four patients reported no family history of neurological disease. The age of onset in C9ORF72-related ALS ranged from 42 to 67 years, with limb onset in eight patients, and one each with cognitive or bulbar onset.
In five patients the C9ORF72 expansion was only found through the systematic screening approach, as patients were not selected for routine testing.
The analysis of pathogenic and rare variants is summarised in  [32]. As this variant was heterozygous in our patient, its relevance is of uncertain significance.

Rare risk variants and variants of unknown significance
The research panel also identified three variants in NEK1, all in patients who did not report a family history of ALS, dementia or other neurological disease, all of whom had slow disease progression, as defined by survival of more than 5 years ( Table 2). These variants have previously been shown to be associated with an increased risk of ALS in a large cohort of >1000 familial and >2000 sporadic ALS patients [9], while not showing segregation in pedigrees in another study [33]. Two patients in our study carried the S1036X variant, which has previously been found in 1% of sporadic ALS patients but only 0.2% of controls (odds ratio [OR] 5.9) [9]. The third patient in the current study was found to carry the R261H variant in NEK1, which was previously reported in 1.6% of sporadic ALS patients and 0.7% of controls (OR 2.4) [9].
In addition to the pathogenic variants described above, this study also identified a further four rare variants of unknown significance in panel genes. This included an ANXA11 G38R variant in a patient who did not report a family history of ALS or dementia, which has been previously described in three independent ALS cohorts [34][35][36]. Although segregation has not been shown in pedigrees to date, and its penetrance is unknown, neuropathological inclusions staining positive for Anxa11 protein have been demonstrated with this variant [34]. According to our interpretation of ACMG criteria, evidence for this variant is currently insufficient to categorise it as 'likely pathogenic'. We also identified a heterozygous SOD1 D91A variant which was classified as a variant of unknown significance, as its accepted mode of inheritance is recessive [37].

Common variants associated with ALS
Multiple variants in ALS-associated genes with minor allele frequencies >1% in the general population have been reported at higher frequencies in ALS cohorts than in the general population, frequently co-occurring with other variants in the same individual [24]. We found five such previously reported variants in our study, namely CFAP410 V58L, KIF5A P986L, TBK1 V464A, CCNF V714M and OPTN M98K.
Apart from the M98K variant, all other variants were observed in our cohort at frequencies between two-and four-fold higher than those reported in gnomAD for healthy non-Finnish Europeans. All variants are missense but are predicted to be tolerated using in silico prediction tools. Variants were stratified by the strength of evidence in previous genetic studies (Table 3), and only the two variants with strong evidence for association with ALS from GWAS were reported in the results figure (Figure 1). in this study (Table S1) matches the number of expected digenic variants in a random draw model with replacement (p > 0.05).

Primary lateral sclerosis
No pathogenic or likely pathogenic ALS-linked variants were detected in the 12 patients who carried a definite diagnosis of PLS, with no evidence of lower motor neuron signs or symptoms more than 4 years from disease onset [38]. One patient with PLS in this cohort had a rare variant of unknown significance in PFN1 that had previously been reported in patients with lower motor predominant ALS, but which was also reported in control subjects. Interestingly, the common variants and OPTN M98K were also found in one PLS patient each ( Table 3).

DISCUSS ION
The rapid translation of high-throughput sequencing methods into to test = 69), in contrast to a recent UK study in which more pathogenic variants were detected [13]. However, our study confirms that the application of routine genetic testing, at least for C9ORF72 and SOD1, would significantly increase the pool of subjects available for trials of genetic therapies.
The lower frequency of monogenic ALS compared to some recent studies has a number of possible explanations. All patients who were under follow-up in the recruiting clinic during the study period were eligible for participation, possibly enriching for slower progression and atypical ALS with a longer mean survival compared to a pure incident population. A subset analysis of cases tested within 2 years of symptom onset, however, yielded a comparable frequency, which compares well with reported frequencies in a large prospective Italian study restricted to incident cases [14], and other European and Asian population and clinic studies [5], arguing against a strong effect of case selection in our data, especially as PLS cases were analysed separately. Referrals to our regional centre almost exclusively come from a defined geographic area with a systematic referral pattern which reflects ALS as it presents to general neurologists in our region.
An important technical aspect of this study was the use of machine-learning splice prediction algorithms, which allow for the detection of non-coding variants even in exome sequencing data [21]. This approach is particularly valuable given the association of ALS with loss of function and splice-site variants in TBK1, OPTN, KIF5A and NEK1 among others, and enabled us to discover two splice donor variants not previously reported in ClinVar or the existing literature. Codon-centric bioinformatic approaches also helped with identification of variants adjacent to previously reported variation.
In addition to highly penetrant monogenic variants, we also found multiple variants that are associated with ALS risk. The two NEK1 variants found in three patients in this study have been well characterised previously, with an estimate of relative risk calculated [9]. We also report five common missense variants that have an allele frequency of >1% in the general population, which are present in our cohort at several-fold frequency. These are at most weak risk variants, with varying evidence of association with ALS, the strongest being for CFAP410 V58L and KIF5A P986L, both of which are significant GWAS hits, with the potential to be direct effectors of increased ALS risk [39]. The remaining variants are less well characterised beyond overrepresentation in ALS cohorts, and their overall significance remains unknown ( Table 3). The example of NEK1, where both loss of function and the relatively common R261H missense variant are associated with ALS, albeit with varying strength [9], indicate that certain missense mutations in TBK1 and OPTN could play a role in ALS, but evidence for this is currently insufficient. Finally, the fact that four of the five common missense variants found in our ALS patients were also detected in in our PLS cohort is noteworthy; but given the small sample size, further studies would be required to explore the possibility of convergence of some of these risk variants in PLS and ALS. Due to the limitations of whole exome sequencing and the study design we were unable to assess the prevalence of the intermediate ATXN2 CAG expansion, which has also been associated with ALS risk, in our cohort [40]. Given that there is an ongoing In this study we chose to only include genes with the strongest evidence for an ALS association, diverging from commonly available ALS gene panels [13,17]. In agreement with the ALSoD database we did not include genes associated with ALS in early studies that have not been replicated, such as SQSTM1, nor did we include genes that are implicated in a common alternative motor system disease, such as SPG11 and ALS2 [23]. We argue that these genes contain no definitely pathogenic variants relevant to ALS and do not have sufficient evidence for an association with ALS, which may explain the slightly lower frequency of variants of unknown significance in our study compared to some previous studies [13,24]. A cohort study cannot provide evidence for the pathogenicity of new variants, with very few exceptions such as inactivating variants in genes with an established loss-of-function pathogenic mechanism. We have therefore adopted a conservative variant filtering strategy, and only reported missense variants with existing evidence for pathogenicity in ALS.
We strongly caution against overreporting of variants of unknown significance in both research studies and clinical practice, given the significant resource implications and risks to personal wellbeing [41].
Genetic reports from ALS exome and genome panels will inevitably increase in complexity. While the number of highly penetrant variants with clearly actionable consequences for genetic counselling and eligibility for antisense trials that can be found with systematic genetic testing has not changed [5], the number of risk variants with evidence of association with ALS is increasing rapidly. Clinicians therefore need to be equipped to provide appropriate counselling about the high likelihood of finding a variant, when the risk of transmission is difficult to quantify, and where there is currently no direct impact on clinical management [42,43]. With the advent of genetic therapies, clinicians will also need to understand the varying levels of evidence for variants in known ALS-associated genes such as SOD1 and be able to counsel patients on the likelihood of therapeutic success, taking into account all available evidence. The heterozygous SOD1 D91A mutation reported in this study is illustrative of this, with a recent neuropathological assessment of the patient showing TDP-43 instead of the expected SOD1 pathology [37].
From a research perspective, understanding the complex genetic architecture of ALS is of paramount importance to our understanding of the pathophysiology of the disease and will be increasingly important in trial stratification. There was a high level of consent to be informed of individual results in this study (81%).
We strongly recommend the approach of offering double-blinded enrolment, with clear safeguards that prevent unwanted disclosure of a blinded genetic result. The specialist ALS clinic is the ideal setting for a patient to have a meaningful discussion about genetic testing and its results, and continuing expansion of research testing is an important part of the pursuit of personalised approaches to therapy in ALS [1,44].

ACK N O WLE D G E M ENTS
We are grateful to the patients attending our clinic for their participation in this study. We acknowledge the help of Professor Pietro

CO N FLI C T O F I NTER E S T S TATEM ENT
The authors declare that they have no competing interests.

DATA AVA I L A B I L I T Y S TAT E M E N T
The data that support the findings of this study are available on request from the corresponding author. The data are not publicly available due to privacy or ethical restrictions.