Burden of Rare Variants in ALS and Axonal Hereditary Neuropathy Genes Influence Survival in ALS: Insights from a Next Generation Sequencing Study of an Italian ALS Cohort

Although the genetic architecture of amyotrophic lateral sclerosis (ALS) is incompletely understood, recent findings suggest a complex model of inheritance in ALS, which is consistent with a multistep pathogenetic process. Therefore, the aim of our work is to further explore the architecture of ALS using targeted next generation sequencing (NGS) analysis, enriched in motor neuron diseases (MND)-associated genes which are also implicated in axonal hereditary motor neuropathy (HMN), in order to investigate if disease expression, including the progression rate, could be influenced by the combination of multiple rare gene variants. We analyzed 29 genes in an Italian cohort of 83 patients with both familial and sporadic ALS. Overall, we detected 43 rare variants in 17 different genes and found that 43.4% of the ALS patients harbored a variant in at least one of the investigated genes. Of note, 27.9% of the variants were identified in other MND- and HMN-associated genes. Moreover, multiple gene variants were identified in 17% of the patients. The burden of rare variants is associated with reduced survival and with the time to reach King stage 4, i.e., the time to reach the need for percutaneous endoscopic gastrostomy (PEG) positioning or non-invasive mechanical ventilation (NIMV) initiation, independently of known negative prognostic factors. Our data contribute to a better understanding of the molecular basis of ALS supporting the hypothesis that rare variant burden could play a role in the multistep model of disease and could exert a negative prognostic effect. Moreover, we further extend the genetic landscape of ALS to other MND-associated genes traditionally implicated in degenerative diseases of peripheral axons, such as HMN and CMT2.

as candidate ALS-associated variants ( Figure 1 and Table 1). Rare variants, as well as clinical and demographic data of individual patients are reported in Table S2. Rare variants identified in a group of 332 non-neurological unrelated Italian patients, for whom NGS exome sequencing data were available from our in-house database, were selected as the control group (Tables S3-S5). Among the detected variants, 44% of the detected variants (n = 19) were previously described in the literature: seven of them were reported in the ALS literature, four in the ALS and non-ALS literature, and eight in the non-ALS literature. The remaining 56% of the variants (n = 24) were not reported in the literature; 18 were annotated only in population databases, of which one only in the Single Nucleotide Polymorphism Database (dbSNP), while six were novel and could be considered as candidate ALS-associated variants ( Figure 1 and Table 1). Rare variants, as well as clinical and demographic data of individual patients are reported in Table S2. Rare variants identified in a group of 332 non-neurological unrelated Italian patients, for whom NGS exome sequencing data were available from our in-house database, were selected as the control group (Tables S3-S5).
a PT-ID, patient identification code; b dbSNP150; c Global MAF, global allele counts were calculated from all subjects in the ExAc database; d MAF population, population allele count refers to European Ancestry subjects from the ExAc database; § , the variants present in BSCL2 gene has been reported the positions in both isoform NM_001122955.3 and NM_001130702.2 in bracket; # , variants predicted as splice site by in silico tools; ACMG, American College of Medical Genetics and Genomics; ACMG classification: 1 = benign, 2 = likely benign, 3 = uncertain significance, 4 = likely pathogenic, 5 = pathogenic; MAF, minor allele frequency; SIFT: T = tolerated and D = deleterious; Polyphen: B = benign, P = possibly damaging, D = damaging; Mutation Taster: D = disease causing, N = polymorphism, A = disease causing automatic; CADD, combined annotation dependent depletion; CADD scores, scaled CADD scores (Phred like) for scoring deleteriousness. Variants in common between ALS patients and controls were excluded (Table S4).

Variants Previously Reported in ALS Literature
Seven patients were harboring the pathogenic C9orf72 repeat expansion. Three previously reported missense variants were classified by the American College of Medical Genetics and Genomics (ACMG) criteria as pathogenic as follows: the SOD1 p.Leu68Pro and p.Gly73Ser, and the TBK1 p.Ile397Thr. Among them, the two SOD1 variants were described in both fALS and sALS cases [12,13] and in an Italian sALS case, respectively [14,15]. The TBK1 p.Ile397Thr was previously described by our group [16]. Three variants, reported in previous NGS studies, were classified as variants of uncertain significance (VUS) as follows: the ALS2 p.Pro372Arg, the OPTN p.Gln314Leu, and the SETX p.Lys218Asn [15,17,18]. Notably, we identified the BSCL2 p.Leu427Pro missense variant, previously reported in ALS literature [18], in three unrelated patients but also in nine controls, hence, this variant was excluded from this study (Tables S3 and S4).

Variants Previously Reported in both ALS and Non-ALS Literature
Four variants, which were identified in seven different ALS patients, have been previously reported in both the ALS and non-ALS literature. We identified the SETX p.Arg20His variant, classified as VUS, in three unrelated patients, presenting with a pyramidal, predominant upper motor neuron (PUMN), and classic phenotype, respectively. This variant has previously been identified in compound heterozygosity in patients with ataxia with oculomotor apraxia type 2 (AOA2) [19] and in patients with CMT2 [20]; however, it has also been previously reported in ALS individuals [8,15,21]. The three variants identified by our screening in the SPG11 gene have previously been reported in both ALS [15,18] and hereditary spastic paraplegia (HSP) cohorts [22]. We also identified the following two variants in ALS patients and control cases, classified by ACMG as likely pathogenic: the FIG4 p.Ile41Thr and the ERBB4 p.His374Gln (Table S4). The FIG4 p.Ile41Thr missense, originally described in compound heterozygous in demyelinating autosomal recessive CMT neuropathy type 4 juvenile (CMT4J) patients [23][24][25][26], was also subsequently reported in ALS patients, with autosomal dominant transmission [8,18,27,28]. The ERBB4 p.His374Gln missense, described in ALS [29] and cancer patients [30], was identified in one sALS and one unrelated fALS patient. These variants were excluded from our results.

Variants Previously Reported in Non-ALS Literature
We identified eight variants that have been previously reported in the non-ALS literature. Among them, seven variants have previously been associated with a neurological phenotype, while three variants have been associated with patients with non-neurological diseases. The SPG11 p.Cys1996Leufs*4 frameshift was classified by ACMG as pathogenic. This variant has been previously reported in compound heterozygosis in one Israeli [31] and one Italian patient [32], both presenting with autosomal recessive (AR) HSP. Similarly, we identified this variant in a classic sALS patient with SPG11 compound heterozygosity. The following six variants were classified as VUS: the MFN2 p.Asn525Ser. the BSCL2 p.Ala282Thr and p.Arg345Trp. the SETX p.Arg1538Trp. and the SQSTM1 p.Leu268Val and p.Pro118Ser. The MFN2 p.Asn525Ser, which was identified in a bulbar sALS patient, has been previously reported in a large NGS CMT study [33]. The SQSTM1 p.Leu268Val has been previously reported in one Italian patient with Alzheimer's disease (AD) [34]; additionally, we identified another SQSTM1 variant (p.Pro118Ser), which had been previously reported in one patient with ALS, a frontotemporal dementia (FTD) patient, and in two control individuals [34,35]. The BSCL2 and SETX VUS variants have been previously reported in B-cell lymphoma [36] and cancer patients, respectively [37]. Finally, the HSPB3 p.Arg116Pro variant has recently been described in myopathy patients [38]. This variant is predicted to be deleterious by in silico programs, since it is located in the highly conserved (alfa) crystalline domain [38].
The following four variants were identified in our in-house controls and therefore were excluded from our results: the ALS2 Gly1069Glu, the MFN2 p.Arg468His, the SPG7 p.Ala510Val, and p.Arg486Gln variants. The ALS2 Gly1069Glu, which was identified in a pyramidal sALS patient, has been reported in the HSP family by a previous NGS study [39]. We identified the MFN2 p.Arg468His variant in a patient with flail arm syndrome, a restricted ALS phenotype characterized by predominant LMN involvement in upper limbs, and pes cavus (i.e., an abnormally high plantar longitudinal arch). This variant, whose pathogenicity has also been supported by functional in vitro studies [40], has been previously reported in patients with axonal CMT (CMT2) [41][42][43], although a role in demyelinating CMT (CMT1) has been more recently suggested [44,45]. The SPG7 p.Ala510Val variant has been previously reported in patients with AR HSP [46,47], whereas the p.Arg486Gln has been previously reported in both AR HSP and progressive external ophthalmoplegia (PEO) [48][49][50]. The SPG7 p.Ala510Val variant is located in a highly conserved region and its pathogenicity has been previously supported in a yeast model [51].

Variants Present Only in Population Databases
Eighteen variants were not reported in the literature but were recorded in the general population as variants at a very low frequency and have not been previously associated with a clinical phenotype. PLEKHG5 p.Gly1017Arg was reported only in dbSNP. We identified five patients harboring three different variants in the BSCL2 gene. Among these variants, the p.Ser230Asn variant, identified in a bulbar sALS patient, was predicted as potentially pathogenic by ACMG classification, while the p.Ser353Thr and the p.Pro428Ser were classified as VUS and were both found in patients harboring more variants of interest. The DCTN1 p.Val496Leu variant, which was located in the dynein binding domain and predicted as pathogenic, was identified in a fALS patient presenting with a classic phenotype and no cognitive impairment. Notably, autosomal dominant (AD) DCTN1 mutations have been reported in patients presenting with a lower motor neuron disease, ALS, and related to the ALS and FDT families [52,53]. While mutations in the HSPB1 gene have been previously associated with HMN, dHMN, and CMT2 [54,55], we detected the HSPB1 p.Ser135Ala in a female patient presenting with a classic ALS phenotype. This variant is located in the protein alpha-crystalline domain and is classified as likely pathogenic by ACMG. The HSPB3 p.Arg116X, predicted to create a premature stop codon, and thus classified as likely pathogenic, was identified in a patient presenting with classic ALS associated with FTD. The HSPB3 p.Gly67Ser was instead classified as VUS. Finally, the SPG11 p.Ser559Thr which was identified in a pyramidal sALS patient and the three DYNC1H1 variants, a gene previously associated with AD CMT2 and spinal muscular atrophy (SMA) phenotypes [56], were all classified as VUS. The PLEKHG5 p.Arg107Cys was identified in two unrelated sALS patients and controls, and therefore was excluded from the study.

Novel Candidate ALS-Associated Variants
Six variants were absent from all public genomic databases, including dbSNP, and can be considered to be novel. Of the two novel DYNC1H1 variants, the p.Lys1395Gln was predicted as likely pathogenic. Interestingly, this variant was detected in a fALS patient harboring multiple gene variants and presenting with an aggressive disease course and 11 months survival. The DYNC1H1 p.Ser1768Ile was instead predicted as VUS. The SPG11 frameshift p.Met1609Serfs*31 was the only one identified in a patient presenting with a flail arm phenotype associated with FTD. The FIG4 p.Pro344Thr variant, identified in a flail arm patient, was located in the protein phosphatase domain and was predicted as likely pathogenic by ACMG classification. Finally, the SETX p.His1951Leu variant was identified in a patient with multiple variants. Further investigations are required to determine if the novel variants have an effect at the protein level, both in the structure or in the functional activity, as well as in in affected individuals and pedigrees.

Co-Occurrence of Variants in ALS Genes Panel and Survival Analysis
In our cohort of ALS patients, 14 patients (17% overall, 24% in fALS, 15% in sALS) harbored multiple rare variants in more than one gene. Eleven of these fourteen patients presented changes in two genes and three patients carried changes in three genes ( Table 2). Thirteen of these fourteen patients (93%) showed at least one variant in an ALS-associated gene; three patients harbored changes in three ALS genes, four patients in two ALS genes, while six patients harbored one gene variant in an ALS gene associated with one variant in a HMN/CMT2 gene. Finally, one patient had two variants in MND-spectrum genes. Interestingly, the two patients with the previously reported mutations in the SOD1 gene presented changes in another ALS gene, while four of the seven patients with the pathogenic C9orf72 repeat expansion were harboring a second gene variant. The C9orf72 repeat expansion was associated with ALS associated genes (DCTN1 and DYNC1H1) and MND-spectrum genes (HSPB1); of note, all these genes were previously associated with a dHMN, HMN, or CMT2 phenotype.
The Kaplan-Meier analysis showed that a higher variant burden is associated with reduced survival in our cohort of ALS patients, log-rank (Mantel-Cox) χ 2 = 10.52, p = 0.005 ( Figure 2A). The ALS patients harboring two or more rare variants had a significantly shorter median survival (28.00 months, 95% CI 16.1-39.9 months) as compared with the ALS patients carrying one rare variant (43.00 months, 95% CI 35.1-50.8) and with the ALS patients harboring no rare variants (63.00 months, 95% CI 41.8-84.1 months). Additionally, we showed that a higher rare variant burden is associated with reduced time to reach King stage 4, log-rank (Mantel-Cox) χ 2 = 8.61, p = 0.01 ( Figure 2B) as compared with the ALS patients harboring one or no rare variants. The ALS patients harboring two or more rare variants have a significantly shorter median time to reach King stage 4 (19.00 months, 95% CI 10.4-27.6 months) as compared with the ALS patients carrying one rare variant (37.00 months, 95% CI 25.2-48.8) and with the ALS patients harboring no rare variants (34.00 months, 95% CI 10.1-57.9 months). These results were confirmed at MAF < 0.001 in the European population of the ExAC database ( Figure S1). The Kaplan-Meier analysis did not show an association between rare variant burden and age at onset.
The Cox multivariate analysis confirmed that the burden of rare variants is independently associated with reduced survival in MND with an increased proportional hazard ratio (HR) for patients harboring more than two rare variants of 5.61 (95% CI 2.38 to 13.22, p ≤ 0.001) (Table 3). Similarly, the Cox regression model confirmed that the burden of rare variants is an independent prognostic factor for the time to reach Kings stage 4, with an increased HR for patients harboring more than two rare variants of 3.09 (95% CI 1.37 to 6.97, p = 0.007). Both results were confirmed at MAF < 0.001. (Table S6). changes in three ALS genes, four patients in two ALS genes, while six patients harbored one gene variant in an ALS gene associated with one variant in a HMN/CMT2 gene. Finally, one patient had two variants in MND-spectrum genes. Interestingly, the two patients with the previously reported mutations in the SOD1 gene presented changes in another ALS gene, while four of the seven patients with the pathogenic C9orf72 repeat expansion were harboring a second gene variant. The C9orf72 repeat expansion was associated with ALS associated genes (DCTN1 and DYNC1H1) and MNDspectrum genes (HSPB1); of note, all these genes were previously associated with a dHMN, HMN, or CMT2 phenotype.
The Kaplan-Meier analysis showed that a higher variant burden is associated with reduced survival in our cohort of ALS patients, log-rank (Mantel-Cox) χ 2 = 10.52, p = 0.005 (Figure 2A). The ALS patients harboring two or more rare variants had a significantly shorter median survival (28.00 months, 95% CI 16.1-39.9 months) as compared with the ALS patients carrying one rare variant (43.00 months, 95% CI 35.1-50.8) and with the ALS patients harboring no rare variants (63.00 months, 95% CI 41.8-84.1 months). Additionally, we showed that a higher rare variant burden is associated with reduced time to reach King stage 4, log-rank (Mantel-Cox) χ 2 = 8.61, p = 0.01 ( Figure 2B) as compared with the ALS patients harboring one or no rare variants. The ALS patients harboring two or more rare variants have a significantly shorter median time to reach King stage 4 (19.00 months, 95% CI 10.4-27.6 months) as compared with the ALS patients carrying one rare variant (37.00 months, 95% CI 25.2-48.8) and with the ALS patients harboring no rare variants (34.00 months, 95% CI 10.1-57.9 months). These results were confirmed at MAF < 0.001 in the European population of the ExAC database ( Figure S1). The Kaplan-Meier analysis did not show an association between rare variant burden and age at onset.
The Cox multivariate analysis confirmed that the burden of rare variants is independently associated with reduced survival in MND with an increased proportional hazard ratio (HR) for patients harboring more than two rare variants of 5.61 (95% CI 2.38 to 13.22, p ≤ 0.001) (Table 3). Similarly, the Cox regression model confirmed that the burden of rare variants is an independent prognostic factor for the time to reach Kings stage 4, with an increased HR for patients harboring more than two rare variants of 3.09 (95% CI 1.37 to 6.97, p = 0.007). Both results were confirmed at MAF < 0.001. (Table S6).

Discussion
In this study, by using next generation sequencing and C9orf72 repeat expansion analysis, we analyzed 29 genes in an Italian cohort of 83 patients with both familial and sporadic ALS. Overall, we detected 43 rare variants in 17 different genes and found that 43.4% of ALS patients harbored a variant in at least one of the investigated genes, with 17% of the patients showing co-occurrence of more than one gene variant. Moreover, our results show that rare variant burden is associated with reduced survival and the time to reach King stage 4, independently of known negative prognostic factors.
Overall, our findings confirm and highlight the importance of genetic susceptibility in both familial ALS, as well as in a major proportion of apparently sporadic ALS cases [15,17,57]. A direct comparison with previous studies is methodologically difficult, since our cohort was clinic-based with no strict inclusion criteria and we could not address whether differences in populations could be involved. However, the high frequency of variants detected in our cohort could probably, in part, be due to the large number of genes we examined and, in part, because some variants were observed only in a heterozygous form but could be associated with ALS as recessive or within the context of a multistep disease model [6]. Moreover, apparently, sALS patients can carry pathogenic variants with low penetrance influencing their classification as fALS versus sALS [8,17].
An interesting finding is that 28% of the variants were identified in CMT2/dSMA genes. These data confirm the emerging genetic pleiotropy of ALS [1,2,9,58]. More specifically, in relation to the specific design of our panel, our data suggest an overlap with other diseases sharing degeneration of motor axons and neurons as a common feature, such as the axonal hereditary neuropathies [9].
Mutations in the MFN2 gene are largely considered to be the primary cause underlying CMT2 [59]. Pes cavus, which is traditionally considered a distinguishing CMT clinical sign, has also been anecdotally reported in ALS patients [60], and is estimated to be as high as 2% of our case series [8]. We identified two MFN2 variants in our cohort. The p.Arg468His variant has been previously reported in CMT2 patients and its pathogenicity has also been supported by functional in vitro studies [33]. However, we excluded this variant since it was identified in our non-neurological control group.
Mutation in DYNC1H1, classified by the ALSoD database as an ALS gene, have been previously associated with autosomal dominant CMT type 2O but also with spinal muscular atrophy with a predominance of lower extremity involvement (SMA-LED), HSP, and hereditary mental retardation with cortical neuronal migration defects [59][60][61]. Among the five variants identified by our screening in DYNC1H1, the p.Lys1395Gln, which was predicted as pathogenic by in silico tools, was interestingly detected in a fALS patient with multiple gene variants and presenting with a classic ALS and aggressive disease.
Mutations in BSCL2, encoding Seipin, are responsible for pleiotropic clinical manifestations ranging from autosomal-recessive congenital generalized lipodystrophy to autosomal-dominant seipin-related motor neuron diseases, such as distal hereditary motor neuropathy type V (dHMN-V), Silver syndrome/SPG17, and CMT2 [62]. Among the variants we found in this gene, three were already reported as VUS, while the other two variants, p.Ser230Asn and p.Ser353Thr, were present in ExAC database at a very low frequency (MAF < 0.00001). Among them, the former was located in the highly conserved functional domain, analogously to the p.Ala282Thr identified by us, while the Ser353 was a cAMP and cGMP-dependent protein kinase phosphorylation site, which allowed us to speculate about the potential pathogenicity of the latter variant [63].
Structurally, small heat shock proteins B (HSPBs) are characterized by a C-terminal α-crystallin domain, which is highly conserved and important for the formation of stable dimers; therefore, mutations involving this sequence are regarded as disease causing [64]. To date, about twenty different mutations have been reported in the HSPB1 gene, associated with HMN, dHMN, dSMA, and CMT2, with both dominant and recessive inheritance [54,55]. We found the novel p.Ser135Ala variant, located in the α-crystalline domain and predicted as damaging by in silico tools. HSPB1 p.Ser135Phe/Cys/Tyr mutations have already been reported in dHMN and CMT2, [54,55] supporting the pathogenicity of this new variant. Intriguingly, our patient presented with flail leg syndrome, a restricted ALS phenotype characterized by predominant LMN involvement in the lower limbs and was also harboring the pathogenic C9orf72 repeat expansion. HSPB3, a paralogue of HSPB1, is expressed in several tissues including adult smooth muscle and brain [65]. Several mutations in this one-exon gene have been previously associated with distal hereditary motor neuropathy 2C (dHMN2C), myopathy, and CMT2 [64]. The p.Arg116Pro mutation which we found in this gene has recently been described in a patient with myopathy and functional studies have provided evidence that HSPB3 mutants could be associated with a gain of toxic function by inhibiting the formation of the HSPB2-HSPB3 complex and increasing the propensity to form nuclear aggregates [38]. Notably, in the same position, we identified the p.Arg116X nonsense variant, never reported before in the literature, in a patient with a classic phenotype. Therefore, this variant could be considered to be likely pathogenic.
Although FIG4 is classified as a major ALS gene, it was originally reported as a causative gene for CMT4J, an AR motor and sensory neuropathy [23,27]. The FIG4 p.Pro344Thr variant, identified in a flail arm patient, is located in the protein phosphatase functional domain and is predicted as deleterious by in silico tools; hence, further studies are needed to understand if this alteration can influence the protein activity. Although the same p.Ile41Thr mutation described first in CMT4J was also subsequently identified in AD fALS and sALS [28], we also found it in two controls, not supporting its pathogenicity.
In the SETX gene, we found nine unrelated patients who were carrying three variants classified as VUS, which have been reported in the ALS and non-ALS literature, including CMT2 and AOA2 [19,20], as well as the novel p.His1951Leu variant which was identified in a patient with multiple gene variants. Interestingly, as already noticed in prior NGS studies, we found an abundance of variants in this gene, which are generally described as rare [8,15].
Among the variants identified in ALS genes, we found an in-frame deletion in the protein glycin-rich domain of the FUS gene, which is absent from all public genomic databases or dbSNP. Similar insertions and deletions of G-residues in this G10-stretch region have already been reported in ALS patients [66], but also in control individuals, as confirmed by the analysis of our control group, suggesting that small G-deletions in this region could be tolerated [67,68].
In spite of the relative abundance of SPG11 variants, this gene is associated with AR HSP with thin corpus callosum and juvenile ALS under recessive disease models [69,70]. We identified the p.Cys1996Leufs*4 frameshift in a classic sALS patient with SPG11 compound heterozygosity; intriguingly, this variant has been previously reported as pathogenic in compound heterozygosity in two different unrelated patients, however, presentin with an HSP phenotype [31,32].
Recent findings have suggested a complex model of inheritance in ALS, which is consistent with a multistep pathogenetic process and could explain phenomena such as reduced penetrance or phenotypic heterogeneity, and clinical presentation and progression, which can be observed even in familial cases and between patients harboring the same gene mutation [6,8]. In our cohort, about 15% of patients were harboring variants in at least two of the investigated genes, confirming the findings of previous studies and supporting the hypothesis that rare variant burden can play a role in the multistep model of disease. Such a model would be inclusive of patients with mutations in major ALS genes [8,15,17,57,[71][72][73], even if for such genes as C9orf72, SOD1, TARDBP and FUS a reduced number of pathogenetic steps has been recently suggested [74]. It is important to note that the percentage of co-occurrence of gene variants estimated by our study is higher as compared with most of the previous studies, including a recent large-scale study, identifying oligogenity in 1% of ALS patients [75]. The reported frequencies of patients with multiple variants in ALS are indeed varied, having been estimated in previous studies as 1.6% [15], 1.3% [76], 1.4% [72], 3.8% [8], 7.9% [73], 18.8% [17], and 19.5% [71]. These differences reflect a number of methodological issues such as the number and selection of sequenced genes (i.e., gene size effect or genetic variability of studied genes), presence of controls, and criteria adopted for filtering and, most importantly, for predicting the pathogenicity of identified variants. Indeed, there is no consensus on the criteria for defining a pathogenic combination of mutations, and hence the appropriate use of the "oligogenic inheritance" term itself is debatable since some variants can be classified as VUS [77] and the co-occurrence of a pathogenic variant with a VUS has been considered to be oligogenic inheritance by some studies [17,71,72], whereas in other studies the term was rigorously restricted to previously reported or potentially pathogenic variants [75]. Therefore, we acknowledge that further mechanistic functional analyses or segregation studies are needed in order to scrutinize the pathogenicity of each of the variants we identified. Indeed, the various in silico models for prediction of pathogenicity are not always consistent (Table 1) [77]; for this reason, we treated all rare variants equally in the survival analysis and did not take into account whether some variants could be more deleterious than others, in line with previous reports [17,71,72]. Notably, we showed that a higher variant burden is associated with reduced survival and time to reach King stage 4, i.e., the time to reach such important disease milestones as the need for percutaneous endoscopic gastrostomy (PEG) positioning or non-invasive mechanical ventilation (NIMV) initiation, independently from known negative prognostic factors [78,79]. This finding supports a model of ALS where the additive or synergistic effects of multiple defective genes influence disease phenotype and, in particular, affects the progression of ALS. Therefore, these data could potentially have important implications not only in clinical practice, but also for patient stratification in clinical trials, using their genetic profile.
Although an association between rare variant burden and the time to reach King stage 4 has never been explored before, our data are in accordance with previous studies, similarly showing a reduced survival in patients with variants in more than one gene [71] and a correlation with the rate of disease progression [80]. Our results did not show a relation between rare variant burden and age of onset, in line with previous reports [68]. However, other papers reported conflicting results, suggesting either earlier [8,73] or later [17] age of onset in patients carrying variants in more than one ALS gene.
Taken together, these data point to an emerging relevance of oligogenic models in ALS with growing implications in ALS molecular diagnosis and counseling, genotype-phenotype correlation studies, as well as the development of novel therapeutic strategies.
Our study has several limitations. First, our sample size is relatively small, and this is not a population-based study and as such it is potentially affected by referral bias. However, we were able to provide detailed phenotyping in order to test potential genotype-phenotype correlations. Secondly, unlike most of previous studies, although we included in our analysis some genes associated with HMN/dSMA/CMT2, based on their association with other hereditary motor syndromes [9], we acknowledge that the gene list is far from being exhaustive, and therefore this is not a truly unbiased survey of the exome or genome. However, our study design allowed us to explore specific hypotheses. We also acknowledge that the panel was designed before the discovery of some more recent ALS genes. The development of NGS technologies has certainly given us the advantage of accelerating the generation of a large amount of sequencing data; however, it has also emphasized the complexity of determining the pathogenicity of variants. This can be even more difficult in complex diseases such as ALS, for which a multistep pathogenetic process has been proposed [6]. Indeed, individual variants alone could be tolerated but when combined with a second variant would exceed the threshold required for neurodegeneration [8], a consideration that led us to not restrict our analysis to previously reported or pathogenic ALS variants, in line with previous studies [17,71,72], even if other studies adopted more stringent criteria [75]. Nonetheless, it could also be anticipated that many of the possible disease variants identified by NGS studies are not associated with ALS [15]. Indeed, it has been shown that individuals from different populations carry different profiles of rare and common variants, and that low-frequency variants show substantial geographic differentiation [81]. Given that variants in a certain region or domain of a gene are associated with ALS as disease causing variants are rare, it could be possible that the lack of study of specific populations could explain the number of variants that have not been previously reported. Moreover, the results obtained could be linked to the size of the gene or the intrinsic variability of the genes tested. In order to control for these biases, we analyzed a large cohort of in-house non-neurological controls, and excluded the variants found in the two groups. However, our analysis of non-neurological controls showed that the gene variability was comparable to that observed in the ALS group and some variants that were associated with ALS by previous studies could also be detected in the controls (Table S5), which is an observation in line with previous reports [75]. In support of our results, however, our survival analysis showed an independent negative effect for patients harboring more than one variant, indirectly corroborating the hypothesis of a potential detrimental effect of the burden of the rare variants that we identified. However, as stated above, we acknowledge that many of the novel and rare variants identified require further studies in larger cohorts of patients to validate a possible contribution to ALS.
In summary, our data contribute to a better understanding of the molecular basis of ALS supporting an oligogenic model of disease and a negative prognostic effect of rare variant burden. Moreover, our results suggest a further extension of the genetic landscape of ALS to other genes traditionally implicated in degenerative diseases of peripheral axons, such as HMN, dSMA, and CMT2. However, our results should be replicated in other independent and larger cohorts of ALS patients.

Subjects
We sequenced a cohort of 83 unrelated Italian ALS patients. The local ethics committee approved the study protocol. No strict inclusion or exclusion criteria were adopted; patients were included upon clinical judgment; individuals with earlier disease onset, fALS, or with no molecular diagnosis at the time of inclusion were prioritized for NGS sequencing by referring clinicians. Most of the included patients had previously been analyzed with Sanger sequencing for SOD1, C9orf72, TARDBP, and FUS variants. Patients with definite, probable, laboratory supported, or possible ALS were included in the study, according to El Escorial revised criteria [82]. Blood samples were obtained for diagnostic purposes and stored in our tissue bank, after informed consent. Progression rate was defined by the slope on the revised ALS functional rating scale (ALSFRS-R). Seventeen patients (20.5%) had fALS.
The NGS targeted regions encompass 455 coding exons each including 20 intronic flanking bases and the 5 and 3 untranslated regions (UTRs). A customized design (TruSeq ® Custom Amplicon, TSCA; Illumina, San Diego, CA, USA) was employed with online probe design performed by the Design Studio (DS) software (https://designstudio.illumina.com/). The regional source of coding exons was extracted from the UCSC database (Genome Browser human GRCh37/hg19). A 98% coverage was predicted for the targeted region, 127,008 bp long, with a total of 968 amplicons. Targeted resequencing was performed using the MiSeq ® sequencing platform according to the manufacturer's procedure (Illumina, San Diego, CA, USA). We analyzed single variants reported in the FASTQ and VCF output file with a commercially available NextGENe ® software (version V4.0.1 SoftGenetics ® , State College, PA, USA) and visualized via Integrative Genome Viewer (IGV) software (http://www.broadinstitute.org/software/ igv/) [84,85]. The C9orf72 repeat expansion was analyzed in all the patients, using both amplicon-length and repeat-primed polymerase chain reactions, as described before [86].

Filters
All the regions with a sequencing depth <30 were considered to be not suitable for analysis, according to the guidelines of the American College of Medical Genetics and Genomics [77]. All variants with coverage depth >30 were filtered based on the following criteria [87]: (i) Qscore > 30; (ii) variants with minor allele frequencies (MAF) > 0.01 identified in the dbSNP150 database (www.ncbi.nlm.nih.gov/ projects/SNP/) or Exome Aggregation Consortium Sequencing Project (ExAC; exac.broadinstitute.org) were filter out, in agreement with the ACMG classification criteria [77], with the exception of variants reported as pathogenic or of uncertain significance [8,18,73,87]; (iii) synonymous changes were excluded except for those located within or near splice sites. The impact of candidate variants was evaluated using prediction tools: Sorting Intolerant from Tolerant software (SIFT; sift.jcvi.org) [88], Polymorphism Phenotyping (PolyPhen-2; genetics.bwh.harvard.edu/pph2/) [89], and Mutation Taster (www.mutationtaster.org/) (Table S8). Clinical significance of reported variants was assessed on the basis of the ACMG guidelines [77]. A cohort of 332 non-neurological unrelated Italian patients, for whom NGS exome sequencing data were available from our in-house database, was selected as the control group. Variants in common between ALS patients and controls were excluded.

Sanger Sequencing Validation
All the identified variants were confirmed by Sanger sequencing. We designed primers using Primer3 (http://biotools.umassmed.edu/bioapps/primer3_www.cgi) to perform a direct sequencing. The PCR products were purified using AMPure (Agencourt-Beckmann Coulter, Inc., Brea, CA, USA), then, sequenced in both directions using a Big Dye Terminator v1.1 Cycle Sequencing Kit (Applied Biosystems Foster City, CA, USA). Sequencing products were purified using a Big Dye X-Terminator Kit (Applied Biosystems Foster City, CA, USA) and run on an ABI 3730 Genetic Analyzer (Applied Biosystems Foster City, CA, USA). Amplicon Sequences were compared with the reference sequence (hg19) using Sequencer 5.0 Software (Gene Codes). Primers are available on request. The validated variants were reported in the results depending on whether they were previously reported in the ALS literature, in the non-ALS literature, in population databases (including those reported only in dbSNP), or never reported (novel variants).

Statistical Analysis
Descriptive statistics are reported as count and percentage, for categorical variables, or mean and standard deviation, for continuous variables. The Chi-squared test, with Bonferroni correction for multiple comparisons, was used to explore differences in gene variant frequencies between ALS patients and non-neurological controls (Table S5). Kaplan-Meier univariate analyses were carried out to determine the effect of the burden of rare variants at MAF < 0.01 [8,18,73,87] on survival (defined as time from symptoms onset to death or tracheostomy), time needed to reach KING stage 4 (defined as the time from symptom onset to PEG or NIMV positioning), and age of onset. Moreover, we confirmed the results of survival analysis at the more stringent cut off of MAF < 0.001 [70]. Log-rank tests were used to test for differences between groups. Subsequently, multivariable analysis with Cox proportional hazards model (stepwise backward) was performed to estimate the proportional hazard ratio (HR) of rare variant burden on survival and on time needed to reach KING stage 4. Cox regressions were adjusted for the following factors known to influence survival in ALS patients: site of symptoms onset, diagnostic delay, presence of dementia, age at the onset, and progression rate, defined by the slope on the revised ALS functional rating scale (ALSFRS-R), i.e., ∆ALSFRS-R = (48 minus ALSFRS-R score )/(date of the ALSFRS-R score minus date of onset) [77,89]. In the survival analysis, we treated all rare variants equally, and did not take into account whether some variants could be predicted to be more deleterious than others, according to previous ALS studies and since reliable models for rare variants interactions are still lacking [17,70,71]. Statistical analyses were performed using IBM Statistical Package for Social Science (SPSS) version 20, with p-value < 0.05.
Supplementary Materials: Supplementary materials can be found at http://www.mdpi.com/1422-0067/21/9/3346/ s1. Table S1. Clinical and demographic characteristics of ALS patients. Table S2. Rare variants identified in this study and clinical features of the index patients. Table S3. Rare variants identified in non-neurological controls (664 Chromosomes). Table S4. Variants identified in this study and presence in controls. Table S5. Overview of variants identified in ALS patients and controls. Table S6. Cox proportional hazards regression multivariate analysis on survival and time to King stage 4 (MAF < 0.001). Table S7. Genes present in the panel. Table S8. Criteria for evaluation and filtering of the identified variants in this study. Figure S1 Funding: This study was partially supported by a grant from the Italian Ministry of Health (#RF-2011-02351193).

Conflicts of Interest:
The authors declare no conflict of interest. The funders had no role in the design of the study; in the collection, analyses, or interpretation of data; in the writing of the manuscript, or in the decision to publish the results.