Introduction

There is a wide range of hereditary and non-hereditary cerebellar ataxias (CA) with variable clinical symptoms, usually characterized by gait and limb ataxia, dysarthria, and oculomotor disorders, such as gaze-evoked nystagmus or saccadic smooth pursuit [1]. Although many genetic causes of CA have been discovered in recent years [2], several cases remain idiopathic. Unknown genetic factors and genetic susceptibility factors have been suggested as contributing to the degeneration of the cerebellum [2, 3].

Although downbeat nystagmus (DBN) is the most common involuntary fixation nystagmus [4], it is a very rare sign. DBN is often associated with other cerebellar ocular signs such as saccadic smooth pursuit or gaze-holding deficits (for ref. see [5, 6]). The most common symptoms of DBN are unsteadiness of gait (89% idiopathic DBN vs. 81% secondary DBN) and oscillopsia (44% vs. 38%) [4]. It is most often caused by a bilateral hypofunction of the flocculus or paraflocculus [7, 8] which – due to an impaired function of the Purkinje cells (PCs) – causes a disinhibition of superior vestibular nuclei neurons, leading to a slow upward drift of the eyes and a quick downward saccade [9]. Several theories concerning the pathophysiology of DBN have been proposed: (A) an asymmetry of peripheral vestibular input [10], (B) a central imbalance in the vertical vestibulo-ocular system [9, 11], (C) an imbalance of the smooth pursuit system [7, 12], and (D) a mismatch of the coordinate system of burst generator and the vertical generator [13].

In many cases, the etiology remains unclear (so-called idiopathic DBN: 38%); in other cases, an underlying structural pathology can be found, such as degenerative disorders of the cerebellum (20%), vascular lesions (9%), and malformations (7%) [4]. DBN is also frequently found in patients with spinocerebellar ataxia type 6 (SCA6) as well as in episodic ataxia type 2 (EA2) [14], whereas DBN was not found in other genetic CA such as SCA1, SCA2, SCA3, and SCA 31 [15,16,17]. All in all, despite its high prevalence and its characteristic clinical features which allow diagnosis, the underlying etiology of DBN often remains unclear.

Therefore, we used a different approach and performed a genome-wide association study (GWAS) in patients with idiopathic DBN in order to identify genes which might be associated with the disease and to allow a further evaluation of the pathophysiology and etiology of DBN.

Material and Methods

Patients

Patients of European descent were recruited in the German Center for Vertigo and Balance Disorders and the Department of Neurology at the Ludwig Maximilians University (LMU), Munich from 2012 to 2017. The detailed medical history, medication, and harmful substance abuse of each participant were assessed using a semi-structured interview. In addition, a detailed family history of first-degree relatives was collected, focusing in particular on neurological genetic disorders and DBN. None of the DBN patients reported cerebellar disorders in the parent generation. All patients underwent an extensive neurological examination and cerebral imaging by MRI or CT. Any evidence of symptomatic DBN led to exclusion from the study (see 2.1.2).

Inclusion Criteria

All patients included showed DBN, which has been defined as downwardly beating fixation nystagmus in primary position with an increase of the intensity of the nystagmus during lateral and downward gaze. The patients included in this study can be divided into two subgroups: (1) idiopathic DBN in association with other cerebellar oculomotor disorders (85 patients); and (2) idiopathic DBN in association with other cerebellar disorders such as cerebellar ataxia or with cerebellar atrophy (21 patients).

Exclusion Criteria

Subjects with any evidence of symptomatic DBN were excluded. This included cerebellar/brainstem infarction or hemorrhage; cerebellar tumor; evident neurodegenerative cerebral disorders/syndromes (e.g., multiple system atrophy); inflammatory, infectious, and immune-mediated cerebellar damage; toxic and nutritional cerebellar damage (e.g., due to alcohol); paraneoplastic cerebellar degeneration; or other infratentorial structural lesions.

Healthy Volunteers (PAGES)

Healthy volunteers were recruited to the PAGES (Phenomics and Genomics Sample) sample from the Munich greater area and were of German descent. PAGES consists of approximately 3000 healthy individuals with a detailed medical, neurological, and psychiatric history of the participants themselves, and their first-degree relatives who were assessed using semi-structured and comprehensive interviews including the Structured Clinical Interview for DSM-IV (SCID I, II) [18, 19] and the Family History Assessment Module [20]. Individuals suffering from psychiatric or neurological diseases as well as subjects with CNS impairment or a self-reported history of DBN were excluded.

Genotyping

Genomic DNA was extracted from whole blood using the QIAamp DNA Maxi Kit (Qiagen), according to the manufacturer’s instructions and dissolved in nuclease-free water. The concentration of genomic DNA was measured using Picogreen (Molecular Probes) and adjusted to 50 ng/μl. Samples were genotyped on different platforms.

Genome-Wide Association Analysis

Genotyping and association analysis were performed as previously described [21].

Overview

Genotypes of the different platforms were imputed in seven batches and combined into one large dataset. Quality control and imputation of batch 1 (Human610-Quad [22], Human660W-Quad [23]), batch 2 (HumanHap300 [24]), and batch 3 (Affymetrix 6.0 [25]) were performed in the framework of a schizophrenia meta-analysis conducted by the Psychiatric Genomics Consortium (PGC) [26]. Batches 4 (HumanHap300 [27]), 5 (Illumina HumanOmniExpress-12 [28]), 6 (Illumina Omni1-Quad [29]), and 7 (HumanOmniExpress-24) were processed following quality control and imputation protocols, used by the PGC. Datasets were combined to get a sufficient number of controls. For GWAS analysis, selected patients and controls were extracted from the combined dataset after global quality control [21].

Global Quality Control

PLINK 1.9 [30] was used for global quality control of the genotype data. A pre-QC filtered SNV (single nucleotide variant) set (missingness < 0.05) was used to exclude subjects with mismatches between reported and estimated gender and samples with call rates below a chip-specific threshold. Sample call rate thresholds differed slightly between chips to account for smaller sample sizes (96–99%). After subject removal, SNV quality was assessed, and variations were excluded based on the following criteria: SNV call rate < 99%, deviations from Hardy-Weinberg equilibrium in controls (p ≤ 10−6) or cases (p ≤ 10−10) and SNVs with call rate differences ≥ 0.02 between cases and controls. Also, x-chromosomal markers with a haploid heterozygosity rate > 2%, missingness ≥ 0.05, or HWE p ≤ 10−6 in females were removed. A more stringent quality-controlled (MAF ≥ 0.05, HWE p ≥ 0.05, call rate ≥ 0.99) and LD-pruned autosomal marker set was used for cryptic relatedness, heterozygosity deviation and population stratification analyses. The marker set was pruned with PLINK’s indep-pairwise command using r2 = 0.2, a window size of 1500 and shifting the window 150 SNVs at each step. Additionally, several high LD regions were excluded as the extended MHC region. One subject of each pair with π̂ > 0.1875 was removed; cases were generally preferred over controls. As sample contamination results in an increase of heterozygote calls and thereby an overestimation of cryptic relatedness, individuals showing a heterozygosity deviation with |Fhet| ≥ 0.2 were excluded. In addition, the number of π̂-values > 0.05 per individual and the distribution of π̂ means were used to check for outliers and possible sample contamination. Population stratification analysis was done with EIGENSTRAT [31]. SNV loadings were checked for normality, and the derived principal components were used to identify and remove outlying samples. Known duplicates on different chips were checked for concordance, keeping the sample of higher quality (i.e., sample call rate and overall chip quality). Both subjects were excluded if the concordance rate was lower than 99%.

Pre-Phasing and Imputation

Each batch was pre-phased with SHAPEIT [32] and imputed separately on the 1000 Genomes reference panel phase 1 version 3 macGT1 (https://mathgen.stats.ox.ac.uk/impute/data_download_1000G_phase1_integrated.html) in chunks of 3 Mb using IMPUTE2 [33]. Chromosome X imputation was performed separately for males and females. After imputation, the seven batches were combined by retaining markers with INFO values ≥ 0.6 in every batch and the combined set. Additionally, all markers with allele frequency differences > 0.1 between any of the batches were excluded. Checks for cryptic relatedness, heterozygosity deviation, and population stratification on the combined set were performed, as described before, using the same exclusion criteria on a stringent-thresholded marker set (INFO > 0.8, missingness < 1%, MAF > 0.05) for calling best guess genotypes with uncertainty ≤ 0.1.

GWAS Analysis

The final mega dataset was comprised of 4575 individuals, including 107 DBN patients and 2618 healthy controls appropriate for association analysis spread across 6 batches (Table 1). The principal components for genome-wide association analysis were derived by EIGENSTRAT [31]; 1 outlying case and 9 controls were removed. Tracy-Widom statistics identified PC1 and PC2 as relevant (Fig. 1). As DBN cases were typed on one platform, controls of the six batches were used for batch effect detection. Any marker showing deviations between any of the batches was excluded (logistic regression corrected for PC1 and PC2, p < 0.001). After exclusion of 116,387 variants, the final dataset was comprised of 7,759,885 markers.

Table 1 Batch distribution of cases and controls
Fig. 1
figure 1

Scatterplot of the first two principal components (PC) derived by EIGENSTRAT

GWAS association testing of approximately 8 million variants (INFO ≥ 0.6, MAF ≥ 0.01 in cases and controls) using 106 patients with DBN and 2609 healthy subjects was conducted in PLINK 1.9 [30], applying an additive logistic regression model corrected for age, sex, PC1, and PC2. Quantile-quantile and Manhattan plots are shown (Figs. 2 and 3). The genomic inflation factor was 1, thus showing no sign of global inflation due to batch effects or population stratification. Results were clumped with PLINK to derive LD-independent index SNVs (3000 kb, p1 = 5 × 10−08, p2 = 1 × 10−4, r2 < 0.1).

Fig. 2
figure 2

Manhattan plot of the genome-wide association analysis of 106 DBN cases and 2609 controls. The x-axis shows the chromosomal position, and the y-axis shows the significance of association (−log10(p)). The red line shows the genome-wide significance level (5 × 10−8)

Fig. 3
figure 3

Quantile-quantile plot of GWAS analysis. The area shaded in gray indicates the 95% confidence interval under the null

Results

One hundred and six patients with idiopathic DBN (46 females, 43%) and 2609 healthy controls (1402 females, 54%) were included in the analysis. The mean age of the patients was 70.09 ± 10.1 years; the mean age of the controls was 47.41 ± 16.6 years. Among approximately 8 million variants tested for association, we identified one genome-wide associated SNV (p < 5 × 10−8) located in the fibroblast growth factor 14 gene (FGF14) mapping to chromosome 13q33.1 (Figs. 2 and 3, Table 2).

Table 2 LD-independent SNV associations for DBN

In addition, 15 LD-independent loci with suggestive evidence of association (p < 1 × 10−5) were identified. On chromosome 13q33.1 and chromosome 5q14.1, two regions each were merged on account of physically mapping next to each other with a distance below 250 kb, resulting in 14 physically and LD-independent regions (Table 2).

The merged region on chromosome 13 contained parts of the FGF14 gene, with the genome-wide associated variant rs72665334 (p = 1.50 × 10−8) being localized in intron 9 and the suggestive variant chr13_103023008_D (p = 4.52 × 10−5) in intron 1 of the gene (Fig. 4a). These hits showed isolated signals and should therefore be interpreted with caution.

Fig. 4
figure 4

Regional association plots for loci associated with DBN. In order to highlight the statistical strength of the association in the context of the surrounding markers, gene annotations and estimated recombination rates (NCBI build 37) of the SNPs in the specific regions are plotted against their corresponding p values (as −log10 values, left-hand y-axis). A purple diamond represents the SNP with the highest association signal in each locus. All other SNPs are represented as single dots, where dot colors indicate the LD with the associated SNP. Color coding represents the r2 value, and respective categories are shown on the upper left hand side. Estimated recombination rates (cM/Mb) are plotted to reflect the local LD structure surrounding the associated SNP and are shown as vertical light blue lines, marked on the right-hand y-axis. Genes in the region are displayed below. The orientation of the genes is indicated by arrows. a Regional association plot of the FGF14 variations on chromosome 13q33.1 associated with DBN. The genome-wide–associated SNP (rs72665334, purple diamond) is localized in intron 9, the suggestive hit chr13_103023008_D (yellow circle) in intron 1 of the fibroblast growth factor 14 gene (FGF14). LD structure refers to the genome-wide associated variant rs72665334. b Regional association plot of the merged region on chromosome 5 containing the overlapping genes DHFR and MSH3 and the respective hits (rs245100, purple; rs33003, yellow). LD structure refers to the associated variant localized in DHFR (rs245100)

The most promising region with suggestive evidence for association is localized on chromosome 5q14.1 containing two genes with a head-to-head overlap of their respective exons 1: dihydrofolate reductase (DHFR) and mutS (Mutator S) protein homolog 3 (MSH3). Two LD-independent hits are present in this merged region. Variant rs245100 (p = 5.03 × 10−07) is located in intron 4 of the DHFR gene, and rs33003 (p = 3,87 × 10−06) is located in intron 23 of the MSH3 gene (Fig. 4b).

Another four regions with suggestive evidence were also located on chromosome 5. Apart from one region on chromosome 5q23.1 in which the index SNV was located approximately 300 kb away from the next protein coding gene (Ferritin Mitochondrial, FTMT), the index SNVs of the remaining three regions were located directly in a protein coding gene: tubulin polymerization promoting protein (TPPP) on 5p15.33, microtubule-associated serine/threonine kinase family member 4 (MAST4) on 5q12.3 and ATPase phospholipid transporting 10B (putative) (ATP10B) on 5q34. As the variation allocated to ATB10B was a single signal, this has to be considered with caution. The same applies to the variations on chromosomes 1p35.2 (long intergenic nonprotein coding RNA 1648, LINC01648), 2p12 (regenerating family member 1 alpha, REG1A), 3p14.2 (synaptoporin, SYNPR), and 4p16.1 (SH3 domain and tetratricopeptide repeats 1, SH3TC1). The remaining suggestive signals are localized near N-acetylated alpha-linked acidic dipeptidase-like 2 (NAALADL2) on chromosome 3q26.31, spermatogenesis-associated 19 (SPATA19) on chromosome 11q25, Down syndrome critical region 4 (DSCR4) on chromosome 21q22.13, and iduronate 2-sulfatase (IDS) on the X-chromosome (Xq28).

Discussion

In this study we used a genome-wide approach to detect common variations associated with DBN. Among the approximately 8 million SNVs tested, only rs72665334 allocated to FGF14 on chromosome 13q33.1 remained genome-wide significant (p < 5 × 10−8). Additional suggestive associations (p < 1 × 10−5) were found for 15 LD-independent regions, the most promising of which were localized on chromosome 5q14.1 in a region containing the two overlapping genes DHFR and MSH3.

FGF14, the gene underlying the genome-wide hit, is part of the fibroblast growth factor (FGF) family subgroup of intracellular nonsecretory forms and is involved in the regulation of voltage-gated ion channels in neurons. It is prominently expressed in PCs, particularly in the axon initial segment of these cells [34] modulating the density of sodium [35, 36] and potassium channels [37] and also has a regulatory effect on calcium channels [38]. FGF14 knockdown in mouse cerebellar lysates has been shown to affect multiple kinetic parameters in PCs, which are responsible for sodium channel inactivation and thereby decrease the ability of repetitive firing [36, 39]. Also, a reduction of the spontaneous firing rate following in vivo knockdown of FGF14 in mature cerebellar PCs of wild-type mice as well as a reduced excitability of these cells in FGF14 knockout mice could be observed. Subsequently, these mice showed impairment of motor coordination and balance [40].

An impaired function of PCs due to reduced excitability is also compatible with the mode of action of the potassium channel blocker 4-aminopyridine, the drug of choice for the treatment of DBN [41, 42] which increases the excitability of PCs [43]. The same applies to the 4-aminopyridine treatment of episodic ataxia type 2 (EA2), which is also associated with DBN [44, 45].

In addition to the already known physiological functions, mutations in FGF14 have been shown to cause autosomal dominant spinocerebellar ataxia 27 (SCA27), a rare inherited neurodegenerative disorder leading to cerebellar degeneration and clinically slow progressive cerebellar ataxia, oculomotor deficits including nystagmus, low performance in education, and mental retardation [46,47,48].

However, as LD-dependent variations in this region are sparse and associations of them are below a suggestive threshold, this result should be considered with caution.

The top hit (rs245100) in the combined region on chromosome 5q14.1, containing the genes DHFR and MSH3, is localized in intron 4 of DHFR, and its significance was slightly below the genome-wide threshold (p = 5.03 × 10−7). DHFR is an essential enzyme in the folate metabolism, which catalyzes the reduction of dihydrofolate to tetrahydrofolate and additionally, in a less efficient reaction, the reduction of folate to dihydrofolate [49]. Folates play an important role in single carbon metabolism, such as DNA synthesis and methylation, regulation of gene expression, and synthesis of amino acids, nucleic acids, and neurotransmitters [50]. There is also evidence of pro-regenerative effects of folate in the adult CNS [51], as well as on peripheral neurons [52] which depend essentially on DHFR [53]. Blocking DHFR function by administration of the folate analogue and DHFR antagonist methotrexate (MTX) leads to neurotoxic effects in the cerebellum of newborn rats [54] as well as to the degeneration of PCs in guinea pigs, which could be rescued by application of the tetrahydrofolate derivative leucovorin [55]. This rescue effect of folate is also apparent in newborn rats, in which PC atrophy and cerebellar degenerative changes induced by treatment with valproic acid in the mothers could be markedly reduced by an additional treatment with folic acid [56]. Conceivably, an impaired DHFR function may contribute to impaired cerebellar function.

The other gene in this chromosomal region is MSH3, a part of the post-replicative DNA mismatch repair system, which is important for cell cycle regulation, apoptosis, and genome stability [57]. Additionally, MSH3 has an effect on the phenotype of Friedreich’s ataxia [58] and Huntington’s disease [59], both trinucleotide repeat disorders. The process resulting in trinucleotide repeat (TNR) expansions is not entirely understood, but MSH3 seems to provide a mutagenic role for (CTG)n/(CAG)n repeat tracts [60], while knockdown of Msh3 blocks TNR expansion effectively in mice [61]. Also, the TNR expansion disorder SCA6 frequently accompanies DBN [15, 16]. A direct link between the TNR enhancing properties of MSH3 and the occurrence of DBN requires further investigation.

In addition to these most promising associations, another twelve regions with suggestive evidence could be identified. Five of those variants localized in or near ATP10B, LINC01648, REG1A, SH3TC1, and SYNPR were single signals. Taking also into account the relatively low p value, these signals are probably false positives.

The remaining 7 variants were localized in genes involved in the cytoskeleton (MAST4, TPPP), iron homeostasis (FTMT), lysosomal degradation of sulfate esters (IDS), glutamate carboxypeptidases (NAALADL2), and fertility according to their sole expression in reproductive organs (DSCR4, SPATA19). Of these, MAST4 [62], TPPP [63], FTMT [64], and IDS [65] have been linked to pathological processes in the brain.

Finally, the main limitation of this study is the small sample size. Since DBN is a very rare phenotype that has received little attention in research so far, this study provides first clues to a possible genetic background. Nevertheless, confirmation of the associated signals in independent samples is necessary.

In conclusion, a genome-wide significantly associated signal points to FGF14 as a factor that might be involved in the pathophysiology of DBN. Given the expression of FGF14 protein in PC, its involvement in the regulation of neuronal ion channels, and its impact on the excitability of PCs, together with the hypothesis of DBN being a consequence of impaired PC function, these results point to an influence of genetic variations in FGF14 on the susceptibility of to this typical cerebellar nystagmus. The second signal suggests an involvement of the folate metabolism through the association of a variation in the regulating enzyme, DHFR. Both findings provide promising candidates for DBN and also for cerebellar degeneration as its cause.