C9orf72 hexanucleotide repeat length in older population: normal variation and effects on cognition

The hexanucleotide repeat expansion in C9orf72 is a common cause of amyotrophic lateral sclerosis/frontotemporal dementia and also rarely found in other psychiatric and neurodegenerative conditions. Alleles with >30 repeats are often considered an expansion, but the pathogenic repeat length threshold is still unclear. It is also unclear whether intermediate repeat length alleles (often defined either as 7-30 or 20-30 repeats) have clinically significant effects. We determined the C9orf72 repeat length distribution in 3142 older Finns (aged 60-104 years). The longest nonexpanded allele was 45 repeats. We found 7-45 repeats in 1036/3142 (33%) individuals, 20-45 repeats in 56/3142 (1.8%), 30-45 repeats in 12/3142 (0.38%), and expansion (>45 repeats) in 6/3142 (0.19%). There was no apparent clustering of neurodegenerative or psychiatric diseases in individuals with 30-45 repeats indicating that 30-45 repeats are not pathogenic. None of the 6 expansion carriers had a diagnosis of amyotrophic lateral sclerosis/frontotemporal dementia but 4 had a diagnosis of a neurodegenerative or psychiatric disease. Intermediate length alleles (categorized as 7-45 and 20-45 repeats) did not associate with Alzheimer's disease or cognitive impairment.


Introduction
C9orf72 hexanucleotide repeat expansion is a major cause of sporadic and familial amyotrophic lateral sclerosis (ALS) and frontotemporal dementia (FTD) (DeJesus-Hernandez et al., 2011;Renton et al., 2011), and it is particularly common in Finland (Majounie et al., 2012). Expansions can span up to several 1000 repeats, but the minimum length of pathogenic expansion is not known. Knowledge on the distribution of repeat lengths in the general population could help gain a better understanding of the threshold for pathogenicity. Perhaps the most widely used threshold for expansion is 30, defined in one of the original studies describing the C9orf72 repeat (Renton et al., 2011). Determining a threshold for expansion is complicated by the difficulty of reliably determining repeat lengths, and thresholds reflect more the limitation of used methods than actual alteration in biologic function. Southern blot is considered to be the golden standard of repeat length estimation but because of its high demand of good quality DNA, cost, and work load, often other methods such as repeatprimed PCR (RP-PCR) are used.
Besides genuine expansions, "intermediate length" alleles (defined often as 7e30 or 20e30 repeats) have also been hypothesized to predispose to neurodegenerative diseases. In luciferase reporter assay, intermediate alleles showed decreased C9orf72 promoter activity as compared to short repeat alleles (Gijselinck et al., 2016). Moreover, adverse effects have been observed in a fly model with 30 repeats , although the relevance of fly physiology to human pathophysiology in this context is unsure. As with expansion, the threshold for intermediate length repeats also varies in studies but allele length !7 has been shown to associate with the C9orf72 founder haplotype in the European population . A recent review (Ng and Tan, 2017) summarized that most studies do not find association between intermediate length alleles and a variety of neurodegenerative diseases but intermediate length alleles might be associated with psychiatric symptoms. Most studies on intermediate length alleles focus on ALS/FTD spectrum but only few on Alzheimer's disease (AD) (Cacace et al., 2013;Harms et al., 2013;Jiao et al., 2013;Kohli et al., 2013;Xi et al., 2012).
In this study, we present C9orf72 hexanucleotide repeat length distribution in 3142 Finns and study the association of intermediate repeat alleles with AD and cognitive impairment.

Cohorts
We studied the C9orf72 hexanucleotide repeat length in 3161 individuals from 4 population-derived cohorts from the Helsinki region in Southern Finland. These were the Vantaa 85þ study (Tanskanen et al., 2017) (n ¼ 469), Helsinki Birth cohort study (Barker et al., 2005;Eriksson et al., 2006;Kajantie et al., 2012;Lahti et al., 2014;Yliharsila et al., 2007) (n ¼ 1651), Helsinki Businessmen study (Strandberg et al., 2016) (n ¼ 666), and DEBATE study (Uusvaara et al., 2013) (n ¼ 375). We have previously used the same cohorts to study the association of TYROPB deletion and cognitive impairment (Kaivola et al., 2018). Information on AD, other dementia diagnoses, mini mental state examination (MMSE) scores, or lack of dementia was assessed from clinical records, death certificates, registry information, or questionnaires. The source of information varied between cohorts. More detailed cohort descriptions and information on assessment of dementia are provided in the Supplementary Methods.
We studied the effect of intermediate repeat alleles in 3 settings: (1) individuals with AD versus no AD; (2) individuals with cognitive impairment (MMSE score 24) versus individuals with no cognitive impairment (MMSE score >24), and (3) individuals with cognitive impairment or any diagnosis of dementia versus nondemented controls. All controls were !75 years. Repeat lengths 7e45 and 20e45 were used as the definition of intermediate length alleles.

Genetic analyses
We determined repeat lengths with RP-PCR as previously described (Renton et al., 2011) with minor changes to the protocol (Supplementary methods). Because it may be difficult to distinguish between an expansion and a long repeat in RP-PCR, we confirmed alleles with repeat length !20 and all putative expansions with over-the-repeat PCR. Samples with more than 30 repeats and the typical sawtooth pattern in RP-PCR, which only produced the smaller amplicon in over-the-repeat PCR, were categorized as expansions ( Supplementary Fig. 1).
The smallest allele is here denoted as 2e3 repeats because 2 and 3 repeat alleles cannot be distinguished by RP-PCR. RP-PCR results may be inaccurate in distinguishing heterozygosity and homozygosity in certain allele combinations (especially 2e3/5 from 5/5 and 5/8 from 8/8). This can slightly increase the proportion of heterozygotes at the expense of homozygotes. Considering this uncertainty, we did not analyze cognitive measures separately in individuals homozygous for the intermediate alleles. There were no samples homozygous for the 20e45 repeat alleles in our cohorts. APOE was genotyped as previously described (Myllykangas et al., 1999).

Statistics
We used logistic regression to test for association between intermediate repeats in all 3 settings and used APOE ε4 carriership, age, and sex as covariates. We applied Bonferroni correction to take account of multiple test settings and set the statistical significance to p ¼ 0.05/3 ¼ 0.017. We used Kruskal-Wallis H test to determine if repeat length distributions differed between our 4 cohorts. All analyses were conducted with IBM SPSS statistics v.24 (IBM Corp. Released, 2016. IBM SPSS Statistics for Windows, Version 24.0. IBM Corp., Armonk, NY, USA).

Ethics
The study was approved by the Coordinating Ethics Committee of the Helsinki University Central Hospital. The Vantaa 85þ study was also approved by the Ethics Committee of the Health Centre of the City of Vantaa.
Interestingly, repeat lengths !20 were more common in Finland than reported elsewhere in previous large studies. To compare the frequency of !20 repeat lengths, we searched PUBMED for large population based (>1000 individuals) and casecontrol studies (>1000 controls) with C9orf72 repeat length distribution available. Besides the original articles describing the repeat, we identified 9 articles (Beck et al., 2013, Cacace et al., 2013, Fahey et al., 2014, He et al., 2015, Nuytemans et al., 2013, Rutherford et al., 2012, Theuns et al., 2014 but only 4 had enough detailed information on !20 repeats for comparison. In these 4 studies, the prevalence of !20 repeats was significantly lower than in Finland (all allelic frequencies <0.52% and all p < 0.019, Fisher's exact test) (Table 1). Of the 9 studies, only 2 found expansion carriers in their control groups with carrier frequency of 1/5748 (p ¼ 0.0095 vs. our Finnish sample, Fisher's exact test) (Theuns et al., 2014) and 11/7579 (p ¼ 0.60 vs. FIN) (Beck et al., 2013).

Intermediate repeat length alleles, Alzheimer's disease, and cognitive impairment
Number of individuals in each cohort with or without AD/ cognitive impairment is shown in Supplementary Table 1.

Neurodegenerative and psychiatric diagnoses
We did not find any evident clustering of neurodegenerative or psychiatric disease in the 12 (0.38%) individuals with 30e45 repeats (Table 2). There were 6 (0.19%) expansion carriers and none had diagnosis of ALS or FTD; 2 had died at ages of 60 and 68 years because of coronary artery disease. Of the 4 living individuals (aged 71e79 years), one had been diagnosed with Parkinson disease (G20, ICD-10) and another had a diagnosis of local brain atrophy (G31.0, ICD-10). A third had been diagnosed with "other alcoholic hallucinosis" (291.20, ICD-8) and a fourth with depressive neurosis and other neurosis (300.40 and 300.88, ICD-8). A summary of carriers of 30e45 repeats and expansions is shown in Table 2.

Discussion
The C9orf72 hexanucleotide repeat expansion is a major cause of ALS/FTD; however, what repeat lengths are pathogenic is still unclear. Our data on 3142 older individuals show that 30e45 repeats are found in 0.38% (1 per 262) individuals, and they did not have any apparent clustering of neurodegenerative or psychiatric disease. Our data suggest that 30e45 repeat alleles should not be considered unequivocally pathogenic and other risk factors than the C9orf72 expansion should be considered in patients with these alleles. The smallest expansions that segregate with combinations of ALS, FTD-ALS, and "FTLD or dementia" are reportedly ca. 55e100 repeats as measured by Southern blot (Gijselinck et al., 2016). However, an individual with 70 repeats without neurodegeneration at the age of 90 years has been reported (McGoldrick et al., 2018). More data are still needed for more precise estimation of the pathological threshold, which may actually turn out as a "gray zone" where disease penetrance is dependent on modulating cofactors.
Our data are in line with previous studies indicating a lack of major association between intermediate repeats and AD. Somewhat surprisingly, there was a nominally significant association between 7e45 repeat length alleles and better cognition. This association was not statistically significant in any of the individual cohorts and was only seen in the pooled data set. This can be a spurious association, but nevertheless, it supports the hypothesis that !7 repeat alleles do not predispose to cognitive impairment or dementia. Across all our cohorts, there was heterogeneity in sex and age distributions, variation in socioeconomic status, and educational history (one cohort consisted solely of men with high socioeconomic status), and there was no uniform way of defining cognition and dementia status, and this information was collected from several sources. Despite these limitations, we found a clear association between APOE ε4 genotype and AD (p ¼ 5.5 Â 10 À10 , OR 2.69), cognitive impairment (p ¼ 0.00078, OR 1.7), and overall dementia status (p ¼ 4.0 Â 10 À7 , OR 1.8), which argues against a major bias in our definition of dementia. Furthermore, the Finnish national registers have been found to identify dementia cases accurately (Solomon et al., 2014).
None of the 6 expansion carriers we identified had a diagnosis of ALS or FTD but 4 had a diagnosis of a neurodegenerative or psychiatric disease. Two had died of coronary heart disease before the age of 70 years and others were under 80 years. Therefore, it is possible that some expansion carriers could have developed or may develop ALS/FTD in their later years.
Previous reports have shown that C9orf72 expansion is especially common cause of ALS/FTD in Finland (Majounie et al., 2012). In the present study, the expansion frequency in older Finns (median age 77) was only slightly higher than in the UK 1958 birth cohort. However, this UK population was considerably younger (age ca. 54e55 years during publication in 2013) than our Finnish sample. The penetrance of the C9orf72 expansion is known to be strongly age related; nearly complete penetrance was observed by 83 years in a cohort of 1147 ALS and FTD cases. The median age of onset varied somewhat with phenotype but was estimated to be around 58 years (Murphy et al., 2017). Therefore, it can be expected that the expansion frequency in the UK cohort will markedly decrease by time. In line with age-related penetrance, the expansions were unequally distributed in our cohorts; all Finnish expansion carriers were from the HBCS cohort, whose participants were the youngest of the 4 cohorts (born 1934e1944, and registry information available up till December 31 of 2013, Supplementary methods). If we restrict the analysis to this cohort only, expansion prevalence would be 6/1644 (0.36% or 1 in 274) in Finland.
Interestingly, we found that nonexpanded repeat lengths !20 are roughly twice as common in Finland than in other reported populations (Table 1). This finding may have a connection with the high frequency of C9orf72 expansion in Finland and raises the question whether the !20 repeat alleles could be prone to germline instability and expansion in the offspring. In this scenario, the larger intermediate alleles would form a pool from which new expansions can be derived, a mechanism shown in families with sporadic Huntington's disease and in Huntington's disease CAG repeat knock-in mice (Goldberg et al., 1993;Myers et al., 1993;Wheeler et al., 1999). Somatic instability (mosaicism) of the expansion has been previously demonstrated (Dols-Icardo et al., 2014;Fratta et al., 2015) and at least one family has been described, in which an unaffected 90-year-old father had a 70-repeat allele in blood (premutation), which was transmitted to 4 offspring, who all had expansion in blood DNA (McGoldrick et al., 2018). Whether the expansion was present in the father's sperm could not be tested. However, mosaicism with large expansions was demonstrated in other tissues; therefore, it is likely that the expansion was present in sperm. As more than half of the Finnish patients with ALS with the C9orf72 expansion are sporadic (Majounie et al., 2012) (Laaksovirta et al. unpublished data), the spectrum of premutations warrants further research by genotyping sporadic C9orf72 ALS cases' parents when possible. Another approach would be to genotype paired DNA samples from blood and gametes (Martorell et al., 2004), especially in carriers of the longer alleles.

Disclosure
Bryan J. Traynor and Pentti J. Tienari hold patent on C9orf72 in diagnostics and treatment of ALS/FTD. The other authors have no actual or potential conflict of interest.