Rare variants of the 3’-5’ DNA exonuclease TREX1 in early onset small vessel stroke

Background: Monoallelic and biallelic mutations in the exonuclease TREX1 cause monogenic small vessel diseases (SVD). Given recent evidence for genetic and pathophysiological overlap between monogenic and polygenic forms of SVD, evaluation of TREX1 in small vessel stroke is warranted. Methods: We sequenced the TREX1 gene in an exploratory cohort of patients with lacunar stroke (Edinburgh Stroke Study, n=290 lacunar stroke cases). We subsequently performed a fully blinded case-control study of early onset MRI-confirmed small vessel stroke within the UK Young Lacunar Stroke Resource (990 cases, 939 controls). Results: No patients with canonical disease-causing mutations of TREX1 were identified in cases or controls. Analysis of an exploratory cohort identified a potential association between rare variants of TREX1 and patients with lacunar stroke. However, subsequent controlled and blinded evaluation of TREX1 in a larger and MRI-confirmed patient cohort, the UK Young Lacunar Stroke Resource, identified heterozygous rare variants in 2.1% of cases and 2.3% of controls. No association was observed with stroke risk (odds ratio = 0.90; 95% confidence interval, 0.49-1.65 p=0.74). Similarly no association was seen with rare TREX1 variants with predicted deleterious effects on enzyme function (odds ratio = 1.05; 95% confidence interval, 0.43-2.61 p=0.91). Conclusions: No patients with early-onset lacunar stroke had genetic evidence of a TREX1-associated monogenic microangiopathy. These results show no evidence of association between rare variants of TREX1 and early onset lacunar stroke. This includes rare variants that significantly affect protein and enzyme function. Routine sequencing of the TREX1 gene in patients with early onset lacunar stroke is therefore unlikely to be of diagnostic utility, in the absence of syndromic features or family history.


Abstract
Monoallelic and biallelic mutations in the exonuclease Background: TREX1 cause monogenic small vessel diseases (SVD). Given recent evidence for genetic and pathophysiological overlap between monogenic and polygenic forms of SVD, evaluation of in small vessel stroke is warranted.

TREX1
We sequenced the gene in an exploratory cohort of patients Methods: TREX1 with lacunar stroke (Edinburgh Stroke Study, n=290 lacunar stroke cases). We subsequently performed a fully blinded case-control study of early onset MRI-confirmed small vessel stroke within the UK Young Lacunar Stroke Resource (990 cases, 939 controls).
No patients with canonical disease-causing mutations of were Results: TREX1 identified in cases or controls. Analysis of an exploratory cohort identified a potential association between rare variants of and patients with lacunar TREX1 stroke. However, subsequent controlled and blinded evaluation of in a TREX1 larger and MRI-confirmed patient cohort, the UK Young Lacunar Stroke Resource, identified heterozygous rare variants in 2.1% of cases and 2.3% of controls. No association was observed with stroke risk (odds ratio = 0.90; 95% confidence interval, 0.49-1.65 p=0.74). Similarly no association was seen with rare variants with predicted deleterious effects on enzyme function TREX1 (odds ratio = 1.05; 95% confidence interval, 0.43-2.61 p=0.91).
No patients with early-onset lacunar stroke had genetic Conclusions: evidence of a -associated monogenic microangiopathy. These results TREX1 show no evidence of association between rare variants of and early TREX1 onset lacunar stroke. This includes rare variants that significantly affect protein and enzyme function. Routine sequencing of the gene in patients with TREX1 early onset lacunar stroke is therefore unlikely to be of diagnostic utility, in the absence of syndromic features or family history.

Introduction
Cerebral small vessel disease (SVD) causes a quarter of all strokes and is the most common pathology underlying vascular cognitive decline and dementia 1 . The pathophysiological and genetic basis of SVD is poorly understood, in particular small vessel lacunar stroke 2,3 . Rare variants may make a significant contribution to the genetic basis of SVD 3,4 and increasing evidence suggests that monogenic and polygenic forms of SVD share common pathophysiological mechanisms 5 . For example, dominant missense mutations in COL4A1 and COL4A2 cause rare familial forms of cerebral SVD 6 , and common variants in the same genes are associated with sporadic cerebral small vessel disease 3 . Such findings demonstrate that genes causing monogenic microangiopathies may also contain variants conferring risk for common forms of cerebral SVD, such as lacunar stroke.
TREX1 is a human 3'-5' exonuclease that can degrade single stranded DNA. Two monogenic small vessel diseases are caused by mutations in TREX1 ( Figure 1A). Heterozygous frameshift mutations in the C-terminus of TREX1, resulting in enzyme mislocalisation, cause retinal vasculopathy with cerebral leukodystrophy (RVCL), an adult-onset systemic microangiopathy with pronounced brain involvement 7 . Biallelic mutations with loss of enzymatic function can cause Aicardi-Goutières' Syndrome (AGS), a neonatal onset brain disorder with prominent microangiopathy 8,9 and features of activated innate immunity 10 . Both genetic cerebral microangiopathies are associated with aberrant innate immune pathways, in particular dysregulation of the type I interferon pathway 10,11 . Given the potential for therapeutic modulation of these pathways, evaluation of TREX1 in SVD phenotypes, such as lacunar stroke, warrants examination. The identification of patients with early-onset cerebral SVD and heterozygous rare TREX1 variants has led to the hypothesis that such variants might be causally related to early-onset SVD 12 .
Here we evaluate TREX1 in patients with small vessel stroke. We perform an initial exploratory analysis in a relatively small cohort of patients with lacunar stroke, and subsequently perform a casecontrol study in a larger cohort of patients with early-onset lacunar stroke, where small vessel infarction has been confirmed by MRI.

Sanger sequencing
The entire coding sequence of TREX1 and part of the 5'UTR (-228bp) and 3' UTR (+57 bp) were amplified by three overlapping amplicons using the following primers:  Unrelated Caucasian controls, free of clinical cerebrovascular disease, were obtained by random sampling from general practice lists from the same geographical location as the patients. Sampling was stratified for age and sex. All patients and controls underwent a standardized clinical assessment and completed a standardized study questionnaire. MRI was not performed in controls.

Variant annotation
Variants were compared with the ExAC database 17 to determine a MAF and/or previously identified disease association (ClinVar, NCBI). Variants were sorted by Combined Annotation Dependent Depletion (CADD). CADD provides a scaled C-score with a C-score of 10 meaning this variant is predicted to be in the top 1% of most deleterious changes in the genome, a score of 20 meaning it is in the top 0.1% 18 .
Structural and functional analyses 3D rendering of the variants in a TREX1 dimer 19 (Protein DataBase ID: 2OA8, amino acids 5-234) was performed using PyMOL (The PyMOL Molecular Graphics System, Version 1.8 Schrödinger, LLC).

TREX1-EGFP vector construction
Gateway cloning was used to construct mammalian expression vectors. Briefly, the coding sequence of human TREX1 was amplified by PCR to include attB sites and cloned into pDONR221 (Invitrogen) via BP reaction (BP clonase II kit; Invitrogen). pEGFP-TREX1 was constructed by cloning the TREX1 coding sequence into a Gateway converted pEGFP-C2 destination vector (Clontech) via LR reaction (LR clonase II kit; Invitrogen). Minipreparations of plasmid DNA (Qiagen) were performed for verification. Midipreparations of plasmid DNA (ZymoResearch) were performed for mammalian cell transfection.

Site directed mutagenesis
Mutations were introduced into the mammalian expression construct by site-directed mutagenesis, as per manufacturer's instructions (Q5 Site-Directed Mutagenesis Kit, NEB). Mutations were confirmed by Sanger sequencing.

Statistical analyses
Sequencing, analysis and functional work was performed blind to case-control status. Fisher's exact test was used to compare proportions of individuals with rare variants in cases versus controls, unless otherwise stated. Odds ratios were calculated using Cochrane RevMan 5. Mann-Whitney U test was used to compare CADD scores between groups. Statistical tests were performed in GraphPad Prism 7.

Exploratory cohort: Rare TREX1 variants in the Edinburgh Stroke Study
We first performed an exploratory analysis of patients with lacunar stroke within the Edinburgh Stroke Study ( Figure 1B). This study of >2000 stroke patients includes a subset of 290 patients with a clinical diagnosis of a lacunar stroke. Sanger sequencing of TREX1 in these 290 patients identified no individuals with genetic results consistent with a diagnosis of RVCL (monoallelic C-terminal frameshift mutations) or AGS (biallelic hypomorphic mutations).
However, four patients with rare heterozygous TREX1 variants were identified (MAF<0.05, Table 1). The Edinburgh Stroke Study does not include population-matched controls, placing limitation on interpretation of these data. However, compared to published TREX1 sequencing control data from patients of European ancestry, this is significantly more than would be expected (p = 0.005 Fisher's exact test) 20 . Notably, 3 of these patients developed lacunar stroke under the age of 70 years. Recognising that to confirm any potential association between rare TREX1 variants and lacunar stroke would require more stringent testing, we analysed DNA from a larger independent lacunar stroke cohort with early onset disease (<70 years) together with a population-matched control cohort.
Case-control study: Rare TREX1 variants in the UK Young Lacunar Stroke Resource The UK Young Lacunar Stroke Resource (UKYLSR) is a study of approximately 1,000 patients with MRI-confirmed lacunar stroke in patients under the age of 70, with matched population controls. As such this study allows more stringent evaluation of the hypothesis that rare TREX1 variants confer risk for lacunar stroke. We performed TREX1 sequencing in cases and controls, including functional annotation and enzymatic assays. We remained blind to case-control status throughout the study.
No individuals in either case or control group had genetic results consistent with a diagnosis of RVCL or AGS.
Variants differ in their capacity to reduce enzymatic function of TREX1. For example some mutations such as D18N can cause complete loss of function of exonuclease activity 7 . We therefore next considered annotations of these variants which evaluated the potential pathogenicity of a given variant. CADD is a method for integrating diverse functional annotations into a single measure (CADD score, or C-score), which can predict the potential pathogenicity of a variant in silico 18 . When rare variants with low CADD scores (<10) were excluded, functional rare variants were identified at a frequency of 10/990 in cases (1.0%) and 9/939 (0.96%) in controls (OR 1.05; 95% confidence interval, 0.43-2.61 p=0.91, Figure 2B). The CADD scores for rare variants did not differ significantly between groups ( Figure 2B, p=0.72 Mann-Whitney U test). The location of variants within TREX1 influences clinical phenotype in monogenic microangiopathic disease ( Figure 1A). The variants we identified were distributed throughout the TREX1 gene, and there was no apparent spatial clustering when variants were mapped onto a 3D protein model ( Figure 2C, Figure 3A).

Rare TREX1 variants can decrease exonuclease activity
To confirm that rare variants with high CADD scores exert a deleterious effect on protein function, we evaluated the effect of rare variants on TREX1 exonuclease activity with a high C-score from each group. We identified a variant from each group with a CADD score >20 and thus predicted to confer significant pathogenic effect on the protein. To examine such amino acid changes on TREX1 function, we reconstituted mouse Trex1 -/-MEFs with the mutated allele, generated by site-directed mutagenesis and assessed cellular nuclease activity against a ssDNA substrate ( Figure 3B). While wildtype TREX1 reconstituted nuclease activity against ssDNA, rare variants from both groups (Case: A139Vfs*21, C-score 34. Control: R114H, C-score 28) lead to significant loss of 3'-5' exonuclease activity ( Figure 3C).
Together these results show no evidence for an association between rare variants of TREX1 and early onset lacunar stroke, including variants that exert deleterious effects on protein function.  (C) Relative nuclease activity of predicted most severe variants identified from case (A139Vfs*21) and control groups (R114H) of the UK Young Lacunar Stroke Study. Nuclease activity was assayed in total protein supernatants from Trex1 -/-MEFs transfected with TREX1 expression constructs containing variants generated by site directed mutagenesis. Rare variants identified in the Young Lacunar stroke cohorts were compared with a known nuclease-dead variant (D18N). Data shown is average of two or more independent experiments performed in triplicate ± standard deviation of the independent experiments relative to WT ** p<0.01.

Discussion
There is a need to identify aetiological factors in small vessel stroke, in particular molecular pathways that might be amenable to therapeutic intervention 1 . Recent meta-analyses of GWAS studies have suggested that the "missing heritability" of small vessel stroke may be in part attributed to rare variants 2,21 . One possibility is that mutations in genes that cause monogenic small vessel diseases, such as NOTCH3, HTRA1, COL4A1 and TREX1, might confer risk for sporadic lacunar stroke. This hypothesis is strengthened by the identification of an association between sporadic SVD phenotypes and common variants in COL4A1/2, since mutations in these genes can cause monogenic SVD 3 .
TREX1 is therefore an important candidate gene to evaluate in lacunar stroke. Biallelic and monoallelic mutations in TREX1 can cause two clinically distinct monogenic syndromes characterized by prominent microangiopathy, AGS and RVCL. While the molecular events by which altered TREX1 function causes SVD is unknown, increasing lines of evidence suggest an association with activated innate immunity, including pathways that are potentially amenable to therapeutic intervention 11,22 . As such detailed evaluation of this gene in sporadic small vessel stroke phenotypes is a priority.
Here we test the hypothesis that rare variants of TREX1 are associated with lacunar stroke, in particular early onset disease. We first examine DNA from an exploratory cohort, recognizing important limitations in the interpretation of genetic data from small uncontrolled studies. Consistent with other screening studies of this gene in other early-onset SVD phenotypes 12 , we identified rare variants of TREX1 in about 1.3% of cases, and observed that 3 out of 4 of these cases were under 70 years of age. Comparison of this proportion to published control data suggested a potential association of early onset lacunar stroke with rare variants of TREX1. However, such analyses present serious methodological limitations, since published control cohorts are not matched for geographical region and age. Therefore, although this type of comparison might be useful in generating preliminary data on which to focus and power more detailed studies, the statistical analysis of this preliminary cohort is prone to bias, confounding and chance 23 .
Therefore to assess whether our observations in this preliminary cohort represented a real association, we performed a more methodologically rigorous evaluation of TREX1 in the UKYLSR. This differs from the exploratory study in a number of ways, which allow more robust genetic conclusions to be drawn. Firstly, the UKYLSR is a dedicated study of early onset lacunar stroke. Secondly, inclusion in the study requires confirmation of a small vessel stroke by MRI. This is important since a clinical diagnosis of a lacunar syndrome may not necessarily be caused by a small vessel stroke 1 . Thirdly, the study was controlled with an age, sex and geographically matched control population. We remained blinded to case-control status throughout the study, including analyses of the functional consequences of rare variants.
The results of this case-control study showed no evidence that rare variants in TREX1 are associated with small vessel stroke. In the UKYLSR, rare variants in TREX1 occur in about 2% of both cases and controls. As such rare variants occur more frequently than previously detected in different control populations, highlighting potential population variation and reinforcing the need for dedicated age and population-matched control cohorts 20 . Our findings emphasize the importance of confirmation cohorts in genetic association studies, however persuasive the prior biological rationale 24 .
These rare variants included those that can directly alter protein structure and function. The distribution of CADD scores, which reflect an in silico evaluation of the potential pathogenicity of variants, was not different between groups. We show that rare variants with high CADD scores, which can affect enzymatic function in vitro, are observed at similar frequencies in both cases and control populations.
Our results are consistent with a recently published nextgeneration sequencing study comparing approximately 600 lacunar stroke patients with control individuals from the INTERSTROKE cohort 25 . This study showed that rare variants in monogenic stroke genes, including TREX1, were not associated with lacunar stroke phenotypes. A potential limitation of both studies is lack of statistical power, although unbiased publication of such sequencing studies will allow meta-analyses with higher degrees of power to be performed.
These results have implications relevant for clinical practice. Firstly, none of the 1,280 lacunar stroke patients sequenced here had genetic results consistent with monogenic TREX1-associated genetic microangiopathies. Secondly, the identification of rare heterozygous variants of TREX1 in early onset small vessel stroke, even those that confer substantial functional effects, may not be of clinical relevance, although our analysis does not exclude a weak effect. Taken together, these findings do not support routine testing of TREX1 variants in early onset small vessel stroke, in the absence of syndromic features or a supportive family history. Furthermore, the interpretation of rare TREX1 variants in early onset SVD phenotypes obtained through screening 12 or next generation sequencing approaches 25 , should be interpreted with caution given that they are observed in control populations at a frequency of approximately 2%.

Data availability
Raw This is a study of 290 subjects with lacunar stroke from the Edinburgh Stroke Study and 990 subjects with lacunar stroke plus 939 control subjects from the UK Lacunar Stroke Resource. The authors examined if there was an association between variants in and lacunar stroke but could not detect this. This is TREX1 an important and relevant finding because when taking previous reports into consideration it has been reasonable to suspect a relation between variants and lacunar stroke. Some comments: TREX1 Microangiopathy can cause many different phenotypes and a discussion about how closely the different microangiopathies noted in previous studies of can be supposed to be related to the phenotype TREX1 lacunar stroke would be of interest. Maybe the reported microangiopathies should instead be suspected to be related to other phenotypes than lacunar stroke? It would therefore be valuable if the authors, with their acknowledged expertise, in the manuscript could explain more about the possible relationships between microangiopathy and SVD causing lacunar stroke.
For clarity of the manuscript, the authors should consider moving some parts of the Results section to the Methods section. This includes the paragraph beginning with "The UK Young Lacunar Stroke Resource (UKYLSR) is a study of approximately 1,000 patients" and the sentences starting with: "We therefore next considered annotations of these variants which evaluated the potential pathogenicity of a given variant. CADD is a method…" The authors also used mouse for their evaluations. It would be valuable if they could comment in TREX1 e.g. their Discussion whether any differences between mouse and human may be of TREX1 TREX1 importance.
Suggest that the authors omit the sentence "p = 0.005, Fishers Exact Test compared to published control cohort" from the legend of Table 1. This is already more clearly explained in the text on page 5.
The abbreviation Exo is explained in Figure 1 but not in Figure 2.

Is the study design appropriate and is the work technically sound? Yes
Are sufficient details of methods and analysis provided to allow replication by others?

If applicable, is the statistical analysis and its interpretation appropriate? Yes
Are all the source data underlying the results available to ensure full reproducibility? Yes

Are the conclusions drawn adequately supported by the results? Partly
No competing interests were disclosed. Competing Interests: Referee Expertise: Complement system -TREX1 deficiency and genetic variants. Not able to assess some of the statistical applications with any expertise.
We have read this submission. We believe that we have an appropriate level of expertise to confirm that it is of an acceptable scientific standard.