Missense variant in TREML2 protects against Alzheimer's disease

TREM and TREM-like receptors are a structurally similar protein family encoded by genes clustered on chromosome 6p21.11. Recent studies have identified a rare coding variant (p.R47H) in TREM2 that confers a high risk for Alzheimer's disease (AD). In addition, common single nucleotide polymorphisms in this genomic region are associated with cerebrospinal fluid biomarkers for AD and a common intergenic variant found near the TREML2 gene has been identified to be protective for AD. However, little is known about the functional variant underlying the latter association or its relationship with the p.R47H. Here, we report comprehensive analyses using whole-exome sequencing data, cerebrospinal fluid biomarker analyses, meta-analyses (16,254 cases and 20,052 controls) and cell-based functional studies to support the role of the TREML2 coding missense variant p.S144G (rs3747742) as a potential driver of the meta-analysis AD-associated genome-wide association studies signal. Additionally, we demonstrate that the protective role of TREML2 in AD is independent of the role of TREM2 gene as a risk factor for AD.

TREM2 Genome-wide association studies Conditional analysis Endophenotype Gene Alzheimer's disease Association a b s t r a c t TREM and TREM-like receptors are a structurally similar protein family encoded by genes clustered on chromosome 6p21.11. Recent studies have identified a rare coding variant (p.R47H) in TREM2 that confers a high risk for Alzheimer's disease (AD). In addition, common single nucleotide polymorphisms in this genomic region are associated with cerebrospinal fluid biomarkers for AD and a common intergenic variant found near the TREML2 gene has been identified to be protective for AD. However, little is known about the functional variant underlying the latter association or its relationship with the p.R47H. Here, we report comprehensive analyses using whole-exome sequencing data, cerebrospinal fluid biomarker analyses, meta-analyses (16,254 cases and 20,052 controls) and cell-based functional studies to support the role of the TREML2 coding missense variant p.S144G (rs3747742) as a potential driver of the metaanalysis AD-associated genome-wide association studies signal. Additionally, we demonstrate that the protective role of TREML2 in AD is independent of the role of TREM2 gene as a risk factor for AD.

Introduction
Genome-wide association studies (GWAS) are a very powerful approach for identification of novel loci associated with disease status or other complex traits. However, these single nucleotide polymorphisms (SNPs) are usually not the functional variants driving the association and, in many cases, regional linkage disequilibrium (LD) prevents identification of a single candidate gene in the region. Often, additional studies are required to demonstrate unambiguously that the gene and/or variant implicated in disease risk is functionally related to pathogenesis.
Recently, the International Genomics of Alzheimer's Project (IGAP) identified 11 new loci (p < 10 À8 ) associated with risk for Alzheimer's disease (AD), and 13 additional suggestive loci (p value between10 À6 and 10 À8 ) . Among the latter group, there is an inter-genic SNP (rs9381040; p < 6.3 Â 10 À7 ) located 5.5 Kb downstream from TREML2 and 24 Kb upstream from TREM2. The TREM and TREM-like receptor genes clustered on chromosome 6p21.1 (Ford and McVicar, 2009) have different patterns of LD among them . This genomic region has previously been implicated in genetic risk for AD Bertram et al., 2013;Cruchaga et al., 2013;Guerreiro et al., 2013;Jonsson et al., 2012;Reitz and Mayeux, 2013). A low frequency missense variant in TREM2 (p.R47H, minor allele frequency ¼ 0.003) was reported to substantially increase risk for AD Guerreiro et al., 2013). SNPs in this region were also found to be associated with a cerebrospinal fluid (CSF) biomarker for AD (phospho-tau 181 levels) . Because of the design of the IGAP study (a meta-analysis) and the low frequency of the TREM2 variant, it was not possible to determine whether the GWAS signal of this variant (rs9381040) was independent of the TREM2-p.R47H variant. In this study, we used exome-sequencing data to identify the most likely functional variant in TREML2 responsible for the GWAS signal and to determine whether this signal is independent of TREM2-p.R47H (rs75932628) variant.

Exome sequencing Knight-Alzheimer's Disease Research Center (ADRC)
Enrichment of coding exons and flanking intronic regions was performed using a solution hybrid selection method with the Sure-Select human all exon 50 Mb kit (Agilent Technologies, Santa Clara, CA, USA) following the manufacturer's standard protocol on 46 unrelated AD cases and 39 unrelated controls from the Knight-ADRC. This was performed by the Genome Technology Access Center at Washington University in St Louis (https://gtac.wustl.edu/). The captured DNA was sequenced by paired-end reads on the HiSeq 2000 sequencer (Illumina, San Diego, CA, USA). Raw sequence reads were aligned to the reference genome National Center for Biotechnology Information (NCBI) 36/hg18 by using Novoalign (Novocraft Technologies, Selangor, Malaysia). Base and/or SNP calling was performed using SNP SAMtools (Li et al., 2009). SNP annotation was carried out using version 5.07 of SeattleSeq Annotation server (see URL) (Benitez et al., 2011). On average, 95% of the exome had fold coverage >8.

UK-National Institute on Aging (UK-NIA) Dataset
A description of the UK-NIA dataset can be found in Guerreiro et al. (2013). Briefly, this dataset includes whole-exome sequencing data from 143 AD cases and 183 controls (Table 1).

Alzheimer's disease genetic consortium methods
Data used in the preparation of this article were obtained from the Alzheimer's disease genetic consortium (ADGC). A description of the samples included in the study as well as the methods used can be found in Naj et al. (2011). Imputed data from 10,067 AD cases and 9606 controls from the ADGC were used in this study (Naj et al., 2011). Genome-wide imputation was performed per cohort using MACH software with HapMap phase 2 (release 22) CEPH Utah pedigrees reference haplotypes and genotype data passing quality control as inference. Imputation quality was determined as r 2 and only SNPs imputed with r 2 ! 0.50 were included in the analysis. A multivariate logistic regression was performed to evaluate the association between genetic markers and risk for late-onset AD (LOAD) adjusting for age, gender, population substructure, and study-specific effects.
2.4. For use of genetic and environmental risk for Alzheimer's disease genotype data from "the 610 group" Data used in the preparation of this article were obtained from the Genetic and Environmental Risk for Alzheimer's disease (GERAD) Consortium. The imputed GERAD sample comprised 3177 AD cases and 974 healthy elderly (age >70) control subjects with available age and gender data. Cases [CERAD]) AD. All elderly controls were screened for dementia using the MMSE or ADAS-cog, were determined to be free from dementia at neuropathological examination or had a Braak score of 2.5 or lower. Genotypes from all cases and control subjects were previously included in the AD GWAS by Harold et al. (2009). Imputation of the dataset was performed using IMPUTE2 and the 1000 genomes (http://www.1000genomes.org/) Dec2010 reference panel (NCBI build 37.1). The imputed data was then analyzed using logistic regression including covariates for country of origin, gender, age, and 3 principal components were obtained with EIGENSTRAT (EIGENSOFT 4.2) (Patterson et al., 2006) software based on individual genotypes for the GERAD study participants.

European Alzheimer's disease initiative consortium
All AD cases were ascertained by neurologists from Bordeaux, Dijon, Lille, Montpellier, Paris, Rouen, and were identified as French Caucasian (Dreses-Werringloer et al., 2008;Group, 2003). Clinical diagnosis of probable AD was established according to the DSM-III-R and NINCDS-ADRDA criteria. Control subjects were selected from the 3C Study (Group, 2003). This cohort is a population-based, prospective (7-years follow-up) study of the relationship between vascular factors and dementia. It has been carried out in 3 French cities: Bordeaux (southwest France), Montpellier (southeast France), and Dijon (central eastern France). A sample of non-institutionalized, over-65 subjects was randomly selected from the electoral rolls of each city. Between January 1999 and March 2001, 9686 subjects meeting the inclusion criteria agreed to participate. After recruitment, 392 subjects withdrew from the study. Thus, 9294 subjects were finally included in the study (2104 in Bordeaux, 4931 in Dijon, and 2259 in Montpellier). Genomic DNA samples 38 of 7200 individuals were transferred to the French Centre National de Génotypage. First stage samples that passed DNA quality control were genotyped with Illumina Human 610-Quad BeadChips (n ¼ 452). At the end, we removed 308 samples because they were found to be firstor seconddegree relatives of other study participants, or were assessed non-European descent based on genetic analysis using methods described in Heath et al. (2008). In this final sample, at 7 years of follow-up, 459 individuals suffered from AD with 97 prevalent and 362 incident cases. These AD cases were included as cases in the European Alzheimer's disease initiative (EADI) discovery dataset. We retained the other individuals as control subjects (n ¼ 6017). The imputation was performed using 1000 Genomes multi-ethnic data (1000 G phase 1 integrated variant set release v3) as reference panel. Imputation was performed in 2 steps: pre-phasing with SHAPEIT (v2), followed by imputation with IMPUTE2. SNPs are used in the imputation process if call rate >98%, Hardy-Weinberg equilibrium (HWE) p value > 1e-6, minor allele frequency (MAF) > 1.

CSF levels dataset
A description of the CSF dataset used in this study can be found in Cruchaga et al. (2013) and data included 1269 unrelated individuals recruited through the Knight-ADRC at Washington University (n ¼ 501, 73% CDR ¼ 0), the Alzheimer's Disease Neuroimaging Initiative (n ¼ 394, 27% Clinical Dementia Rating [CDR] ¼ 0), a biomarker consortium of Alzheimer disease centers coordinated by the University of Washington (n ¼ 323, 61% CDR ¼ 0), and the University of Pennsylvania (UPenn) (n ¼ 51, 2% CDR ¼ 0). Briefly, CSF tau, phosphotau-181 (ptau), and amyloid beta (Ab 42 ) levels were from research participants enrolled in longitudinal studies at the Knight-ADRC, ADNI, University of Washington, and University of Pennsylvania. CSF collection and Ab 42 , tau, and ptau181 measurements were performed as described previously (Fagan et al., 2006). The samples were genotyped using Illumina chips. Cases received a diagnosis of dementia of the Alzheimer's type, using criteria equivalent to the National Institute of Neurological and Communication Disorders and Stroke-Alzheimer's Disease and Related Disorders Association for probable AD (McKhann et al., 1984). Controls received the same assessment as the cases but were nondemented. All individuals were of European descent and written consent was obtained from all participants.

Statistical analyses
We performed multivariate logistic regression to evaluate the association between genetic markers and risk for LOAD adjusting for age, gender, population substructure, and study specific effects using PLINK (http://pngu.mgh.harvard.edu/purcell/plink/). Conditional analysis was performed to identify additional independent signals by conditioning on the top case-control GWAS hits. We first estimated the odds ratios for SNPs across cohorts. These models calculate crude odds ratios and confidence intervals from counts of heterozygous in case patients and control subjects in each study. Then we performed a fixed-effect model to combine the odds ratios from study-specific estimates into a summary measure. No multiple-testing correction was used in our analyses. The heterogeneity of effects was evaluated using Woolf test for heterogeneity (Woolf, 1955). Meta-analysis was conducted using the META package (http://www.stats.ox.ac.uk/ wjsliu/meta.html) in R (version 3.0.1). Association of CSF ptau with the genetic variants was analyzed as described previously (Cruchaga et al., 2010Kauwe et al., 2011). Briefly, CSF ptau values were log transformed to approximate a normal distribution. Because the CSF levels were measured using different platforms (Innotest plate ELISA vs. AlzBia3 beadbased ELISA, respectively), we were not able to combine the raw data. We extracted from the log-transformed value, the mean within each series for the log-transformation. No significant differences in the transformed CSF values of the different series were found. We used SAS (version 9.2) to analyze the association of SNPs with CSF biomarker levels. Age, gender, site, and the first 3 principal components were included as covariates. We also performed conditional analyses by including several variants in the model.

Genotyping
rs9381040 and rs3747742 were extracted from the GWAS data , and confirmed by direct genotyping. The TREM2-p.R47H was genotyped using KASP genotyping assay (LGC Genomics), as previously described Cruchaga et al., 2009Cruchaga et al., , 2010Cruchaga et al., , 2012 on 2000 cases and control subjects from the Knight-ADRC.

Cell-based analysis
Primary astrocytes and microglia were prepared from 2 litters (16 pups) of P1 C57BL/6 mice. Individual mice were pooled and 12 replicate co-cultures were plated in 25 cm 2 flasks. Co-cultures were treated with 0.2 ng/mL of mouse interleukin-1 beta (IL-1b) (R&D 401-ML/CF) for 24 hours. Microglia was detached from the plate by shaking at 125 rpm for 1 hour in a 37 C incubator. RNA was extracted using MiRNeasy mini kit (Qiagen 217004), according to manufacturer's instructions. The quantitative polymerase chain reaction assays for mouse Trem2 (ID: Mm04209424), Treml2 (ID: Mm01277362), and Saa3 (ID: Mm00441203) were obtained from Life Technologies (NY, USA).
We also performed a linear regression analysis for rs9381040 and rs3747742 with CSF levels of tau and ptau (n ¼ 1269 individuals) . rs9381040 (p ¼ 4.11 Â 10 À4 , beta ¼ À0.02) and rs3747742 (p ¼ 1.4 Â 10 À4 , beta ¼ À0.02) both exhibit a strong association with CSF ptau levels. The respective associations with CSF ptau are no longer significant when either SNP is included as a covariate in the conditional analysis. These results confirm via 2 independent datasets that the associations of rs9381040 and rs3747742 with CSF biomarker levels and with AD risk represent the same signal. The TREM2-p.R47H variant was also genotyped in a subset of the CSF samples (n ¼ 835). In these samples, 3 variants, rs9381040 (p ¼ 0.04, beta ¼ À0.02) (Fig. 2, panel A), rs3747742 (p ¼ 0.02, beta ¼ À0.02) (Fig. 2, panel B), and rs75932628 (p ¼ 0.0016, beta ¼ 0.2) (Fig. 2, panel C) demonstrate a nominally significant association with CSF ptau levels. To determine whether the TREML2 signal (rs3747742) is independent of TREM2p.R47H, we removed all of the p.R47H carriers from the analysis. rs3747742 remained significantly associated with CSF ptau levels (p ¼ 0.03) (Fig. 2, panel D). Furthermore, when TREM2-p.R47H was included in the model as a covariate for rs3747742 analysis, the association remained significant (p ¼ 0.02), which suggests that the TREM2 and TREML2 signals are independent. Importantly, these associations confirmed the direction of the effect on CSF ptau levels: the minor allele of rs3747742 is associated with lower ptau levels (beta ¼ À0.02) and is predicted to be protective for AD risk (OR ¼ 0.91; CI ¼ 0.86e0.97), while the minor allele of TREM2p.R47H is associated with an increased risk for AD (OR ¼ 1.91, CI ¼ 1.85e1.97) and higher levels of CSF ptau (beta ¼ 0.2).
In addition, TREM and TREM-like receptors modulate the innate immune response by either amplifying or dampening Toll-like receptor-induced signals, playing critical roles in fine-tuning the inflammatory response (Ford and McVicar, 2009). TREM and TREMlike receptors demonstrate different patterns of expression and are likely to play different roles in the inflammatory response. To further understand the relative expression of TREM2 and TREML2, we analyzed gene expression in primary mouse microglia and astrocytes stimulated by IL-1b. Treatment of microglia with IL-1b repressed expression of TREM2 (Fig. 3, panel A), but increased expression of TREML2 (Fig. 3, panel B). The opposing effects of this inflammatory cytokine on TREM2 and/or TREML2 expression is consistent with our genetic data and with evidence that TREM2 and/or DAP12 antagonizes inflammatory signaling in microglia while TREML2 is not coupled to DAP12 signaling and plays a proinflammatory role (Ford and McVicar, 2009).

Discussion
In summary, these results demonstrate that the associations of missense variants in TREM2 and TREML2 with AD risk are independent. Moreover, our analyses suggest that the AD-associated GWAS signal is likely driven by the TREML2 coding missense variant p.S144G (rs3747742); it results in a similar odds ratio to rs9381040. We also validated 2 other coding variants p.V25A and p.S129T in TREML2 gene in moderate LD (r 2 ¼ 0.05 and D 0 ¼ 1) with the GWAS SNP, which both exhibited a higher frequency among control subjects than in AD cases (Table 1). However, for both variants we only obtained data by whole-exome sequencing which limited our analysis about the role that these variants may play in the association of TREML2 with AD risk. To prove that these additional variants are associated with AD risk we will need a larger sample size. Additionally, the purpose of this study was to find a functional coding variant in the TREML2 gene that could explain the association for TREML2 which was found in the recent IGAP metaanalysis. Our data suggest that there is a coding variant in TREML2 that could explain the GWAS signal, but our data cannot rule-out of the presence of functional variants outside of the coding region.
We conclude that at least 2 genes in this gene cluster influence risk for AD: TREM2-p.R47H is associated with increased risk for AD (OR ¼ 1.91, CI ¼ 1.85e1.97) and TREML2-p.S144G is associated with reduced risk for AD (OR ¼ 0.91; CI ¼ 0.86e0.97). The mechanisms by which these variants influence AD risk are not currently understood, but it has been suggested that modulation of microglial activation might influence clearance of Ab (Benitez et al., 2011). These results underline the importance of the inflammatory response in modulating risk for AD and suggest that other genes in this gene family may also harbor risk alleles for AD.

Disclosure statement
The authors report no conflicts of interest. All participants had agreed by signed informed consent to participate in genetic studies approved by our Institutional Review Board. Cruchaga  The authors thank the ARUK consortium for collection of the samples used, and the patients and families whose participation made this work possible. The authors also thank ARUK and the Big Lottery Fund for financial support of this work, and ARUK for funding the PhD studentship of Jenny Lord. EADI1 was supported by the French National Foundation on Alzheimer's disease and related disorders. Data management involved the Centre National de Génotypage, the Institut Pasteur de Lille, Inserm, FRC (fondation pour la recherche sur le cerveau), and Rotary. This work has been developed and supported by the LABEX (laboratory of excellence program investment for the future) DIS-TALZ grant (Development of Innovative Strategies for a Transdisciplinary approach to ALZheimer's disease). The Three-City Study was performed as part of collaboration between the Institut National de la Santé et de la Recherche Médicale (Inserm), the Victor Segalen Bordeaux II University, and Sanofi-Synthélabo. The Fondation pour la Recherche Médicale funded the preparation and initiation of the study. The 3C Study was also funded by the Caisse Nationale Maladie des Travailleurs Salariés, Direction Générale de la Santé, MGEN, Institut de la Longévité, Agence Française de Sécurité Sanitaire des Produits de Santé, the Aquitaine and Bourgogne Regional Councils, Fondation de France, and the joint French Ministry of Research and INSERM "Cohortes et collections de données biologiques" programme. Lille Génopôle received an unconditional grant from Eisai.