Detection of pathogenic variants in Alzheimer’s disease related genes in Bulgarian patients by pooled whole-exome sequencing

Abstract In an effort to better understand the complex genetic background of Alzheimer’s disease (AD) we performed high-coverage whole-exome sequencing of a DNA pool assembled of 66 Bulgarian AD patients. We focused our analysis on genes demonstrated to have association with AD in previous studies, i.e. PSEN1, PSEN2, APP, APOE, TREM2, HFE, CLU and CR1. In these genes, we established six pathogenic/likely pathogenic variants in the sequenced pool, three common and three rare. Two of these variants showed statically non-significant difference between Bulgarian AD patients and Bulgarian control exomes, the hemochromatosis variant rs104894002 (HFE) and rs7412 (APOE), which, notwithstanding its pathogenicity score, has putative protective role against AD. Three of the remaining four pathogenic/likely pathogenic variants were estimated to significantly differ in frequency between the analyzed AD patient pool and controls. These are the rs429358 (APOE) polymorphism, a well-established risk factor for Alzheimer’s disease, the rs28936380 (PSEN2) and rs104894002 (TREM2), also ascertained to be associated with AD. The performed study validates the role of three pathogenic/likely pathogenic variants in AD related genes in the multifaceted genetic etiology of Alzheimer’s disease.


Introduction
Alzheimer's disease (AD) is a complex neurodegenerative disorder and the most common type of dementia.Although it affects more than 44 million people worldwide, it currently lacks blood biomarkers, which prevents its effective treatment [1,2].AD is one of the human multifactorial diseases with the highest level of heritability (60-80%) and very intricate genetic etiology [3,4], yet mutations associated with AD development have so far been identified only in a handful of genes.For example, the ε4 allele of apolipoprotein E (APOE) gene has repeatedly been established as а major genetic risk factor for Alzheimer's disease (AD) [5].The vast majority of AD cases are late-onset (LOAD) influenced by a large number of low-penetrant alleles that may either increase or reduce the risk [6,7].Still rare mutations in the amyloid protein precursor (APP), presenilin-1 (PSEN1) and presenilin-2 (PSEN2) genes are a known cause of familial, early-onset Alzheimer disease (EOAD, with onset below the age of 65) [8].Mutations in the APP, PSEN1 and PSEN2 genes have been, however, demonstrated in only 10-15% of EOAD cases, and new data have suggested that the unexplained cases involve non-Mendelian patterns of inheritance with a combination of common and newly discovered rare variants [9,10].Only a fraction of this genetic variation has, however, been identified to date and the molecular mechanisms underlying EOAD and the association of these mechanisms with clinical biomarkers and neuropathological developments remains unclear [11].Rare variants in the triggering receptor expressed on myeloid cells 2 (TREM2) gene have been shown to nearly triple an individual's risk of developing Alzheimer's disease [12].Recent studies have illuminated how the mechanisms by which TREM2 and its rare variants affect amyloid and tau pathologies [13].Loss of iron homeostasis can be central to the pathogenic events in AD and a number of studies have investigated the frequency of mutations in the hereditary hemochromatosis HFE gene in AD cases [14].A common HFE variant (H63D), for example, could be a modifier of multiple neurological diseases, and iron targeting therapies are in development for AD [15].Studies have also identified variants in CLU, another major brain apolipoprotein encoding gene, and the complement receptor 1 CR1 gene as prominent risk factors for the development of AD [16,17].Genome-wide association studies of this gene have identified both protective and risk alleles whose expression levels may be associated with late-onset AD [18].The physiological function of apolipoproteins in the central nervous system is still largely unknown.
A cost-and time-effective method to establish variants in such genes is pooled whole-exome sequencing (WES) where DNA from several individuals is pooled in a single sample before being analyzed.For this study, we carried out high-coverage pooled WES to evaluate the frequency differences in pathogenic/likely pathogenic mutations in AD associated genes, i.e.PSEN1, PSEN2, APP, APOE, TREM2, HFE, CLU and CR1, between Bulgarian AD patients and the allele frequencies given in the online database gnomAD [19], which we used as controls.

Ethics statement
This study was approved by the Ethical committee of Medical university of Sofia, Bulgaria (N 1879/17.06.2021).The participants were informed about its aims and the methodology, and were requested to sign an informed consent.

Subjects
All subjects were recruited at the Department of Neurology, university Hospital 'Alexandrovska' (Sofia, Bulgaria).We considered as AD cases any individual diagnosed with dementia due to probable or possible AD, 66 subjects in total.DNA was extracted from blood samples using phenol-chloroform extraction.The concentration of each DNA sample was measured using a qubit 2.0 fluorometer and a single DNA pool was constructed with equimolar amounts of DNA from all 66 samples.

WES
WES was performed with Illumina ® SBS technology, and sequencing libraries were generated using Agilent ® SureSelect Human All ExonV6 kit (Agilent Technologies, cA, uSA).In total, 115023264 raw reads were sequenced and these were filtered for low quality reads (qscore< = 5) and reads containing adapters as to not affect the quality of the subsequent analyses.Overall, 97.8% of the remaining 115023264 reads had Phred values larger than q20.The average read length was ~150 bp and the achieved mean coverage was 250x, ensuring the detection of low frequency alleles.The reads were aligned to the reference genome GRch37/hg19.The acquired .vcffiles were annotated using the web-based platform wANNOVAR [20].The following filters were subsequently applied: total depth of coverage ≥ 30, mapping quality ≥ 60, supporting base quality ≥ 30, number of reads of minor allele ≥ 2 [21,22].We choose to focus our analyses on variants in genes that have repeatedly been found to be associated with AD, i.e.PSEN1, PSEN2, APP, APOE, TREM2, HFE, CLU and CR1.Thereafter, the variants were chosen based on their designation as pathogenic/likely pathogenic in clinVar [23] or/and Varsome [24] databases.Variant allele frequencies in global and Bulgarian exomes were taken from GnomAD (exomes v.2.1) [19].

Results
We established a total of 349737 SNPs in the sequenced DNA pool.Six of them are classified as pathogenic/ likely pathogenic variants of the AD associated genes PSEN1, PSEN2, APOE, TREM2 and HFE.Three of these variants are common (with MAF > 0.01 in gnomAD exomes), and their respective incidence and frequency in the analyzed sample and in gnomAD exomes (total and Bulgarian) are given in Table 1.Two of these common variants are in the APOE gene and constitute the APOE ε alleles: rs7412, c > T and rs429358, T > c, which are respectively contra-and pro-risk factors for Alzheimer's disease [25,26].The estimated difference in the prevalence of the rs429358 variant was significantly higher in the analyzed AD subject group compared to that in the control Bulgarian exomes, and the prevalence of the rs7412 variant was lower in the analyzed AD subject group, albeit non-significantly.Notwithstanding the impossibility to infer haplotypes from our data, this pattern is in accordance with the supposed interplay between rs429358 and rs7412, respectively, in being a risk factor and conferring protection in AD genesis [27].The frequency of the third detected common pathogenic variant, rs1800562 of the hemochromatosis gene (HFE), showed non-significant differences between the analyzed patients and the control Bulgarian exomes.
The three rare pathogenic/likely pathogenic variants found in the AD related genes considered in this study and their frequencies in the analyzed sample and in control exomes are given in Table 2.The rs63750053 variant of the PSEN1 gene is not reported in gnomAD (and we might assume zero allele count), whereas PSEN2 rs28936380 and TREM2 rs104894002 variants have not been established in over 2600 Bulgarian control exomes, differing significantly in frequency between Bulgarian AD patients and Bulgarian control exomes.Their frequency in the global gnomAD exome set is also less than 1:10 000.

Discussion
For this study, we performed high-coverage pooled whole exome sequencing to evaluate the difference of pathogenic/likely pathogenic mutations in AD associated genes between Bulgarian AD patients and a control group.
The rs1800562 G > A variant of the hemochromatosis associated HFE gene is found in a homozygous state in 80-90% of hemochromatosis patients, but is also considered to be involved in altered cholesterol balance, Alzheimer's disease and cutaneous photosensitivity [28,29].We did not establish significant differences in the prevalence of this variant between AD cases and the control group, which is in line with the findings of other studies [30].
The variant rs63750053 in the PSEN1 gene has been found in AD patients of different ethnic background, associated mainly with early-onset and familial forms of the disease [31,32].This variant is not reported in gnomAD, indicating very low frequency in the general population, which in turn gives a clue about its pathogenicity.The variant rs63750053 in the PSEN1 gene is involved in the disturbance of the endoproteolytic processing of presenilin 1 necessary for the activation of γ-secretase [33].
The estimated frequency of the PSEN2 rs28936380 variant in the AD pool was significantly higher than that from the Bulgarian gnomAD exomes.The c > G transversion in the same nucleotide position  (p.Thr122Arg) is causative for familial forms of Alzheimer's disease [34,35], but we were unable to find publications about the phenotypic role of the c > T substitution at this site.Further analyses on the role of the rs28936380 c > T substitution in the pathogenesis of AD are thus warranted.
In homozygous state, the rare variant rs104894002 in the TREM2 gene causes the rare Nasu-Hakola disease (polycystic lipomembranous osteodysplasia with sclerosing leukoencephalopathy) characterized by early-onset dementia and multifocal bone cysts [36].Although it is a causative mutation for a Mendelian disorder with dementia, the association of rs104894002 and Alzheimer's disease is controversial [37].Nevertheless, based on its high impact in homozygous patients and the intermediate inheritance [38], rs104894002 might be a causative genetic factor for the development of Alzheimer's disease in Bulgarian patients.
Intriguingly, we did not establish pathogenic variants in one of the well-established causative genes for Alzheimer's disease, APP, as well as in the immune related CLU and CR1 genes.This does not exclude the existence of AD causing variants in these genes, suggesting instead that it is necessary to analyze larger samples to uncover yet unidentified causative variants and to determine the interconnection of genes in complex molecular pathways.

Conclusions
High-coverage pooled whole exome sequencing is an efficient approach to assess the prevalence of exomic variants, and in this study we evaluated the difference in prevalence of pathogenic/likely pathogenic variants in AD associated genes between Bulgarian AD patients and Bulgarian control exomes.Six such pathogenic/ likely pathogenic variants were established, constituent of five AD associated genes, i.e.PSEN1, PSEN2, APOE, TREM2 and HFE.Two of these variants, the rs7412 in the APOE gene and the rs1800562 in the HFE gene variants, did not differ significantly in frequency between Bulgarian AD patients and Bulgarian control exomes.The association between the variant rs1800562 and AD is currently only circumstantial and rs7412 is considered to have protective effect for the development of Alzheimer's disease.Three of the remaining four variants, rs429358 in the APOE gene, rs28936380 in the PSEN2 gene and rs104894002 in the TREM2 gene, were significantly more prevalent in Bulgarian AD patients than in control exomes, and are all with well-established association with AD development.The present pooled WES analysis validates the role of three pathogenic/likely pathogenic variants in AD associated genes, constituent of the complex genetic background of AD.The incidence and specific impact of these variants should be further investigated in individual DNA samples.

Table 1 .
Frequencies of the common pathogenic/likely pathogenic variants in aD related genes.

Table 2 .
Frequencies of the rare pathogenic/likely pathogenic variants in aD related genes.Chi-squared test: for allele frequencies between Bulgarian aD patients and gnomaD Bulgarian exomes.