Assessment of ANG variants in Parkinson's disease

Genetic risk factors are occasionally shared between different neurodegenerative diseases. Previous studies have linked ANG, a gene encoding angiogenin, to both Parkinson's disease (PD) and amyotrophic lateral sclerosis (ALS). Functional studies suggest ANG plays a neuroprotective role in both PD and ALS by reducing cell death. We further explored the genetic association between ANG and PD by analyzing genotype data from the International Parkinson's Disease Genomics Consortium (IPDGC) (14,671 cases and 17,667 controls) and whole genome sequencing (WGS) data from the Accelerating Medicines Partnership - Parkinson's disease initiative (AMP-PD, https://amp-pd.org/) (1,647 cases and 1,050 controls). Our analysis did not replicate the findings of previous studies and found no significant association between ANG variants and PD risk.

disease (Riboldi and Di Fonzo, 2019). Therefore, the interrogation of genes common to multiple neurodegenerative disorders is a logical next step in the identification of novel PD risk variants.
One such candidate exists in ANG, a gene thought to confer a large risk for both ALS and PD (Rayaprolu et al., 2012;van Es et al., 2011). However, studies in Asian populations have suggested there is no link between ANG variants and PD (Chen et al., 2014;Liu et al., 2013). ANG encodes angiogenin, a small protein that plays a role in the angiogenesis pathway, which forms new blood vessels. Angiogenin and its related pathway are thought to play a role in cancer and placental development (Amankwah et al., 2012;Pavlov et al., 2014). An in vitro study has shown that angiogenin has a neuroprotective effect on motor neurons (Subramanian et al., 2008). ALS associated ANG variants are suggested to potentiate neuronal death through inhibition of the PI3K-Akt pathway (Kieran et al., 2008). A PD mouse model has also shown this gene has a neuroprotective effect on dopaminergic neurons (Steidinger et al., 2011). This neuroprotective effect is suggested to be lost when ANG is mutated, decreasing the viability of motor neurons (Wu et al., 2007). These findings are of relevance because PD is characterized by the loss of dopaminergic neurons and ALS is characterized by the loss of motor neurons. Interestingly, angiogenin levels have been found to be elevated in the blood serum of ALS patients, but not in PD patients (van Es et al., 2014). This suggests angiogenin may play a larger role elsewhere, such as in the basal ganglia, a brain structure often associated with PD. Structural work has shown ten ANG coding variants are associated with a decrease in angiogenin activity, and one coding variant, p.Arg145Cys, is associated with an increase in activity (Bradshaw et al., 2017).
To date, ANG variants have not been associated with either ALS or PD through genome wide association studies (GWAS) (Nalls et al., 2019;Nicolas et al., 2018), despite previous studies suggesting ANG is associated with risk for these diseases (Rayaprolu et al., 2012;van Es et al., 2011). Here we scrutinize ANG variants in two large PD datasets to assess whether ANG variants contribute to PD risk in individuals of European ancestry.

Methods:
We mined whole-genome sequencing (WGS) data from the Accelerating Medicines Partnership -Parkinson's disease initiative (AMP-PD, https://amp-pd.org/) which included 1,647 cases and 1,050 healthy controls from cohorts including the Fox Investigation for New Discovery of Biomarkers (BioFIND), the Parkinson's Progression Markers Initiative (PPMI), the Harvard Biomarker Study (HBS), and the Parkinson's Disease Biomarkers Program (PDBP). We also looked at ANG variants in genotype data from the International Parkinson's Disease Genomics Consortium (IPDGC) which included 14,671 cases and 17,667 healthy controls. Variants were annotated from both datasets using ANNOVAR (Wang et al., 2010). Variant frequencies in non-Finnish European populations were obtained from the hg38 gnomAD v3.0 dataset (Karczewski et al., 2020). PLINK 1.9 was used to perform Fisher's exact test to identify significant variants (Purcell et al., 2007). Rare variant burden tests were performed using RVTESTS (Zhan et al., 2016). We further analysed existing summary statistics including the latest GWAS metaanalyses for PD risk and age of onset (Blauwendraat et al., 2019;Nalls et al., 2019) and for use under a CC0 license.
This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 27, 2020. ; additionally assessed public summary statistics from the most recent ALS GWAS (Nicolas et al., 2018). Variants identified by amino acid change from previous studies including Van Es et al. and Rayaprolu et al. did not initially match any variants identified in AMP-PD data due to differences in nomenclature. To resolve this we mapped each variant amino acid change to the angiogenin protein sequence. This sequence was obtained from Ensembl using the ANG-201 ENST00000336811.10 transcript (Yates et al., 2020). We found that the reported amino acid changes from Van Es et al. and Rayaprolu et al. were offset by 25 or 24 amino acids due to numbering differences used for the signal peptide sequence and we accounted for this in our analysis. The code we used for analysis is available on the IPDGC github (https://github.com/ipdgc/IPDGC-Trainees/blob/master/ANG.md).

Results:
We identified a total of 168 ANG variants in the AMP-PD WGS data. Nine of these variants were found to be coding. Two of these were synonymous and the other seven were nonsynonymous. The top variant (p=0.017) after performing Fisher's exact test was not significant after Bonferroni correction for multiple tests (p=0.05/168=2.97E-4) (Supplementary Table 1). We compared the nine identified ANG coding variants to variants from two other studies (Rayaprolu et al., 2012;van Es et al., 2011) (Supplementary Table 2). All nonsynonymous variants were rare (MAF<0.01). Allele frequencies did not differ significantly from gnomAD non-Finnish European allele frequencies although most variants were too rare to reliably test individually. After excluding two synonymous variants, rs11701 (p.G110=) and rs2228653 (p.T121=), from these nine, we observed a frequency of 0.39% in PD cases and 0.48% in controls. Van Es et al. also removed two common variants, rs121909536 (K41I) and rs121909541 (p.I70V), from their analysis. After removing these variants from our data the frequencies were 0.15% in PD cases and 0.19% in controls. This is in contrast with the previously found 0.45% in PD cases and 0.04% in controls (van Es et al., 2011). Burden tests using variants with minor allele frequency less than 0.03 gave no significant results when using all variants (N variants=72;CMC p=0.493,Fp p=0.509 ,MB p=0.880,Skat p=0.454,SkatO p=0.523,Zeggini p=0.395). Likewise, there were no significant results when doing the same test on only coding variants (N variants=9;CMC p=0.866,Fp p=0.510 ,MB p=0.820,Skat p=0.436,SkatO p=0.556,Zeggini p=0.868). Twenty-six ANG variants were found using the IPDGC imputed genotype data, all of which were non-coding (Supplementary Table 1). No significant association between ANG variants and PD risk ( Figure 1A) or onset (Supplementary Figure 1) was found in data from the latest PD risk GWAS or in the PD age of onset GWAS (Blauwendraat et al., 2019;Nalls et al., 2019). No variants had a minor allele frequency less than 0.03, so the threshold was increased to 0.05 for burden tests. Only two variants were included at this threshold, which also gave no significant results (N=2; CMC p=0.893, Fp p=0.960 , MB p=0.948, Skat p=0.980, SkatO p=1, Zeggini p=0.842). Additionally, no GWAS signal of interest is identified in the most recent ALS GWAS ( Figure 1B) (Nicolas et al., 2018).
for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 27, 2020. ; Figure 1. Locus zoom plots for ANG and PD risk and ALS risk. The -log 10 (p-value) of variants on or near ANG are shown on the y-axis, and base-pair position of each variant is on the x-axis. P-values are taken from the PD risk GWAS ( Figure 1A) and the ALS risk GWAS ( Figure 1B). Variants are colored by their R 2 linkage disequilibrium color with respect to the variant with the lowest p-value on this plot. Recombination rates are included in blue (Pruim et al., 2010).

Discussion:
Rare coding variants in ANG have been reported to be associated with PD (van Es et al., 2011). Here, our goal was to further explore the role of ANG in PD by analyzing large datasets from IPDGC and AMP-PD. Our study shows no significant enrichment of ANG single variants in PD cases or controls in either of these datasets. Rare variant burden tests also gave no significant results for ANG. Our analysis provides no evidence to support the hypothesis that genetic variation of ANG plays a role in PD risk or age at onset. The nine coding ANG variants we identified were from AMP-PD WGS data. This dataset included fewer samples (1,647 PD; 1,050 controls) than the Van Es et al. study (6,471 ALS;3,146 PD;7,668 controls) which identified a total of 29 unique ANG coding variants. However, the frequency of ANG coding variants detected in the AMP-PD data is 0.15% in PD cases and 0.19% in controls which is different from the previously found 0.45% in PD cases and 0.04% in controls (van Es et al., 2011). Using the Genetic Association Study Power Calculator, we calculated a genotype relative risk of 3.7 at a statistical power of 0.8 (Supplementary Figure  2) (Johnson and Abecasis, n.d.). The genotype relative risk increased to 6.7 when using a statistical power of 0.95. This is comparable to the PD odds ratio of 6.7 from previous studies, suggesting we have the statistical power required to replicate these findings (van Es et al., 2011). However, the cumulative frequency of ANG variants identified in AMP-PD data was not significantly different as previously reported. A larger sample size may be needed to identify the missing coding variants so the role of ANG in PD can be assessed on an even larger scale. Overall, despite some potentially interesting functional experiments supporting the neuroprotective effect of angiogenin, we cannot replicate the genetic association between ANG coding variants and PD. Therefore, we cannot conclude that ANG variants play a role in PD, which is in line with previous studies done in Asian populations (Chen et al., 2014;Liu et al., 2013).
for use under a CC0 license. This article is a US Government work. It is not subject to copyright under 17 USC 105 and is also made available (which was not certified by peer review) is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.
The copyright holder for this preprint this version posted October 27, 2020. ;