Study populations.
The BiDirect Study was initiated in 2009 as a prospective, observational study integrating three cohorts: 1) community-dwelling adults (control cohort), 2) patients with an acute depressive episode (depression cohort), and 3) patients who recently suffered from acute myocardial infarction (MI cohort). The study, whose principal goal is the exploration of the bidirectional relationship between depression and subclinical arteriosclerosis, recruited participants in the district of Münster, Germany, and carried out extensive phenotyping and follow-up of all cohorts in parallel. The study design and methods have been previously described in detail [8]. Here, we included 1,899 BiDirect participants (977 males, 922 females; mean age: 52.1 ± 7.9) from the control (763), depression (851) and MI (285) cohorts.
The Austrian Stroke Prevention Family Study (ASPS)-Fam cohort represents an extension of the prospective, population-based ASPS (Austrian Stroke Prevention Study) on the effects of vascular risk factors in normal aging. ASPS was established in 1991 in the city of Graz, Austria [9]. For ASPS-Fam, first-degree relatives of ASPS participants were invited to join the study. The study’s composition and inclusion criteria have been described elsewhere [10,11]. Here, we included 287 ASPS-Fam participants (115 males, 172 females; mean age: 64.3 ± 10.6).
The basic descriptive information of the BiDirect and ASPS-Fam cohorts are shown in Table 1. All participants of the BiDirect and ASPS-Fam cohorts provided written informed consent. Methods were carried out in accordance with the ethical standards laid down in the updated version of the 1964 Declaration of Helsinki. The BiDirect Study was approved by the ethics committee of the University of Münster and the Westphalian Chamber of Physicians in Münster, North-Rhine-Westphalia, Germany. The ASPS-Fam protocol was approved by the ethics committee of the Medical University of Graz, Austria.
Serum measurements of NfL.
Quantification of sNfL in BiDirect and ASPS-Fam was conducted at the University Hospital Basel, Switzerland, using the single molecule array (Simoa®) HDX analyzer (Quanterix, Lexington, MA, USA). In BiDirect participants, measurements of sNfL were obtained from non-fasting blood samples collected at the first visit, using the Simoa® NF-light Advantage Kit. In ASPS-Fam participants, sNfL measurement has been previously described [4]. The sNfL values obtained at initial assessment were log2-transformed and used for all analyses herein reported.
Because it is known that sNfL concentrations increase during aging [4], we tested for age-adjusted sex- and cohort-dependent sNfL differences in BiDirect using analysis of covariance (ANCOVA). We also tested for sNfL correlations, using the Pearson’s method, with markers of inflammation, renal and liver function, lipids, hormones and brain volumes derived from magnetic resonance imaging (MRI) data (106 clinical variables in total). All p<0.05 values were considered statistically significant. Here, age represented the age at participant recruitment, when baseline phenotyping (s0) took place. Clinical variables coming from up to three subsequent follow-up visits were identified as time points s2, s4 and s6.
Genotype data.
For BiDirect genotypes, genomic DNA was isolated from whole blood samples with EDTA using standard DNA extraction kits and procedures at the University of Münster. Genome-wide genotyping was performed with the Infinium PsychArray BeadChip v1 (Illumina) at Life&Brain GmbH (Bonn, Germany). Basic quality control (QC) was employed to remove samples and variants with high rates of missing data. This included removal of individuals with genotyping rate <2%, cryptic relatedness (PI-HAT ≥1/16), and genetic outliers (distance in first two multidimensional scaling components >5 standard deviations from the mean), as well as the removal of variants with call rate <2% and minor allele frequency (MAF) <1%. Genotype imputation was performed with SHAPEIT (pre-phasing) [12] and IMPUTE2 [13] using the 1000 Genomes Project, phase 3, European population reference panel (from here on, 1KG Reference Panel). Imputed variants were filtered for the INFO metric (≥0.8), MAF≥0.01 and Hardy-Weinberg equilibrium (HWE p≥1x10−6). Individuals were further removed from the sample based on missing phenotypic data (age and baseline sNfL measurement). The final BiDirect GWAS dataset consisted of 5,597,244 genetic variants and 1,899 individuals.
For ASPS-Fam genotypes, genome-wide genotyping was performed with the Genome-Wide Human SNP Array 6.0 (Affymetrix). During the initial QC, variants with MAF<0.05, HWE<5x10-6 and low variant call rate (>2%) were excluded. Individuals with sex mismatch, cryptic relatedness, low sample call rate (>2%) and other detected failures were removed. Genotype imputation was performed using the Michigan Imputation Server [14] and the 1KG Reference Panel.
Of note, genetic variants herein comprise single nucleotide polymorphisms (SNPs), as well as small insertions/deletions (indels) present in the datasets.
Screening for genetic associations with sNfL.
We conducted a discovery GWAS in the BiDirect dataset under an additive regression model, adjusting for age, sex, cohort and the first 10 principal components. A secondary GWAS in the smaller ASPS-Fam dataset was performed independently at the Medical University of Graz, using age and sex as covariates. After harmonization of summary statistics from both studies, we performed a weighted meta-analysis of all overlapping variants with Rsq≥0.8 and MAF≥0.01 using Plink 1.9 [15]. Variants with high heterogeneity between studies (I>40 and Q<0.1) were subsequently neglected.
Definition of genomic loci for sNfL.
For the discovery GWAS and the meta-analysis, we carried out downstream analyses on the FUMA GWAS platform [16] and defined genomic loci at the suggestive threshold of significance for genome-wide studies (p<1x10-5), obtained variant annotations and identified the level of support for each signal. Linkage disequilibrium (LD) was defined by r2≥0.6 and a window of 500 kb, according to the 1KG Reference Panel. Subsequently, LD blocks were formed with variants under the suggestive threshold as lead variants, and containing all nominally significant (p<0.05) variants in the dataset that were in LD with the corresponding lead variants. Positional (gene) mapping was performed according to a maximum distance of 1 kb for the categories protein-coding, long non-coding RNA (lncRNA), non-coding RNA (ncRNA) and processed transcripts. Expression quantitative trait loci (eQTLs) were mapped using the BRAINEAC and GTEx v8 Brain databases. Only SNP-gene pairs with false discovery rate (FDR) <0.05 were annotated.
Functional implications of suggested candidate genes.
To inform the biological meaning of our findings, we created a protein-protein interaction (PPI) network using our suggested meta-analysis candidate genes as input. The network was generated with the Gene Set analysis tool of the ReactomeFIViz app for Cytoscape v.3.7.1 [17,18]. Linker proteins and functional interaction (FI) annotations were incorporated into the network (version 2018). In addition, we performed clustering of nodes, as well as enrichment analyses of pathways and gene ontology cellular components (GO_CC) for each network cluster. Gene sets with FDR<0.05 were considered significantly enriched.
SNP Heritability (h2SNP).
We calculated the proportion of variance in sNfL concentrations explained by our discovery GWAS in BiDirect using the GREML-LDMS (LD- and MAF-stratified GREML) method implemented in GCTA [19,20]. For all autosomal variants with MAF≥0.01 in the imputed dataset, we calculated the 200 kb segment-based LD scores, stratified variants according to LD scores of individual SNPs, computed one genetic relationship matrix for each quartile of the stratified variants, and performed a restricted maximum likelihood analysis using these four matrices. The variance explained was adjusted for the same covariates as the GWAS. SNP heritability from our meta-analysis summary statistics was calculated using LDSC software [21] with LD scores pre-computed in 1KG Reference Panel data, as suggested by the authors.
Screening for associations with clinical variables.
For the lead variant of each loci resulting from our meta-analysis, we performed genotype-specific comparisons in BiDirect participants using an ANCOVA model adjusted for age. Moreover, for all variants within meta-analysis loci, we tested for associations with the same set of clinical variables used in the correlation analyses. These association tests were performed in the same manner as for baseline sNfL. For these analyses, p<0.05 values were considered suggestive of statistical associations.