Plasma proteins predict conversion to dementia from prodromal disease

The study aimed to validate previously discovered plasma biomarkers associated with AD, using a design based on imaging measures as surrogate for disease severity and assess their prognostic value in predicting conversion to dementia.


Introduction
Alzheimer's disease (AD) is the most common neurodegenerative disorder of the aging population; usually affecting people over the age of 65 years and resulting in progressive cognitive and functional decline. Detecting AD at the earliest possible stage is vital to enable trials of disease modification agents and considerable efforts are being invested in the identification and replication of biomarkers for this purpose.
Such biomarkers currently include measures of tau and amyloid beta (Ab) in cerebrospinal fluid (CSF), brain atrophy using magnetic resonance imaging (MRI), and measures of Ab pathological load using positron emission tomography (PET). All these approaches are promising, although molecular imaging is currently a costly procedure available in relatively few centers and lumbar puncture is moderately invasive. Furthermore, repeated measures are problematical in both cases.
Blood (plasma) on the other hand is a more accessible biofluid suitable for repeated sampling. This led many groups including ours to investigate the potential of a diagnostic signal in blood. Using a case-control study design with a gel-based approach (two-dimensional gel electrophoresis and liquid chromatography tandem mass spectrometry) two proteins (complement factor H [CFH] and alpha-2-macroglobulin [A2M]) were observed as potential markers of AD [1], both of which were subsequently replicated by independent groups [2,3]. In the same study we observed changes in serum amyloid P (SAP), complement C4 (CC4), and ceruloplasmin, all of which have been implicated in AD pathogenesis [4][5][6]. However, case-control studies are problematical when there is a long prodromal disease phase. In such instances a large proportion of apparently normal controls already harbors the disease processes and hence may already have a peripheral biomarker disease signature. To overcome the limitations of case-control design, we searched for proteins associated with surrogates of disease severity (hippocampal atrophy and clinical progression), and identified Clusterin as a marker associated with both these surrogate measures [7]. Building on this "endophenotype" discovery approach we subsequently found transthyretin (TTR) and apolipoprotein A1 (ApoA1) to be associated with faster declining AD subjects and increased plasma apolipoprotein E (ApoE) levels related to increased Ab burden in the brain [8,9].
These studies, and those from other groups, have identified a set of proteins that might act as biomarkers relevant to AD. However such findings require replication, in large studies, ideally using samples drawn from more than one cohort source and using a platform that enables multiplexing. We therefore developed multiplex panels using our discovery proteins together with additional putative candidate biomarkers that have been implicated in AD and neurodegeneration (Supplementary Table S1).
The aims of the current study were (1) to validate a set of blood-based biomarkers in a large multicenter cohort with specified a priori outcome variables of the disease endophenotype measure of atrophy on MRI and of clinical severity and (2) to determine the accuracy of a multiplexed panel of disease relevant biomarkers in predicting conversion of mild cognitive impairment (MCI) to dementia in a defined time period.

Subjects and clinical classification
Plasma samples from AD, MCI subjects and elderly nondemented controls were selected from three independent studies. AddNeuroMed (ANM) a multicenter European study [10], Kings Health Partners-Dementia Case Register (KHP-DCR) a UK clinic and population based study and Genetics AD Association (GenADA) a multisite casecontrol longitudinal study based in Canada [11]. The diagnosis of probable AD was made according to Diagnostic and Statistical Manual for Mental Diagnosis, fourth edition and National Institute of Neurological, Communicative Disorders and Stroke-Alzheimer's disease and Related Disorders Association criteria. MCI was defined according to Petersen criteria [12]. Standardized clinical assessment included the Mini-Mental State Examination (MMSE) for cognition and for global levels of severity the Clinical Dementia Rating (ANM and KHP-DCR only). The human biological samples were sourced ethically and their research use was in accord with the terms of the informed consents.
In total we examined plasma samples from 1148 subjects: 476 with AD, 220 with MCI, and 452 elderly controls with no dementia (Table 1). The APOE single nucleotide polymorphisms (SNPs) rs429358 and rs7412 were genotyped using Taqman SNP genotyping assays (determined by allelic discrimination assays based on fluorogenic 5 0 nuclease activity) and the allele inferred.

Cognitive decline
Cognitive decline, as determined by the slope of change in cognition, was calculated for a subset of AD subjects (n 5 342) who had a minimum of three separate MMSE assessments. The rate of cognitive decline was calculated separately for ANM because it had a different following up interval (every 3 months for 1 year) in comparison to DCR and GenADA, which were followed up yearly for a period of at least 3 years. Linear mixed effect models were generated using the package "nlme" in R. We estimated the rate of change using a multilevel linear model with random intercepts and random slopes adjusted for subject and center level clustering. Covariates including age at baseline, gender, APOE ε4 allele presence, and years of education were investigated for their effect on the rate of decline. Age at baseline and years of education had a significant effect on the rate (P value , .05) and thus were included as fixed effects in the final model. The slope coefficient obtained from the final model for each sample was then used as a rate of cognitive change, defined as the change in MMSE score per year.

Magnetic resonance imaging
High-resolution sagittal 3D T1-weighted Magnetization prepared rapid gradient-echo (MPRAGE) volume (voxel size 1.1 ! 1.1 ! 1.2 mm 3 ) and axial proton density/ T2-weighted fast spin echo images were acquired on 1.5 T MRI scanners for 476 of the subjects (179 control, 123 MCI, and 174 AD) as previously reported [13]. The MPRAGE volume was acquired using a custom pulse sequence specifically designed for the Alzheimer's Disease Neuroimaging Initiative (ADNI) study to ensure compatibility across scanners [14]. Full brain and skull coverage were required for all MR images according to previously published quality control criteria [13,15]. Image analysis was carried out using the FreeSurfer image analysis pipeline (version 5.1.0) to produce regional cortical thickness and subcortical volumetric measures as previously described [16,17]. This segmentation approach has been previously used for analysis in imaging proteomic studies [18] and AD biomarker discovery [16]. All volumetric measures from each subject were normalized by the subject's intracranial volume, whereas cortical thickness measures were used in their raw form [19]. Measures of hippocampal volume, entorhinal cortex volume, and ventricular volume were chosen as MRI endophenotypes of AD. For the evaluation of hippocampal atrophy the MRI data were stratified into high and low atrophy for the MCI group based on their median volumetric measures.

Immunoassay-Luminex measurement
All candidate proteins were measured using multiplex bead assays (Luminex xMAP) (Supplementary S1) incorporated in 7 MILLIPLEX MAP panels (Supplementary S2 and S3) run on the Luminex 200 instrument according to manufacturer's instructions.

Data preprocessing
Before statistical analysis, we examined the performance of each assay using quality checks (QC) as outlined in the Supplementary Material. Median fluorescent intensity (MFI) was measured using xPONENT 3.1 (Luminex Corporation) and exported into Sigma plot (Systat Software; version 12) for estimation of protein concentrations using a five-parameter logistic fit. Briefly, all analytes that passed QC checks based on the following four criteria (standard curve linearity, intra-assay coefficient of variation [CV], interassay coefficient of variation for reference sample, and percentage of missing data; Supplementary Material S4) were taken forward for further analysis.

Univariate analysis
Univariate statistical analysis was performed in SPSS 20 (IBM). All raw MFI measures were log 10 transformed to achieve normal distribution. Covariates including age, gender, plasma storage duration (days), and center were investigated. We found that most proteins were significantly affected by these covariates and therefore values were adjusted using a generalized linear regression model (GLM). All subsequent analysis was performed on the GLM adjusted data. Partial correlation (adjusting for APOE genotype) analysis was performed to examine associations with either structural MRI brain imaging or cognition assessments. Correlations were performed separately within diagnostic groups due to the discrete nature of the clinical scores across all groups. The proteins were also analyzed individually for their association with disease phenotypes: disease status (AD vs. control) via analysis of covariance (adjusting for APOE genotype). Multiple linear regressions were performed to test how combinations of proteins could predict hippocampal volume.

Classification analysis
Class prediction and attribute selection were performed using WEKA (University of Waikato). Naive Bayes Simple algorithm was used with default settings unless stated otherwise. The data set was randomly split into 75% train and 25% test for the MCI-converter (MCI c ) and MCI-nonconverter (MCI nc ) groups. Attribute selection was performed using the Classifier Subset Evaluator with the best first search method on the training data. Five iterations of attribute selection were performed ranked by times observed. Proteins seen greater than three or more times were taken forward as predictor variables (Table 3). Any class imbalance was overcome by applying the Synthetic Minority Oversampling Technique in WEKA.

Cut-off point analysis
Untransformed protein concentrations on the full data set (n 5 169, MCI c , and n 5 51, MCI nc ) were binarised at different cut-off points using the upper and lower quartile ranges and the percentile rank. A minimum of three cut-off concentrations were tested per protein. Logistic regression analysis was performed on individual cut-off concentrations and selected based on their accuracy of predicting conversion.

Study participants
The demographic and clinical characteristics of participants from the three cohorts are presented in Table 1. The AD group were marginally, but significantly older than controls (AD: mean 77 years, controls: 75 years, P 5 .01). The frequency of the APOEε4 allele was higher in MCI and AD groups than in controls.

Plasma proteins and brain atrophy
Of the 26 proteins measured only two proteins were found to be significantly different between AD and controls (ApoE: F 5 6.5, P , .001; CFH: F 5 6.1, P , .001). However, using partial correlation, and adjusting for APOE, we identified a number of plasma proteins that were significantly associated with atrophy using MRI measures of one or more of the brain regions; hippocampus, entorhinal cortex, ventricles, and whole-brain volume in the disease groups (Table 2A and B). Controlling for multiple testing, only Clusterin (MCI group: P , .001) and ApoE (AD group: P 5 .0014) remained significant.
We then set out to identify proteins that collectively would predict disease progression, as represented by the surrogate of hippocampal atrophy, in a predisease group of n/a n/a n/a n/a Mean Entorhinal Thickness n/a n/a n/a n/ n/a n/a n/a n/a Left entorhinal thickness n/a n/a n/a n/ In summary we found an association of Clusterin with greater atrophy, and a trend toward reduced RANTES, NSE, and TTR levels in the MCI group. In the AD group A1AT, NSE, ApoC3, ApoA1, ApoE, and BDNF plasma levels were increased in subjects with increased atrophy.

Plasma proteins clinical cognition and cognitive decline
We examined the relationship between these proteins and disease severity as measured by cognition at the time of sampling and by the rate of change in cognition. In the MCI group at the point of sampling, both ApoE and C-reactive protein (CRP) negatively correlated with MMSE (ApoE: r 5 20.15, P 5 .001; CRP: r 5 20.186, P 5 .007).
In Furthermore, we assessed the association of the proteins with longitudinal prospective MMSE change in the AD group. Three proteins, NCAM, soluble receptor for advanced glycation end products [sRAGE], and intercellular adhesion molecule [ICAM], were significantly associated with the rate of change in cognition; NCAM and sRAGE were both negatively correlated (NCAM: r 5 20.129, P 5 .0018; sRAGE: r 5 20.125, P 5 .029), whereas ICAM was positively correlated (ICAM: r 5 0.108, P 5 .047).

Protein biomarkers to predict disease conversion: MCI to AD
In summary we confirm that a number of proteins, previously identified as putative markers of AD, correlated with disease severity, measured by MRI or severity of cognitive impairment not only in disease but in the predisease state of MCI. We therefore reason that if these proteins are reflecting pathological load, they may also be markers predictive of conversion from predisease states such as MCI to clinical dementia.
To test this, we used a machine learning approach (Naive Bayes Simple) with feature selection on a training data set ( Figure 1) and then applied this to a test set. The average Abbreviations: NSE, neuron-specific enolase; SAP, serum amyloid P; CFH, complement factor H; CRP, C-reactive protein.
NOTE. Ranked according to the number of times a protein was observed in the feature selection. Proteins highlighted in bold were taken forward as the predictors for MCI conversion. time of conversion of MCI to AD was approximately 1 year (375 days, SD 5 23 days). Ten proteins (TTR, Clusterin, cystatin C, A1AcidG, ICAM1, CC4, pigment epithelium-derived factor [PEDF], A1AT, RANTES, ApoC3) plus APOE genotype had the greatest predictive power ( Table 3). The receiver operating curve characteristic (ROC) for the independent test set is shown in Figure 2A. The ROC area under the curve (Table 4A) of the test set was 0.78 (protein only) and 0.84 (protein 1 APOE genotype). To test the accuracy, we investigated three different sensitivity cut-off points at 30%, 50%, and 85%. The optimal accuracy was observed at the 85% sensitivity with the test achieving an accuracy of 87% with a specificity of 88%.
We then investigated whether combining structural MRI data with these 10 proteins observed in the MCI conversion data would improve classification accuracy. MRI brain measures for a subset of subjects were combined with the protein data and the Naive Bayes algorithm was applied. In this smaller data set the proteins alone performed very well when tested at the three different sensitivity cut-offs (cut-off: accuracy; 30%: 83.3%, 50%: 80.6%, 85%: 69.4%). The addition of MRI data only marginally improved the accuracy at two cut-off points (cut-off: accuracy 30%: 86%; 50%: 83%) but reduced it at the 85% sensitivity cut-off to 64% ( Figure 2B and Table 4B).

Concentration cut-offs points for proteins predicting MCI to AD
Individual protein cut-off values were derived for the 10 proteins identified by feature selection in the MCI conversion model. Values predictive of conversion to AD were; ApoC3, 105.5 mg/ml, TTR , 222 mg/ml, A1AT, 9.5 mg/ml, PEDF . 10.7 mg/ml, CC4 . 78.5 mg/ml, ICAM-1, 99.72 ng/ml, RANTES, 33.8 ng/ml, A1AcidG . 768.3 mg/ml, Cystatin C , 3.21 mg/ml, Clusterin . 402 mg/ml. Logistic regression was applied to test the 10 protein cut-off concentrations and APOE genotype, the overall model accuracy was 94.9%, with a sensitivity 73.6% and specificity of 94.9% when using the full data set.

Discussion
Previous studies by our group using data-driven pan-proteomic approaches have identified a number of proteins as diagnostic [1] progression [7,20] and markers of disease severity [18]. The advent of high throughput multiplex platforms facilitates the replication of such findings and raises the potential of high throughput multiplexed markers for use in clinical practice and in clinical trials [21][22][23]. In this study we have used a multiplex antibody capture platform to determine if our putative biomarkers are associated with early disease stages and might have value as prognostic markers. Using MRI as a surrogate of disease pathology we identified a number of proteins that were associated with atrophy either early in the disease process (MCI) or in established dementia.
This approach of using MRI as a proxy for in vivo pathology has previously been shown to be useful in biomarker discovery, including in our study identifying Clusterin as a putative marker of disease [7]. In this study we identified RANTES, NSE, and transthyretin, in addition to Clusterin, to be associated with cortical atrophy in the MCI group, with Clusterin showing the strongest correlation with all brain regions assessed. The other proteins have previously been implicated in AD. RANTES, also known as chemokine ligand 5, is a protein known to have an active role in recruiting leukocytes into inflammatory sites. We find a negative association between RANTES and ventricular volume, suggesting a decreased level with increased disease related pathology; the opposite to previous reports in neurodegeneration [24][25][26]. One possible explanation might be that because we observe RANTES association with atrophy only in MCI and not in AD, perhaps a decrease early in disease process is followed by an increase. Similar findings have been previously reported for other proteins [27] and we also observe a similar relationship with NSE, the second protein we observe in association with brain atrophy. This protein is thought to be an indicator of acute neuronal damage [28,29] and has been associated with AD in some but not all previous studies [30,31]. We find a positive association between NSE and volume of hippocampus and whole brain in MCI subjects, but in the AD group we find a positive association instead with ventricular volume. This inverse relationship with atrophy in predisease and then positive correlation with atrophy in disease suggests to us, that like RANTES, NSE might be decreased in early disease stages (i.e. MCI) with a rebound elevation in established AD.
In established AD we observe a different set of proteins associated with disease severity as measured by atrophy on MRI, in line with this concept of disease phase specific biomarkers. A number of these belong to the group of apolipoproteins (ApoE, ApoC3, and ApoA1). We found these proteins were negatively associated with hippocampal, entorhinal cortical, and whole-brain volumes. The roles of apolipoproteins in neurodegenerative disorders have been studied extensively because the discovery that APOE was a major susceptibility gene for AD [32,33]. In the peripheral system, ApoE serves in the transport of triglycerides, phospholipids, and cholesterol into cells [34]. The literature on ApoE is conflicting with some groups reporting lower ApoE in AD [35,36], with others showing increased levels [37,38]. ApoE plasma measurements derived from this study have been recently published and are in agreement with the findings from the North American Alzheimer's Disease Neuroimaging Initiative (ADNI), which reports an APOE genotype effect [39].
Our present findings suggest that we have identified a panel of plasma biomarkers, associated with neuroimaging measures of disease, which may serve as readily accessible markers of early disease severity. Moreover, we identify a set of 10 protein biomarkers that can prospectively predict disease conversion from MCI to AD within a year of blood sampling. These results are supported by other evidence that plasma proteins can have a role in early disease detection, with inflammatory proteins in particular being identified as possible predictors of conversion from MCI [23,40]. It is important to note that when attempting to compare such biomarker studies, the lack of standardized reagents, particularly antibodies may result in different outcomes reflecting technical differences between analytical platforms more than disease biology. Therefore our ability to replicate these proteins using an orthogonal approach (mass spectrometry in discovery, multiplexed immune capture in replication) makes these findings particularly powerful and robust. Moreover, combining MRI with protein measures did not improve predictive power in contrast to previous studies where CSF marker performance was improved in combination with MRI [41]. Although this study is built on findings from previous discovery-led and replicated findings, further replication will be needed. Ideally such replication should be in large, longitudinal, population-based cohorts. Such a study would be able to address potential confounds of the data reported in this study including site-specific effects and representativeness of the cohorts. Further studies will also be needed to address specificity. The markers we have identified are often altered in other disease areas-inflammation, cardiovascular, respiratory, dental, and others-and it will be important to distinguish the relative overlap and confounding by these diseases. However, although the protein participants in the panel we have identified are often altered in other disease states, these diseases are all different and therefore the panel itself may show specificity even if the participants do not. This remains to be determined. It also remains to be seen whether the panel we have identified is specific to AD or shows biomarker utility in relation to other dementia syndromes. Although we used an assessment protocol that we have previously shown is highly accurate in distinguishing AD from other dementias based on post-mortem confirmation, it will be interesting in due course to correlate the behavior of our panel to specific markers of dementia pathology such as biochemical or imaging measures of Ab and tau.
In summary, using a multiplexing approach we have validated a plasma protein panel as a marker reflecting disease severity and for predicting disease progression within three large multicenter cohorts. Such a marker set may have considerable value in triaging patients with early memory disorders, to other more invasive approaches such as molecular markers in CSF and PET imaging, in clinical trials and possibly in clinical practice. We are grateful to all the participants from AddNeuroMed, GenADA, and KHP-DCR studies. We would like to thank Megan Pritchard and the research team for the IoP sample collection and assessments. We would also like to thank Shaikh Masood and Lawrence Rentoul for their technical support.

Systematic review: We searched PubMed up to
February 2014 using the keywords, Alzheimer's disease (AD), plasma, prediction, pathology, mild cognitive impairment, and MCI to AD conversion.

2.
Interpretation: This is the largest (n 5 1148) multicenter plasma validation study based on previous discovery candidate biomarkers. In addition, we identified markers that are strongly associated with disease endophenotype measures based on magnetic resonance imaging and clinical severity. Moreover, these biomarkers can prospectively predict disease conversion from MCI to AD with an accuracy of 87% exceeding that of any previous reported plasma biomarkers. Our findings suggest the potential role of these biomarkers detected in plasma as indirect indicators of AD pathology, and their utility as predictors for future disease conversion.
3. Future directions: To validate the clinical utility of the current study results, an independent study is required of an equal or greater size to test the accuracy of this panel of biomarkers.