Computational Causal Modeling of the Dynamic Biomarker Cascade in Alzheimer's Disease

Background Alzheimer's disease (AD) is a major public health concern, and there is an urgent need to better understand its complex biology and develop effective therapies. AD progression can be tracked in patients through validated imaging and spinal fluid biomarkers of pathology and neuronal loss. We still, however, lack a coherent quantitative model that explains how these biomarkers interact and evolve over time. Such a model could potentially help identify the major drivers of disease in individual patients and simulate response to therapy prior to entry in clinical trials. A current theory of AD biomarker progression, known as the dynamic biomarker cascade model, hypothesizes AD biomarkers evolve in a sequential but temporally overlapping manner. A computational model incorporating assumptions about the underlying biology of this theory and its variations would be useful to test and refine its accuracy with longitudinal biomarker data from clinical trials. Methods We implemented a causal model to simulate time-dependent biomarker data under the descriptive assumptions of the dynamic biomarker cascade theory. We modeled pathologic biomarkers (beta-amyloid and tau), neuronal loss biomarkers, and cognitive impairment as nonlinear first-order ordinary differential equations (ODEs) to include amyloid-dependent and nondependent neurodegenerative cascades. We tested the feasibility of the model by adjusting its parameters to simulate three specific natural history scenarios in early-onset autosomal dominant AD and late-onset AD and determine whether computed biomarker trajectories agreed with current assumptions of AD biomarker progression. We also simulated the effects of antiamyloid therapy in late-onset AD. Results The computational model of early-onset AD demonstrated the initial appearance of amyloid, followed by biomarkers of tau and neurodegeneration and the onset of cognitive decline based on cognitive reserve, as predicted by the prior literature. Similarly, the late-onset AD computational models demonstrated the first appearance of amyloid or nonamyloid-related tauopathy, depending on the magnitude of comorbid pathology, and also closely matched the biomarker cascades predicted by the prior literature. Forward simulation of antiamyloid therapy in symptomatic late-onset AD failed to demonstrate any slowing in progression of cognitive decline, consistent with prior failed clinical trials in symptomatic patients. Conclusions We have developed and computationally implemented a mathematical causal model of the dynamic biomarker cascade theory in AD. We demonstrate the feasibility of this model by simulating biomarker evolution and cognitive decline in early- and late-onset natural history scenarios, as well as in a treatment scenario targeted at core AD pathology. Models resulting from this causal approach can be further developed and refined using patient data from longitudinal biomarker studies and may in the future play a key role in personalizing approaches to treatment.


Introduction
Alzheimer's disease (AD), one of the leading public health priorities in the U.S., is projected to affect over 15 million people by 2050. e high failure rate of clinical drug trials over the past decade is in large part rooted in an incomplete understanding of its complex causal mechanisms [1]. Genetic pathway analyses implicate over 1000 different molecular species and over 30 metabolic pathways in the pathophysiology of AD, including amyloid and tau proteinopathies, inflammation, microglial activation, alterations in signaling pathways, cholesterol metabolism, and cholinergic function [2]. It is therefore likely that AD is not a single disease, but a common end-stage pathway resulting from multiple interacting etiologies. Effective treatment will likely require a personalized medicine approach to track disease progression, determine the major pathophysiologic drivers, and tailor an appropriate therapy. AD progression can be tracked in patients, from presymptomatic to late-stage disease, through several validated biomarkers.
ese are biomarkers of AD core pathology (cerebrospinal fluid and PET scan markers of beta-amyloid and tau proteins) and biomarkers of neuronal loss (FDG-PET and volumetric MR imaging). Data from Alzheimer's Disease Neuroimaging Initiative (ADNI), and other naturalistic studies, have led to a hypothetical model of disease progression known as the dynamic AD biomarker cascade theory [3], which hypothesizes that AD biomarkers evolve in a sequential but temporally overlapping manner. According to the hypothesis, amyloid pathology is an early event, leading to tau pathology, followed by neuronal loss and cognitive decline. Additional refinements of the model have been proposed to make it more generalizable to communitybased aging populations, including the addition of suspected nonamyloid pathology (SNAP) (e.g., cerebrovascular disease, age-related changes, and non-AD tauopathies), as well as the concept of cognitive reserve (e.g., protective factors such as genetics or education), both of which could influence the variability in the onset of biomarkers and cognitive decline [4]. Although this theoretical model has been operationalized into a categorical scheme for classifying patients, the system remains descriptive and makes no assumptions regarding putative causal relationships among biomarkers [5]. Understanding how these biomarkers interact, evolve over time, and result in cognitive expression of disease will be essential to harness them in a personalized medicine approach to AD diagnosis and treatment. Given the complexity of AD, a rigorous mathematical and computational modeling approach, such as that offered by systems biology, will be a critical component. e tools of systems biology may be used to incorporate clinical biomarkers of disease progression into a computational model to determine the major pathoetiologic drivers of disease in individual patients and help simulate the effects of potential interventions. One modeling approach, known as causal modeling, refers to an explicitly formulated mathematical description of the biological phenomena of interest, based on existing knowledge, in terms of cause and effect relationships. is is in contrast to correlative models which merely describe statistical associations between variables without regarding to the mechanism driving the phenomena under investigation. Our goal was to construct and test the feasibility of an initial computational causal model (CCM) of AD biomarker progression, based on the updated dynamic AD biomarker cascade theory [4,6].
is would enable the theory to be tested rigorously with existing data and further refined as new data become available.

Methods
For the construction of the causal model, assumptions about biomarker relationships and temporal course were drawn from the prior literature [3,4,6]. We tested whether the CCM would lead to the predicted biomarker trajectories described in the literature and whether it would predict failed outcome of antiamyloid therapy started late in the disease course [7]. Figure 1 shows the variables and their relationships in the computational model.

Computational Model Construction.
We implemented the above causal model, using the ordinary differential equation (ODE) toolbox in MATLAB (Mathworks ® , Natwick, MA), as the system of nonlinear first-order ODEs to include amyloid-dependent and nondependent neurodegenerative cascades. e amyloid-dependent cascade is initiated by amyloid beta, A β , and mediated via phosphorylated tau, τ ρ . e nonamyloid-dependent cascades are initiated by comorbidities, e.g., aging and/or suspected non-Alzheimer pathology (SNAP), either directly or indirectly via nonamyloid-dependent tauopathy, τ o . Initiation of cognitive decline, C, is directly determined by neurodegeneration, comorbidities, genetic factors, and cognitive reserve. e equations are as follows: where A β represents the amyloid pathology; τ ρ represents the amyloid-related tau pathology (p-tau); τ represents the total tau pathology, defined as the sum of τ ρ and τ 0 , where τ 0 represents the age-related and/or SNAP-related tauopathy; N represents the neuronal dysfunction/loss; and C represents the cognitive impairment. τ 0 , rather than τ ρ , was explicitly modeled because it can be directly measured via assay. λ defines the numerous rate constants. λ A β , λ τ ρ , λ N , and λ C reflect the logistic growth rates of the various biomarker cascades. e remaining rate constants reflect linear growth rates of the biomarkers and determine the influence of various factors on the time-of-onset of the subsequent biomarker cascades. λ CN , for example, is a rate constant that reflects the influence of neurodegeneration on cognitive decline, which is modified by cognitive reserve, for example, education level. is, along with comorbid pathologies and genetic risk alleles, determines the age of onset of the cognitive decline cascade. δ A β represents the degradation rate constant for A β and, in this model, mediates the effects of antiamyloid therapy. A o represents amyloidopathy, A Rx (t) represents the time-dependent function for antiamyloid therapy, AS represents aging and/or SNAP, R represents cognitive reserve, and ε represents the ApoE allele status. e descriptions of all variables and the parameters are listed in Tables 1 and 2, respectively. e additional assumptions of the model are as follows. (1) Biomarker cascade growth is implemented via a logistic growth model with carrying capacity K. K is adjusted using a least squares minimization procedure, so all biomarkers achieve the maximal level of 1 at the age of 100 years. is is done to configure biomarker curves in a sigmoidal shape with a progressively steeper slope in the right-hand tail for later changing biomarkers, as described in the hypothetical model.
(2) At time t � 0, A β is set to a very small number. is is done to initiate the amyloid cascade sometime during the lifespan, even in the absence of amyloidopathy. For simplicity, amyloidopathy A o , is set to zero in all models, and a slightly larger initial value of A β is used for the early-onset model, whereas a slightly smaller value is used in the late-onset models. All other biomarker initial values are set to zero, except for total tau in the late onset models, which is set to the minimum biomarker level on the graphs. (3) e minimal biomarker level on the graphs is set to 0.05 to allow for different onset delays for the sigmoidal-shaped biomarker curves that depend upon both biology and biomarker sensitivity. Minimal detection level is set to 0.15. (4) Amyloidopathy, SNAP, aging, and ApoE status are constants across the age span that add linearly to the growth rate of the biomarkers and cause earlier initiation of the amyloid, tau, neurodegenerative, and/or cognitive decline cascades. (5) Cognitive reserve is a constant that modifies the effect of neuronal degeneration on the onset of cognitive decline. A lower value is used in the lowrisk group, and a higher value in the high-risk group. (6) Antiamyloid therapy, once initiated, is assumed to be maintained throughout the lifespan, and A Rx (t) is simulated as a Heaviside step function, H[n], using the half maximum convention: where n represents the age of initiation of therapy. In the case of antiamyloid therapy, carrying capacity, K, was determined for all biomarkers under the no-therapy condition, in a natural history context, with δ A β set to zero. δ A β was then changed to positive number to simulate the effects of amyloid degradation, with fixed K values based on natural history. is was done to assure that the evolution of biomarkers in the pretherapy interval was in no way influenced by the administration of therapy later in the course of the disease.
To determine the feasibility of the CCM, we parameterized and tested four versions: (1) early-onset autosomal dominant AD, (2) late-onset amyloid-first AD, (3) late onset tau-first AD, and (4) antiamyloid therapy in late-onset amyloid-first AD. In the first three scenarios, the goal was to determine whether manipulating the CCM parameters in a physiological meaningful manner could reproduce biomarker trajectories that closely match those visually depicted in the literature. In the fourth scenario, we determine whether the model would predict the outcome of recently failed clinical trials of antiamyloid therapy administered in symptomatic late-onset AD [7]. Figure 2 illustrates that the cascade of early-onset   [4,6]. Blue circles represent biomarker quantities. A β represents the amyloid pathology. Its initial value determines that during the lifespan the amyloid cascade begins. τ ρ represents the amyloid-related tau pathology (p-tau). τ 0 represents the age-related and/or suspected non-Alzheimer pathology-(SNAP-) related tauopathy. N represents the neuronal dysfunction/loss. C represents the cognitive impairment. λ values are the growth rate constants, and δ a degradation/clearance rate constant. Amyloidopathy, aging, SNAP (suspected non-Alzheimer's pathology), ApoE status, and cognitive reserve are the constants that modify the onset of the growth cascades. Antiamyloid therapy is a function of time.  Computational and Mathematical Methods in Medicine 3 familial AD derived from the prior literature [6] ( Figure 2(a)) matches closely to the output generated by our DCM (Figure 2(b)). Specifically, they both demonstrate the initial appearance of amyloid, followed by tau and neurodegeneration, then followed by the onset of cognitive decline. It also shows how cognitive reserve could modify the cascade. Although tau in these figures represents total-tau, in this scenario, it is dominated by p-tau (amyloid-related tau). e model parameters are shown in Table 2.

Computational Model of Late-Onset Amyloid-First AD.
e late-onset AD CCM (Figure 3) output shows that amyloid appears first, followed by total tau and neurodegeneration. In this CCM, the arrival of amyloid is delayed compared to that in early-onset AD but reaches detection threshold prior to total tau. e CCM trajectories visually matched those predicted in the literature [6]. e model parameters are shown in Table 2.

Computational Model of Late-Onset Tau-First AD.
In this CCM (Figure 4), the arrival of total-tau precedes that of amyloid and initiates neurodegeneration, whereas the subsequent appearance of amyloid accelerates this process. Our CCM mimics a condition described in the literature as suspected nonamyloid pathology (SNAP) in some ways (absence of initial amyloid) but illustrates a mixed pathology concept, where amyloid and amyloid-related tau contribute to cognitive decline at later stages. e model parameters are shown in Table 2. Figure 5 depicts the outcome of our CCM of antiamyloid therapy when given to amyloid-first late-onset AD dementia patients after symptom onset. is model output shows no benefit on the onset or slope of cognitive decline, despite the amyloid level dropping substantially from its peak. Tau levels drop marginally. e model mimics the results of recent failed antiamyloid therapy trials in probable AD dementia [7]. In this model, antiamyloid therapy would have to be given before a hypothetical tipping point to show benefits on cognition. e model parameters of Figure 5(b) are shown in Table 2.

Discussion
We have implemented a CCM that incorporates the three clinically available categories of biomarkers to track AD progression, amyloidopathy, tauopathy, and neurodegeneration. e model effectively simulates the temporal evolution of the biomarkers and their relation to cognitive decline as described in the previous literature [3,4,6], taking into account late verses early onset, the influence of aging and co-occurring non-AD-related brain pathology common in the elderly, and the concept of cognitive resilience to AD pathologic changes. In addition, we simulate the effects of a disease-modifying therapy given late in the disease course, after patients becoming symptomatic. is CCM was developed both as a means to test existing theories and as a new resource for the field that can be refined as our knowledge advances. e hypothetical model of the AD pathological cascade, originally published in 2010 [3], and updated in 2013 [4], is based largely on cross-sectional biomarker data due to  Figure 3: Model of late-onset, amyloid-first AD. e red, blue, and yellow curves represent the evolution of amyloid, tau, and neuronal biomarker levels, respectively, over the course of the disease. In (a), the blue and yellow lines are combined into a single purple line, per the original theory in which tau was considered a neurodegenerative marker. e green curves represent cognition in two hypothetical highand low-risk groups, based on cognitive reserve. Our CCM-generated curves (b) match closely the pattern hypothesized in the literature ((a) is adapted from [6] with permission).  Figure 2: Model of early-onset, autosomal dominant AD. e red, blue, and yellow curves represent the evolution of amyloid, tau, and neuronal biomarker levels, respectively, over the course of the disease. e green curves represent cognition in two hypothetical high-and low-risk groups based on low and high cognitive reserve. Our CCM-generated curves (b) closely match the schematic model curves (a) (adapted from [6] with permission) from the prior literature.
Computational and Mathematical Methods in Medicine limited individual longitudinal biomarker data. It postulates a temporal evolution marker of amyloid pathology, tau pathology, and neurodegeneration, represented as sequential plots of biomarker abnormality over time, leading to cognitive impairment. ree different pathological and neuronal loss scenarios were considered, early-onset familial AD, late-onset amyloid-first AD, and late-onset tau-first AD [6]. We created and parameterized a CCM, based on assumptions of underlying biology inherent in the AD pathological cascade, to successfully simulate these three natural history scenarios. Our CCM of early-onset autosomal dominant AD closely matched the temporal order and shape of the biomarker trajectories in the literature schematic [3] and is also supported by empirical data from the longitudinal Dominantly Inherited Alzheimer Network (DIAN) study [8]. Our CCM of the late onset amyloid-first and tau-first models of AD also closely simulated the curves postulated in the literature [4]. Lastly, our simulation of antiamyloid therapy in symptomatic late-onset amyloid-first AD mimicked the negative findings from several failed clinical trials of antiamyloid therapies [7]. Of note, most of the disease modifying treatment trials in preclinical or in mild AD, recently completed and ongoing, target betaamyloid or tau pathologies.
ere are some limitations to our work. First, the hypothetical model [6] on which we built our CCM may not be entirely accurate. Although some aspects of the AD pathological cascade model, for example, the ordering and shape of the biomarker curves, have been validated using datadriven approaches [9,10], the model continues to evolve as more natural history and clinical trial data becomes available. Second, we did not parameterize our models using shows the CCM simulation of the effect of antiamyloid therapy administered in AD after symptom onset. e red curve shows marked decline in brain amyloid levels, the blue line shows a small decrease in tau levels, and green lines show there is no significant effect on cognitive decline onset or rate, consistent with the many failed trials.  Figure 4: Model of late-onset tau-first AD. e red, blue, and yellow curves represent the evolution of amyloid, tau, and neuronal biomarker levels, respectively, over the course of the disease. In (a) (adapted from [6] with permission), the blue and yellow lines are combined into a single purple line, per the original theory in which tau was considered a neurodegenerative marker. e green curves represent cognition in two hypothetical high-and low-risk groups, based on cognitive reserve. e CCM (b) closely matches the trajectories proposed in the literature (a).
actual patient biomarker data. Rather, our goal in this work was to construct and test the feasibility of a CCM that would best match the hypothetical biomarker trajectories proposed in the literature [6]. Subsequent goals will include adjusting parameters based on longitudinal data and iteratively refining the model itself as new knowledge becomes available. ird, our CCM is a simplified causal model of biomarkers interacting with each other, an abstraction that does not model the actual underlying cellular and molecular processes. Prior CCM efforts in AD have modeled the disease at a molecular and cellular level [11,12] as well as at a whole brain, systems level using MRI and EEG data [13,14]. ere have been few prior CCM applications that have specifically focused on clinical AD biomarkers [13,15], and only one has incorporated all three clinically available categories of biomarkers, amyloidopathy, tauopathy, and neurodegeneration [16]. Augmenting these efforts and overcoming the above limitations will require large real-world datasets of individual longitudinal biomarker trajectories across the cognitive continuum as well as integration of genomic, cellular, and biomarker knowledge. Such efforts are underway on an international scale [17][18][19][20].
A key strength of the CCM approach is that it allows for testing underlying causal assumptions in an integrated fashion, unlike other published correlative mathematical models of clinical biomarkers [9,10,[21][22][23][24][25][26][27]] that treat them independently or fit the data without considering its underlying causal structure. For example, several studies validated the temporal ordering of biomarkers, without attempting to explain the underlying disease mechanism by which this temporal ordering arises [9,10,26,27]. Causal models allow for testing the effects of nonlinear interactions among multiple AD biomarkers and comorbid conditions that cannot be deduced by intuition alone, as well as for predicting response to single and combination therapies. A CCM can be implemented in a "forward" manner to simulate new data or in a "backward" manner, using a Bayesian inversion procedure, to infer the causal architecture of the system based on existing data. is approach has been applied extensively to reconstruct mechanistic models of brain function and disease, including AD, from electrophysiologic and imaging data [13,15,28]. Unlike descriptive models of disease, which can be become increasingly difficult to validate, particularly as datasets of biomarker trajectories become larger and more complex, CCM's can be easily scaled up to increasing degrees of complexity. It is our hope that once a CCM resource for clinical AD biomarkers is created, it will be parameterized based on data from patient studies, expanded and iteratively refined over time. Ultimately such a model would create a global resource for the field to translate existing knowledge, personalize care, and accelerate drug discovery for this devastating disorder [29].
Data Availability e data presented in this manuscript are based on simulations and therefore can be reproduced, given the equations, parameters, and descriptions in this article.

Conflicts of Interest
JRP and PMD have received grants from, and have served as advisors to, several companies in the Alzheimer's disease field. PMD owns stock in companies whose products are not discussed here. He is also coinventor of patents related to the Alzheimer's disease field. e other authors declare no relevant conflicts of interest.