Scolaris Content Display Scolaris Content Display

CSF tau and the CSF tau/ABeta ratio for the diagnosis of Alzheimer's disease dementia and other dementias in people with mild cognitive impairment (MCI)

This is not the most recent version

Collapse all Expand all

Abstract

This is a protocol for a Cochrane Review (Diagnostic test accuracy). The objectives are as follows:

To determine the diagnostic accuracy of 1) CSF tau, 2) CSF p‐tau, 3) the CSF tau/ABeta ratio and 4) the CSF p‐tau/ABeta ratio index tests at various thresholds for detecting participants with mild cognitive impairment (MCI) at baseline who would clinically convert to Alzheimer’s disease or other forms of dementia at follow‐up.

Background

Dementia is a progressive syndrome of global cognitive impairment with resultant functional decline.  In the United Kingdom (UK), it affects 5% of the population over 65 and 25% of those over 85 (Knapp 2007).  Worldwide, there were estimated to be 36 million people living with dementia in 2010 (Wilmo 2010), and this will increase to over 115 million by 2050.  The greatest increases in prevalence will be seen in the developing regions.  By 2040, China and its western‐Pacific neighbours are predicted to have 26 million people living with dementia (Ferri 2005). 

Dementia encompasses a group of neurodegenerative disorders that are characterised by progressive loss of cognitive function and ability to perform activities of daily living, that can be accompanied by neuropsychiatric symptoms and challenging behaviours of varying type and severity.  The underlying pathology is usually degenerative and subtypes of dementia include Alzheimer’s disease dementia, vascular dementia, dementia with Lewy bodies, and frontotemporal dementia.  There may be considerable overlap in the clinical and pathological presentations (MRC CFAS 2001), and there is often co‐existence of Alzheimer’s disease dementia, vascular dementia and other causes of neuronal atrophy (Matthews 2009; Savva 2009). 

Alzheimer’s disease dementia is an incurable, progressive, neurodegenerative condition which accounts for over 50% of dementias, afflicting 5% of men and 6% of women over the age of 60 worldwide (World Health Organization 2010).  Its prevalence increases exponentially with age, with Alzheimer’s dementia affecting less than 1% of people aged from 60 to 64 years, but 24% to 33% of those over the age of 85 (Ferri 2005).

There have been over a dozen different definitions used to describe cognitive impairment that is somehow qualitatively different from so‐called ‘normal’ ageing.  The first complaints in patients with Alzheimer’s disease spectrum are often cognitive problems such as problems with planning and judgement as well as the more characteristic memory complaints.  This may lead to a diagnosis of Mild Cognitive Impairment (MCI) if formal testing reveals objective evidence of cognitive impairment. It has not been previously mandated which psychometric tests should be used to define objectively cognitive impairment. However, the objectivity of the cognitive impairment is critical as it differentiates this population from a group with subjective cognitive impairment which is more likely to have a non‐neurodegenerative aetiology. MCI is a heterogeneous condition, the diagnosis of which holds very little prognostic significance. There are four outcomes for those within an MCI population: progression to Alzheimer’s disease dementia, progression to another dementia, maintaining stable MCI and recovery.  Currently, sixteen different classifications are used to define MCI (Matthews 2008).  In this protocol MCI refers to this extended definition of MCI or to the clinical criteria defined by Petersen criteria or revised Petersen criteria (Petersen 1999; Petersen 2004; Winbald 2004) or to the Cognitive Dementia Rating (CDR = 0.5) scale (Morris 1993). 

Studies (Petersen 1999; Bruscoli 2004; Mattson 2009; Petersen 2009) indicate that an annual average of 5% to 15% of MCI patients progress to Alzheimer’s disease dementia. This all depends on clinical profile, settings and investigation for vascular disease.  At the present time there is no clinical method to determine accurately which of those patients with MCI will develop Alzheimer’s disease dementia or other dementia subtypes. 

Research suggests that measurable change in proton emission tomography (PET), magnetic resonance (MRI) and cerebrospinal fluid (CSF) biomarkers occurs years in advance of the onset of clinical symptoms (Beckett 2010).  In this protocol we aim to assess the ability of:

  1. CSF tau,

  2. CSF phosphorylated tau (p‐tau),

  3. The CSF tau/ABeta ratio, and

  4. The CSF p‐tau/ABeta ratio,

to enable the detection of Alzheimer’s dementia and other dementia subtypes in patients with MCI. These biomarkers have been chosen as they are considered to be the most intimately expressed biomarkers of the Alzheimer's disease core pathology; namely the aggregation and fibrilisation of the amyloid plaque and hyperphosphorylation of tau. Consequentially, these biomarkers have been proposed as important in new criteria for Alzheimer's dementia that incorporate biomarker abnormalities.

Target condition being diagnosed

In this review there are two target conditions: i) Alzheimer's disease dementia and ii) other forms of dementia, which will be assessed at follow‐up.

We will be comparing the index test results obtained at baseline with the results of the reference standard obtained at follow‐up (delayed verification of diagnosis). 

Index test(s)

This review is part of a suite of reviews for assessing the accuracy of PET, MRI and other index tests (please see the Alternative test(s) section) in identifying those patients without clinical onset of dementia, who would develop Alzheimer's dementia or other forms of dementia during follow‐up. We would consider the following.

Tau and phosphorylated tau (p‐tau) CSF biomarker tests

Tau is a microtubule‐associated protein located primarily in neuronal axons. There are six different human isoforms, each of which has multiple phosphorylation sites. Physiologically tau interacts with tubulin and plays an important role in the organisation and stabilisation of microtubules. Independent of phosphorylation status, slightly increased levels of CSF total tau (t‐tau) have been associated with ageing, vascular dementia, multiple sclerosis, AIDS dementia, head injury and tauopathy; significant increases with Creutzfeldt‐Jakob disease and meningoencephalitis; and a threefold increase has been seen in Alzheimer’s disease compared to normal controls (Shoji 2002). Systematic review of CSF biomarkers for Alzheimer’s disease in 2001, analysing 41 studies of CSF t‐tau, demonstrated a specificity of 90% and sensitivity of 81% in diagnosing the condition (Blennow 2003).

The p‐tau protein also has a number of potential phosphorylation sites (Billingsley 1997) and abnormal hyperphosphorylation has been shown to be associated with microtubule disruption and the formation of neurofibrillary tangles, dystrophic neurites surrounded by neuritic plaques, and neuropil threads, major components of Alzheimer’s disease pathophysiology (see Mandelkow 1998). Systematic review in 2001 of 11 studies of CSF p‐tau in Alzheimer’s disease indicated a diagnostic specificity and sensitivity of 92% and 80% respectively (Blennow 2003).

There is great interest around the use of biomarkers and imaging techniques for the prediction of progression from MCI populations to Alzheimer’s disease dementia and other dementia subtypes. The international consortium study Alzheimer Disease Neuroimaging Initiative (ADNI), performed between 2004 and 2009, has so far been a key cohort study for predicting the progression from MCI to Alzheimer’s disease using biomarkers, and demonstrated a sensitivity and specificity of CSF t‐tau of 70% and 92% and CSF pTau181 of 68% and 73% respectively (Petersen 2010).

Tau/ABeta ratio and p‐tau/ABeta ratio CSF biomarker tests

ABeta is produced mainly by neurons, secreted into the CSF and then cleared through the blood‐brain barrier and degraded by the reticuloendothelial system. ABeta levels are thus regulated in strict equilibrium between the brain, CSF and blood (Shoji 1992), but in Alzheimer’s disease patients ABeta42 forms insoluble amyloid and accumulates as intra‐cerebral fibrils, resulting in decreased levels of CSF ABeta42 (Shoji 2001).

ABeta in CSF has only modest potential as a test for delayed verification of Alzheimer’s disease (Ritchie 2013), with meta‐analysis of studies being hampered by poor methodological quality (Noel‐Storr 2013) and multiple thresholds being reported between studies (Ritchie 2011).

In 2001, the American Academy of Neurology produced practical guidelines for dementia, including three Class II or III reports in a systematic review of a combination study of ABeta42 and t‐tau CSF levels. The sensitivity and specificity for diagnosis of Alzheimer’s disease were 85% and 87% (Knopman 2001), supported by the 2001 systematic review revealing 83% to 100% sensitivity and 85% to 95% specificity for the CSF ABeta42 and total tau combination assay (Blennow 2003).

Again, the ADNI cohort study demonstrated that the total tau/ABeta42 ratio could be used to predict conversion from MCI to Alzheimer’s disease dementia, revealing a sensitivity of 86% and specificity of 85% (Petersen 2010).

Clinical pathway

Dementia develops over a trajectory of several years.  There is a presumed period when people are asymptomatic, and when pathology is accumulating.  Individuals or their relatives may then notice subtle impairments of recent memory.  Gradually, more cognitive domains become involved, and difficulty planning complex tasks becomes increasingly apparent.  In the UK, people usually present to their general practitioner, who may administer the index tests, and may refer the person to a hospital memory clinic.  However many people with dementia do not present until much later in the disorder and will follow a different pathway to diagnosis, for example being identified during an admission to general hospital for a physical illness.  Thus the pathway influences the accuracy of the diagnostic test.  The accuracy of the test will vary with the experience of the administrator, and the accuracy of the subsequent diagnosis will vary with the history of referrals to the particular healthcare setting. Diagnostic assessment pathways may vary in other countries and diagnoses may be made by a variety of specialists including neurologists and geriatricians. 

Role of index test(s)

The sampling of CSF and assay for levels of tau and A‐Beta could have a role when applied in specialist clinics. Due to the costs, risks and complexity of the testing, they will be not be applied in a primary care setting. The role of this index test is as an add‐on biomarker test which has been proposed in new research diagnostic criteria to compliment clinical examination and cognitive tests. It is these clinical and cognitive tests which have defined the MCI population under study in this review.

Alternative test(s)

We are not including alternative tests in this review because there are currently no standard practice tests available for the diagnosis of dementia. 

The Cochrane Dementia and Cognitive Improvement Group (CDCIG) is in the process of conducting a series of diagnostic test accuracy reviews of the biomarkers and scales listed below. Although we are conducting reviews on individual tests compared to a reference standard, we plan to compare our results in an overview.

  • FDG‐PET  (Positron emission tomography F‐fluorodeoxyglucose)

  • 11 C‐PIB‐PET(Positron emission tomography Pittsburg Compound‐B)

  • sMRI (structural magnetic resonance imaging)

  • Neuropsychological tests (MMSE; MiniCOG; MoCA)

  • Informant interviews (IQCODE; AD8)

  • APOE e4

  • FP‐CIT SPECT (Fluoropropil‐Carbomethoxy‐lodophenil‐Tropane Single‐photon emission tomography)

Rationale

The new diagnostic criteria for Alzheimer’s disease and MCI due to Alzheimer’s disease (Dubois 2010; Albert 2011; McKhann 2011; Sperling 2011) incorporate add‐on biomarkers based on imaging or CSF measures. These add‐on tests to core clinical criteria might increase the sensitivity or specificity of a testing strategy. It is crucial that each of these biomarkers is assessed for its diagnostic accuracy before it is adopted as a routine add‐on test in clinical practice. It is worth noting that in all these criteria, a single abnormality in any of the proposed tests is considered diagnostic of prodromal Alzheimer’s dementia.

CSF biomarkers might improve diagnoses and thereby treatments and patient outcomes.  At the present time there is no 'cure' for dementia, but there are some treatments which can slow cognitive and functional decline, or reduce the associated behavioural and psychiatric symptoms of dementia (Birks 2006; McShane 2006).  Furthermore, if Alzheimer’s disease can be diagnosed at an earlier, pre‐dementia stage, this could present interventions with a critical window for enhanced likelihood of effect as well as help people with dementia, their families and potential carers make timely plans for the future.  Coupled with appropriate contingency planning, proper recognition of the disease may also help to prevent inappropriate and potentially harmful admissions to hospital or institutional care (Bourne 2007).  In addition, the accurate early identification of dementia may improve opportunities for the use of newly evolving interventions designed to delay or prevent progression to more debilitating stages of dementia. 

Objectives

To determine the diagnostic accuracy of 1) CSF tau, 2) CSF p‐tau, 3) the CSF tau/ABeta ratio and 4) the CSF p‐tau/ABeta ratio index tests at various thresholds for detecting participants with mild cognitive impairment (MCI) at baseline who would clinically convert to Alzheimer’s disease or other forms of dementia at follow‐up.

Secondary objectives

To investigate the amount of and associations of heterogeneity in the included studies of test accuracy.

We expect that heterogeneity will be likely and that it will be an important component of the review. The potential sources of heterogeneity, which will be used as a framework for the investigation of heterogeneity, include target population, index test, target disorder and study quality.

Methods

Criteria for considering studies for this review

Types of studies

We will consider longitudinal cohort studies in which index test results are obtained at baseline and the reference standard results at follow‐up (see Index tests; Reference standards). These studies necessarily employ delayed verification of conversion to dementia and are sometimes labelled as ‘delayed verification cross‐sectional studies’ (Bossuyt 2008; Knottnerus 2002). 

We will include case control studies if they incorporate a delayed verification design.  We believe this can only occur in the context of a cohort study, so these studies are invariably diagnostic nested case‐control studies.

Participants

Participants recruited and clinically classified as those with mild cognitive impairment (MCI) at baseline will be eligible for inclusion in this review.  The diagnosis for MCI will be established using the Petersen criteria or revised Petersen criteria (Petersen 1999; Petersen 2004; Winbald 2004) and/or Matthews 2008) criteria and/or CDR = 0.5 (Morris 1993).  These criteria include: subjective complaints; a decline in memory objectively verified by neuropsychological testing in combination with a history from the patient; a decline in other cognitive domains; no or minimal impairment of activities of daily living; not meeting the criteria for dementia. Therefore, the eligible participants will have had a number of tests, e.g. neuropsychological tests for cognitive deficit and checklists for activities of daily living, before study entry. Participants will be defined as amnestic single domain or amnestic multiple domain or non‐amnestic single domain or non‐amnestic multiple domain or non‐specified MCI participants.

We will include participants from secondary and tertiary settings. Although demographic and clinical characteristics of MCI as well as sources of recruitment might differ in those settings, we have decided not to limit our review by setting; instead, we will look for variation within and between settings, and will examine the potential influence of the setting on diagnostic performance of the index test in the analyses.

We will exclude those studies that include patients with MCI possibly caused by: i) a current or history of alcohol/drug abuse; ii) central nervous system (CNS) trauma (e.g. subdural haematoma), tumour or infection; iii) other neurological conditions, e.g. Parkinson’s or Huntington’s diseases.

Detail of the causes of study drop‐outs is crucial and if such data are missing the reliability of the conclusions must be questioned.

Index tests

  1. CSF tau

  2. CSF p‐tau

  3. CSF tau/ABeta ratio

  4. CSF p‐tau/ABeta ratio

There are currently no generally accepted standards for CSF tau and CSF p‐tau positivity threshold, and therefore it is not possible to pre‐specify test positivity threshold.

Criteria for CSF tau and CSF p‐tau positivity: we will use the criteria which were applied in each included primary study to classify participants as either test positive or test negative.

Measure of index test: tau and p‐tau level in CSF.

We will not include a comparator test because there are currently no standard practice tests available for the diagnosis of dementia. We will compare the index tests with a reference standard.

Target conditions

There are two target conditions in this review:

  1. Alzheimer’s disease dementia (conversion from MCI to Alzheimer’s disease dementia)

  2. Any other forms of dementia (conversion from MCI to any other forms of dementia)

Reference standards

For the purpose of this review, several definitions of Alzheimer’s disease dementia are acceptable.  Included studies may apply probable or possible NINCDS‐ADRDA criteria (McKhann 1984). The Diagnostic and Statistical Manual of Mental Disorders (DSM) and International Classification of Diseases (ICD) definitions for Alzheimer’s disease dementia will also be acceptable. It should be noted that different iterations of these standards may not be directly comparable over time (e.g. DSM‐IIIR versus DSM‐IV or ICD9 versus ICD10). Moreover, the validity of the diagnoses may vary with the degree or manner in which the criteria have been operationalised (e.g. individual clinician versus algorithm versus consensus determination).  We will consider all these issues in interpreting the results, using sensitivity analyses as appropriate.

Similarly, differing clinical definitions of other dementias are acceptable.  For Lewy body dementia the reference standard is the McKeith criteria (McKeith 1996; McKeith 2005). For frontotemporal dementia the reference standard is the Lund criteria (LMG 1994), Neary 1998, Boxer 2005, DSM and ICD. For vascular dementia the reference standard is the NINDS ARIEN criteria (Roman 1993), DSM and ICD.

The time interval over which progression from MCI to Alzheimer’s disease dementia or other forms of dementia happened is also important. The minimum period of delay in the verification of the diagnosis (i.e. the time between the assessment at which a diagnosis of MCI is made and the assessment at which the diagnosis of dementia is made) is one year. Where a mean duration is specified, we will exclude the study if the mean minus one standard deviation is less than one year, which will ensure that no more than 16% of participants were followed up for less than one year if the follow‐up period is normally distributed. If our assumptions regarding distribution are not met then we can develop new methods for standardising the follow‐up period using, for example, quartiles.

If possible, we will segment analyses into separate follow‐up mean periods for the delay in verification: one year to less than two years; two to less than four years; and more than four years.  In this eventuality we will clearly note where the same included studies contribute to the analysis for more than one reference standard.

Search methods for identification of studies

We will search a variety of information sources to ensure all relevant studies are included. The Trials Search Coordinator of the Cochrane Dementia and Cognitive Improvement Review Group will devise search strategies for electronic database searching.

Electronic searches

We will search:

  • MEDLINE (OvidSP),

  • EMBASE (OvidSP),

  • Science Citation Index (ISI Web of Knowledge),

  • PsycINFO (OvidSP), and

  • LILACS (Bireme).

See Appendix 1 for a proposed draft strategy to be run in MEDLINE (OvidSP). We will design similarly structured search strategies using search terms and syntax appropriate for each database listed above. We will request a search of the Cochrane Register of Diagnostic Test Accuracy Studies (Cochrane Renal Group).

We will also search:

for relevant systematic review and meta‐analyses.

We will make no restriction based on language of study. We will use translation services as necessary. We will not use search filters (collections of terms aimed at reducing the number need to screen) as an overall limiter because those published have not proved sensitive enough (Whiting 2011a).

Initial searches will be performed by a single researcher with extensive experience of conducting systematic reviews.

Searching other resources

Grey literature: chosen electronic databases will include assessments of conference proceedings.

Handsearching: we will not perform handsearching. At present there is little published evidence of the benefits of handsearching for reports of DTA studies (Glanville 2010).

Reference lists :we will scan reference lists of all eligible studies and reviews in the field for further possible titles and the process repeated until no new titles are found (Greenhalgh 2005).

Correspondence: we will contact research groups who have published or are conducting work on diagnostic tests for dementia. Groups to contact will be informed by the initial results of our literature search.

Data collection and analysis

Selection of studies

Two researchers (EL and AN‐S) will screen all titles and abstracts generated by the electronic database searches for relevance. 

Two researchers (EL and AN‐S) will independently review the remaining abstracts of selected titles and select all potentially‐eligible studies for full text review.  Two researchers (NS and EL) will then independently further assess full manuscripts against the inclusion criteria (see Criteria for considering studies for this review).  Where necessary, a third arbitrator (CWR) will resolve disagreements that the two researchers cannot resolve through discussion.

Where a study may include useable data but these are not presented in the published manuscript, we will contact the authors directly to request further information.  If the same data set is presented in more than one paper we will include only the primary paper.

We will detail the numbers of studies selected at each point in a PRISMA flow diagram.

Data extraction and management

We will extract the data on study characteristics into the Excel‐based template developed by the Diagnostic Test Accuracy Unit in Birmingham released in July 2012 tailored for the needs of these review data. The template includes:

Bibliographic details of primary paper:

  • Author, title of study, year and journal

Basic clinical and demographic details:

  • Number of subjects

  • Mild cognitive impairment (MCI) clinical criteria

  • Age

  • Gender

  • Setting

  • Participant recruitment

  • Sampling procedures

Details of the index test:

  • Method of the CSF tau and CSF tau/ABeta ratio test administration, including who administered the test

  • Thresholds used to define positive and negative tests results

  • Collection handling

Details of the reference standard:

  • Definition of Alzheimer's disease dementia and other dementias used by the reference standard

  • Duration of follow‐up from the time of index test application to diagnosis of Alzheimer’s disease dementia or other dementias by the reference standard: 1 year to < 2 years; 2 to < 4 years; and > 4 years. If participants have been followed for a varied amounts of time we will record a mean follow‐up period for each included studies. If possible, we will group those data into minimum, maximum and median follow‐up periods; these may then become the subject of subgroup analyses.

  • Prevalence or proportion of the population developing Alzheimer's disease dementia and other dementias, with severity, if described.

The results of the 2x2 tables cross‐tabulating index test results with the results of the reference standard/s

Table 1: Conversion from MCI to Alzheimer’s disease dementia

Index test information

Reference standard information

ADD present

ADD absent

Index test positive

Index test + who convert to ADD (TP)

Index test +  who remain MCI (FP) & Index test + who convert to non‐ADD (FP)

Index test negative

Index test ‐ who convert to ADD (FN)

Index test ‐ who remain MCI (TN) & Index test ‐ who convert to non‐ADD (TN)

 

Table 2: Conversion from MCI to non‐Alzheimer’s disease dementia

Index test information

Reference standard information

Non‐ADD present

Non‐ADD absent

Index test positive

Index test + who convert to non‐ADD (TP)

Index test + who remain MCI (FP) & Index test + who convert to ADD (FP)

Index test negative

Index test ‐ who convert to non‐ADD (FN)

Index test ‐ who remain MCI (TN) & Index test ‐ who convert to ADD (TN)

 

Table 3: Conversion from MCI to any form of dementia

Index test information

Reference standard information

Any forms of dementia present

Dementia absent

Index test positive

Index test + who convert to any form of dementia (TP)

Index test + who remain MCI (FP)

Index test negative

Index test ‐ who convert to any form of dementia (FN)

Index test ‐ who remain MCI (TN)

The numbers lost to follow‐up

Assessment of methodological quality

We will assess methodological quality of each study using the QUADAS‐2 tool (Whiting 2011b) as recommended by the Cochrane Collaboration.  The tool is made up of four domains: Patient selection; Index test; Reference standard; Patient flow.  Each domain is assessed in terms of risk of bias, with the first three domains also considered in terms of applicability (Appendix 2).  The components of each of these domains and a rubric which details how judgments concerning risk of bias are made are detailed in Appendix 3.  Certain key areas important to quality assessment are participant selection, blinding and missing data.

We will pilot a QUADAS‐2 assessment on two papers. If agreement is poor, we will refine the signalling questions. We will not use QUADAS‐2 data to form a summary quality score. We will produce a narrative summary describing numbers of studies that we considered contained high/low/unclear risk of bias as well as concerns regarding applicability.

Statistical analysis and data synthesis

We will apply the DTA framework for the analysis of a single test and extract the data from a study into a 2x2 table, showing the binary test results cross‐classified with the binary reference standard and ignoring any censoring that might have occurred.  We acknowledge that such a reduction in the data may represent a significant oversimplification. We will therefore adopt an intention‐to‐diagnose (ITD) approach as well. If possible, we will present what the result would be if all dropouts would have developed dementia, and if all dropouts would not have developed dementia.  We may also need to assume that the proportion of positive and negative test results is the same in the unknown as the known participants in order to do this. We will examine the effects of imputation and data from censored participants in a sensitivity analysis, and censoring in the discussion of results.

We will use data from the 2x2 tables abstracted from the included studies (TP, FN, FP, TN) and entered into RevMan to calculate the sensitivities, specificities and their 95% confidence intervals. We will also present individual study results graphically by plotting estimates of sensitivities and specificities in both a forest plot and a receiver operating characteristic (ROC) space. If more than one threshold is published in primary studies we will report accuracy estimates for all thresholds.

If there are sufficient and adequate data we will meta‐analyse the pairs of sensitivity and specificity. We will use the hierarchical summary ROC curve (HSROC) method proposed by Rutter and Gatsonis (Rutter 2001). We will conduct these analyses in SAS software. Particularly if there are common thresholds across included studies we might also consider the bivariate random effects approach (Macaskill 2010).

If studies report multiple thresholds we will include the most frequently used cut‐off, across all included studies, in meta‐analysis. We recognise the limitation of this data‐driven approach (Leeflang 2006) but there are no standard thresholds used in practice. We will acknowledge and consider this further in the ‘Discussion’ section of our review.

We will explore the implications of any credible summary accuracy estimates emerging by considering the numbers of false positives and false negatives in populations with different prevalence of dementia subtypes, and by presenting the results as natural frequencies and using alternative metrics such as likelihood ratios and predictive values.

Investigations of heterogeneity

The following factors could be relevant in clinical practice as they relate to the interpretation of the test result. Knowledge of potential sources of heterogeneity that can be referenced within the clinical setting are crucial to understand. This includes patient factors such as age, illness severity and genetic risk as well as how the clinical population where the test applied has been defined where slightly different clinical criteria exist and may be used. This may also include differing assay methods for the CSF tau and A‐Beta. All these factors may have an influence on the accuracy of the test itself as it is applied in practice.

The framework for the investigation of possible sources of heterogeneity includes the following factors:

1. Index test

Target population

  • Spectrum of patients (mean age, gender, Mini‐Mental State Examination, ApoE status). Concerning age, any studies that include 30% of participants below the age of 65 will be examined separately.

  • Clinical criteria of MCI at baseline: e.g. Petersen criteria versus CDR = 0.5 versus different MCI classification (Matthews 2008)

  • Clinical settings: e.g. secondary care versus tertiary care

Index test

  • Thresholds

  • Technical features: e.g. ELISA vs Innogenetics kit

  • Operator characteristics: e.g. training of assessors

Target disorder

  • Reference standard/s used: e.g. NINDS‐ARDRA versus DSM versus ICD10 for Alzheimer's disease dementia

  • Operationalisation of criteria used for the definition of a dementia syndrome: e.g. individual clinician/algorithm/consensus group.

Study quality

  • Types of studies: longitudinal cohort studies or diagnostic nested case‐control studies

  • Blinding: prior clinical information will increase accuracy of the index test.

  • Duration of follow‐up: 1 year to < 2 years versus 2 years to < 4 years versus > 4 years

  • Loss to follow‐up: we will consider separately those studies that have more than 20% attrition

To investigate the effects of the sources of heterogeneity, we will perform a descriptive analysis by visual examination of the forest plot of sensitivity and specificity and the ROC plot. If there are sufficient included studies, subgroup analyses will be performed in the RevMan software.

Sensitivity analyses

If not already explored as part of the investigation of heterogeneity above, we will perform sensitivity analyses for the above covariates, if appropriate. For example in order to investigate the influence of study quality on overall diagnostic accuracy of the CSF biomarkers, we will omit studies at high risk of bias (see Appendix 2).

We will also perform a sensitivity analysis with and without the intention‐to‐diagnose (ITD) approach. Where data are available we will undertake an analysis of reported intermediate points in the development of dementia.

In addition, we will evaluate the effects of data‐driven threshold selection studies on overall diagnostic accuracy of CSF tau and CSF tau/ABeta ratio tests.

Assessment of reporting bias

We will not investigate reporting bias because of current uncertainty about how it operates in test accuracy studies and the interpretation of existing analytical tools such as funnel plots.  We may investigate the effect of the presence of potential conflicts of interest as part of any investigation of heterogeneity.