Validity of a novel computerized cognitive battery for mild cognitive impairment

Background The NeuroTrax Mindstreams computerized cognitive assessment system was designed for widespread clinical and research use in detecting mild cognitive impairment (MCI). However, the capability of Mindstreams tests to discriminate elderly with MCI from those who are cognitively healthy has yet to be evaluated. Moreover, the comparability between these tests and traditional neuropsychological tests in detecting MCI has not been examined. Methods A 2-center study was designed to assess discriminant validity of tests in the Mindstreams Mild Impairment Battery. Participants were 30 individuals diagnosed with MCI, 29 with mild Alzheimer's disease (AD), and 39 healthy elderly. Testing was with the Mindstreams battery and traditional neuropsychological tests. Receiver operating characteristic (ROC) analysis was used to examine the ability of Mindstreams and traditional measures to discriminate those with MCI from cognitively healthy elderly. Between-group comparisons were made (Mann-Whitney U test) between MCI and healthy elderly and between MCI and mild AD groups. Results Mindstreams outcome parameters across multiple cognitive domains significantly discriminated among MCI and healthy elderly with considerable effect sizes (p < 0.05). Measures of memory, executive function, visual spatial skills, and verbal fluency discriminated best, and discriminability was at least comparable to that of traditional neuropsychological tests in these domains. Conclusions Mindstreams tests are effective in detecting MCI, providing a comprehensive profile of cognitive function. Further, the enhanced precision and ease of use of these computerized tests make the NeuroTrax system a valuable clinical tool in the identification of elderly at high risk for dementia.


Background
Mild cognitive impairment (MCI) is the term applied to a condition in which elderly individuals who have a subjective cognitive complaint have objective memory impairment in the absence of functional disability [1][2][3]. Its importance arises from the observation that it often constitutes the clinical state between normal cognition and dementia in the elderly [4]. Approximately 12-15% of MCI subjects per year convert to clinical dementia with functional disability [4,5]. For this reason, much interest has centered on the development of standardized techniques for quantification of cognitive deficits in MCI and potential therapeutic interventions for treatment of these high-risk individuals [6].
While memory impairment is the hallmark of MCI [3], multiple cognitive domains are compromised in the majority of MCI individuals [7][8][9], as in those with mild Alzheimer's disease (AD). Therefore, sufficiently broad and sensitive instruments are key in the effective diagnosis of these two groups. While traditional neuropsychological tests have been shown to discriminate among individuals with MCI and cognitively healthy elderly [10][11][12], there is no standard, comprehensive neuropsychological battery that is suitable for the screening and follow-up of mild impairments in routine clinical care. Moreover, while paper-based neuropsychological batteries have been applied in research settings, they are generally impractical for widespread clinical use due to high cost and extended administration time.
Computerized cognitive testing has the potential to effectively address the limitations posed by traditional paperbased measures. Technical innovations for accurate measurement of reaction time as well as frequency of errors enhance overall sensitivity, and on-line adjustment of level of difficulty may minimize ceiling or floor effects. Standardized batteries with alternate forms allow for accurate follow-up of patients over time. A computerized testing session can be of shorter duration and is less expensive than a paper-based session. Further, computerized testing can be made widely available via the Internet and easily administered so that high quality testing is on hand to supplement clinical evaluation in routine patient care settings. Indeed computerized tests have been developed to assist researchers and psychologists in screening for cognitive dysfunction [13,14]. We chose to evaluate Mind-streams™ (NeuroTrax Corp, NY), a new commercially available computerized testing system for comprehensive clinical assessment of cognitive impairment, designed primarily for use in the elderly. Specifically, we examine the ability of Mindstreams tests to discriminate individuals with MCI from among cognitively healthy elderly. The present study is the first to assess discriminant validity of the computerized tests as compared with that of traditional neuropsychological tests.

Participants
Participants were 98 elderly individuals assessed at two tertiary care memory clinics (Bloomfield Centre for Research in Aging, McGill-Jewish General Hospital, Montreal, Canada; Memory Disorders Clinic, Shaare Zedek Medical Center, Jerusalem, Israel). Participants were diagnosed by consensus of evaluation teams led by dementia experts at each of the sites and were diagnosed with mild cognitive impairment (MCI), mild Alzheimer's disease (AD), or as cognitively healthy. Diagnosis of MCI followed Petersen et al. [15] and included the following features: (1) a complaint of defective memory; (2) normal activities of daily living; (3) a memory deficit documented on mental status evaluation and supported by abnormalities on neuropsychological testing; and (4) absence of dementia. These criteria define the subtype of MCI known as 'MCI-amnestic' [3]. Diagnosis of mild AD was according to the Diagnostic and Statistical Manual, 4 th ed. (DSM IV). Healthy elderly had no cognitive complaints and were volunteers for research testing. Each diagnostic group was taken to be representative of a distinct population defined by the criteria outlined above. Ethics Committee approval in compliance with the Declaration of Helsinki was obtained at both testing sites, and informed consent was obtained from all participants.

Mindstreams Computerized Cognitive Testing
A detailed treatment of the NeuroTrax system, including the computerized tests, data processing, and usability considerations appears in a supplementary document (Additional File 1). In brief, Mindstreams consists of custom software that resides on the local testing computer and serves as a platform for interactive cognitive tests that produce precise accuracy and reaction time (millisecond timescale) data. Tests are adaptive, in that the level of difficulty is adjusted accordingly depending upon performance. This feature increases sensitivity and minimizes the prevalence of ceiling effects. Feedback is provided in the practice sessions that precede each test, but not during the actual tests. Web-based administrative features allow for secure entry and storage of patient demographic data. Once tests are run on the local computer, data are automatically uploaded to a central sever, where calculation of outcome parameters from raw single-trial data and report generation occur.
The Mindstreams Mild Impairment Battery (administration time: 45 minutes) samples a wide range of cognitive domains, including memory (verbal and non-verbal), executive function, visual spatial skills, verbal fluency, attention, information processing, and motor skills (see Table 2). The tests that comprise this battery were designed for use with the elderly. All responses were made with the mouse or with the number pad on the keyboard (intuitively similar to the telephone keypad). Participants were familiarized with these input devices at the beginning of the battery, and practice sessions prior to the individual tests prepared them for the specific types of responses required for each test. Outcome parameters varied with each test, as in Table 2. Given the speed-accuracy tradeoff, (e.g., [18]) a performance index (computed as [accuracy/reaction time]*100) was computed for timed Mindstreams tests in an attempt to capture performance both in terms of accuracy and reaction time (RT). Tests were run in the same fixed order for all participants.
Following are brief descriptions of the Mindstreams tests included in the Mild Impairment Battery: Verbal Memory Ten pairs of words are presented, followed by a recognition test in which one member (the target) of a previously presented pair appears together with a list of four candidates for the other member of the pair. Participants must indicate which word of the four alternatives was paired with the target when presented previously. Four consecutive repetitions of the recognition test are administered during the 'learning' phase. An additional recognition test is administered following a delay of approximately 10 minutes.

Non-Verbal Memory
Eight pictures of simple geometric objects are presented, followed by a recognition test in which four versions of each object are presented, each oriented in a different direction. Participants are required to remember the orientations of the originally presented objects. Four consecutive repetitions of the recognition test are administered

Go-NoGo test
A series of large colored stimuli are presented at pseudorandom intervals. Participants are instructed to respond as quickly as possible by pressing a mouse button if the color of the stimulus is any color except red, for which no response is to be made.

Problem Solving
Pictorial puzzles of gradually increasing difficulty are presented. Each puzzle consists of a 2 × 2 array containing three black-and-white geometric forms with a certain spatial relationship among them and a missing form. Participants must choose the best fit for the fourth (missing) form from among six possible alternatives.

Mindstreams Stroop test
The Stroop is a well-established test of response inhibition [19]. The Mindstreams Stroop test consists of three phases. Participants are presented with a pair of large colored squares, one on the left and the other on the right side of the screen. In each phase, participants are instructed to choose as quickly as possible which of the two squares is a particular color by pressing either the left or right mouse button, depending upon which of the two squares is the correct color. First, participants are presented with a general word in colored letters. In the next phase (termed the Choice Reaction Time test), participants are presented with a word that names a color in white letters. In the final phase (the Stroop phase), participants are presented with a word that names a color, but the letters of the word are in a color other than that named by the word. The instructions for the final phase are to choose the color of the letters, and not the color named by the word.

Verbal Function
Pictures of common objects of low and high familiarity are presented. Participants are instructed to select the name of the picture from four choices. In a related test, participants are instructed to select the word that best rhymes with the name of the picture.

Visual Spatial Imagery
Computer-generated scenes containing a red pillar are presented. Participants are instructed to imagine viewing the scene from the vantage point of the red pillar. Four alternative views of the scene are presented as choices.

Staged Information Processing test
This test comprises three levels of information processing load: single digits, two-digit arithmetic problems (e.g., 5-1), and three-digit arithmetic problems (e.g., 3+2-1). For each of the three levels, stimuli are presented at three different fixed rates, incrementally increasing as testing continues. Participants are instructed to respond as quickly as possible by pressing the left mouse button if the digit or result is less than or equal to 4 and the right mouse button if it is greater than 4.

Finger Tapping
Participants are instructed to tap on the mouse button for 12 seconds with their dominant hand. This task is repeated twice.

Catch Game
The Catch game is a novel motor screen that assesses cognitive domains distinct from those in other Mindstreams tests. Participants must "catch" a rectangular white object falling vertically from the top of the screen before it reaches the bottom of the screen. Mouse button presses move a rectangular green "paddle" horizontally so that it can be positioned directly in the path of the falling object.
The test requires hand-eye coordination, scanning and rapid responses.

Data Analysis
Mindstreams data were uploaded to the NeuroTrax central server, where automatic data processing occurred, during which aggregate outcome parameters were computed from the raw single-trial data (Additional File 1). Outcome parameters were calculated using custom software that was blind to diagnosis or testing site, and results were relayed to each of the sites for review and analysis. Outcome parameters were computed for each test only when performance on the preceding practice session exceeded a predetermined minimum accuracy. The actual test was not given when practice session performance was below this cutpoint. Given this source of 'missing' data, a minimum of 13 data points was deemed acceptable for inclusion of a group in statistical analyses.
All statistics were computed with SPSS statistical software (SPSS, Chicago, IL). Two-tailed statistics were used throughout, and p < 0.05 was considered significant. Receiver operating characteristic (ROC) analysis was used to evaluate the ability of Mindstreams outcome parameters and traditional neuropsychological tests to discriminate participants with MCI from cognitively healthy elderly. Area under the curve (AUC), an index of effect size, was the primary result of the ROC analysis. For each measure, the AUC indicated the probability that a randomly selected individual with MCI would perform more poorly than a randomly selected cognitively healthy individual. An AUC of 0.50 indicated no better than chance discriminability, and an AUC of 1.00 indicated perfect discriminability. If the 95% confidence interval around an AUC included 0.50, the measure was unable to discriminate among MCI and healthy elderly at a significance level of p < 0.05. Separate between-group comparisons were made on Mindstreams outcome parameters between MCI and cognitively healthy and between MCI and mild AD. Given heterogeneous variances across these pairs of groups for numerous outcome parameters (Brown-Forsythe test, p > 0.05), the non-parametric Mann-Whitney U was used to make the comparisons.

Discriminant Validity of Mindstreams Outcome Parameters: Effect Sizes
Results of an ROC analysis measuring the ability of Mindstreams outcome parameters to discriminate MCI from cognitively healthy elderly are presented in Table 2 Medium-(AUC = 0.783) and high-load (AUC = 0.688) information processing outcome parameters discriminated significantly, but the low-load parameter did not. All motor skills outcome parameters did not discriminate significantly. Table 3 presents ROC analysis results for the subset of MCI and healthy elderly participants who received a battery of standardized neuropsychological tests in addition to Mindstreams testing. For each cognitive domain, the ability of Mindstreams outcome parameters to discriminate MCI from cognitively healthy elderly was compared with paper-based neuropsychological tests designed to tap the same domain. Outcome parameters measuring attention, information processing, and motor skills were excluded from this analysis for lack of corresponding traditional tests in these cognitive domains.

Effect Sizes of Mindstreams Outcome Parameters Relative to Traditional Neuropsychological Tests
As above (Table 2), the Mindstreams memory outcome parameter that best discriminated MCI from cognitively healthy elderly was accuracy across the 'learning' phase of the Verbal Memory test (AUC = 0.894; Table 3); the best traditional memory test was the WMS-III Logical Memory II subtest (AUC = 0.885). Also as above, the Mindstreams executive function outcome parameter that discriminated best was the Go-NoGo performance index (AUC = 0.840); the best traditional executive function test was the WAIS-III Digit Symbol subtest (AUC = 0.729; Figure 1A). The Mindstreams visual spatial outcome parameter discriminated significantly (AUC = 0.778), but the corresponding WAIS-III Block Design subtest did not ( Figure 1B). The Mindstreams verbal outcome parameter that discriminated best was accuracy on the naming portion of the Verbal Function test (AUC = 0.837); the best traditional verbal test was the COWA FS test (AUC = 0.768).

Discriminant Validity of Mindstreams Outcome Parameters: Group Differences
Mindstreams outcome parameters discriminated MCI from cognitively healthy and from mild AD (e.g., Figure 2). Descriptive statistics for outcome parameters in each cognitive domain are presented in Table 4, subdivided by diagnostic group. The results of two Mann-Whitney U tests are shown for each parameter, one comparing MCI and healthy elderly participants and the other comparing MCI and mild AD participants. Results for the MCI/ healthy elderly comparison are similar to those in Table 2.
For cognitive domains with sufficient data for conclusive results, significant differences between MCI and mild AD participants were found for memory, visual spatial, and verbal outcome parameters. Results were mixed for attention outcome parameters, such that timed Go-NoGo parameters did not significantly discriminate among MCI and mild AD, but the performance index from the Choice Reaction Time test did.

Discussion
Cognitive assessment is essential to the effective care and treatment of the elderly. Given that the number of elderly is predicted to increase steeply as the baby boomer generation ages [20], there is an urgent need for standardized cognitive assessment tools that deliver high quality information and are practical for routine clinical use. Traditional paper-based neuropsychological testing is seriously limited by the formidable cost in time and money. We therefore evaluated a novel computerized cognitive testing system, the Mindstreams system (NeuroTrax Corp., NY), which was designed for widespread clinical application in the detection of MCI and mild dementia.
The present study evaluated the discriminant validity of Mindstreams tests in distinguishing individuals with MCI from healthy elderly. Outcome parameters across multiple cognitive domains significantly discriminated among MCI and healthy elderly with considerable effect sizes ( Table 2). Particularly strong results were obtained for outcome parameters assessing memory, executive function, visual spatial skills, and verbal function. Further, effect sizes of the computerized tests in these domains were at least comparable to neuropsychological tests designed to assess the same domains (Table 3; Figure 1). Results were mixed for Mindstreams attention and information processing outcome parameters, and those assess-ing motor skills did not discriminate among MCI and healthy elderly ( Table 2).

Discriminant Validity of Executive Function and Visual Spatial Tests
The current findings are consistent with those of studies designed to identify traditional neuropsychological tests that predict conversion to dementia. Many such studies have found standard tests of verbal-and non-verbal Mindstreams outcome parameters with insufficient data (<13 data points per group) due to failed practice sessions (see Methods) were excluded. AUC = area under the curve WMS-III = Wechsler Memory Scale, 3 rd edition SE = standard error RAVLT = Rey Auditory Verbal Learning Test, Version 1 CI = confidence interval WAIS-III = Wechsler Adult Intelligence Scale, 3 rd Edition RT = reaction time COWA = Controlled Oral Word Association memory and executive function to be excellent predictors [10,11,[21][22][23]. Others have found verbal fluency to be a good predictor [24,25], and a recent report by Mapstone et al. [12] suggests that visual spatial impairment may also predict conversion to dementia. Hence every cognitive domain with strong discriminant validity for Mindstreams outcome parameters in MCI has been associated with prediction of conversion to dementia in studies of traditional tests.
Computerized tests other than Mindstreams have been employed to discriminate MCI from cognitively healthy elderly. Indeed the paired associates learning (PAL) test of the Cambridge Neuropsychological Test Automated Battery (CANTAB) has been shown sensitive to cognitive decline [26]. While demonstrating the general utility of computerized cognitive testing, the CANTAB-PAL is limited in scope, difficult to use, and requires specialized equipment. A brief set of three tests developed by CogState Ltd. and administered serially four times in 3 hours has recently been shown to discriminate among MCI and cognitively healthy elderly on the basis of learning performance [7]. However, the CogState tests fail to provide a comprehensive cognitive profile, consisting exclusively of reaction time tests. Finally, MicroCog [27], a multi-domain computerized battery, showed good discriminability among participants with mild dementia and cognitively healthy elderly in an initial validity study [28]. However, MicroCog has not been widely used clinically, likely because it tests only selected cognitive domains and must be administered by a trained psychologist [29].
It is important to note that the results reported in the present study are preliminary. Population based studies with longitudinal follow-up, pathological confirmation of diagnosis, and comparison with a wider array of traditional tests are required to fully establish the validity of the Mindstreams tests in MCI detection. Further, given the between-group differences in age and years of education in the present study, future studies must collect normative data on Mindstreams tests so that performance can be standardized according to age and years of education. Given the between-group difference in computer experience in the present study, subsequent studies will collect more detailed information on participants' facility with the computer in general and with each of the Mindstreams tests in particular. However, the absence of betweengroup differences on Mindstreams motor skills tests in the current study, those most dependent upon facility with the computer, suggests that differential computer experience did not confound the results. Finally, future work might incorporate test data in the event of a failed practice session. As such data was labeled 'missing' in the present study, the reported results likely underestimate the true discriminant validity of the Mindstreams tests.
An important limitation imposed upon the present study and all studies of MCI arises from lack of consensus regarding the clinical definition of MCI [30,31]. Our MCI participants were selected according to the standard definition in the field [4], but these criteria for 'MCI-amnestic' [3] require only memory impairment. Consistent with the present results, individuals classified as 'MCI-amnestic' are often impaired in other cognitive domains [7][8][9]. A more clinically valid classification of this pre-dementia Discriminant Validity of Non-Verbal Memory Test Figure 2 Discriminant Validity of Non-Verbal Memory Test. During the 'learning phase' of the Mindstreams Non-Verbal Memory test, four consecutive repetitions of a recognition test were administered. In (A), mean accuracy is shown across repetitions for cognitively healthy elderly (diamonds), mild cognitive impairment (MCI; squares), and mild Alzheimer's disease (AD; triangles) participants. (B) depicts mean performance (+ standard error) for each of the diagnostic groups on the final repetition trial. MCI participants are discriminable from healthy elderly and from those with mild AD (see Table 4).

MCI Mild AD
state may be Aging-Associated Cognitive Decline (AACD; [32]), which has clearly defined diagnostic criteria and requires impairment in multiple cognitive domains [30,31]. Indeed AACD has recently been validated as a predictor of conversion to dementia [33,34].
Computerized testing has been criticized relative to paperbased testing in terms of technical limitations and appropriateness for clinical use [35]. Perhaps the most pervasive technical limitation is measurement error that varies depending upon the refresh rate of the monitor, the sam-pling rate of the input device, operating system activities, and the data acquisition software. Mindstreams, which runs under Microsoft Windows, utilizes the DirectX library to minimize imprecision due to operating system activities and data acquisition software to sub-millisecond levels. The remaining sources of error are hardwaredependent and typically result in imprecision on the order of less then 20 milliseconds, still far better than human measurement error. Computerized assessment has also been criticized on the grounds that testing is not customizable for the individual participant. While paper- ---0.797 ---Summary statistics presented as mean (standard deviation). Accuracies are given as percent correct. Timed outcome parameters are given in milliseconds. '---' indicates insufficient data (<13 data points per group) due to failed practice sessions (see Methods). MCI = mild cognitive impairment AD = Alzheimer's disease RT = reaction time based tests are indeed more flexible, the inherent lack of uniformity confounds the valid comparison of test results across participants. Further, Mindstreams testing batteries can be customized to suit specific clinical needs. Batteries can be constructed to include only relevant tests, and stimulus presentation parameters can be altered as appropriate for a particular clinical population.
We found Mindstreams tests straightforward to administer and easy for even the mild AD participants to learn. Administration time for the comprehensive testing battery used in this study (45 minutes) was appropriate, and participants were pleased with the positive feedback that the system provided throughout the session. The automatic uploading and scoring of the data streamlined the entire data collection process, and, in our view, these features may lead to widespread adoption of computerized cognitive testing.
The present study is evaluative in that it serves to guide future studies in determining the optimal set of Mindstreams tests and outcome parameters for differentiating among various patient groups. For example, not all information processing outcome parameters discriminated equally among MCI and cognitively healthy elderly ( Table  2). It appears that the level of difficulty associated with the 2-digit arithmetic (i.e., medium load) portion of the Information Processing test discriminated best, while that associated with the single digit (i.e., low load) portion of the test was ineffectual in discriminating. This suggests that level of difficulty is an important consideration in selecting the Mindstreams parameters that best discriminate among groups. Similarly, the mixed pattern of results for attention outcome parameters (i.e., Choice Reaction Time did not discriminate, but Go-NoGo timed outcome parameters did discriminate; Table 2) can be accounted for by inter-task differences in level of difficulty. These observations may guide both clinical research on existing Mindstreams tests and future test development.

Conclusions
The present preliminary study demonstrates the ability of Mindstreams computerized cognitive tests to discriminate individuals with MCI from cognitively healthy elderly. Mindstreams measures of memory, executive function, visual spatial skills, and verbal fluency discriminated best, and discriminability was at least comparable to that of traditional neuropsychological tests in these domains. Our findings and experience with the NeuroTrax system underscore the utility of this novel clinical tool in the diagnosis of MCI and mild dementia in circumstances where full neuropsychological evaluation is unavailable or impractical. Guided by the present results, further work is necessary to examine the suitability of Mindstreams tests for additional clinical and non-clinical validation cohorts and for longitudinal use.