Internet-based cognitive assessment tool Sensitivity and validity of a new online cognition screening tool for patients with bipolar disorder

Background: The International Society for Bipolar Disorders Targeting Cognition Task Force recommends the Screen for Cognitive Impairment in Psychiatry (SCIP) to screen for cognitive impairment in bipolar disorder. However, SCIP must be administered by a healthcare professional, which is often impossible due to time and resource constraints. Web-based


Background
Cognitive impairment is a core feature of affective disorders, including unipolar depression (UD) and bipolar disorder (BD) (Bourne et al., 2013;Burdick et al., 2014;Rock et al., 2014).These impairments are evident across several cognitive domains, including memory, attention, and executive functions and often persist during periods of remission (Bora et al., 2013;Cullen et al., 2016;Demmo et al., 2017;Rock et al., 2014).Cognitive impairments contribute to poor quality of life and impede patients' functional recovery including work capacity, which poses the largest socio-economic burden of these disorders (McIntyre et al., 2013;Olesen et al., 2012).Notwithstanding the importance of cognitive functions for patients' daily functioning and societal costs, cognition is often not assessed in the clinic.This is due to lack of consensus on how to assess cognition and restricted time and resources for such assessments (Miskowiak et al., 2018;Miskowiak et al., 2017).While some clinicians assess cognition by simply asking patients whether they experience any cognitive challenges, this may not provide accurate insights into patients' cognitive functions because of a general poor correlation between subjective and objective measures of cognition (Demant et al., 2015).This means that it is not necessarily the patients with most subjective cognitive difficulties who display largest performance decline on objective cognitive tests and vice versa.Indeed, various factors influence this discrepancy, including the severity of mood symptoms, illness duration, premorbid intelligence, and age (Miskowiak et al., 2016;Petersen et al., 2019).The International Society for Bipolar Disorders (ISBD) Targeting Cognition Task Force therefore recommends clinical implementation of the validated and sensitive cognition screening tool, the Screen for Cognitive Impairment in Psychiatry (SCIP) that assesses several cognitive domains despite its brevity (<20 minutes).However, an impediment to its clinical implementation is that it must be administered by a clinician, which requires training and resources in terms of time and space.This highlights a need for the development of web-based patient self-administered cognition screening tools that offer brief, valid and reliable remote testing of patients' objective cognitive functions.
Web-based remote testing options exist for a few cognition test batteries, including CNS Vital Signs (CNSVS; Gualtieri and Johnson, 2006) and CogState (Davis et al., 2017), which are validated in neuropsychiatric populations, including dementia, attention deficit hyperactivity disorder (ADHD) and UD.However, these tools may be suboptimal for detection of more subtle cognitive impairments in affective disorders (Davis et al., 2017) and depend on relatively expensive software (Parsons, 2016).In contrast, the THINC-integrated tool (THINC-it;McIntyre et al., 2017) and MyCognition Quotient (MyCQ; Domen et al., 2019) are web-based tools that have been made freely available and thus offer more accessible cognition assessments to patients.While MyCQ was developed to be used transdiagnostically, THINC-it is the first web-based cognition screening tool designed for patients with affective disorders.THINC-it is sensitive to cognitive impairment in patients with UD in an acute illness-phase (McIntyre et al., 2017) and has high reliability in healthy controls (HC) (Harrison et al., 2018).However, only two of four subtests of THINC-it showed acceptable concurrent validity (Harrison et al., 2018).Another limitation of THINC-it -and MyCQ -is that they lack an assessment of verbal learning and memory, which is likely due to difficulties with online speech recognition (Hafiz et al., 2019).Nevertheless, verbal learning and memory assessments should ideally be integrated into future web-based screening tools because impairments in this domain are common even during remitted states of affective disorders (Arts et al., 2008;Bora et al., 2008) and contribute to poor occupational and daily functioning (Bonnín et al., 2010;López-Villarreal et al., 2020;Tse et al., 2014).
We designed and piloted a web-based self-administered objective screening tool for remote testing; the Internet-based Cognitive Assessment Tool (ICAT).The ICAT was designed to resemble the SCIP assessment approach (Miskowiak et al., 2018) and includes real-time assessment of verbal learning and recall using Automatic Speech Recognition (ASR) technology.Version 1 of ICAT showed adequate feasibility and validity in a pilot study on n=19 healthy volunteers (Hafiz et al., 2019) and based on results and insights derived from this study, an improved version of ICAT was designed and implemented.The aims of the present study were to investigate (i) the sensitivity of ICAT to cognitive impairments in partially or fully remitted patients with BD, and (ii) to establish the concurrent validity of the ICAT as compared to the SCIP.In addition, we assessed the association between ICAT performance, psychosocial function and subjective cognition, and estimated the preliminary tentative cut-off scores for cognitive impairment for ICAT.Finally, the usability of the ICAT was examined.We hypothesized (i) that patients with BD in full or partial remission would display impaired performance as assessed by ICAT compared to healthy controls (HC), and (ii) that cognitive performance as assessed by ICAT would correlate with cognitive performance as assessed with the corresponding paper-and-pencil-based SCIP test.

Participants and recruitment
Patients with BD and HC were recruited for the study between May 2019 and March 2020.Patients were recruited through an ongoing longitudinal cohort study of patients with newly diagnosed BD, the Bipolar Illness Onset (BIO) study (Kessing et al., 2017) and were referred by specialized psychiatrists at the outpatient Copenhagen Affective Disorder Clinic, Psychiatric Centre Copenhagen.Patients had an ICD-10 diagnosis of BD and were in full or partial remission upon inclusion, as reflected by scores ≤ 14 on the Hamilton Depression Rating Scale 17-items (HRDS-17;Hamilton, 1960) and on the Young Mania Rating Scale (YMRS; Young et al., 1978).Diagnostic assessment was carried out using the diagnostic interview Schedules for Clinical Assessment in Neuropsychiatry (SCAN; Wing et al., 1990).The HC were recruited from the Blood Bank at Copenhagen University Hospital.All participants were 18-60 years of age and had Danish as their mother tongue.Exclusion criteria included any diagnosed comorbid neurological disorder, severe somatic illness, or current substance abuse disorder.Patients were further excluded if they had a daily use of benzodiazepines >22.5 mg oxazepam or >7.5 mg diazepam or if they have had electroconvulsive therapy within the past three months.HC were excluded if they had a personal or first-degree family history of psychiatric illness.Written informed consent was obtained prior to study enrolment.The local ethics committee and data protection agency in the Capital Region of Denmark approved the study (protocol numbers: H-7-2014-007 and RHP-2015-023).

Procedure
Participants attended the Psychiatric Centre Copenhagen on two occasions.On the first visit, participants were assessed for subjective cognitive functions and functional capacity (details below).On the second visit, participants were assessed with the paper-and-pencil-based screening SCIP test and subsequently completed the ICAT test on a 13" laptop.For this second assessment, participants had been instructed to avoid caffeine intake.The ICAT was set up to be fully self-administered and included a short introduction video with instructions for the test.To investigate the accuracy of the ICAT automatic speech recognition (ASR) component, participants' verbal responses were also recorded by the assessor.Finally, participants provided feedback on the system usability of the ICAT.Premorbid verbal intelligence was estimated with the Danish Adult Reading Test (DART; Crawford et al., 1987).All participants were rated on depressive and manic symptoms using the HDRS-17 and YMRS, respectively.

Internet-based Cognitive Assessment Tool
The Internet-based Cognitive Assessment Tool (ICAT) is a web-based cognitive test battery designed to resemble the cognitive tasks of the Screen for Cognitive Impairment in Psychiatry (SCIP; Purdon, 2005).The ICAT is fully self-administered and requires no involvement by an assessor.Participants are introduced to the application through a homepage and fill out an informed consent form compliant with the European data protection law (general data protection regulation, GDPR).Since the ICAT uses Automatic Speech Recognition (ASR) to assess verbal recall, a technical setup page ensures that the microphone and speaker are correctly configured.A detailed description of the ICAT design process, system descriptions, and the use of ASR in cognitive assessment is reported in Hafiz et al. (2019).
Four of the five cognitive tests in ICAT were designed based on the subtests of SCIP.A detailed description of the ICAT tests and their paperand-pencil counterpart can be found in the supplementary material, Table S1.For each test in ICAT, the participant is provided with a detailed guide consisting of pre-recorded video-and audio instructions designed to closely resemble the face-to-face instruction normally issued by a human assessor.An assessment in ICAT takes approximately 35 minutes, with the longer duration than SCIP being due to the introductory video, both auditory and written instructions before each test and longer duration of the adapted LNS than the SCIP verbal fluency test.The ICAT is comprised of five subtests: (i) List Learning (LL; Fig. 1 ).Together, the ICAT tests probe verbal learning (LL), two aspects of working memory: 'transient online storage and retrieval' and 'executive function working memory' (CR and adapted LNS), delayed verbal memory (DLL) and psychomotor speed (VMT), respectively.The adapted LNS test replaced the SCIP Verbal Fluency subtest, since a digital format of verbal fluency could not be adequately implemented due to insufficient accuracy of ASR for rapidly uttered words with sometimes only slight pronunciation differences (Hafiz et al., 2019).

Revision of tests for ICAT version 2
The ICAT was optimized from its original version published in Hafiz et al. (2019) through improvement of the DLL and CR tests.In our feasibility study, we had found no association between the ICAT DLL and SCIP Verbal Learning Test (VLT) -Delayed (Hafiz et al., 2019).This was due to: (i) participants uttering words too fast, quiet, and far away from the microphone and (ii) low accuracy of the ASR to certain words.To mitigate such limitations to the automatic speech recognition technology, updated video and audio instructions were included which emphasized how the test participant should speak clearly and separate words with small pauses in-between.To further improve the precision of the speech recognition, we updated the list of words provided in the SCIP-manual (Purdon, 2005) with a new list that yielded a higher accuracy in the speech recognition component.The new words for the revised ICAT word list were chosen from the Rey Auditory Verbal Learning Test (RAVLT) and had an approximately equal frequency in the Danish language compared to the old ICAT list to ensure a similar difficulty level.Another benefit of this approach was that the revised word list had no overlap with words on the SCIP form 3 administered in the study.
The ICAT CR test originally included a drag-and-drop number sorting module -instead of having the participant count backwards as in the SCIP -as a distraction before recalling the letter sequence.However, no association was observed between the ICAT CR test and the SCIP Working Memory Test (WMT) (Hafiz et al., 2019), possibly due to lower cognitive load of the drag-and-drop number sorting distractor task.We therefore redesigned the distraction task to include a speech interface, with participants counting backwards out loud as in the SCIP WMT.To ensure that participants did in fact count backwards, they were told by the system that their responses were recorded (even though they were not).Further, during this "recording" a sound wave was shown to illustrate that sound was being recorded (Fig. 1.B).

Assessment of functional capacity and subjective cognitive complaints
Participants' functioning was assessed with the Functioning Assessment Short Test (FAST; Rosa et al., 2007), a brief observer-based rating scale that patients complete with a trained clinician.The FAST is designed to measure main functional impairments typically experienced by patients with BD and is recommend as a functional capacity assessment tool by the ISBD Targeting Cognition Task Force (Miskowiak et al., 2017).Participants also filled out the Cognitive Complaints in Bipolar Disorder Rating Assessment (COBRA; Rosa et al., 2013), a brief questionnaire assessing subjective cognitive difficulties in daily life situations.

Assessment of usability
System usability of the revised ICAT was assessed with the Poststudy System Usability Questionnaire (PSSUQ), assessing computer application usability satisfaction along four dimensions: overall usability, system usefulness, information quality, and interface quality (Lewis, 1995).Upon completing the questionnaire, qualitative feedback was collected from patients who were asked to further comment on their experience of using ICAT.

Statistical analyses
Statistical analyses were carried out using IBM SPSS statistics 25 for windows (Field, 2013).Statistical significance for all analyses was set to an alpha-level of p<.05 (two-tailed).The two groups were compared using independent samples t-test for normally distributed data.If the assumption of normality was violated, non-parametric Mann-Whitney U tests were carried out instead.Groups were compared on demographic (age, years of education) and clinical variables (subsyndromal depressive and manic symptoms) as well as on premorbid intellectual ability estimated from participants' error score on the DART using the following formula (IQ estimate=128-(0.83*DARTerror score) as proposed by Nelson and Willison (1991).A χ2 test was applied to investigate any potential differences in gender distribution between the two groups.
The sensitivity of the ICAT and SCIP tools for assessment of cognitive impairment was assessed for each subtest, as well as for the total score of each instrument.For independent sample t-tests, Cohen's d was calculated as a measure of effect size.For Mann-Whitney U tests, r was calculated as an appropriate effect size using the formula r=Z / √N (Fritz et al., 2011).Significant between-group differences were followed up with post-hoc ANCOVAs with ICAT or SCIP score (total or subtest scores) as the dependent variable, group (BD vs. HC) as the independent variable, and any demographic or clinical variables on which the two groups differed as covariates.
The concurrent validity of the ICAT compared to SCIP was investigated with correlation analyses using Pearson's r or Spearman's σ for normally and non-normally distributed data, respectively.Associations were investigated between the total scores of ICAT and SCIP, as well as between performance on each of the corresponding subtests.Performance on the ICAT adapted WAIS LNS subtest was also correlated with the original pen-and pencil version.For the ICAT LL and DLL subtests, words recalled by the ICAT ASR component were correlated with manual transcripts recorded by the assessor.Finally, objective neuropsychological performance was correlated with observer-based measures of functional capacity and subjective cognitive impairments, respectively.Significant associations were followed up by partial correlations controlling for any demographic or clinical variables in which the groups differed.
Tentative cut-off scores for cognitive impairments were estimated for ICAT based on the standard deviation (SD) of the HCs.The tentative cutoff scores were established at performance level≥1 SD and≥0.5 SD below the normative mean for the subtests and the total score, respectively, as recommended by the ISBD Task Force Targeting Cognition for clinically relevant thresholds for cognitive impairments in BD (Miskowiak et al., 2018).
Word error rate (WER) was used as the primary performance measure of the ICAT ASR component using the formula WER=(S+D+I)/N, where N is the total number of words, D is the number of deletions, S is the number of substitutions, and I is the number of insertions.WER was calculated manually by comparing ASR transcripts to verbal responses recorded during the ICAT LL and DLL subtests.

Demographics and clinical variables
Demographic and clinical data is presented in Table 1.We included n=35 participants with BD (26 females; age, mean ± standard deviation [SD]: 31.7±9.0)and n=35 HC (21 females; age, mean ± SD: 28.4±6.0).Comparisons between BD and HC revealed that the two groups were comparable for age, gender, education, and premorbid IQ (p s ≥.092).Patients with BD were in full or partial remission, but nevertheless they displayed more subsyndromal symptoms of depression (U=148.0,p<.001, r=0.68) and mania (U=444.0,p=.018, r=0.28) compared to HCs.
Across the entire sample, verbal responses recorded by the ICAT ASR component were strongly correlated with verbal responses recorded manually by the assessor for both the LL (r(65)=.91,p<.001) and DLL (r (64)=.96,p<.001) subtests (Table 4).Participants' recall and recognition accuracy of the ASR component is presented in Fig. 3. Overall, 1,640 Danish words were received by the ASR component during the ICAT LL and DLL tests with an average word error rate (WER) of 8.8%.For the individual Danish words included on the list, eight out of 10 words had an accuracy >90%, whereas two words ('dør' and 'kylling') had slightly lower but still acceptable ASR accuracy of 85% and 88%, respectively (Fig. 3).Across the LL and DLL, the percent correctly identified words by the ASR was 92%.

Association between ICAT, functioning and cognitive complaints
Across the entire sample, there was a moderate correlation between lower ICAT total scores and more functional impairments (r(59)=-.43,p<.001).This correlation was also significant within the patient group (r (30)=-.32,p=.04).In contrast, SCIP total scores were not significantly associated with functional impairments across the entire sample or within the patient group (p s ≥.10).

Tentative cut-off scores for cognitive impairment
Based on the performance of the HCs, we propose the cut-off of <68 points for the ICAT total score using a threshold for cognitive impairment on this global cognition measure of 0.5 SD under the HC mean, in line with the ISBD Targeting Cognition Task Force recommendations (Miskowiak et al., 2018).The following cut offs for the individual ICAT sub-tasks: <18 for the LL subtest, <20 for the CR subtest, <10 for the adapted WAIS LNS subtest, <4 for the DLL subtest, and <7 for the VMT subtest, in line with the recommended threshold for cognitive impairment of ≥1 SD under the HC mean (Miskowiak et al., 2018).

Usability of the ICAT
A total of 67 (96%) participants completed the PSSUQ usability questionnaire.They reported high usability of the ICAT system as measured by their perceived overall usability (M=4.1,SD=0.5 of max 5 on the PSSUQ).In the qualitative feedback, seven participants reported that it was more challenging to memorize words being read aloud by the ICAT application during the LL subtest compared to the face-to-face format of the SCIP VLT-I subtest.Additionally, four participants reported that the ICAT VMT subtest was more difficult to complete on a keyboard than the paper-and-pencil format of the SCIP PST subtest.

Discussion
We investigated the sensitivity and validity of a novel web-based and self-administered cognitive screening tool (the revised ICAT) in fully or partially remitted patients with BD and HC.Consistent with our first hypothesis, patients with BD displayed impaired cognitive performance compared to HCs, as measured by the ICAT total score comprising all five subtests as well as on three out of five subtests tapping into verbal learning and memory, working memory, and psychomotor speed, respectively.This suggests that ICAT is sensitive to cognitive impairments in BD.In line with our second hypothesis, performance on ICAT was strongly correlated with performance on the validated objective screening tool SCIP, indicating that ICAT provides a valid assessment of cognitive functioning.Finally, participants reported high levels of satisfaction with the ICAT system, thus indicating good feasibility and user friendliness.
The strong correlation between the ICAT and SCIP total scores, and between ICAT subtests and their paper-and-pencil counterparts, indicates high concurrent validity of ICAT.This reflects an improvement from ICAT version 1, which showed only moderate correlations with the SCIP in a sample of healthy participants (Hafiz et al., 2019).The ICAT also seems to have higher concurrent validity than the other web-based cognition screening tool designed specifically for affective disordersthe THINC-it -for which total scores showed a moderate correlation (r=0.4) with standardized neuropsychological tests and only two of four subtests correlated significantly with neuropsychological tests (Harrison et al., 2018).One explanation is that THINC-it employs a gamified test format that mayin contrast with neuropsychological testsalso tap into reward processing that is closely related to depressive symptoms (Nusslock and Alloy, 2017).The ICAT employs a less gamified design  and has greater resemblance with traditional neuropsychological tests, and its results may thus be less influenced by mood symptoms.Another advantage of ICAT compared to other web-based cognitive tests (Domen et al., 2019;McIntyre et al., 2017) is its automatic assessment of verbal recall based on speech recognition.This is an important feature because verbal memory impairments are often pronounced in affective disorders and associated with functional disability (Arts et al., 2008;Bonnín et al., 2010;Tse et al., 2014).Indeed, the ICAT List Learning and Memory tasks which applies the ASR technologydisplayed high concurrent validity and strong correlations with manual transcripts.Notably, participants recalled fewer words from the ICAT than SCIP word lists.This could be due to their greater difficulty with remembering words read aloud from a computer than by the research assistant, from whom they could read mouth movements to assist their learning.It is also possible that their poorer retention of the ICAT words was influenced by their awareness of having to pronounce the words clearly, which could have diminished their attention to memorizing the words.
The finding of a moderate association between psychosocial functioning, measured by the FAST, and the ICAT processing speed test is in accordance with previous reports of associations between this cognitive domain and functioning in BD (Mur et al., 2009).In contrast, the observed significant -albeit only moderate -association between ICAT performance and subjective cognitive difficulties according to the COBRA diverges from previous findings of a poor relationship between objectively and subjectively measured cognitive difficulties (Demant et al., 2015;Van der Elst et al., 2008).A possible explanation is that the self-administrated format of ICAT more closely resembles the cognitive challenges that patients must tackle by themselves in their daily life, since completing the ICAT tests involves minimal guidance and expectations from a neuropsychologist.Indeed, guidance and expectations during face-to-face assessment might facilitate motivational processes, which can have a positive confounding influence on test effort during assessment (Greher and Wodushek, 2017).
The finding that patients showed only a trend towards impaired performance on the SCIP diverges from previous studies that have demonstrated good sensitivity of the SCIP to cognitive impairments in BD (Jensen et al., 2015;Rojo et al., 2010).Notably, 63% of patients in the present study were newly diagnosed with BD (< 3 years) upon study inclusion.Moreover, in the current study, the total illness duration and mean age were lower than in other validation studies of the SCIP in remitted BD (mean illness duration 9 years and mean age 32 years in the present study vs. 12 years and 41 years in previous studies) (Rojo et al., 2010).Importantly, longer illness duration has been associated with more cognitive impairments in BD (Cardoso et al., 2015).It is therefore conceivable, that the limited sensitivity of the SCIP and of some of the ICAT subtests was due to the inclusion of recently diagnosed and hence less impaired patients.Indeed, when applying a SCIP total score cut-off point for cognitive impairments in BD of <70, suggested by Jensen et al. (2015) we identified 16% of our patients (vs.6% of HCs) as cognitively impaired compared to 30-59% in other previous validation studies of the SCIP (Jensen et al., 2015;Rojo et al., 2010).Given this, it was remarkable that ICAT was sensitive to cognitive impairment in patients on the total score and on three out of five subtests.Following adjustment for subsyndromal mood symptoms, patients' verbal learning and psychomotor impairments as measured by ICAT rendered non-significant, whereas the significant group difference prevailed on the ICAT working memory test.This finding corroborates with meta-analytic evidence for ameliorated verbal memory and psychomotor speed during remission, whereas executive function deficits are more independent of clinical mood states (Kurtz and Gerraty, 2009).
There are numerous implications of a valid web-based cognitive screening tool.First, assessment of cognition is not systematically implemented in the clinical management of BD, which is partly due to lack of consensus on which cognition screening tools to use, as well as limited resources of health care providers (Miskowiak et al., 2018).Even brief, easily administered tools such as the SCIP require training, time and the facilities and resources for conduct face-to-face assessment.A valid web-based tool such as ICAT can therefore have important implications for assessing objective cognitive functioning in daily settings by enabling remote, self-administered assessment.This enables cost-efficient cognitive screening, providing cognition scores which can be uploaded directly to electronic medical records accessible to clinical staff in hospitals or practitioners in primary care.Second, ICAT can help monitor patients' cognitive status over time to help detect a potential cognitive decline or improvements following changes in treatment and lifestyle.Third, in research trials investigating pro-cognitive treatments, the possibility to efficiently pre-screen eligible participants for objective cognitive impairments across large geographical distances would represent an immense methodological advancement in recruitment (Miskowiak et al., 2017).Finally, ICAT could enable unprecedented large-scale, longitudinal studies of cognitive functioning in patients with affective disorders in nationwide register-based studies including thousands of participants.
The present study has several limitations.First, the sample size, with 35 patients and 35 healthy controls, was relatively small for this ICAT validation study.Post hoc power calculations showed that the power was high for estimating the sensitivity to cognitive impairment in BD with the ICAT total scores, but only moderate for three (LL, LSN and VMT) and low for two ICAT subtests (CR and DLL).Further, the suggested cut-off scores can only be considered tentative since optimal cutoff scores for cognitive impairments must be determined in a larger sample by means of receiver operating characteristic (ROC) analysis.Second, it was not possible to counterbalance the order of assessment, and all participants therefore completed the SCIP first, which may have introduced a learning effect, such that performance on the ICAT tests was possibly better than it would have been had it not been proceeded by the SCIP tests.However, speaking against this, participants displayed more deficits on the ICAT compared to the SCIP.Third, although the brevity of the ICAT tests represents a clear advantage, the tool does not measure all aspects of cognitive functioning, such as problem solving or shifting aspects of EFs.Moreover, ICAT does not address deficits within affective cognition although such deficits have been linked to functional impairments in BD (Miskowiak et al., 2019).Nevertheless, ICAT is meant for screening purposes only and cannot replace a complete neuropsychological examination.Fourth, the ICAT ASR component displayed slight inconsistencies for the List Learning scores which may be unavoidable due to the early stage of ASR technology (König et al., 2018;Pakhomov et al., 2015).Finally, ICAT testing was conducted in a clinical setting rather than in patients' homes and may thus not be representative of scores obtained in home settings with different physical (and social) contexts.Future studies are thus required to established whether the ICAT has adequate validity and reliability in a remote administration setting.Finally, a limitation of the ICAT is that a lack of computer skills would impact on participants' ICAT scores.However, participants in the present study did not experience difficulties using the computer.This was perhaps because of their relatively young age (years, mean±SD: 30±8) and because of the specific instructions in the introductory ICAT video clip and in the auditory and written instructions before each ICAT test regarding which key(s) to press (Hafiz et al. 2019).Nevertheless, potential technical difficulty is an important aspect which we will assess with a brief online questionnaire in our planned next study in which participant will complete the ICAT in their home settings.
In conclusion, ICAT showed adequate concurrent validity, suggesting that this tool can be used for remote, self-administered assessments of objective cognitive functioning in fully or partially remitted patients with BD.The ICAT was even more sensitive to cognitive impairments than the SCIP and may be more sensitive in capturing perceived real-life functional and cognitive difficulties.The use of real-time ASR for assessment of verbal memory, as well as the close resemblance to a standard neuropsychological screening tool, represents important advantages of the ICAT system as compared to existing web-based cognitive screening tools.Based on the insights derived from this study, a slight optimization of the ICAT is now possible as well as the development of a parallel version for repeated testing.Future larger studies are now warranted to investigate the test-retest reliability of the ICAT and psychometric properties of a parallel ICAT test, identify optimal cut-off scores for cognitive impairments using receiver operator characteristic (ROC) analysis, and address the important question of how the tool fares outside of a clinical setting (i.e., with remote testing) as well as in other patient groups that may benefit from a screening for cognitive impairments.

Declaration of competing interest
KWM reports having received consultancy fees from Lundbeck and Janssen-Cilag in the past 3 years.LVK has been a consultant for Lundbeck for the past 3 years.The remaining authors report no conflicts of interest.
Fig. 1. A. The speech interface of the ICAT List Learning and Delayed List Learning subtests.B The interface of the ICAT Consonant Repetition subtest.C. A practice sequence in the ICAT adapted WAIS Letter/number-sequencing subtest.D. The interface of the ICAT Visuomotor Tracking subtest.
note.BD: Bipolar Disorder.HC: Healthy Control.SD: Standard Deviation.HDRS-17: 17 item Hamilton Depression Rating Scale.YMRS: Young Mania Rating Scale.* = p < .05(two-tailed), ** = p < .01(two-tailed).a Missing data for n = 2 BD patients and n = 2 HCs.b Missing data for n = 2 BD patients.c Duration since diagnosis and illness duration was defined as the time from the first manic, hypomanic, or mixed episode to the time of diagnosis or assessment, respectively.

a
Missing data for n = 2 BD patients and n = 1 HCs.b Missing data for n = 1 HC.c Missing data for n = 1 HC.d Missing data for n = 2 BD patients and n = 2 HC. e Missing data for n = 1 BD patient and n = 1 HC.f Missing data for n = 4 BD patient and n = 3 HC.g Missing data for n = 1 BD patient and n = 3 HC.h Missing data for n = 1 BD patient.mood symptoms (p s ≥.28).

Fig. 2 .
Fig. 2. A. Performance on the ICAT total score in the HC and BD groups.Error bars represent standard errors of the mean (SEM).B. Performance on the five ICAT subtests (A) and SCIP subtests (B) in the HC and BD groups.Error bars represent SEM. C. Scatterplot of the correlation between the ICAT and SCIP total scores across the entire sample (r = .72).Patients in the BD group are represented by red dots.Participants in the HC group are represented by blue dots.

Fig. 3 .
Fig. 3. Total number of recalls versus the recognition accuracy of the Automatic Speech Recognition component for the Danish words in task 1 (List Learning) and task 4 (Delayed List Learning).

Table 1
and WMT tests  (U=448.5, p=.08, r=.21).No group differences were observed for the remaining SCIP subtests (p s ≥.145; Table2and Fig.2.B).The group differences remained non-significant when adjusting for subsyndromal Demographic and clinical characteristics of the BD and HC group.

Table 2
Scores on the Internet-Based Cognitive Assessment Tool (ICAT), corresponding subtests on the Screen for Cognitive Impairment in Psychiatry -Danish (SCIP-D), functional capacity and subjective cognitive functioning between the HC and BD groups.

Table 3
Number and percentages of patients and HC achieving a maximum score on the SCIP and ICAT subtests with fixed score range.

Table 4
ICAT correlations with SCIP-D, manually recorded measures, functional capacity, and subjective cognitive impairment across the entire sample (n = 70).List Learning.CR: Consonant repetition.WAIS IV LNS: Weschler Adult Intelligence Scale III: Letter-Number sequencing.DLL: Delayed list learning.VMT: Visuomotor Tracking.ASR: Automatic Speech Recognition.VLT-I: Verbal Learning Task -Immediate.WMT: Working Memory Task.VFT: Verbal Fluency Task.VLT-D: Verbal Learning Task -Delayed.PST: Psychomotor Speed Task.FAST: Functional Assessment Short Test.Cognitive Complaints in Bipolar Disorder Rating Assessment.