Cognitive Functioning in Clinically Stable Patients with Bipolar Disorder I and II

Objectives Bipolar disorder is accompanied by cognitive impairments, which persists during euthymic phases. The purpose of the present study was to identify those neuropsychological tests that most reliably tell euthymic bipolar patients and controls apart, and to clarify the extent to which these cognitive impairments are clinically significant as judged from neuropsychological norms. Methods Patients with bipolar disorder (type I: n = 64; type II: n = 44) and controls (n = 86) were examined with a comprehensive neuropsychological test battery yielding 47 measures of executive functioning, speed, memory, and verbal skills. Multivariate analysis was used to build a model of cognitive performance with the ability to expose underlying trends in data and to reveal cognitive differences between patients and controls. Results Patients with bipolar disorder and controls were partially separated by one predictive component of cognitive performance. Additionally, the relative relevance of each cognitive measure for such separation was decided. Cognitive tests measuring set shifting, inhibition, fluency, and searching (e.g., Trail Making Test, Color-Word) had strongest discriminating ability and most reliably detected cognitive impairments in the patient group. Conclusions Both bipolar disorder type I and type II were associated with cognitive impairment that for a sizeable minority is significant in a clinical neuropsychological sense. We demonstrate a combination of neuropsychological tests that reliably detect cognitive impairment in bipolar disorder.


Introduction
Even though recurrent episodes of depression and mania are the hallmarks of bipolar disorder, an association between bipolar disorder and cognitive impairment has also repeatedly been described, foremost regarding executive function, attention, processing speed, and verbal-and episodic memory [1][2][3]. Yet, it is undecided whether cognitive impairment is a general trait of bipolar disorder. Bipolar patients with a history of psychotic symptoms have been suggested to be more cognitively impaired than non-psychotic patients [4]. Also, earlier studies disagree as to whether bipolar disorder I and II differ with respect to cognitive impairment. Some studies report no differences between these subtypes [5,6], whereas others suggest that bipolar disorder I patients perform poorer than patients with bipolar disorder type II [7][8][9].
A recent meta-analysis of cognition in bipolar disorders showed a large degree of inconsistency across studies and that case-control differences were smaller than previously thought [10]. Even though cognitive impairment during euthymic periods are thought to contribute to the bipolar patient's difficulties in everyday occupational and social functioning [11], the clinical significance of case-control differences in cognitive function remains to be determined.
The magnitude of these case-control differences across studies assessing cognitive functioning has guided the choice of cognitive tasks recommended by the International Society of Bipolar Disorders [12]. However, studies of cognitive functioning have an innate difficulty concerning the choice of appropriate statistical method. While not yet on par with 'omics' datasets, modern neuropsychological test batteries yield dozens of inter-correlated measures, which demand efficient statistical tools to extract robust trends and to avoid false-positives. Here, we therefore used orthogonal partial least-squares to latent structures (OPLS) [13] in order to characterize cognition in bipolar disorder. OPLS in its discriminant analysis form (OPLS-DA) splits the systematic variation in a dataset (47 neuropsychological test results for each participant in the present case) into two parts. One is predictive of class membership (i.e., bipolar I, bipolar II, healthy controls in the present case) and the other is uncorrelated or orthogonal to the classes. This partitioning greatly facilitates model interpretation and identifies the combination of neuropsychological variables that separate pre-defined groups.
The aims of this study was (i) to identify those neuropsychological tests that most reliably tell euthymic bipolar patients and healthy controls apart; (ii) to clarify the extent to which these cognitive impairments are clinically significant in a neuropsychological sense as opposed to merely statistically different from a healthy control group; (iii) to elucidate if patients with bipolar disorder I and II are cognitively dissimilar; (iv) to elucidate if the degree of cognitive impairment has bearing on psychosocial and clinical variables. We studied these questions in a clinical cohort of euthymic bipolar patients and matched healthy controls that had completed a comprehensive neuropsychological test battery. Some parts of the neurocognitive data from patients and controls enrolled in this study have been used in a previous study that used univariate analysis to compare cases and controls [6]. Here, we the OPLS-DA procedure that we hypothesized would uncover information on neurocognitive performance that is unavailable when using univariate analyses.
the Affective Disorder Evaluation (ADE), which is a semi-structured interview that includes adapted versions of the mood and psychosis modules of the Structured Clinical Interview for DSM-IV (SCID), and was developed for the Systematic Treatment Enhancement Program of Bipolar Disorder (STEP-BD) project [17]. These diagnoses were confirmed by a consensus panel of experienced clinicians whereby a best estimate diagnosis was reached. Thus 64 participants met the criteria for bipolar I disorder and 44 for bipolar II disorder (n = 44). Co-morbid psychiatric disorders were screened for by using the Mini International Neuropsychiatric Interview (M.I.N.I.) [18]. To screen for alcohol and substance abuse, the self-report questionnaires Alcohol Use Disorders Identification Test (AUDIT) [19] and Drug Use Disorders Identification Test (DUDIT) [20] were used. In addition to information regarding age, sex, and level of education, records were available concerning age at first symptom, age at first psychosis (if any), lifetime history of psychosis, number of affective episodes, electroconvulsive treatments, number of sick leave-days the last year, and primary income source. Data on medication was collected at the baseline diagnostic assessment and the somatic examination. Drug information was therefore harvested from the date closest in time to the date of testing. The severity of bipolar disorder was rated using the Clinical Global Impression (CGI) [21] rating scale. Overall functioning was assessed with GAF [22].
The age-and sex-matched healthy, population-based control subjects (n = 86) were randomly selected by Statistics Sweden (SCB). The screening for past or present psychiatric disorders was performed using M.I.N.I. [18]. The exclusion criteria for healthy controls have been described in detail previously [25].
Some parts of the neurocognitive data from patients and controls enrolled in this study have been used in a previous study that used univariate analysis to compare cases and controls [6].

Neuropsychological test procedure
Participants were assessed with 21 tests, most of which are described in detail by Lezak et al., [26] tapping key aspects of cognition, including executive function and attention, processing speed, memory and verbal skills. The battery usually required two sessions with patients, whereas the controls were assessed during a single session. The following tests were used.
The Claeson-Dahl Verbal Learning and Retention Test is a word list learning task that presents 10 words for a maximum of ten learning trials. Measures of importance are the learning score, the retention score, and the recognition score.
Five stand-alone tests from the Delis-Kaplan Executive Function System (D-KEFS): the Color-Word Interference Test (condition 1: Color Naming, condition 2: Word Reading, condition 3: Inhibition, condition 4: Inhibition/Switching), the Design Fluency Test (condition 1: Filled Dots, condition 2: Empty Dots Only, condition 3: Switching, the Tower Test (total achievement Score and total rule violations), the Trail Making Test

Statistical procedures
For each cognitive measure, skewness and kurtosis were determined and then when appropriate data were transformed according to the ladder of powers. The direction of the results was adjusted such that high scores represented good performance. Following unit variance scaling and mean-centering, data were modeled by means of OPLS-DA, implemented by SIMCA-P 13.0 software (Umetrics AB, Umeå, Sweden). The OPLS-DA procedure identifies correlation patterns that discriminate between pre-defined groups and assesses the relative importance of each test variable for the discrimination [13]. Furthermore, conventional PLS calculations were performed, which apply to the two-block regression problem. Psychosocial and clinical variables (e.g., MADRS scores, number of depressions) were used to construct models of cognitive performance (47 neuropsychological measures) in patients. Table 1 shows demographic and clinical characteristics for the bipolar disorder group and the control group. Premorbid IQ as assessed by years of education was not significantly different In work (%) 58 67

Results
History of psychosis (%) 73 7 Age at first psychosis 27 11 26 9 Antipsychotic medication (%) 32 11 Lithium (%) 68 48 Anticonvulsants (%) 32 32 ECT (%) 30 9 Comorbid ADHD (%) 12 26 Comorbid anxiety disorder (%) 28 27 Alcohol abuse (%) 24 23 Substance abuse (%) 9 2 5 The controls (n = 86) were matched for age and sex (X% female). No differences were found regarding education level between the bipolar disorder groups and the control group. between the bipolar disorder group and the control group (F (2,193) = 1.02, p > 0.05, Partial Eta Squared = 0.01). As expected, patients with bipolar I disorder showed a higher occurrence of prior psychosis and treatment with antipsychotics than bipolar II patients. The OPLS-DA procedure yielded a model that was significant by cross-validation. The predictive component accounted for 13% of the neuropsychological variation, with a prediction ability of 0.16, according to cross validation. This indicates that the case and control groups were partially separated on the basis of their cognitive performance, the overall pattern being that the two bipolar groups performing similar and somewhat poorer than controls (Fig. 1).
The order of presentation of the individual neuropsychological measures in Table 2 reflects how well they contributed to the separation between the groups (i.e., the size of the OPLS-DA loadings). The tests with the strongest class discriminating ability were Trail Making Test 2 (number sequencing), Trail Making Test 3 (letter sequencing), Trail Making Test 4 (numberletter switching), Symbol Search, Verbal Fluency 2 (category fluency), Verbal Fluency 3 (category switching), Color-Word 3 (inhibition) and Color-Word 4 (inhibition/switching). Table 2 also presents the mean and 95% confidence interval (CI) for each measure. There was little CI overlap between bipolar patients and healthy controls on the top-loading measures, suggesting reliable case-control differences. By contrast, the CIs of the two patient groups (bipolar I and II) intersected in the vast majority of cases.  Table 2 also shows that the proportion of patients whose performance fell 1.25 standard deviations below controls was sizeable on the top-loading tests. For instance, 48% of patients with bipolar I disorder and 43% of patients with bipolar II disorder fulfilled this criterion for impairment on Trail Making Test 4.
A series of PLS models were created using patient data in order to assess the extent to which clinical and psychosocial variables could be understood in terms of cognitive performance. Regressing MADRS and YMRS scores separately against cognition yielded non-significant models, suggesting that the neuropsychological performance was not related to mood within the present scale interval (i.e., MADRS/YMRS scores below 14). Likewise, non-significant models were obtained when cognitive performance was regressed against GAF and CGI scores, the number of affective episodes, ADHD co-morbidity, a history of psychosis, or number of sickleave days. Non-significant models were also obtained when regressing ongoing drug treatment (antidepressants, antipsychotic medication expressed as CPZ equivalents, lithium, and anticonvulsants) to cognitive performance in patients.

Discussion
Neuropsychological test batteries generate large datasets with many variables that are challenging to summarize and evaluate using traditional statistical tools that primarily are designed to analyze 'long-and-lean' data tables [13]. Some of the between subject variation in performance on a broad cognitive battery is shared by all tests, therefore the performance on a single test is related to latent mental ability required to perform all tests [27]. Statistical tools developed to deal with large datasets with inter-correlated measures in other research areas may therefore be useful for the study of cognition in mental illness. Here, we employed the OPLS-DA to sift the results from 21 neuropsychological tests, yielding 47 performance measures, in patients with bipolar disorder and healthy controls, the aim being to identify specific tasks that detect cognitive weaknesses in the patient group.
As applied here, OPLS-DA partitioned the variation in neuropsychological performance into two components, one being associated with group membership and the other irrelevant. When examining the explained variance in more detail it was apparent that only measures from nine tests defined the class separation: Trail Making Test, Verbal/Design Fluency, Color-Word, Tower Test, Symbol Search, Block Design, Letter-Number Sequencing, and Symbol Coding. The remaining 12 tests, including measures of semantic knowledge, verbal learning and some tests of working memory/attention, did not tell the groups apart.
The combination of nine identified tests may hence be considered for identifying cognitive impairment in bipolar patients in research, and possibly in a clinical setting as well. The set of tests identified here are with some exceptions the same as, or similar to, the ones recommended for inclusion in a battery for cognition in bipolar disorder by the International Society for Bipolar Disorder (ISBD) on the basis of meta-analyses [12]. For instance, the usefulness of the Trail Making Test A, Letter-Number Sequencing, Category Fluency and Digit Symbol Coding was confirmed in our study. Moreover, our analysis revealed the importance of TMT 4 (similar to TMT B) and Color-Word Interference Test, which are in line with the result from a recent evaluation of the clinical efficacy of the tests recommended by ISBD [28].
However, some of the tests proposed by Yatham et al. [12] received relatively low loadings in the present study and did not differentiate between patients and controls. These included tests for attention/vigilance (Continuous Performance Test), verbal learning and memory (Claeson-Dahl Verbal Learning and Retention Test), and visual learning (most measures of the Rey Complex Figure Test). Our analysis reveals that some of the tests suggested by Yatham et al. [12] might be redundant, being unable to differentiate between patient and controls. It   could also be that patients with problems in these domains may constitute a cognitive subgroup that was insufficiently represented to gain statistical significance in the present sample.
It should be noted that the tests used in present study are diagnostically non-specific. For example, deficits on the Trail Making Test have been observed in conditions as disparate as posttraumatic stress disorder [29] and myotonic dystrophy type 1 [30]. Nevertheless, the tests do go some way in directing researchers' attention to particular cognitive domains such as, in the present case, searching, fluency, switching and inhibition, i.e., processes with a distinct 'prefrontal' flavour, which could be specified further using more fine-grained cognitive instruments.
As to the question whether the cognitive impairment in bipolar disorder is clinically significant, inspection of the group means and their CIs for the nine discriminatory tests confirmed their usefulness for identifying impairments in the patient groups. For a substantial minority of the patients the impairments approached clinical significance (as defined by a performance 1.25 SD below controls) on certain tests (e.g., for >40% on Trail Making Test 4 and Block Design tests). Such findings may be rooted in the fact that bipolar disorder is associated with palpable changes in brain structure, such as ventricular enlargement and shrinkage of medial temporal lobe areas [31]. A reduction in temporal lobe gray matter has been associated with decline in intellectual function and with numbers of mood episodes in patients with bipolar disorder [32]. However, we emphasize that the OPLS-DA-aided group separation was partial and incomplete, with non-trivial overlap in the overall cognitive performance between patients and controls (see Fig. 1). Hence, the observed group differences could not be used for predicting diagnoses on an individual level.
We investigated whether the subtypes of bipolar disorder differ in neuropsychological terms but found that patients with bipolar I disorder were indistinguishable from patients with bipolar II disorder on 16 of the 18 cognitive measures (89%) with discriminant loadings 0.15. This correspond to the results from our own group where we used measures of verbal and visual memory test and executive function to test this hypothesis using univariate statistics in the same cohort, and also the findings by a previous study by Dittman et al. [5], but contrasts with the results showing that bipolar I patients have more severe cognitive dysfunction compared with patients with bipolar II [7][8][9]. By and large, the cognitive impairments seen in patients with bipolar I and bipolar II disorder are hard to differentiate and appear to have much in common [5,6]. Bipolar disorder can take on psychotic manifestations requiring antipsychotic medication, and is commonly accompanied with excessive alcohol/drug use [33] and psychosocial and occupational problems [34]. Difficulties of this nature were observed also in the present patient group (see Table 1) and might conceivably have contributed to the observed cognitive deficits [35,36]. Our attempts to model these relationships showed that medication alone could not explain the variance in cognitive performance, which was somewhat surprising given that our own group showed that treatment with antipsychotics was associated with worse performance on the time to draw parameter of the Rey complex figure test, number sequencing, letter sequencing and number-letter switching conditions of the Trail making test and all trials of the Verbal fluency test [6]. An explanation might be that the multivariate modelling creates a new summarizing variable, which captures a latent structure in cognitive performance and is somewhat different from the original variables. However, results in the current study are in line with re-analysis of earlier studies suggesting that most neuropsychological tests do not exhibit any significant association with ongoing pharmacological treatment [10].
As to the other clinical and psychosocial background factors (viz. co-morbidity with ADHD, substance/alcohol abuse, GAF scores, number of affective episodes or sick-leave days), the associations were not stable enough to gain statistical significance. The modest associations with everyday occupational and social factors may suggest that the tests used here are of questionable ecological validity [37]. Moreover, a meta-analysis, investigating the association between cognitive ability and everyday functioning in bipolar disorder showed results similar to that seen in schizophrenia, with small differences across cognitive domains [38]. The strength of association differed to a greater extent with respect to the functional measurement approach. Nevertheless, the lack of associations in current study are consistent with a recent re-analysis of studies in the field showing that the majority of cognitive measures were not associated with measures of illness severity [10].
By including patients scoring between 0 and 13 on the MADRS/YMRS scales, the criterion for defining euthymia in the present study was more liberal than in earlier studies [1]. This raises the possibility that residual mood symptoms significantly affected performance in the cognitive tasks [39]. However, the shared variance between the mood scale values and overall cognitive performance was not sufficiently stable to generate a valid statistical model. In line with previous studies, [2,40,41] we conclude that mood have limited value in explaining the cognitive impairments in euthymic bipolar patients.
The strengths of the current study are that patients are representative of bipolar patients that receive psychiatric care. At the time of enrollment, virtually all patients with bipolar disorder in the Northern Stockholm catchment area were referred to the Affective unit for work up and treatment. Moreover, the diagnoses have been made with a best-estimate procedure by experienced clinicians specialized in bipolar disorder. The neuropsychological test battery was administered under standardized conditions. Controls were randomly selected from the population in the same catchment area and matched for sex and age rather than being university students or health care workers. A limitation to consider, however, is that information on drug use was not collected on the day of testing, but from either the baseline examination or the day of blood sampling. On the other hand, enrolled patients were in a stable euthymic mood and the overwhelming majority of patients would not have added or discontinued medication between these occasions.