Delirium in hospitalized elderly patients is associated with longer hospital stays, increased likelihood of institutionalization and re-hospitalization, increased mortality rates, and cognitive and functional decline.13 However, delirium is often under-recognized by physicians and other hospital staff.46 Elderly hospitalized patients are at high risk for delirium, with incidence outside the intensive care unit ranging from 14 % to 56 %.1,2,7 Furthermore, the prevalence of delirium superimposed on dementia ranges from 22 % to 89 %,8 creating additional challenges for the recognition of delirium. Given the high rates of delirium in the aging hospitalized population, improving its detection with a brief and reliable tool is essential for enabling proper treatment strategies and mitigating its negative consequences.9,10

To help physicians detect delirium, in 1990 Inouye and colleagues developed the Confusion Assessment Method (CAM),11 and a 2010 review demonstrated that this diagnostic algorithm remains the best screening tool for delirium.12 The CAM algorithm assesses the presence or absence of four diagnostic features of delirium: 1) acute change and fluctuating course, 2) inattention, 3) disorganized thinking, and 4) altered level of consciousness. A diagnosis of delirium requires the presence of features 1 and 2 and either 3 or 4.11 The initial validation studies, and subsequent prospective studies performed by the CAM developers and other experienced research groups, operationalized the CAM algorithm after formal cognitive testing, demonstrating yields of 94 % sensitivity and 89 % specificity relative to a reference standard.1113 However, studies that have operationalized the CAM using observations from clinical care have yielded sensitivity as low as 30 %.1416

To improve the diagnostic accuracy and reliability of the CAM, brief standardized assessments have been developed that map specific test items to CAM features. The Confusion Assessment Method for the Intensive Care Unit (CAM-ICU) is one such assessment, and was developed specifically for non-verbal patients in the intensive care unit.17 The CAM-ICU has been validated for the detection of delirium in mechanically ventilated patients, a population in which it has shown sensitivity of 96 % and specificity of 98 %.17,18 Due to its brevity and simplicity, the CAM-ICU is increasingly being used for detecting delirium in verbal patients outside the ICU setting.

The newly developed 3-minute diagnostic assessment for delirium using the Confusion Assessment Method (3D-CAM), like the CAM-ICU, was developed from the CAM.11 Unlike the CAM-ICU, the 3D-CAM was developed using items specifically designed for a general medicine setting, in which most patients can undergo verbal assessment. The 3D-CAM has also been validated, and has shown 95 % sensitivity and 94 % specificity relative to a reference standard.19

The primary aim of this comparative effectiveness study was to compare the diagnostic test characteristics of the CAM-ICU to the 3D-CAM in relation to a reference standard in hospitalized elderly medicine patients outside the ICU setting. We hypothesized that the 3D-CAM would demonstrate better sensitivity for delirium in this patient population. Secondary aims were to understand why these two assessments might differ in their test characteristics by performing subgroup analyses comparing the test characteristics in patients with normal baseline cognition or mild cognitive impairment (MCI) with those of patients with dementia, and between patients with mild versus moderate to severe delirium.

METHODS

Study Population

Experienced clinicians (clinical psychologists and advanced practice nurses) screened for eligibility from a list of patients admitted to a large teaching hospital in Boston, MA. Inclusion criteria were as follows: 1) age ≥ 75 years and admitted to general or geriatric medicine inpatient services (designed to enroll a purposefully challenging population with a high prevalence of dementia), 2) able to communicate effectively in English, 3) without imminently terminal conditions, 4) expected hospital stay of ≥ 2 days, and 5) not a previous study participant. After approval was obtained from the attending physician, each eligible patient was approached for informed consent, which was obtained for all participants, either from the patient or a designated surrogate decision-maker if the patient lacked capacity. The study was approved by the institutional review board.

Reference Standard Assessment

The reference standard delirium diagnosis was based on an extensive (45-min) face-to-face patient interview, medical record review, and input from the patient’s nurse and available family members, all performed by experienced clinicians. This assessment included 1) the reason for hospital admission and hospital course; 2) family, social, and functional history; 3) Montreal Cognitive Assessment (MoCA), a validated 30-item cognitive assessment that takes approximately 20 min to administer;20 4) assessment for depression, including the Geriatric Depression Scale-Short Form (GDS-SF);21 5) medical record review, including a list of psychoactive medications being administered, and a list of comorbidities quantified using the Charlson index.22 If the patient screened below the threshold for cognitive impairment on the MoCA (≤23), the clinical assessor conducted a proxy interview to assist in determining the patient’s baseline mental status. This included a validated proxy-based screening questionnaire for the presence of baseline dementia, the Alzheimer’s Disease-8 (AD-8).23 The final delirium diagnoses were adjudicated by a study panel, including the clinical assessor, study principal investigator (ERM), and co-investigator (MO). The presence or absence of delirium was determined based on DSM-IV criteria24 (DSM-V was not yet published at the time of the study). The panel also adjudicated the presence or absence of cognitive impairment at baseline, including dementia or mild cognitive impairment (MCI)25,26 and the presence and severity of depression. A blinded geropsychiatrist subsequently re-adjudicated 20 cases (10 randomly selected with delirium and 10 without delirium) to verify the panel adjudication process. There was perfect agreement between the panel and psychiatrist.

Brief Delirium Identification Assessments

The reference standard assessment was always administered first. Afterwards, the 3D-CAM and CAM-ICU were administered in random order by two additional raters who were blinded to the results of the other assessments. The screening assessments were completed by bachelor's-prepared trained research assistants who each received one-on-one training by an expert before the start of the study. All three assessments were completed within a 2-h period (Fig. 1).

Figure 1
figure 1

Study Recruitment and Design. Here the overall study flow is depicted. Eligible patients aged 75 or older admitted to the general medicine service of a large teaching hospital were approached for consent, and 201 patients were enrolled. Of these, 101 participated in this study comparing the 3D-CAM and CAM-ICU. All patients underwent a reference standard delirium assessment by an experienced clinician. Within 2 h of this assessment, they underwent two additional brief assessments, the 3D-CAM and the CAM-ICU, administered in random order, which were performed by trained research assistants blinded to the reference standard assessment and to each other's results.

CAM-ICU

The CAM-ICU assesses the four CAM diagnostic features of delirium using questions that do not require verbal responses. Feature 1 is assessed using the Richmond Agitation and Sedation Scale (RASS).27 If there is a fluctuation on the sedation scale or evidence of acute change, the feature would be considered present. Feature 2 is assessed using the Attention Screening Examination (ASE), involving both the Vigilance A task and picture recognition.28 Feature 3 is assessed based on the patient’s ability to correctly answer two yes-or-no questions and to follow basic commands. Feature 4 is assessed using the RASS, with values other than 0 considered abnormal. After the presence or absence of each CAM feature is determined, the CAM diagnostic algorithm is used to determine the presence or absence of delirium.17,18

3D-CAM

The 3D-CAM operationalizes the four CAM diagnostic features by asking direct questions of the participant (including cognitive testing and patient symptom probes), and then prompts the interviewer to rate ten observational items, including observations of level of consciousness.19 Specific questions are linked to specific CAM features. Feature 1 is assessed through several patient symptom probes (e.g. “Have you felt confused today?”). Feature 2 is assessed using digit-span tasks of three and four digits backwards, and days of the week and months of the year backwards. Feature 3 is assessed using three orientation items and interviewer observations, and feature 4 is assessed by interviewer observation. The presence of any “positive” response (incorrect/no response on a cognitive test, a positive report of a symptom by the patient, or a positive observation by the interviewer) results in the feature being considered present. Similar to the CAM-ICU, once the presence or absence of each CAM feature is determined, the CAM diagnostic algorithm is used to determine the presence or absence of delirium. More details about the 3D-CAM are provided elsewhere.19

Statistical Analyses

Diagnostic test characteristics (sensitivity and specificity) were calculated separately for the 3D-CAM and CAM-ICU in relation to the clinical reference standard. Subset analyses were performed to determine diagnostic test characteristics of these two assessments, stratified by the patient’s baseline cognition (normal/MCI vs. dementia). Finally, to better understand the differences in 3D-CAM and CAM-ICU performance, delirium cases (based on the reference standard) were stratified by severity (mild vs. moderate/severe), and the sensitivity of both instruments was calculated in these two strata. All data analyses were performed using SAS statistical software, version 9.3 (SAS Institute, Inc., Cary, NC, USA).

RESULTS

Patient Characteristics

A total of 101 patients met the inclusion criteria and provided informed consent. Their mean age [standard deviation (SD)] was 84 (5.5) years, and 61 % were women (Table 1). In addition to very old age, this population had a high comorbidity burden, with 52 % having a Charlson score of 3 or higher. There was also a high burden of cognitive and depressive symptoms. Based on the reference standard assessment, 26 % of the population had dementia and 24 % had depression (Table 1).

Table 1 Population Characteristics, Comorbidities, Baseline Cognitive Status, Delirium Rates, and Duration of Assessments

Time Required for the Brief Delirium Identification Assessments

Across the 101 participants, the 3D-CAM was completed in a median of 3 min (interquartile range 3–5 min), while the CAM-ICU was completed in a median of 4 min (interquartile range 3–5 min). Evaluations of patients with delirium and/or dementia took longer (median 5 min for both instruments) than those of patients with neither of these conditions (median 3 min for both instruments).

Delirium Prevalence and Diagnostic Test Characteristics

Among participants, 19 patients (19 %) were diagnosed as having delirium based on the reference standard clinical assessment. The 3D-CAM identified 24 patients (24 %) with delirium, while the CAM-ICU identified 10 (10 %) (Table 1).

The diagnostic test characteristics of the 3D-CAM and CAM-ICU instruments were determined using the reference standard delirium diagnosis. The sensitivity [95 % confidence interval] of the 3D-CAM was 95 % [74 %, 100 %], which was substantially higher than that of the CAM-ICU, at 53 % [29 %, 76 %]. However, the specificity of the 3D-CAM was slightly lower than that of the CAM-ICU (93 % [85 %, 97 %] vs. 100 % [96 %, 100 %], respectively) (Table 2). Notably, of the six 3D-CAM false-positive diagnoses, four were adjudicated to have subsyndromal delirium by the reference standard.

Table 2 Diagnostic Test Characteristics of the 3D-CAM and CAM-ICU Compared to the Clinical Reference Standard Delirium Assessment

Subgroup Analysis: Baseline Cognition

Subgroup analyses based on baseline cognition (normal/MCI vs dementia) are summarized in Table 3. The 3D-CAM performed nearly perfectly in patients with normal baseline cognition or MCI, with sensitivity of 100 % and specificity of 96 %. In patients with dementia, the sensitivity and specificity of the 3D-CAM declined modestly, to 92 % and 77 %, respectively. In contrast, the CAM-ICU had substantially lower sensitivity in both patients with normal baseline cognition or MCI (33 %) and patients with dementia (62 %). Specificity for CAM-ICU was 100 % in both groups.

Table 3 Comparison of Test Characteristics for 3D-CAM and CAM-ICU Stratified by Baseline Cognition (Normal/MCI vs. Dementia)

Subgroup Analysis: Delirium Severity

To better understand the reason behind the differences in sensitivity between the 3D-CAM and CAM-ICU, we stratified the 19 cases of reference standard-identified delirium based on severity (Table 4). We found that the primary difference in sensitivity between the 3D-CAM and CAM-ICU is in cases of mild delirium. In these patients, 3D-CAM had sensitivity of 100 %, while the CAM-ICU had sensitivity of 30 %. The two instruments had similar sensitivity in cases of moderate/severe delirium.

Table 4 Comparison of Sensitivity of the 3D-CAM and CAM-ICU in Mild vs. Moderate/Severe Delirium Cases Based on the Reference Standard

DISCUSSION

We evaluated the diagnostic accuracy of two brief structured instruments for delirium identification that operationalize the CAM algorithm, the well-established CAM-ICU,17,18 and the recently validated 3D-CAM19 in a group of 101 older hospitalized general medicine patients. We found the 3D-CAM had substantially better sensitivity with only slightly worse specificity than the CAM-ICU. These findings were consistent in patients both with and without baseline dementia. The sensitivity of the CAM-ICU was particularly poor in patients with mild delirium. The length of time required to administer the assessments was similar between the two instruments. Because of its brevity and high sensitivity, with good specificity, the 3D-CAM may be a superior brief screening instrument for delirium detection among hospitalized general medicine patients.

In performing a reference standard assessment, maximizing both sensitivity and specificity is critical, since accuracy is paramount. However, in a brief case identification tool, maximizing sensitivity takes priority, because the consequences of false negatives (missing cases of delirium) outweigh the consequences of false positives (unnecessary clinical evaluation). Furthermore, we found that two-thirds of the 3D-CAM “false positives,” in fact, were in patients with subsyndromal delirium, which itself confers negative prognostic consequences.29 In this study, both the 3D-CAM and CAM-ICU demonstrated high specificity, but sensitivity in the CAM-ICU was inadequate for case identification, missing nearly half of the reference standard delirium cases.

Recent studies have demonstrated that the CAM-ICU has low sensitivity in verbal patients. In postoperative and oncology patients, the CAM-ICU demonstrated sensitivity as low as 18 % and 28 %, respectively.30,31 Other studies performed in more acute populations have yielded better results. Among verbal patients in the ICU, the CAM-ICU achieved sensitivity of 73 %,32 while in older patients presenting to the emergency department, the CAM-ICU yielded sensitivity of 68–72 %.33 A systematic review with critically ill patients found pooled sensitivity for the CAM-ICU of 75.5 %.34

Our subgroup analysis stratifying delirium cases by severity sheds light on these discrepancies. We found that the 3D-CAM and CAM-ICU had similar sensitivity in patients with moderate or severe delirium, but that the 3D-CAM performed much better in patients with mild delirium. Thus, the CAM-ICU appears well-titrated to pick up cases of delirium in the ICU, which tend to be more severe, while the 3D-CAM is superior in detecting the milder cases of delirium found on the wards.

Another aim of the current study was to determine whether these brief standardized screening tools would be useful for detecting delirium even when there was underlying dementia. We found that the test characteristics of the instruments in both the normal baseline cognition/MCI and dementia groups mirrored the overall sample. The 3D-CAM was much more sensitive than the CAM-ICU in both cognitive strata, while the CAM-ICU was somewhat more specific than the 3D-CAM, particularly in the dementia group. Reduced specificity in patients with dementia would be expected, since dementia makes delirium assessment more challenging, and may lead to more false positives.8

Overall, we found that the 3D-CAM and CAM-ICU required a similar length of time to administer, with delirious and/ or demented patients taking about 2 min longer than normal patients (median of 5 vs. 3 min). Overall, administration times for the CAM-ICU were longer than those reported in previous studies.18 This discrepancy can be explained by the fact that in the ICU, many patients have a profoundly abnormal level of consciousness (RASS score of −4 or −5), which terminates the CAM-ICU assessment before proceeding to the additional questions. In the general medicine population, it is rare for someone to score a −4 or −5 on the RASS, and thus our median time is more reflective of the time required to complete the entire interview.

To effectively integrate the 3D-CAM into clinical care, physicians or nurses will need to be trained in its use. A recent study found that implementation of a delirium case identification tool into daily nursing practice is achievable in the ICU setting.35 The CAM algorithm without formal cognitive testing has been integrated into clinical care in some settings, but has shown poor sensitivity compared to a reference standard.14,15 The 3D-CAM integrates cognitive testing, and is simpler and more algorithmic, which will allow for ease of training.

Strengths of our study include a design in which all delirium assessments were administered closely in time, while the results of each test were blinded from the other assessors. Furthermore, our study employed a clinical reference standard to compare all instruments. Lastly, the study focuses on a challenging population that is very old (mean age of 84 years) and has a high prevalence of dementia (26 %), allowing us to evaluate the performance of the two case identification tools (3D-CAM and CAM-ICU) for a large fraction of older patients commonly admitted to the general medicine service.

Our study also has potential limitations. First, due to the cross-sectional design, it does not address repeated administration of the 3D-CAM and CAM-ICU for case identification purposes, nor do we have an estimate of the cumulative incidence of delirium in our study population. Second, although we used well-trained interviewers, we did not assess inter-rater reliability. Third, the study focused only on elderly general medicine patients and was conducted at a single hospital. Also, our study population was largely white. Thus, our findings should be confirmed in other populations, including non-ICU surgical patients and those with greater diversity. Finally, our sample size was relatively small, which led to wide confidence intervals for some of our estimates of sensitivity and specificity, particularly in the stratified analyses.

In conclusion, in our study of very old, non-critically ill general medicine patients with a high prevalence of underlying dementia, the 3D-CAM performed better than the CAM-ICU as a case identification assessment for delirium. These findings do not diminish the utility of the CAM-ICU as a case identification tool in the ICU. However, our findings suggest that generalizing this instrument to general medicine patients where the spectrum of delirium is different may be suboptimal. Further research will focus on translating the 3D-CAM into the routine care of older general medicine patients, testing how well it performs when administered by clinicians, and whether improved detection of delirium can result in improved outcomes for vulnerable hospitalized elders.