INTRODUCTION

Screening for early detection and treatment of mental and substance use disorders in primary care settings can improve quality of life, help contain health care costs,1 and reduce complications from co-occurring behavioral health and medical comorbidities.2 Many national family medicine, internal medicine, pediatric, and obstetric organizations, as well as the U.S. Preventive Services Task Force (USPSTF),3 have released general recommendations for behavioral health disorder screening in primary care settings.4 The USPSTF recommends that adults be screened for depression, alcohol abuse, and drug abuse, and that primary care physicians (PCPs) ensure there is appropriate diagnostic follow-up available from behavioral health clinicians. However, rates of screening in community-based physician practices for common behavioral health conditions, such as depression, are low.5 There may be many reasons for these low rates, including behavioral health financing challenges and lack of adequate behavioral health infrastructure to ensure referral and diagnostic follow up.

Value-based payment models support a more holistic approach to providing health care.6 Policy reforms have directly led to the promotion of standardized screening for behavioral health conditions in primary care settings. New payment models, such as the Medicare Shared Savings Program, require behavioral health screening to receive shared savings, thus providing incentives for organizations to invest in the necessary infrastructure changes. Recent reforms support USPSTF-recommended screening that promotes early identification and treatment; integrated care through accountable care organizations (ACOs), medical and health homes; payment reform; parity; and the move to electronic medical records. The Centers for Medicare & Medicaid Services (CMS) now requires that ACOs measure 12-month depression remission rates to meet quality performance standards for shared savings.7 Parallel trends in the provision and financing of clinical care have led to increased interest in identifying mental and substance use disorders in primary care practices. Although most practices currently use informal screening methods,1 it is more effective to identify behavioral health disorders using structured, validated instruments.3 , 4 , 8

Primary care practices need tools that are valid, reliable, brief, easy to administer, free, and easily accessible.9 Determining the number and type of conditions for screening may require data about the behavioral health needs of the patient population that the practice serves.10 Other considerations in selecting appropriate tools include clinical time constraints,10 , 11 workflow, and whether the instrument is provider-administered or self-administered. Tool selection also requires information about psychometric properties (i.e., validity and reliability). Lastly, screening alone does not improve outcomes; education, training, and clinical processes that promote early and effective treatment are also needed,12 along with resources for required diagnostic follow-up,10 , 12 as noted by the USPSTF.3

A wide range of screening tools are available for use in bundled screening, a process for simultaneously assessing multiple behavioral health disorders. Bundled screening can improve recognition of behavioral health problems that may not be immediately apparent during a clinical visit. There are two approaches to bundled screening: (1) administering a multiple-disorder tool designed to screen for more than one behavioral health condition together, or (2) administering several short, independent, single-disorder tools at the same time.

The aim of this review is to identify publicly available, psychometrically tested, short single-disorder and multiple-disorder tools that are appropriate for screening adults for behavioral health conditions most commonly encountered in primary care settings. The tools identified in this review are primarily for PCPs who 1) want to screen the patient population due to specific diagnostic risk factors, and 2) have or are considering the supportive behavioral health infrastructure as recommended by the USPSTF. This review also provides other information to help PCPs select tools that meet their needs, including information about applicability of the tools in primary care settings and information about their implementation across diverse populations.

METHODS

Search Strategy

We followed the Institute of Medicine (IOM) guidelines for conducting a systematic literature review.13 We identified relevant literature addressing behavioral health screening tools through a search of the PubMed, PsycINFO, Applied Social Sciences Index and Abstracts (ASSIA), Cumulative Index to Nursing and Allied Health Literature (CINAHL), and Health and Psychosocial Instruments (HaPI) databases. We selected abstracts that contained the following keywords in the article’s title or abstract: “primary care” and combinations of “screening,” “screening tools,” “instruments,” “assessment,” “alcohol,” “behavioral health disorder,” “behavioral medicine,” “anxiety,” “depression,” “emotional health,” “mental health,” “mental illness,” “mental disorders,” “substance use,” “substance abuse,” “substance-related use disorders,” and “suicide.” Filters were used to limit articles to those published in English from 2000 through 2015. The last search date was May 4, 2017.

Article Selection

We selected abstracts that assessed publicly available, nonproprietary tools that screen for anxiety, depression, and substance use disorders. Inclusion criteria were as follows: the instrument (1) had undergone psychometric testing, (2) targeted adults over age 18, and (3) had been studied in English in North America or western Europe. We excluded articles that were general overviews of behavioral health screening processes or studies of Screening, Brief Intervention, and Referral to Treatment intervention programs. Screening tools were also excluded if they (1) were global functioning or quality-of-life scales (without reference to specific behavioral health conditions); (2) were initially developed for use in research and not for use in clinical settings; (3) measured only cognitive impairments such as dementia or Alzheimer’s; (4) screened for conditions less likely to be treated exclusively in primary care, such as eating disorders and severe mental disorders including bipolar disorder or schizophrenia; or (5) were specifically developed for older patients outside general primary care clinics (e.g., inpatient settings, nursing homes). Additional articles not identified during the original literature search were hand-selected from references (see Fig. 1). Screening tools were classified as either (1) multiple-disorder tools (either subscales from longer multiple-disorder tools or individual tools that assessed more than one mental or substance use disorder in a single instrument) or (2) short, single-disorder tools that assessed only one mental disorder or one substance use disorder using five or fewer items.

Figure 1
figure 1

Flow chart for article selection. *Reasons for exclusion include: proprietary or not publicly available (n = 28); not targeted BH condition (n = 11); developed for research (n = 3); not tested in English (n = 2); not tested in PC setting (n = 1); not a screener (n = 1); elderly focus or cognitive impairment (n = 3); global functioning (n = 1). BH, behavioral health; PC, primary care.

Data Abstraction

The primary decision criterion for determining the usefulness of screeners for this review was based on their psychometric properties. We abstracted information about each tool’s characteristics and utility for primary care practices, including the behavioral health condition(s) that the tool assessed and its sensitivity, specificity, and cut points for determining positive screens. We also determined the population and the measure against which its psychometrics were calculated. For tools originally developed in primary care settings, we abstracted the psychometric information from the initial validity study. For tools originally developed in other medical settings, we abstracted information from the first study in which it was tested in a primary care setting identified during the literature review.

Our assessment of the psychometrics was based on the scale’s criterion validity—the accuracy with which a scale determines the presence or absence of a disorder. We evaluated the strength of a tool’s validation using three criteria: (1) whether a strong gold standard (e.g., a clinical interview) was used, (2) whether the scale’s sensitivity and specificity had been tested in primary care settings, and (3) whether the scale’s sensitivity and specificity both exceeded 75% (considered good or excellent according to the generally accepted rule of thumb). Sensitivity is the proportion of individuals correctly identified as having the condition, or true positives. Specificity is the proportion of individuals correctly identified as not having the condition, or true negatives. Sensitivity and specificity vary according to the cut point used for the scale, the population being assessed, the setting, and the experience of the assessors.14 The optimal cut point of a scale is the threshold that provides the highest percentage of both sensitivity and specificity, identifying those that have the disorder (sensitivity) and excluding those that do not have the disorder (specificity).

To evaluate each tool’s utility in primary care settings, we assessed how long it took to administer the tool and whether it was a self-administered or provider-administrated instrument. If information about time to completion was not cited in the articles, we used the number of items as a proxy by applying the following criteria: (1) 1–4 items represented ultra-short screening tools that took less than 2 min, (2) 5–14 items represented short 2–5-min screening tools, and (3) 15 or more items represented standard screening tools that took 5 min or longer.15

RESULTS

We identified 24 tools that screen for behavioral health disorders in the primary care setting—13 short instruments with five or fewer items, and 11 longer instruments. Of these 24 tools, eight were subscales or portions of subscales originally developed as part of the longer Patient Health Questionnaire (PHQ) and Patient Stress Questionnaire (PSQ) instruments.

Tools Derived from the PHQ and PSQ

Table 1 lists the eight PHQ and PSQ subscales that have been refined into tools that can be administered and scored separately. Because these subscales were developed together and tested as a unit, no overlap exists among their items; primary care practices can select and combine these tools to fit their needs without affecting the psychometric properties of the individual scales due to content overlap, although there may be context effects for the responses depending on which subscale is administered first.

Table 1 Screening Tools Derived From the Patient Health Questionnaire and Patient Stress Questionnaire – Appropriate for Screening for Multiple Mental and Substance Use Disorders

The PHQ screeners assess multiple mental and substance use disorders, such as depression, in the nine-item PHQ (PHQ-9),19,20, 21 somatoform disorders in the 15-item PHQ (PHQ-15),16 , 17 and anxiety disorders in the seven-item General Anxiety Disorder (GAD-7) instruments.27 The PHQ also screens for alcohol use and eating disorders, but these scales are not promoted by the distributer for individual administration.41 The PSQ, however, includes the Alcohol Use Disorder Test (AUDIT-10),34 a well-validated screening tool for assessing problems with alcohol use. The PHQ-9, PHQ-15, GAD-7, and AUDIT-10 are appropriate for administering either separately or together.

Initial testing of the psychometrics for the PHQ-9, PHQ-15, and GAD-7, which was assessed in relationship to clinical interviews, demonstrated good to excellent sensitivity and specificity across most relevant DSM-5 disorders. However, there were a few exceptions. The GAD-7 only has fair sensitivity for panic and social phobia and low sensitivity for post-traumatic stress disorder, meaning that some patients with these conditions will be missed. Also, the PHQ-15 has only fair specificity, meaning that some patients that should have screened out will potentially screen in.

Three of these four tools (the PHQ-9, GAD-7, and AUDIT-10) have been adapted into ultra-short screening tools from the parent instruments—the PHQ-430 that includes two-item screeners for depression (PHQ-2)20 , 21 , 32 and anxiety (GAD-2),33 and the AUDIT-C, a three-item screener for alcohol problems.39 The PHQ-2, PHQ-4, and GAD-2 have sound psychometrics, but their sensitivity is lower than their specificity.

Although tools based on the PHQ and PSQ are limited in their ability to address substance use disorders (as the AUDIT only assesses for alcohol use disorders), they have strong psychometrics and applicability in primary care settings. A screening protocol for depression might first involve administering the PHQ-2, and then giving individuals who screen positive the more comprehensive PHQ-9 to confirm the positive screen, although studies have found that this often does not happen.42 A review of ultra-brief screens indicates that the PHQ-2 is as effective as more extensive instruments, and its brevity, sensitivity, and simple administration make it a suitable screening tool for depression, although only as a rule-out and only when resources for follow-up are available.9

Additional Multiple-Disorder Screening Tools

We identified six additional tools that screen for multiple behavioral health disorders (Table 2). Two assess mental disorders, and five assess substance use disorders.

Table 2 Multiple-Disorder Screening Tools Assessing Either Mental Disorders or Substance Use Disorders in a Single Instrument – Not Derived From the Patient Health Questionnaire and Patient Stress Questionnaire

The Hospital Anxiety and Depression Scale (HADS)43 and the Web-Based Depression and Anxiety Test (WB-DAT)45 screen for multiple mental disorders in a single tool, while identifying the specific disorder for follow-up. Both screen for depression and anxiety, which is efficient given how frequently these disorders co-occur, but both tools have mixed sensitivity and specificity. In addition, neither has been extensively tested in primary care settings. Evidence suggests that the HADS, which was developed for use with patients in hospital-based settings, does not perform as well in primary care settings.44 The WB-DAT is designed for web-based administration, so patients can be screened on a computer before seeing their provider.

Five tools screen for multiple substance use disorders in a single instrument. One tool, the Kreek-McHugh-Schluger-Kellogg (KMSK),46 only measures the extent of the substance use. The other four tools—the Simple Screening Instrument for Substance Abuse Potential (SISAP),48 , 49 Drug Abuse Screen Test (DAST-10),50 , 51 the Tobacco, Alcohol, Prescription Medication and Other Substance Use (TAPS) tool,53 and the Alcohol, Smoking, and Substance Involvement Screening Test (ASSIST)55—also assess problems related to substance use. Of these tools, the DAST-10 and ASSIST had the strongest psychometrics and demonstrated widespread testing in primary care settings.

There was a trade-off between the length of these four screeners and their ability to determine the nature of the problem. For example, the 10-item DAST-10 only measures the consequences of drug use in general. In contrast, the ASSIST asks eight questions about 10 individual categories of drugs, which allows the clinician to identify specific substances causing impairment. However, the ASSIST is often difficult to incorporate into primary care settings because of the complexity of the scoring system.58 , 59

Mode of administration is an important consideration. Self-administered tools can encourage more honest disclosure about illegal drug use.60 Only the DAST-10 was originally designed for self-administration, although a self-administered version of the SISAP is currently available.

Additional Ultra-Short Screening Tools

We identified nine ultra-short screeners (5 or fewer items) that were not associated with the PHQ family of tools (Table 3). Three addressed mental disorders, and six addressed substance use disorders. Theoretically, these ultra-short tools can be combined to screen for more than one disorder.

Table 3 Ultra-Brief Single-Disorder Screening Tools

The three ultra-short mental disorder tools that screen for depression and anxiety were the Mental Health Inventory-5 (MHI-5),61 World Health Organization-Five Well-Being Index (WHO-5),63 and Brief Case-Find for Depression.65 The Brief Case-Find for Depression was originally developed for use in a medically ill patient population. In general, these measures do not have strong sensitivity and specificity, and none were superior to the PHQ-4 discussed above.

The CAGE,66 which only screens for alcohol-related disorders, has strong sensitivity and specificity but lower sensitivity among women at the traditional cut point of 2.67 The CAGE-AID70 , 71 screens for both alcohol and drugs, with wide ranges in sensitivity and specificity rates across cut points and populations.

One important step in choosing appropriate screening tools involves evaluating differences among instruments that screen for similar conditions. The CAGE and AUDIT-C both screen for alcohol problems, but the AUDIT-C screens for less severe alcohol problems such as at-risk, harmful, and hazardous drinking, whereas the CAGE screens for lifetime and current alcohol abuse or dependence.8 Tools that assess the same condition also may differ according to population. Lower thresholds are needed when screening women with the AUDIT-C,39 given that women metabolize alcohol differently from men.77

We identified four ultra-brief substance use disorder tools that were one-item screens assessing alcohol and drug use (the Single Question Screening Test for Drug Use, the Single Alcohol Screening Question, the Two-Item Conjoint Screen, and the Fast Alcohol Screening Test).72,73,74, 75 , 78 In general, we found strong psychometrics for these short screens. A challenge is that conjoint screeners do not distinguish between the specific type or severity of the alcohol or drug use disorder.70 , 78

DISCUSSION

This review synthesized the scientific literature and highlighted the psychometric properties of publicly available screening instruments for the most common behavioral health disorders seen in primary care. It identified 24 screening instruments, of which 13 were short screening tools with five or fewer items. The selection of tools by PCPs is a multifactorial process, and no one screening instrument fits all practices. PCPs must evaluate the psychometrics in conjunction with multiple factors, such as the most prevalent behavioral conditions, staffing resources (e.g., clinician time), reimbursement, quality measurement, availability of follow-up from behavioral health clinicians, and PCPs’ familiarity with behavioral health conditions. Primary care practices can use this review to inform their selection, taking all these factors into account.

Factors to Consider in Screening Tool Selection – Measurement

The issue of longer, multiple-disorder tools versus shorter, single-disorder scales for mental health screening is still actively debated.79 The primary disadvantage of multiple-disorder tools is that they can be long and difficult to implement. However, multiple-disorder tools (e.g., WB-DAT, DAST-10, ASSIST) may uncover behavioral health conditions and information about psychological functioning not detected by single-disorder tools.33 , 80

Shorter, single-disorder screening tools provide practices with the flexibility to assess conditions most often encountered and to combine screeners into a brief protocol tailored to specific needs. But single-disorder screening instruments—whether brief or comprehensive—that assess conditions in isolation may result in missed diagnoses of other disorders because of high rates of co-occurring behavioral health conditions.81 When combining multiple single-disorder tools, the ordering of the administration of the tools may also influence patient responses (context effects) and may affect the tools’ psychometric properties if they are not administered via the parent instrument (e.g., PHQ or PSQ).

Our review has several limitations. Although the inclusion criteria were sensitive enough to identify well-known screening tools such as the PHQ, our focus on screeners for common behavioral health disorders, and the fact that new screeners are constantly being developed, refined, and tested, means that we may have missed some tools. In addition, we did not include information from every published study about each tool. Hence, information about the tools’ psychometrics with particular subgroups may have been missed. Lastly, our review weighed breadth over depth. We did not assess the rigor of the psychometric testing for each study or address fine-tuned distinctions between screeners.

Factors to Consider in Screening Tool Selection – Considerations for Primary Care Practices

Selection of a screening tool requires consideration of the population the clinic serves. Primary care practices with multiple co-occurring behavioral health conditions in their patient population may wish to consider the screening tools derived from the PHQ and PSQ. Practices can combine these tools to assess for the most common conditions. The PHQ-9 is one of the few tools endorsed by the National Quality Forum82 for behavioral health screening. Its administration is reimbursed by Medicare and Medicaid, and some commercial insurance, though practices must always emphasize the need for diagnostic follow-up.83 Practices that serve a high proportion of seriously ill patients on an outpatient basis might consider tools that were originally developed for patients with co-occurring physical health disorders (e.g., HADS).

Selection of the type of screener (e.g., shorter vs. longer) requires weighing issues of time along with characteristics of the patient population. A screener with a web-based mode of administration such as the WB-DAT may be ideal for fast-paced clinics with staff shortages, but not those serving patients with low literacy. Short conjoint screeners that do not distinguish between the specific type or severity of the alcohol or drug use disorder (e.g., CAGE-AID, TICS) may be ideal for detecting substance disorder in a general population clinic, where most patients will not screen positive. However, in clinics with a high proportion of patients with polysubstance use, PCPs may prefer to select screeners that separately assess for alcohol and specific drugs (e.g., ASSIST), to avoid administering multiple layers of screening before referral for diagnostic assessment.70 , 78

The psychometrics of a screening tool must be considered within the context of the patient population and follow-up resources available. Ultra-short screeners with strong specificity tend to function best in ruling out disorders15; PCPs can be confident that patients who score negative on these screeners are true negatives and do not need follow-up. However, screeners with low sensitivity will yield a higher number of false positives, or may not provide enough information about specific disorders. Clinics that treat a generally healthy population, and where PCPs have sufficient time to follow up a highly sensitive test with a second test with high specificity to confirm false positives as disease-negative, may want to implement this type of protocol.

On the other hand, high sensitivity of a screen increases the likelihood of people with the disorder being correctly identified (true positives). Screens with high sensitivity are a good fit for practices with higher behavioral health needs in general (who may have many individuals with subthreshold disorders), and with the resources to rapidly follow up with a diagnostic assessment without a second round of screening.

It is important that providers have a plan for patients who screen positive.84 This follow-up could consist in providing patients with education and treatment within the primary care practice or referring them to a specialty provider. Practices with ready access to behavioral health clinicians to perform a diagnostic assessment might prefer to use tools that broadly assess multiple problems. Other practices might prefer to screen only for specific conditions or to build screening protocols slowly as their referral network expands.

In conclusion, many primary care practices may have physical health screening protocols in place, and need only to incorporate behavioral health tools into already-existing protocols. Some changes in primary care practice (e.g., training protocols, workflow adjustments) may still be necessary to implement an enhanced screening protocol that incorporates tools identified here. Effective recognition and management of behavioral health conditions is essential to success as the health care system evolves from paying for volume to paying for value. Future research will be needed on how to implement screening processes successfully and on which payment mechanisms work best to sustain these processes.