Psychometric evaluation of the NORC diagnostic screen for gambling problems (NODS) for the assessment of DSM-5 gambling disorder.

The National Opinion Research Center (NORC) Diagnostic Screen for Gambling Problems (NODS) is one of the most used outcome measures in gambling intervention trials. However, a screen based on DSM-5 gambling disorder criteria has yet to be developed or validated since the DSM-5 release in 2013. This omission is possibly because the criteria for gambling disorder only underwent minor changes from DSM-IV to DSM-5: the diagnostic threshold was reduced from 5 to 4 criteria, and the illegal activity criterion was removed. Validation of a measure that captures these changes is still warranted. The current study examined the psychometric properties of an online self-report past-year adaptation of the NODS based on DSM-5 diagnostic criteria for gambling disorder (i.e., NODS-GD). A diverse sample of participants (N = 959) was crowdsourced via Amazon's TurkPrime. Internal consistency and one-week test-retest reliability were good. High correlations (r = 0.74-0.77) with other measures of gambling problem severity were observed in addition to moderate correlations (r = 0.21-0.36) with related but distinct constructs (e.g., gambling expenditures, time spent gambling, other addictive behaviors). All nine of the DSM-5 criteria loaded positively on one principal component, which accounted for 40% of the variance. Classification accuracy (i.e., sensitivity, specificity, predictive power) was generally very good with respect to the PGSI and ICD-10 diagnostic criteria. Future studies are encouraged to establish a gold standard self-report measure of gambling problems and develop agreed-upon recommendations for the use and interpretation of crowdsourced addiction data.


Main Text
Humans have been gambling for millennia, from six-sided dice used ve thousand years ago to contemporary online casinos. Nearly every culture has had a relationship with gambling, although societal acceptance has varied across time and context (Raylu & Oei, 2004). In the United States, gambling represents an occasional pastime for most of the adult population; however, past-year prevalence of disordered gambling (0.8%) remains high (Welte et al., 2015). Given its simultaneous popularity and potential to cause signi cant problems, it is critical that assessment instruments accurately identify and classify leisure, at-risk, problem, and disordered gambling.
A screening instrument based on the original DSM-III criteria for pathological gambling became widely used in both clinical and community samples. More recently, the Problem Gambling Severity Index (PGSI; Ferris & Wynne, 2001) has gained prominence. The PGSI, which is not based on DSM diagnostic criteria, contains nine self-report items that measure gambling problem severity. Total scores are used to identify low-, moderate-, and high-risk gambling activity over the past year. The PGSI has demonstrated high correlations with other measures of gambling problem severity (r = .83), as well as good internal consistency ( = .86), excellent speci city (1.0), and adequate sensitivity (.83;Ferris & Wynne, 2001;Holtgraves, 2009).
The PGSI includes items that survey a range of problematic gambling characteristics, but does not offer comprehensive coverage of current DSM criteria nor yield a diagnosis. To ll this gap, the National Opinion Research Center (NORC) Diagnostic Screen for Gambling Problems (NODS; Gerstein et al., 1999) was developed. The NODS is a 17-item diagnostic interview based on DSM-IV criteria that assesses gambling problems over the past year. The NODS correlates highly with other measures of gambling problem severity (r = .86) and moderately with log-transformed monthly gambling expenditures and number of days gambled (r = .50); it has also demonstrated fair internal consistency ( = .78 ;Hodgins, 2004).
In 2013, the DSM-5 update included changes to the classi cation and diagnosis of gambling problems (APA, 2013). Renamed gambling disorder (GD) to reduce stigma, it was reclassi ed as an addictive disorder given its signi cant overlap with substance use disorders (Petry et al., 2013). Additionally, the illegal acts criterion was removed to re ect that it was often the last criterion endorsed by those diagnosed with DSM-IV pathological gambling; the diagnostic threshold was also reduced from 5 to 4 given the removal of a criterion. A DSM-5 diagnosis is typically derived by administering the NODS and excluding the illegal acts question; however, no measure has been developed or validated to re ect the 2013 DSM changes.
The purpose of the current study was to validate an online self-report past-year version of the NODS to assess DSM-5 gambling disorder via examination of psychometric properties (i.e., internal consistency, test-retest reliability, convergent and divergent validity, factor structure, item response patterns, sensitivity, speci city, and classi cation accuracy). Additionally, this version of the NODS was evaluated for how well it identi es ICD-10 pathological gambling.

Participants
Participants were recruited via Amazon's TurkPrime (Litman et al., 2017). TurkPrime is a virtual crowdsourcing platform that allows researchers to invite users to complete brief tasks (called Human Intelligence Tasks [HITs]) in exchange for nancial compensation. To meet eligibility requirements, participants had to: a) be located in the United States; b) be 18 years of age or older; c) have gambled at least once in the past year; and d) demonstrate a HIT approval rate of 25% or greater (i.e., successful completion of at least 25% of attempted HITs). It bears noting that TurkPrime samples generally report higher rates of gambling problems compared to the general population (Schluter, Kim, & Hodgins, 2018).
Remuneration amounts on TurkPrime are typically based on the anticipated length of time it will take to complete a task. Best practices recommend a compensation rate of at least ten cents per minute (Chandler & Shapiro, 2016). For the current study, the anticipated completion time was ten minutes in total per worker; thus, participants were initially compensated a total of US$1.00. Mean survey completion times were greater than expected, which prompted the authors to increase worker remuneration to US$2.00. This study, including the methods, design, and modi cation, was approved by the University of Calgary Conjoint Faculties Research Ethics Board (REB20-1012).

Measures
The PGSI and NODS were administered in self-report format using a past-year reporting window. A cut score of 8 or greater on the PGSI (i.e., problem gambling) was used to categorize participants; classi cations were used as references for the measurement of NODS classi cation accuracy. Note that there is no gold standard measurement of gambling problems; the PGSI was selected for the current study on the basis of its strong psychometric properties and frequent use in epidemiological research.
The Screener for Substance and Behavioural Addictions (SSBA) is a self-report measure that asks the same four questions as they apply to respondents' engagement with ten addictions over the past year, including gambling 2020). Response options span from 0 (none of the time) to 4 (all of the time), and total scores range from 0 to 16 for each of the ten addiction scales. Each scale has demonstrated high internal consistency ( = .89-.94) and moderate to high correlations with other measures of the same constructs .
The Composite International Diagnostic Interview gambling module (CIDI-GM) contains 17 yes/no questions that correspond to the four diagnostic criteria for pathological gambling in the International Statistical Classi cation of Diseases and Related Health Problems (ICD), tenth revision (World Health Organization [WHO], 2004). Participants that met all four criteria were classi ed as such as a reference point for measuring classi cation accuracy of the NODS for ICD-10 criteria. The CIDI-GM was administered in online self-report format along with the other measures included.
The Patient Health Questionnaire-9 (PHQ-9; Kroenke et al., 2001) was included to assess discriminant validity. This self-report instrument contains nine questions that assess depressive symptoms over the past two weeks. Response options range from 0 (not at all) to 3 (nearly every day) and yield a total score between 0 and 27. The PHQ-9 has shown good internal consistency ( = .89; Kroenke et al., 2001).
Participants who endorsed thoughts that they would be better off dead or hurting themselves were directed to an automated response at the end of the survey that encouraged participants to consult a resource (e.g., family physician) or contact a crisis helpline via phone numbers provided to them.

Procedure
Advertisements were displayed in TurkPrime to individuals that met eligibility criteria including IP addresses located in the United States. Eligible respondents were redirected to Qualtrics to complete the rst part of the survey.
Part one: Screening. A virtual private network (VPN) block and reCAPTCHA system were implemented within part one to prevent the enrollment of ineligible and fake participants, respectively. Several demographic survey questions were then asked to crosscheck with TurkPrime lters and gather descriptive information from the sample. Participants were also asked to estimate the average number of hours they have gambled per month and per gambling session over the last three months, as well as the average net number of dollars they won or lost per month and per gambling session. These three-month retrospective self-report questions were adapted from the Gambling Participation Instrument (GPI; Williams et al., 2017).
Best practices recommend dual screening of participants recruited from platforms such as TurkPrime (Kim & Hodgins, 2017;Schluter, Kim, & Hodgins, 2018). To that end, two randomly selected PGSI questions were presented prior to the demographic questions, in addition to the full PGSI at the end of part one. Only individuals who met eligibility criteria and had matching PGSI responses were permitted to continue with the main survey (part two) immediately. Regardless, all participants who completed the rst part were automatically compensated US$0.60.
Part two: Main survey. The main survey consisted of the NODS, CIDI-GM, SSBA, and PHQ-9. Participants who completed part two were manually compensated a bonus of US$0.60 and invited to complete the one-week follow-up.
Part three: One-week follow-up. The one-week follow-up comprised the readministered NODS in addition to questions asking if participants were engaged and honest in their survey responses. Participants who completed part three were automatically credited with an additional US$0.80.

Data Analysis
Data analysis covered three domains: reliability, validity, and classi cation accuracy. Test-retest reliability was assessed with the intraclass correlation coe cient (ICC) using a two-way mixed model to measure absolute agreement. Internal consistency was assessed with Cronbach's alpha. Correlational analysis were used to evaluate convergent and divergent validity by calculating Pearson's correlation coe cients between the NODS-5 and measures of gambling problem severity, monthly gambling expenditures, hours per month spent gambling, depressive symptoms, and severity of concurrent addiction symptoms. The construct validity assessment entailed a principal components analysis using principal component extraction with promax rotation. Item response patterns on the PGSI and SSBA gambling scale were grouped by DSM-5 diagnostic severity (i.e., number of criteria met). Crosstabs were used to assess classi cation accuracy based on the PGSI and ICD-10 diagnostic criteria for pathological gambling. Sensitivity, speci city, positive predictive power, and negative predictive power were calculated from these crosstabs. Finally, percent agreement with the NODS-5 was calculated for the PGSI and CIDI-GM. All analyses were conducted with and without data from participants who did not endorse honest and attentive responses to survey questions; however, this did not signi cantly alter any results.

Results
All statistical analyses were conducted with R and RStudio (R Core Team, 2020). In total, 959 participants with a mean age of 39.4 (Mdn = 37, SD = 12.1) were recruited via TurkPrime from February to June 2021. Sample characteristics are provided in Tables 1 and 2. The one-week follow-up rate (50.8%) was much lower than would normally be expected from crowdsourcing platforms (Kim & Hodgins, 2017) despite attempts to increase retention (e.g., increased compensation, reduced HIT approval rate requirement) so follow-up invitations were discontinued for the nal 315 participants.
Validity Table 3 displays the Pearson correlations between the NODS-5 and validation measures. Convergent validity was strong, evidenced by high correlations with other direct measures of gambling severity (r = .74-.77). As expected, moderate correlations were observed between the NODS-5 and measures of depression, other addictions, gambling expenditures, and hours spent gambling (r = .21-.36), which are indicative of divergent validity. Item response patterns for direct measures are provided in Table 4 and broken down by NODS-5 classi cation.

---TABLE 3 ------TABLE 4 ---
The nine DSM-5 diagnostic criteria, based on the sixteen NODS-5 questions, were subjected to a principal components analysis to examine factor structure and establish construct validity. Results demonstrated a robust principal component that accounted for 40% of the variance and positively correlated with all nine criteria, which is indicative of a unitary construct. Two components had eigenvalues greater than one and together accounted for 51% of the variance. Promax rotated factor loadings for these two components are provided in Table 5. Five criteria loaded on Component A, which re ects cognitiveaffective aspects of problematic gambling. In contrast, the four criteria that loaded on Component B re ect behavioural aspects.

Classi cation Accuracy
Classi cation accuracy of the NODS-5 based on DSM-5 and ICD-10 diagnostic criteria was evaluated with the PGSI and CIDI-GM categorizations, respectively (see Table 6). The NODS-5 classi ed more individuals as engaging in disordered gambling (34%) compared to the PGSI (25%) and fewer compared to the CIDI-GM (39%). Percent agreement with the NODS-5 was 83% for the PGSI and 78% for the CIDI-GM. Sensitivity, speci city, and positive and negative predictive power of the NODS-5 are provided in Table 7. -- -TABLE 6 ------TABLE 7 ---

Discussion
The results of this study support the use of a DSM-5 self-report adaptation of the NODS in a community sample of crowdsourced individuals who gamble. Internal consistency and one-week test-retest reliability were both good. High correlations were observed between the NODS-5 and other direct measures of gambling problem severity (i.e., PGSI, SSBA gambling scale); moderate correlations were observed between the NODS-5 and other validation measures (i.e., gambling expenditures, time spent gambling, depressive symptoms, SSBA non-gambling scales). While one might predict that gambling hours and expenditures would be more highly correlated with the NODS-5, moderate correlations are to be expected given the different timeframes (past year versus past three months) and distinct constructs (gambling behaviours versus diagnostic problem severity).
All nine of the DSM-5 criteria loaded positively on the principal component, which accounted for 40% of the variance; this suggests that the NODS-5 measures a single construct. Two components with eigenvalues greater than one roughly represented cognitive and behavioural symptoms of GD, which is consistent with a cognitive-behavioural theoretical formulation of gambling problems.
Classi cation accuracy of the NODS-5 was generally very good with reference to the PGSI and CIDI-GM. The NODS-5 classi ed more individuals as disordered compared to the PGSI and fewer compared to the CIDI-GM. Importantly, the lack of a gold standard measurement of gambling problem severity obfuscates possible conclusions. Our results should be interpreted considering this caveat.
Those familiar with TurkPrime will recognize our HIT approval rate cut-off (25%) as low, which can raise concerns about the quality of data collected. Indeed, best practices recommend excluding participants with HIT approval rates lower than 95% (Kim & Hodgins, 2017;Schluter, Kim, & Hodgins, 2018) because a low approval rate implies the associated TurkPrime worker has provided unusable data to a large proportion of prior HITs. Unfortunately, users with extremely high approval rates are much less common; the use of a low approval rate was necessary to ensure timely recruitment of participants. Despite these considerations, our adherence to other best practices (e.g., remuneration, multi-step surveys, validation procedures, dual screening) appears to have mitigated potential consequences of a low approval rate cutoff. For example, ltering out participants who admitted to responding dishonestly or inattentively did not signi cantly alter any ndings. Finally, researchers have demonstrated that addiction populations on TurkPrime generally provide high quality data (Kim & Hodgins, 2017).

Conclusion
The NODS-5 represents a reliable and valid measure of gambling problem severity, based on DSM-5 diagnostic criteria, when administered in online self-report format to a diverse crowdsourced sample of community gamblers. Future validation studies are encouraged to establish a gold standard measurement of gambling problem severity for the psychometric evaluation of classi cation accuracy.