Factor Structure of the Cannabis Use Disorders Identification Test Revised (CUDIT-R) for Men and Women

The Cannabis Use Disorders Identification Test Revised (CUDIT-R) is an 8-item screening instrument designed to identify recent problematic cannabis use over the past 6 months. The purpose of the present study was to investigate the factor structure of the CUDIT-R separately for male and female college students. Participants included 1,390 male and female college students recruited from three state universities (61% female; Age: M= 19.8, SD= 1.3). We conducted exploratory and confirmatory factor analyses followed by tests of measurement invariance including configural invariance, metric invariance and scalar invariance across men and women. Results confirmed a one-factor structure for the CUDIT-R. The number of factors and item loadings were invariant between men and women. However, intercepts were non-invariant for an item asking about consumption of cannabis use indicating that the endorsement of this item varied between men and women. Follow-up validation tests indicated that using a sum score for analyses is appropriate despite non-invariance. However, more research is needed to determine if the cut-off scores of the CUDIT-R should be reevaluated by gender.

Men and women differ in rates of cannabis use and progression to cannabis use disorder (CUD). Reporting of lifetime cannabis use is about 53.1% in men and 43.7% in women (Substance Abuse and Mental Health Services Administration, 2018). Recent research indicates that men also have higher rates of CUD than women (3.5% versus 1.7%, respectively; Hasin et al, 2016). Nonetheless, women appear to have a faster trajectory from first cannabis use to CUD relative to men (Hernandez-Avila, Rounsaville, & Kranzler, 2004;Khan, 2013). The reason for this telescoping effect is not well known; however, women demonstrate greater subjective intoxication to cannabis than men, which may contribute to maintained use (Cooper & Haney, 2014).
The validity of the CUDIT-R across gender has not, to our knowledge, been investigated. Thus, current research utilizing the CUDIT-R relies on the assumption that this measure assesses the same construct in men and women across a common metric. Without examining measurement invariance, we do not have evidence that differences among men and women in the CUDIT-R scores represent true differences in problematic use or are merely artifacts of other processes, such as the interpretation of questions. The purpose of the present study was to explore the psychometric qualities of the CUDIT-R separately for young adult men and women.

METHOD Participants and Procedures
As part of a larger study, students ages 18-24 from three state universities were randomly chosen by each school's registrar and sent email invitations to participate in an online screening survey. Eligible participants were then invited to participate in a 30-to 40-minute online baseline survey asking questions related to alcohol and cannabis use. Eligibility criteria included being a past-year alcohol and cannabis user between ages 18 and 24 and a full-time student invited to participate at one of the three universities. Those who responded to the screening surveys were fairly representative of those invited based on demographic information provided by the registrars. (For greater detail on recruitment and participation, see White et al., 2019.) Participants were 1,390 eligible students from the three universities. Gender identity of the sample was 61% women, 38.1% men, and 0.9% transgender, genderqueer, or gender nonbinary, with a mean age of 19.8 (SD = 1.3). The sample was 63.8% non-Hispanic White, 2.7% non-Hispanic Black, 12.5% Asian, 12.2% Hispanic/Latinx, 0.2% Native Hawaiian or other Pacific Islander, 0.1% American Indian or Alaskan Native, 0.8% other not listed, and 7.7% more than one race/ethnicity.

Measures
The CUDIT-R (Adamson et al, 2010) contains items designed to assess criteria related to cannabis abuse and dependence during the past 6 months and has been validated to screen for DSM-5 criteria for CUD (Schultz et al., 2019). See Table 1 for questions and item-level means. Item one had response options of 0 = "Never," 1 = "Monthly or less," 2 = "2-4 times a month," 3 = "2-3 times a week," and 4 = "4 or more times a week." Item eight used response options 0 = "Never," 2 = "Yes, but not in the past 6 months," and 4 = "Yes, during the past 6 months." The remaining six items used a five-point Likert-type scale ranging from 0 = "Never" to 4 = "Daily or almost daily." Total scores range from 0 to 32 with scores of 8 or more indicating hazardous cannabis use and scores of 12 or more indicating possible CUD (Adamson et al., 2010). In addition, demographic information including gender identity was collected. Response options included Male, Female, Tans male/Trans man, Trans female/Trans woman, Genderqueer/Gender non-conforming, different identity (check all that apply). For the analyses, we included persons who identify as either men or women. Thus, those who identified as transgender men were combined with those who identified as men (total n = 547) and those who identified as transgender women were combined with women (total n = 831). We excluded participants who identified as genderqueer and gender nonbinary if they did not also identify as a man or woman as the sample size was too small (n = 9). Further, we excluded participants who identified as both a man and woman (n = 3). The final sample size for the EFA was 488 (men n = 199, women n = 289) and the final sample for the CFA was 890 (men n = 348, women n = 542).
Students were asked whether or not they experienced 28 different negative consequences in the past 3 months "due to their marijuana use" (see Table 2). We summed these dichotomous items (yes/no) to create a score of total number of consequences experienced. The consequence items were from the 21-item Brief Marijuana Consequences Questionnaire (MACQ; Simons, Dvorak, Merrill, & Read, 2012) and the 24-item Brief Young Adult Alcohol Consequence Questionnaire (BYAACQ; Kahler, Strong, & Read, 2005); collapsing the two scales yielded 28 unique items. Both scales have been used reliably with college students (Kahler et al., 2005;Simons et al., 2012).

Data Analysis
A random subset of 488 participants was first used for exploratory factor analysis (EFA), which was further split by gender. We conducted the EFA using R version 3.1.4 (R Core Team, 2017) on the CUDIT-R and factor extraction was based on parallel analysis (Hayton, Allen, & Scarpello, 2004). Factor analysis was justified using Bartlett's test of sphericity and the Kaiser Meyer-Oklin (KMO) measure of sampling adequacy (Bartlett, 1950;Kaiser, 1970). A significant Bartlett's test (p<.05) and a KMO index of at least 0.50 indicated the data were suitable for factor analysis (Williams, Onsman, & Brown, 2010). The remaining 890 participants were used to conduct the CFA and measurement invariance. Table 1 shows descriptive statistics for individual CUDIT-R items. The CFA was completed using lavaan (Rosseel, 2012) for R version 3.1.4 (R Core Team, 2017). Missing data were accounted for using Diagonally Weighted Least Squares (DWLS), which results in less biased factor loadings for ordinal data (Li, 2016). The Comparative Fit Index (CFI) and Tucker Lewis Index (TLI) ≥.95, the Root Mean Square Error of Approximation (RMSEA) ≤ .06, and the standardized root mean squared residual (SRMR) ≤ .08 were used as indicators for good model fit (Hu & Bentler, 1999;Yu, 2002). Modification indices were evaluated to determine whether residuals of items should be correlated based on overlapping constructs (Sörbom, 1989).
Next, measurement invariance was tested by sequentially constraining parameters across genders. Configural invariance of the CUDIT-R was evaluated by first fitting separate confirmatory models in men and women. A test of configural invariance examines whether the basic organization of the constructs (i.e., latent factors) is supported across genders. Once configural invariance is established, the next step is metric invariance, or invariance of the item loadings. When factor loadings are invariant across groups, this indicates that each item contributes to the latent construct to a similar degree. If metric invariance holds across groups, scalar invariance is tested. Scalar invariance is the equivalence of item intercepts. If all previous invariances are supported, strict invariance is tested. Strict, or residual, invariance tests whether the sum of specific variance and error variance is similar across groups (Byrne, 2010;Kline, 2011Putnick & Bornstein, 2016. The marijuana consequences score was used to validate the CUDIT-R.

RESULTS
The criteria of sphericity and normality were met as checked by a significant Bartlett's sphericity test (p<.001) and KMO value of 0.85. Parallel analysis and EFA suggested a onefactor solution for the CUDIT-R for both men and women with 83% of variance explained for both samples (see Table 3).
Next, we tested goodness of fit of the one-factor structure using CFA. The final sample for the CFA (N = 890) consisted of 348 men and 542 women. There was no significant difference in gender by site, χ 2 (2) = 1.36, p = .506. A one-factor model with no correlated residuals showed poor to adequate fit (i.e., χ 2 (20) = 92.431, p < .001, CFI = .964, TLI = .950, RMSEA[90%CI] = .064[.051, .077], SRMR = .034). Evaluation of the modification indices showed strong evidence of a correlated residual between item one ("How often do you use cannabis?") and item seven ("How often do you use cannabis in situations that could be physically hazardous, such as driving, operating machinery, or caring for children?"). Given that these two items tapped similar content (frequency of use), we made the decision to correlate the residuals. The model with these correlated residuals resulted in significant improvement in model fit, Δχ 2 = 21.597, Δdf = 1, p < .001. Modification indices were reevaluated and suggested that the covariance of item one and item two ("How many hours were you 'stoned' on a typical day when you had been using cannabis?") also overlapped, likely because both items are indicators of consumption (as opposed to problems). Adding these correlated residuals resulted in significant improvement in model fit, Δχ 2 = 21.229, Δdf = 1, p < .001.  Table 4 shows the results from invariance testing. After establishing configural invariance, we tested group invariance by entering the configural model as the baseline step (Step 1), and constraining factor loadings to be equal across groups (metric invariance; Step 2). We found that the strengths of the factor loadings were invariant across men and women. We then evaluated scalar invariance by further constraining item intercepts to be equal across groups (Step 3). We found non-invariant intercepts for item one suggesting that scalar invariance did not hold between men and women. Specifically, men had a higher unstandardized item intercept than women (intercept = 2.91, SE= 0.083 and intercept = 2.41, SE = 0.062, respectively).
We further evaluated whether a sum score for the CUDIT-R reliably indexed the measure for both men and women, by computing factor scores for each individual and correlating this score with the CUDIT-R sum score. Pearson correlation revealed that the CUDIT-R factor score was strongly correlated with the sum score (r = .991, p < .001), suggesting that the variance in these indices was largely overlapping. Finally, to determine whether differences observed in our test of scalar invariance would have practical implications for the CUDIT-R at a substantive level, we investigated the concurrent validity of both index measures (sum scores and factor scores) and found that they were both significantly correlated with cannabis use consequences with relatively equivalent magnitude (r= .711 , p < .001 and r = .711, p < .001, respectively). When split by sex, both index measures correlated significantly with men (r = .684, p < .001 and r = .689, p < .001, respectively) and women (r = .724, p < .001 and r = .720, p < .001, respectively).

DISCUSSION
The present study sought to replicate the factor structure of the CUDIT-R items proposed by Adamson et al. (2010) in a nonclinical young adult sample of cannabis users and to extend previous studies by examining gender invariance in the CUDIT-R. In line with the conceptualization of the CUDIT-R, our model confirmed a one-factor structure. Our test of whether the CUDIT-R factor structure was the same across men and women indicated that the number of factors and item loadings were invariant between men and women.
Although our model replicated the factor structure of the CUDIT-R, we used modification indices to identify items with shared variance. Specifically, item one ("How often do you use cannabis") was correlated with items two ("How many hours were you 'stoned' on a typical day when you had been using cannabis") and seven ("How often do you use cannabis in situations that could be physically hazardous, such as driving, operating machinery, or caring for children"). These items strongly overlapped in asking about consumption patterns. Invariance testing of the CUDIT-R held across factor loadings (i.e., metric invariance). This finding indicates that the relationship between CUDIT-R items and the underlying latent construct is the same for men and women and suggests that these items are interpreted consistently by both genders.
In this sample, item one intercept ("How often do you use cannabis?") was non-invariant across groups, indicating that the endorsement of the items varied between men and women. Knowing that there are gender differences in cannabis use including prevalence (Cuttler, Mischley, & Sexton, 2016) and rates of and progression to CUD (Hasin et al., 2016;Hernandez-Avila et al., 2004;Khan, 2013), differences in endorsement of CUDIT-R items was expected. Specifically, we found that men had higher endorsement of item one ("How often do you use cannabis?"). This difference is in line with previous research indicating that men use cannabis more frequently than women (Cuttler et al., 2016).
Due to the gender differences in the CUDIT-R above, we compared the traditionally derived CUDIT-R sum score to a CUDIT-R factor score based on our psychometric models. Factor scores are composite scores which identify an individual's placement on a latent factor. When we compared the factor score with the sum score, results indicated both scores measured virtually the same thing (i.e., they were correlated at .99). This finding suggests that despite non-invariance at item one intercepts, the sum score is still appropriate to use for both young adult men and women. However, more research is needed to determine if clinical implications of the CUDIT-R, such as cut-off scores, should be reevaluated by gender.
The results of the study need to be considered within the context of some limitations. The CUDIT-R is a self-report measure; thus, responses may be over-or under-reported. The present study results were based on a sample of university students who reported using both alcohol and cannabis in the past year and may not generalize to other college students or to non-student samples. The CUDIT-R may perform differently in other samples such as older adults or those with less regular cannabis use who may endorse items related to frequency at lower levels. Nonetheless, our sample represents an important age group given that the annual prevalence of cannabis is highest among 19-to 30-year-olds (38%) with highest use at ages 21-22 (44%; Schulenberg et al, 2019) and odds of CUD diagnosis are highest in young adults aged 18-24 (Hasin et al., 2016). Our sample had a small number of non-white students and analyses were limited to those who identified either as men or women; replication in more diverse samples and across non-binary gender groups is an area of future research. Due to the self-report nature of the assessment and the lack of a diagnostic measure of CUD in the data set, we were unable to determine potential cutoff scores for hazardous use and probable CUD. Future research should work to determine appropriate cutoff scores for men and women.
Despite these limitations, this is the first study that has evaluated the factor structure of the CUDIT-R separately for men and women. This study makes a significant contribution through the evaluation of this screening tool across genders, which could have clinical implications for the identification of problematic cannabis use and CUD. With recent legislative changes in cannabis legalization as well as increased prevalence of cannabis use, identifying problematic use will be imperative.

Funding and Acknowledgements:
The writing of this paper was supported by the National Institute on Drug Abuse (R01 DA040880, MPIs: Jackson and White). Alex W. Sokolovsky is funded by the National Institute on Drug Abuse (T32 DA016184, PI: Rohsenow). Points of view in this document are those of the authors and do not necessarily represent the official position or policies of the National Institutes of Health. The funding sources had no role in the analysis or interpretation of the data, the preparation of this manuscript, or the decision to submit the manuscript for publication.