Sociodemographics and Transdiagnostic Mental Health Symptoms in SOCIAL (Studies of Online Cohorts for Internalizing Symptoms and Language) I and II: Cross-sectional Survey and Botometer Analysis

Background Internalizing, externalizing, and somatoform disorders are the most common and disabling forms of psychopathology. Our understanding of these clinical problems is limited by a reliance on self-report along with research using small samples. Social media has emerged as an exciting channel for collecting a large sample of longitudinal data from individuals to study psychopathology. Objective This study reported the results of 2 large ongoing studies in which we collected data from Twitter and self-reported clinical screening scales, the Studies of Online Cohorts for Internalizing Symptoms and Language (SOCIAL) I and II. Methods The participants were a sample of Twitter-using adults (SOCIAL I: N=1123) targeted to be nationally representative in terms of age, sex assigned at birth, race, and ethnicity, as well as a sample of college students in the Midwest (SOCIAL II: N=1988), of which 61.78% (1228/1988) were Twitter users. For all participants who were Twitter users, we asked for access to their Twitter handle, which we analyzed using Botometer, which rates the likelihood of an account belonging to a bot. We divided participants into 4 groups: Twitter users who did not give us their handle or gave us invalid handles (invalid), those who denied being Twitter users (no Twitter, only available for SOCIAL II), Twitter users who gave their handles but whose accounts had high bot scores (bot-like), and Twitter users who provided their handles and had low bot scores (valid). We explored whether there were significant differences among these groups in terms of their sociodemographic features, clinical symptoms, and aspects of social media use (ie, platforms used and time). Results In SOCIAL I, most individuals were classified as valid (580/1123, 51.65%), and a few were deemed bot-like (190/1123, 16.91%). A total of 31.43% (353/1123) gave no handle or gave an invalid handle (eg, entered “N/A”). In SOCIAL II, many individuals were not Twitter users (760/1988, 38.23%). Of the Twitter users in SOCIAL II (1228/1988, 61.78%), most were classified as either invalid (515/1228, 41.94%) or valid (484/1228, 39.41%), with a smaller fraction deemed bot-like (229/1228, 18.65%). Participants reported high rates of mental health diagnoses as well as high levels of symptoms, especially in SOCIAL II. In general, the differences between individuals who provided or did not provide their social media handles were small and not statistically significant. Conclusions Triangulating passively acquired social media data and self-reported questionnaires offers new possibilities for large-scale assessment and evaluation of vulnerability to mental disorders. The propensity of participants to share social media handles is likely not a source of sample bias in subsequent social media analytics.


Introduction
Background So-called mental disorders, including depression, anxiety, substance use, and pain-related conditions, account for a substantial proportion of disabilities attributed to illness worldwide [1]. According to hierarchical models of psychopathology [2], most of these clinical problems can be grouped into dimensions that include an internalizing dimension, involving emotional dysfunction, and an externalizing dimension, involving disinhibition or antagonism. Research implicates various mechanisms in the etiology and maintenance of mental disorder symptoms, including sustained negative affect, disturbances in positive affect, disrupted social processes, disturbances in arousal and regulatory processes, sensorimotor problems, and cognitive dysfunction [3]. However, it has been extremely difficult to determine reliable mechanisms of psychopathology. Although mental disorders are very common [4], they are also highly heterogeneous in their presenting characteristics [2]. In addition, the longitudinal course of mental health symptoms is also heterogeneous, with some individuals having brief courses and others having highly chronic or relapsing-recovering courses [5].

Social Media
Characterizing heterogeneity in psychopathology requires large samples, which, as a result, have become a staple of modern clinical research, that is, clinical trials such as STAR*D (Sequenced Treatment Alternatives to Relieve Depression) [6], epidemiological studies [7], neuroimaging cohorts [8], and neurocognitive assessment studies [9]. More recently, analyses of naturalistic social media samples have also facilitated the collection of large samples. Social media is well-suited for collecting research data because it is ubiquitous in modern life; 72% of adults in the United States report belonging to at least one social media platform [10]. Twitter, specifically, is used by 23% of the population in the United States [10]. Although the use of Twitter has a Pareto distribution, wherein a few individuals account for most of the active Twitter activity; approximately three-fourths of Twitter users use the platform at least once a week (46% use it daily and 27% use it at least weekly). As a social media platform, Twitter is geared toward sharing frequent, brief, and introspective posts that are suitable for longitudinal, within-subject text analysis at high temporal resolutions.
We had used Twitter previously to study vulnerability to mental health symptoms. For example, in a study, we reported that individuals who had disclosed that they were diagnosed with depression in their tweets (eg, "I was diagnosed with depression a couple of months ago...") had different circadian patterns of Twitter activity than a random sample of Twitter users [11]. Specifically, individuals who disclosed a depression diagnosis used Twitter more frequently later into the night and used Twitter less frequently earlier in the day, possibly indicating circadian differences between the depressed users and the random sample. In another study, we measured lexical proxies of cognitive distortions, words like "should," "must," "have to," "nobody," or "always," a concept from the literature on cognitive behavioral therapy which points to rigid or inflexible thinking [12,13]. As suggested by the generic cognitive model underlying cognitive behavioral therapy [14], individuals with depression make more use of cognitive distortions than a random sample of individuals [15]. Al-Mosaiwi and Johnstone [16] reported a similar finding with language that they deemed "absolutist." Others have also found associations between features of written text and depressive symptoms. For example, greater use of personal pronouns (eg, "I") in social media and other contexts appears to be correlated to symptoms of depression [17], a finding that connects with research using cognitive tasks linking depression to increased self-referential processing [18]. Similarly, greater use of negative emotional words, including those expressing depressive symptoms, appears to be related to depressive symptoms [19].
In spite of the potential offered by social media data for research into the mechanisms involved in the development and maintenance of mental disorders, there are limitations to passively acquired social media data. A limitation is that social media users are not representative of the general population [10,20]. There are data on sociodemographic differences between individuals who use specific social media sites and those who do not. Relative to the broader population, Twitter users are more likely to be male, younger, more educated, and more liberal leaning in their political orientation [10].
It has also been hypothesized that differences in variables such as need for self-disclosure [20] may bias samples of individuals who are on Twitter versus those who are not. Likewise, individuals who volunteer to give researchers access to their social media accounts may provide a biased subsample of individuals with a stronger disposition to self-disclose. Another limitation to using social media data for research is that researchers lack information to support inferences about participants' health from their web-based activity (eg, Is someone actually depressed even if they explicitly said so?).

This Study
To address these limitations of social media research, namely the lack of sample representativeness and inability to verify health status, we conducted the Studies of Online Cohorts for Internalizing Symptoms and Language (SOCIAL). SOCIAL are cohort-based studies in which we triangulated self-reported disorder screening questionnaires with data acquired from social media. Participants in SOCIAL I and II completed a series of disorder screening questionnaires focused on internalizing symptoms that are meant to capture psychopathology more broadly. They were also asked to provide their Twitter handles which we subsequently verified for validity, including how closely they resembled the behavior of bots. SOCIAL I is a sample of Twitter users (Methods section) targeted to be nationally representative in terms of age, sex assigned at birth, race, and ethnicity, and SOCIAL II is a large sample of college students.
Here, we describe the baseline sociodemographic characteristics, social media use data, and mental health characteristics of individuals in SOCIAL I and II. Because we asked individuals to self-report whether they used Twitter and to give us access to their Twitter accounts, we could compare sociodemographic characteristics, social media use data, and mental health differences between groups of individuals depending on their willingness to share their social media data. We distinguished approximately 2 groups of participants: those who provided valid Twitter handles pointing to their own social media content and those who did not or refused. The latter group can be separated into three subgroups: (1) users who refused to provide a valid Twitter handle (invalid handle), (2) users who denied being Twitter users (not a Twitter user), and (3) users who did provide an existing Twitter handle, but the accounts were deemed to be bot-like as defined by a machine learning classifier [21].

Overview
Both SOCIAL samples answered self-reported questionnaires probing internalizing, externalizing, somatoform, and thought disorder symptoms (Table 1). We also collected demographic information and aspects of social media use, including whether the individual was a Twitter user, whether they were willing to let us access their Twitter time line, and which other social media platforms they used.

Participant Recruitment
SOCIAL I purposefully sampled individuals via Qualtrics panels. We aimed to recruit approximately 1000 Twitter users, given the budgetary constraints for this study. Individuals were recruited from July 2020 to March 2021 for a study on "social media and mental health." The sample was selected to represent the United States at the intersections of age, gender, race and ethnicity.
All individuals in SOCIAL I were Twitter users. Accordingly, we could not ascertain the role that being a Twitter user in itself has on potential differences between individuals in baseline sociodemographic characteristics, social media use, and mental health symptoms. To have a sample of individuals who did not use Twitter as well as to have an additional sample with which to assess the transportability of results from SOCIAL I, we began the SOCIAL II study. SOCIAL II recruited college students from a predominantly White and Asian university in the Midwest. Individuals were compensated with credits in an introductory psychology course. Individuals were recruited from September 2020 to the present date.

Measures
For individuals in SOCIAL I and SOCIAL II, we collected information on characteristics described in the following sections.

Demographic Characteristics
Specifically, we collected age, political orientation on a 10-point Likert scale (1=extremely liberal, 10=extremely conservative), race, ethnicity, sex assigned at birth (male, female, other or inconclusive, or prefer not to say), gender identity (male, female, nonbinary, genderqueer, agender, other, or prefer not to say), and sexual orientation (heterosexual or straight, homosexual or gay, bisexual or pansexual, other, or prefer not to say). In SOCIAL I, we asked participants for their annual household income. In SOCIAL II, we asked participants to estimate their parents' annual household income. We present both of these as the same variable (ie, estimated household income). In addition, in SOCIAL I, we asked participants to indicate their race by using a single category from a list (White, Black or African American, American Indian or Alaska Native, Native Hawaiian or Pacific Islander, Hispanic, or other). In SOCIAL II, we allowed participants to select multiple racial and ethnic identities, including all the possibilities in SOCIAL I along with Middle Eastern or North African. We recoded the categories in SOCIAL II to fit a version of the race variable in SOCIAL I that identified whether individuals were non-Hispanic White, non-Hispanic Black, Hispanic, Asian, or other (eg, Native Hawaiian or Pacific Islander, Middle Eastern or North African, or multiracial but not Hispanic).

Social Media
Individuals who were Twitter users (ie, all individuals in SOCIAL I and some in SOCIAL II) were queried about how much time they spent on Twitter (less than once every few weeks; every few weeks; a few days (more like 1-2) a week; a few days (3-5) a week; about once a day; or several times a day). This item was selected on the basis that it was used by the Pew Research Center in a previous study on social media use in the United States. In addition, all individuals were queried about their use of Twitter as well as other social media platforms on a binary scale (ie, user vs nonuser of that platform). We asked for a Twitter handle for all individuals in the study who identified that they were Twitter users. Individuals could choose to enter a valid Twitter handle or to enter text to bypass the question (eg "I don't want to give my Twitter handle").

Mental Health
We compiled a battery of self-report disorder screening questionnaires for psychopathology (Table 1). These measures were chosen because (1) they measure symptoms that are relatively common (eg, depression) or relatively uncommon but highly impairing (eg, drug use), (2) they are indicators of some of the major domains of psychopathology as per contemporary nosologies (eg, the study by Kotov et al [2]), (3) they were freely available, and (4) they are widely used. Most of the measures we used were the Diagnostic and Statistical Manual of Mental Disorders (DSM) severity measures recommended by the American Psychiatric Association (eg, social anxiety, panic, worry, and substance use) or were measures that were eventually adapted into the DSM severity measures (ie, Patient Health Questionnaire (PHQ)-9 and PHQ-15 for depression and somatic symptoms, respectively). Given that all these measures have different response types and number of items, and accordingly different ranges, we standardized them all as percentage of maximum point (POMP) scores [22]. The POMP scores are defined as follows: POMP = ((observed score -minimum possible) / (maximum possible -minimum possible)) × 100. This represents the percentage of a measure's total that a specific score represents. For example, for the PHQ-9, with its score range of 0 to 27, a score of 0 is 0% of the POMP, 14 is 51.85%, and 27 is 100%. In addition to characterizing the symptoms of psychopathology that individuals currently experienced, we also asked them about whether they were aware of having received a medical diagnosis of the following mental disorders: depression, social anxiety, generalized anxiety, specific phobia, panic disorder, agoraphobia, posttraumatic stress disorder, somatic symptom disorder (or "chronic pain"), insomnia, alcohol use, drug use, or bipolar disorder (I or II). Individuals were allowed to answer "yes," "no," "no, but I should be," or "I don't know." In this study, we differentiated between individuals who were sure they had a diagnosis (ie, those answering "yes") and all others.
We conducted preliminary analyses to describe the samples, including the ranges represented in the different variables. The results of these analyses suggested that individuals gave relatively high ratings of self-reported manic symptoms, a problem that has been previously reported in the literature assessing hypomanic symptoms via self-report. Zimmerman [23] suggested that screening for bipolar disorder should be accompanied by a subsequent evaluation by a clinician. Similarly, individuals endorsed relatively few agoraphobic symptoms that were highly correlated with other internalizing symptoms. Considering these factors, we removed the mania rating scale, the Altman Self-Rating Mania Scale as well as the DSM Severity Scale For Agoraphobia from SOCIAL II leaving only a subsample of individuals with ratings on these scales (n=665).

Twitter Status
All individuals who reported that they were Twitter users were asked to provide their Twitter handles, which identify the individual on Twitter. The Twitter application programming interface, a free and public interface provided by Twitter, provides access to an individual's past tweets (timelines) via their individual handle (provided the tweets were public). Hence, for individuals who provided a Twitter handle, we retrieved individual timelines (a time-sorted record of their past tweets). We assessed whether the corresponding Twitter accounts were valid and belonged to real users using the Botometer application programming interface, an algorithm that uses machine learning to predict whether a given account belongs to a bot from its web-based behavior and content (eg, frequency of posting, specific content features, evidence that they have purchased followers, whether the account self-declares as being a bot, or whether the account has been declared a bot by others). As per recommendations of the Botometer developers, we explored the distribution of bot scores and created a cutoff of 0.42 to classify individuals as bot-like or valid users.
Individuals were classified as providing invalid handles if they refused to provide their handle, answered the question about handles with a response that was not a syntactically valid Twitter account (eg, "I don't want to give you this information"), or if Botometer failed to access the Twitter account. In addition to these 3 groups (ie, invalid, bot-like, and valid), in SOCIAL II, we included individuals who denied being Twitter users (not a Twitter user). We focused on the differences between these 3-4 subgroups using Twitter users who did not provide handles or who provided handles that were not syntactically valid account names (ie, the invalid handle group).

Analytic Plan
All analyses were conducted using the R programming language (version 4.1.2) [24] in R Studio [25]. Given that we have collected samples that differ substantially in demographic characteristics, we report all analyses according to the study cohort (ie, first in SOCIAL I and then in SOCIAL II). For continuous variables, we provide descriptive statistics in the form of means, SDs, medians, and IQR values. For categorical variables, we present frequencies and percentages.
To assess statistically significant differences between the sociodemographic factors, social media use, and mental health variables, we tested the association of each of these variables (eg, age, frequency of Twitter use, and depression) with group membership (ie, no handle, bot-like, valid, and no Twitter [on SOCIAL II]). For continuous variables, we reported the P values from a Kruskal-Wallis rank-sum test. For categorical variables, we reported the P values from a chi-square test to assess whether Twitter group membership is significantly related to specific baseline characteristics (eg, race and gender identity) or the P values from Fisher exact test when a cell size is <5. To characterize the magnitude of these associations (ie, the strength of the effect beyond its statistical significance), for binary variables, we report odds ratios (ORs) with 95% CIs when using individuals who provided invalid user names as the reference group. For nominal variables (eg, gender as male, female, or nonbinary), we report a Cramer V. For continuous variables, we report the standardized β values and 95% CIs, representing the differences in SD units of each variable in question.

Ethics Approval
Both studies were approved by the Indiana University Institutional Review Board (2002549202 and 2005948214).

Demographic Characteristics
In SOCIAL I (N=1123), the average participant was in their mid-30s, although there was variability in the ages represented ( Table 2). Approximately half of the individuals (580/1123, 51.65%) provided valid Twitter handles. For the remainder (ie, the 543/1123, 48.35% who did not provide valid Twitter handles), most were individuals who provided invalid Twitter handles (353/1123, 31.43%) with only 16.92% (190/1123) of people providing Twitter handles that were deemed to be bot-like. Most of those who used Twitter reported using the platform at least "several times a day." Individuals were approximately split along the political spectrum and there appeared to be variability in sexual orientation, gender identity, and socioeconomic status. Hispanic and Asian individuals appeared to be underrepresented relative to the population from the United States.
There were various statistically significant demographic differences among individuals in SOCIAL I based on Twitter status ( Table 2). In general, we focused on differences relevant to the individuals who provided valid Twitter handles versus those who refused to provide a handle or provided an invalid one (eg, we ignored differences between people who provided invalid handles vs bot-like handles). Compared with Twitter users who provided invalid handles, Twitter users who provided valid handles were more liberal (β=-0.14, 95% CI -0.20 to -0.07), used Twitter less (OR 0.73, 95% CI 0.56-0.95), and reported lower incomes (OR 0.58, 95% CI 0.46-0.74). In addition, compared with Twitter users who provided invalid handles, Twitter users who provided valid handles were relatively more likely to identify as genderqueer, nonbinary, or otherwise unwilling to use male or female designation than to identify as male (Cramer V=0.14, 95% CI 0.11-0.19) and were relatively more likely to identify as gay, lesbian, or bisexual than as heterosexual (OR 1.59, 95% CI 1.12-2.28).

Social Media Use
In SOCIAL I, all individuals were recruited to be Twitter users. Other social media platforms used by most of the sample were Facebook, Instagram, and YouTube (Table 3). There appeared to be several statistically significant differences in social media use by Twitter status, but these effects were mostly attributable to bot-like users (eg, bot-like users were more likely to report being on LINE than valid users). Individuals who provided valid Twitter handles were more likely to report that they used Tumblr (OR 1.48, 95% CI 1.07-2.05) and Pinterest than individuals who provided invalid Twitter handles (OR 1.79, 95% CI 1.37-2.34).

Mental Health
The POMP scores for the various measures of psychopathology as well as the reported diagnoses are presented in Figure 1 and Table 4. Stress and insomnia were the most commonly endorsed symptoms. Major depression and generalized and social anxiety were the most commonly reported clinical diagnoses. There were a few statistically significant differences between the groups in clinical symptoms or diagnoses. The differences we did find were very small. For example, the largest difference between the groups was in self-reported manic symptoms and suggested individuals who provided valid Twitter handles had lower symptoms of hypomania than individuals who did not provide Twitter handles, although this difference was small by conventional standards (β=-0.22, 95% CI -0.28 to -0.15. The next highest difference between the groups was in self-reported issues with alcohol and suggested that individuals who provided valid Twitter handles had lower alcohol use symptoms than individuals who did not provide Twitter handles, although this difference was small (β=-0.19, 95% CI -0.25 to -0.12). Compared with individuals who provided invalid handles, individuals who provided valid handles were less likely to report relatively rare diagnoses, such as somatic symptom disorder (OR 0.38, 95% CI 0.21-0.68) and drug use disorder (OR 0.60, 95% CI 0.39-0.92)

Demographic Characteristics
In SOCIAL II (N=1988), age was more restricted to the range 18 to 22 years, as would be expected of undergraduate students (mean 19.07, SD 2.91 years; Table 5). The sample was primarily female ( There were 3 statistically significant differences between individuals based on their Twitter user status, of which 2 involved the individuals with valid Twitter user names. First, individuals who provided valid Twitter handles used Twitter more frequently than individuals who were Twitter users but did not provide their handles or provided invalid handles (OR 2.48, 95% CI 1.98-3.13). Second, there were differences in reported race and ethnicity by Twitter user status (Cramer V=0.03, 95% CI 0.02-0.07). Specifically, individuals who provided valid Twitter handles were less likely to be Hispanic than individuals who were Twitter users but did not provide their handles (OR 0.40, 95% CI 0.21-0.77). LGB: lesbian, gay, bisexual (or other nonheterosexual sexual orientation).

Social Media
In SOCIAL II, Instagram and Snapchat were the most popular platforms and were used by almost all individuals (Table 3). Twitter, Facebook, YouTube, and TikTok were also quite popular, being used by 61.77% (1228/1988) to 78.72% (1530/1988) of the sample. A number of differences emerged with respect to which social media platforms the participants used. However, most of these differences indicated that individuals who denied being Twitter users were also less likely to use other platforms. The 2 exceptions were that the individuals in SOCIAL II who used Twitter and provided valid handles were more likely to also use TikTok (OR 1.50, 95% CI 1.05-2.14) and Pinterest (OR 1.35, 95% CI 1.04-1.77) than the individuals who refused to provide handles or provided invalid handles.

Mental Health
The POMP scores for the various measures of psychopathology as well as the reported diagnoses are presented in Figure 2 and Table 6. Similar to SOCIAL I, in SOCIAL II, stress and insomnia were the most commonly endorsed symptoms, and major depression and generalized and social anxiety were the most commonly reported clinical diagnoses. Relative to SOCIAL I (Table 4), there were even fewer statistically significant differences between the groups in clinical symptoms or diagnoses. The largest difference between individuals who provided valid (vs invalid) handles was in agoraphobic symptoms, and it was relatively small in magnitude (β=-0.13, 95% CI 0.03-0.15).

Principal Findings
Our results suggest that it is feasible to collect social media data from individuals who also provide information about a breadth of mental health symptoms. We found no evidence that individuals who provide valid Twitter accounts are a biased sample when compared with individuals who provide invalid handles, do not provide their handles, do not use Twitter, or are classified as bot-like. The widespread availability of social media [10] has facilitated research on large samples with longitudinal observations [26][27][28]. Although the nature of social media activity can be fairly simple (eg, posting short bits of text and sharing audiovisual content), researchers have made well-supported inferences from this activity about the way mood [26][27][28], sleep patterns [11], social relations [26], and personality [28] manifest in real-world contexts. Most of this work lacks the measurement of clinically relevant variables, such as the validated assessments of depression, anxiety, and other mental disorder symptoms that we used. For example, we conducted a study characterizing the language of individuals who self-identified as having received a clinical diagnosis of depression [15], finding that they use language that is more negative and rigid than that of a random sample [15]. Although the findings obtained using individual self-identification are interesting, they are subject to a variety of possible sample and observation biases and bear replication against validated clinical screening scales such as the ones we used in this study.
We conducted the SOCIAL I and SOCIAL II studies to triangulate data and meta-data obtained from social media with a range of validated clinical self-reports of symptoms of distress (ie, depression, stress, and generalized anxiety), fear (ie, panic and social anxiety), substance use (ie, alcohol and other drugs), somatoform problems (ie, insomnia and chronic pain), and potential thought disorder symptoms (ie, symptoms consistent with hypomania). However, a concern about studies triangulating clinical data and social media data remains that individuals who volunteer their social media accounts in such studies are not representative of individuals on social media in general [20]. In this report, we compared the baseline sociodemographic, clinical, and social media variables of individuals who were Twitter users who provided valid Twitter handles to Twitter users who provided handles associated with accounts with high bot scores, Twitter users who provided invalid account names, and, in SOCIAL II, non-Twitter users. In both cohorts, individuals who provided valid Twitter handles tended to use Twitter less than individuals who did not provide handles or who provided invalid handles, although these differences were small, and most individuals reported using Twitter "several times a day." By and large, the differences between the groups were not statistically significant, and when they were statistically significant, they were small in magnitude. This suggests that prior work that focuses on individuals who self-disclose valid Twitter handles is generalizable, at least with regard to the demographic, clinical, and social media features measured here. We observed other demographic differences between the 2 cohorts. For example, in SOCIAL I, cisgender women were more likely to provide their handles, as were lesbian, gay, bisexual, and queer individuals (vs heterosexual individuals) and those who reported lower (vs higher) incomes. Nonetheless, in all cases in which we did detect differences, there was complete overlap in the distributions of continuous and ordinal variables, and the differences in effect sizes were relatively small in magnitude. Again, these results are encouraging regarding the generalizability of research on people who volunteer their handles to social media users more broadly, and therefore, do not support the critique that relying on a sample of users who are willing to provide their Twitter handles will lead to significant sample bias.

Limitations and Strengths
Some limitations inherent in our data are worth considering. First, social media use, especially frequent social media use, is not a random and normally distributed variable. Evidence suggests, for example, that a small portion of users are responsible for a large number of tweets. Thus, future analyses of the SOCIAL I and II data sets and related data sets should consider the frequency of social media activity as well as the nature of that activity. In addition, the decision to enter a study focused on social media may in itself introduce a selection bias that we cannot guard against. Although our samples allow us to study mental health and social media across units of analysis (ie, self-report, text data, and meta-data), we lack more objective data including biomarkers or even observer reports of mental health symptoms. Importantly, although we did not conduct semistructured interviews about mental disorder diagnoses, the diagnosis of mental disorders is largely influenced by the severity of symptoms [29]. For many clinical problems such as depression [30], anxiety, and alcohol use [31], scores on disorder screening scales such as the ones we used are excellent predictors of diagnoses in clinical interviews.

Future Directions
Despite the fact that social media samples are not representative of the entire population, social media users represent 20%-70% of all individuals in the United States [10], thereby providing a sample that constitutes a plurality of the entire population in the United States. In SOCIAL I, we collected a relatively heterogeneous sample of Twitter users. SOCIAL II was a more homogeneous sample, but it had the advantage of containing a subsample of individuals who did not use Twitter or were unwilling to share these data. We collected an assortment of transdiagnostic features of psychopathology representing the most common symptoms of poor mental health. These data will allow us to assess how the spectrum and range of psychopathology manifests itself in natural language and social networks.
With these data sets, we can triangulate self-reported clinical data and data collected from social media. In both samples, mental health symptoms were relatively well represented, making them good, large-scale samples for studying psychopathology. Our current analyses suggest that individuals in these data sets who volunteered to give their Twitter handle, and provided a valid handle, were not different from other individuals in terms of their demographic characteristics, social media use, and mental health. A future direction for this line of work is to use self-reported mental health to replicate findings in which mental health is inferred through social media activity. Another direction is to extend the data collection to include ecological momentary assessments to triangulate to what extent social media behavior is a valid window into individuals' mental health.