Skip to main content

ORIGINAL RESEARCH article

Front. Psychiatry, 09 February 2023
Sec. Digital Mental Health
This article is part of the Research Topic Understanding Public Discourse for Digital Mental Health Promotion View all 6 articles

Detecting depression of Chinese microblog users via text analysis: Combining Linguistic Inquiry Word Count (LIWC) with culture and suicide related lexicons

\r\nSihua Lyu,Sihua Lyu1,2Xiaopeng Ren,Xiaopeng Ren1,2Yihua DuYihua Du3Nan Zhao,*Nan Zhao1,2*
  • 1CAS Key Laboratory of Behavioral Science, Institute of Psychology, Chinese Academy of Sciences, Beijing, China
  • 2Department of Psychology, University of Chinese Academy of Sciences, Beijing, China
  • 3Computer Network Information Center, Chinese Academy of Sciences, Beijing, China

Introduction: In recent years, research has used psycholinguistic features in public discourse, networking behaviors on social media and profile information to train models for depression detection. However, the most widely adopted approach for the extraction of psycholinguistic features is to use the Linguistic Inquiry Word Count (LIWC) dictionary and various affective lexicons. Other features related to cultural factors and suicide risk have not been explored. Moreover, the use of social networking behavioral features and profile features would limit the generalizability of the model. Therefore, our study aimed at building a prediction model of depression for text-only social media data through a wider range of possible linguistic features related to depression, and illuminate the relationship between linguistic expression and depression.

Methods: We collected 789 users’ depression scores as well as their past posts on Weibo, and extracted a total of 117 lexical features via Simplified Chinese Linguistic Inquiry Word Count, Chinese Suicide Dictionary, Chinese Version of Moral Foundations Dictionary, Chinese Version of Moral Motivation Dictionary, and Chinese Individualism/Collectivism Dictionary.

Results: Results showed that all the dictionaries contributed to the prediction. The best performing model occurred with linear regression, with the Pearson correlation coefficient between predicted values and self-reported values was 0.33, the R-squared was 0.10, and the split-half reliability was 0.75.

Discussion: This study did not only develop a predictive model applicable to text-only social media data, but also demonstrated the importance taking cultural psychological factors and suicide related expressions into consideration in the calculation of word frequency. Our research provided a more comprehensive understanding of how lexicons related to cultural psychology and suicide risk were associated with depression, and could contribute to the recognition of depression.

1. Introduction

Depression has been considered as a common mental illness worldwide, affecting an estimated 3.8% of the population, including 5.0% of young adults and 5.7% of adults over 60 (1). Globally, approximately 280 million people suffered from depression (1). In China, it was estimated nearly 50 million people had depression, accounting for 3.6% of the country’s population (1). In addition, studies have shown that the prevalence of depressive symptoms among different age groups and occupational groups ranged from 17 to 40% (26).

Depression has a significantly negative effect on patients, leading to poorer quality of life, cognitive dysfunction, low work productivity and unemployment (710). Moreover, it also poses a huge economic and psychological burden for both the family and society (1114). Recent research further suggested that COVID-19 pandemic has increased the prevalence and burden of depressive disorders, especially for some vulnerable populations such as females, younger people and medical staff (1517). As the early detection and intervention of depression could mitigate negative effects associated with depression (18, 19), it is of great value to provide depression screening and tracking services, especially for those vulnerable groups.

Many depression assessment scales such as Center for Epidemiological Studies Depression Scale (CES-D) have been designed for depression screening. And U.S. Preventive Services Task Force (20) demonstrated these scales have good sensitivity (80–90%) and fair specificity (70–85%). Nevertheless, two major disadvantages will arise when we use these scales for routinely depression screening at large scale. Firstly, although online technology has eased the difficulty and cost of conducting traditional questionnaires, the survey response rate still needs to be improved (21, 22), especially the follow-up surveys. Secondly, large-scale depression screening can be time-consuming to collect data from the targeted population.

Given these shortcomings, increasing number of studies have investigated the possibility of screening depressive symptoms passively and automatically. Some researchers have showed the potential of using neurophysiological biomarkers and text data collected from social media platform to detect and measure depression (2325). With the development of the online technology, it is obviously more cost-effective to screen depression in large-scale through social media. And it is of our concern how to understand and improve the prediction models of depression using social media data.

Throughout the literature, previous research based on social media platform has mainly made use of three types of features while training the model, that is, social networking behaviors, profile features, and psycholinguistic features (2631). In terms of psycholinguistic features, it could be found that the most widely adopted approach for feature extraction is to use the Linguistic Inquiry Word Count (LIWC) dictionary and various affective lexicons. Other factors in individual characteristics have not been included in the analysis of textual content in past research.

However, sociocultural factors have been considered by increasing number of mental health scholars and practitioners in their conceptions of causality and treatment of mental disorders. For example, Marsella et al. (32) proposed the interactional model of behavior, in which relating both physical environment and socioenvironmental phenomena to individuals’ biological and psychological variables. Besides, many studies have explained how culture-related factors were entangled with individuals’ mental health. Specifically, Helgeson (33) stated that moral motivation such as communion could be beneficial to mental well-being through improving relationship satisfaction. Moreover, Kirsh and Kuiper (34) pointed out that people with excessive individualism and relatedness were more likely to engage in the negative kind of thinking. Furthermore, the discrepancies between ones’ own moral standards and those of society could positively predict depression (35). Thus, cultural factors such as moral motivation, individualism, and moral foundation might have an impact on mental health, in particular depression. Moreover, Coryell and Young (36) found depressive individuals would have higher suicide risk. Thus, it also seems plausible to use suicidal expression to help train the detection model.

Checking wider ranges of possible linguistic features would introduce a more comprehensive understanding of the relationship between real-life verbal expression and depression, and also might increase the performance of the prediction model. Furthermore, developing and optimizing the linguistic features would do great help to model generalization, as the use of social networking behavioral features and profile features usually make it impossible to apply the model in different platform or text-only condition.

Therefore, our study aimed at building a prediction model of depression for text-only social media data so that the model could be generalized across platforms. Furthermore, wider ranges of depression-related lexicons were tested in order to explore how these lexical features entangled with depression. Specifically, not only did we use Simplified Chinese Linguistic Inquiry Word Count (SCLIWC) dictionary, but also covered psycholinguistic features representing suicidal expression, moral foundation, moral motivation and individualism/collectivism.

2. Materials and methods

2.1. Participants

We recruited 1,813 Chinese users of Sina Weibo, the most popular Chinese social media, to participate in this study. Participants were informed about the research and required to fill in the electronic consent form. Referring to the participant screening procedures of prior research (3739), the inclusion criteria of participants for our study were as follows: (1) users who had over 50 posts with a total word count of more than 500 words in the month before the questionnaire was completed (n = 895); (2) at least 70% of the words should be identifiable by SCLIWC lexicon (n = 851); (3) users whose questionnaire completion time was over 90 s (n = 789). Finally, 789 valid participants were left (287 males), averaged 24.3 years of age (SD = 6.2, range = 13–57).

2.2. Measurement

2.2.1. CES-D

The CES-D scale was used in the study to assess self-reported depressive symptoms (40). The CES-D consists of 20 items and all the statements could be rated on a scale of 0–3, with 0 representing rarely no symptom presence (less than 1 day in the past week) and 3 indicating most or all of the time (5–7 days in the past week). Previous research has suggested cutoff scores to differentiate patients with different depressive symptoms severity: (1) no depression, score 0–15; (2) mild depression, score 16–20; (3) moderate depression, score 21–25; (4) severe depression, score 26–60 (40, 41).

2.3. Procedure

2.3.1. Data collection

We developed a Weibo-based application named “XinLiDiTu” to recruit participants (Figure 1). Weibo users could get paid by filling out an online survey containing CES-D scale and demographic questions (e.g., gender, age). Then, a crawler would collect their original public Weibo posts from the pre-constructed Weibo data pool (42). It should be noted that we only collected the posts in the month prior to the day the survey was completed, as the CES-D scale was designed to measure current psychological state.

FIGURE 1
www.frontiersin.org

Figure 1. “XinLiDiTu” website.

2.3.2. Feature extraction

We employed the Simplified Chinese LIWC (SCLIWC) dictionary, Chinese Suicide Dictionary, Chinese Version of Moral Foundations Dictionary, Chinese Version of Moral Motivation Dictionary and Chinese Individualism/Collectivism Dictionary to extract word frequency features from Weibo posts.

SCLIWC mapped the written expression into over 80 psychologically or linguistically meaningful categories, covering individuals’ psychological aspects such as emotional and cognitive processes (43).

Chinese Suicide Dictionary consists of 2,168 words, which belongs to 13 different categories related to the risk of suicide (e.g., suicide ideation) (44).

Chinese Version of Moral Foundations Dictionary is composed of 580 words for five moral foundations (Care/Fairness/Loyalty/Authority/Sanctity), with each foundation containing both foundation-supporting words (virtues) and foundation-violating words (vices) (45, 46).

Chinese Version of Moral Motivation Dictionary was adapted by Zhang and Yu (47) from the work of Frimer (48). The Chinese version one has 690 words for the agency dimension and 260 words for the communion dimension.

Chinese Individualism/Collectivism Dictionary was developed by Ren et al. (49), including 53 individualism words and 64 collectivism words.

We aggregated all the posts of each individual into a single text and calculate the word frequency of lexical categories from above dictionaries. Finally, a total of 117 lexical features were extracted from our dataset.

2.3.3. Feature selection

To optimize the performance of regression models, the greedy algorithm was adopted for search strategy in the feature selection. The greedy stepwise forward algorithm would add one feature per step that provides the highest increase in evaluation measure. And it will terminate when the evaluation measure is not improved, or the variables run out. The whole process was described in Figure 2. In our study, the Pearson correlation coefficients between predicted values and true values were used as the evaluation measure.

FIGURE 2
www.frontiersin.org

Figure 2. Forward greedy feature selection. FS(i) represents the selected features in i-th iteration, F(i) represents features that have not been selected in i-th iteration.

3. Results

3.1. Descriptive statistics

The scores of the CES-D scale were distributed over a relatively wide range from 0 to 57, with a mean of 17.15 and a standard deviation of 11.67. Table 1 shows the percentage distribution of no depression and depression with different level of symptoms [i.e., mild, moderate, severe; (40, 41)] among all subjects. It can be seen that near half of the sample had depressive symptomatology to some extent.

TABLE 1
www.frontiersin.org

Table 1. The percentage distribution of subjects with different level of depressive symptoms.

3.2. Correlation between lexical features and depression

We calculated Spearman correlations coefficients between word frequency features and CES-D scores. Before Bonferroni correction, a total of 30 lexical features that were significantly correlated with depression were eventually obtained from these five dictionaries (see Figure 3). Among them, psychache words from Chinese Suicide Dictionary reached the highest correlation coefficient with depression scores (r = 0.19, p < 0.001). After performing Bonferroni correction, only space words (r = −0.13, p < 0.001) from SCLIWC and psychache words (r = 0.19, p < 0.001) significantly correlated with CES-D scores.

FIGURE 3
www.frontiersin.org

Figure 3. The heatmap between lexical features and CES-D score. I, We, Prepositions, Quantifiers, MultiFunction, Sadness, Inhibition, Inclusive, BiologicalProcess, Health, Sexuality, Relative, Motion, Space, Work, Achieve, Religion, Death, WordsPerSentence, and WordsPerPost were extracted from SCLIWC; Agency and Communion were extracted from The Chinese Version of Moral Motivation Dictionary; Fairness Virtue, IngroupVirtue, AuthorityVirtue, AuthorityVice were extracted from The Chinese Version of Moral Foundation Dictionary, Collectivism was extracted from The Chinese Individualism/Collectivism Dictionary; Psychache, SomaticComplaints, and Trauma/Hurt were extracted from The Chinese Suicide Dictionary.

3.3. The performance of regression models

To evaluate the performance of regression models, we adopted 5-fold cross-validation to calculate the mean of R-squared scores and the mean of Pearson correlation coefficients between predicted values and true values for each algorithm. We used ridge, linear regression, support vector regression, random forest regression, and gradient boosting regression to build machine learning models. To validate the effectiveness of the culture and suicide related lexicons in predicting depression, we compared the results for data with and without these added features (full dataset vs. SCLIWC dataset). The top three best performing models for both datasets have been illustrated in Table 2, supporting that the performance of the predictive model could be improved with culture and suicide related features.

TABLE 2
www.frontiersin.org

Table 2. The performance of the regression models with 5-fold-cross validation.

For the full dataset, the best outcome occurred with linear regression (R2 = 0.10, r = 0.33), with the selected features shown in Table 3. And the scatterplot of these selected features with CES-D scores presented in Figure 4. It can be seen that some of the features (e.g., psychache) significantly associated with depression scores were also selected into the model.

TABLE 3
www.frontiersin.org

Table 3. Selected features after feature selection.

FIGURE 4
www.frontiersin.org

Figure 4. Scatter plots of selected features with CES-D scores. The horizontal axis represents the word frequency, and the vertical axis represents the CES-D score.

3.4. The split-half reliability of regression models

To obtain the split-half reliability, we were supposed to split our samples into two parts, one of which was used for model building and the other for testing. Thus, we ranked users in descending order of the number of posts and selected the last 90% of users (n = 711) to build the CES-D model. While developing the model, we still used the greedy forward stepwise algorithm for feature selection and linear regression for model training. For the remaining top 10% (n = 78) of users, each individual’s Weibo posts were sorted by posting time and ordered by ascending. Further, their posts were divided into halves individually (i.e., odd-numbered and even-numbered posts). Then, the CES-D model predicted the scores for each individual based on the odd-numbered and the even-numbered posts, so there were two CES-D scores for each user. The scatterplot (Figure 5) shows a strong and positive relationship (r = 0.75, p < 0.01) between these two scores.

FIGURE 5
www.frontiersin.org

Figure 5. User’s depression scores calculated from even-numbered posts and odd-numbered posts.

4. Discussion

4.1. The feasibility of recognizing depression via word frequency information

In this study, lexical features were extracted using SCLIWC, Chinese Suicide Dictionary, Chinese Version of Moral Foundations Dictionary, Chinese Version of Moral Motivation Dictionary and Chinese Individualism/Collectivism Dictionary to build predictive models of depression through machine learning methods. Results show that the correlation between actual scores and predicted scores achieved 0.33, and the split-half reliability was 0.75.

In fact, Hu et al. (50) has used a total of 927 features, including linguistic features (via SCLIWC), stable behavioral features (i.e., profile features, self-expression behaviors, privacy settings, interpersonal behaviors), and dynamic behavioral features (i.e., microblog updates, mentions, use of apps, recordable browsing behaviors) to build depression recognition model. The Pearson correlation coefficients between predicted values and self-reported values of the best-performance model in their study reached 0.38. Although their model slightly outperformed our model, they used more complex and not easily accessible data such as recordable browsing behavior. By taking cultural psychological factors and suicide expressions into consideration in the calculation of word frequency, we could obtain a model that was slightly inferior in performance, but has greatly reduced the difficulty and complexity of feature extraction. Moreover, using only lexical features could improve the generalization ability of the model, so that the application of the model could not only be limited to the microblogging platform.

Moreover, we can see that the linear regression offered the best outcome in terms of the model performance and split-half reliability. It might imply a linear relationship between word frequencies in public discourse and users’ depression. Compared to random forest regression and support vector machine, one of the advantages for linear regression models was that they were easier to interpret. By building a linear model, it provided new evidence for us to reveal and understand the expression pattern of depressive populations.

4.2. The features contributed in depression identification

Compared with correlation analysis, linear regression could estimate the change in the depression scores due to the change in one or more independent variables. Thus, the features selected by the linear regression model did contribute to the detection of depression, to varying degrees. Half of the chosen features were from SCLIWC: prepositions, second person plural pronouns, semicolon, present tense, leisure, achieve, humans, swear, non-fluencies and sadness. Another half of the selected features were from cultural psychological dictionaries and the suicide dictionary: fairness vice, purity vice, authority virtue, authority vice and morality general from Chinese Version of Moral Foundations Dictionary; communion from Chinese Version of Moral Motivation Dictionary; collectivism from Chinese Individualism/Collectivism Dictionary; psychache, guilt/shame and personality from Chinese Suicide Dictionary. In the following section, we would elucidate how some of the above selected features could entangle with depression.

4.2.1. SCLIWC

Linguistic Inquiry Word Count has been widely used to investigate the linguistic markers of depressed people (2629, 51). And the findings suggest that depressed patients differed in the writing pattern comparing to the non-depressed group. These differences enable us to detect depression through plain text. Our findings were partially consistent with the previous literature, while some cross-cultural differences emerged.

Firstly, different from most of the past studies (5254), first person singular pronouns did not serve an important role in the model. Based on self-awareness theory, self-focused attention was one of the vulnerability factors for the onset and maintenance of depression (55, 56). Thus, the use of first singular pronouns has long been considered as an effective indication of depressive narratives. We considered this discrepancy could be explained by two reasons: firstly, the absence of the subject was very common in Weibo comments posted by depressed individuals (57); secondly, first person singular pronouns in Chinese did not merely refer to the addresser himself or herself, but also used as pragmatic empathetic deixis to narrow the psychological distance between addresser and addressee (58). Therefore, we suspected that more research is needed to support the use of first person singular pronouns as a significant indicator of depression in the Chinese corpus.

Secondly, we observed the second person plural pronouns appear more frequently in the depression group. This was the opposite of the results obtained from studies conducted in English texts (54). However, this result replicated the findings of studies focusing on Chinese texts (59, 60). This could possibly because the social media has become a platform for the depressed group to communicate for social support and advice (61, 62), and thus they would use the second person pronouns in their posts more often.

Despite above differences, there were some findings in line with the previous literature. Present tenses and sadness words have also been identified as valid cues to recognize depression in prior research (28, 63). Depressed people have been described as “stucking in the past” (64), and thus focusing more on the past rather than the present. Therefore, a lower rate of present tenses words might manifest the possibility of depression. And the more frequent use of sadness words supported that depressed individuals expressed more negative emotions (53). Furthermore, achievement words were less mentioned by the depressed group in our study. O’Connor et al. (65) suggests that depressed people were significantly higher in survivor guilt. Survivor guilt refers to a dysfunctional belief possessed by individuals who believe that the pursuit and achievement of their own happiness and fulfillment will cause others to suffer by comparison. Therefore, we inferred that depressed patients were less likely to share their achievements on the posts.

4.2.2. Chinese Suicide Dictionary

For suicide-related expressions, results indicated that psychache words, personality words and guilt/shame words could help identify people with depression. Psychache words (e.g., want to cry, loneliness) reflect one’s psychological distress, which are more likely to experience and express by people in depression (66). Guilt/shame words (e.g., lose status, make an apology) embody a sense of guilt and shame. It has been suggested that survivor guilt and omnipotent responsibility guilt were important factors in depression (65). Personality words reflect negative personality such as inferiority complex. The results showed that compared to healthy individuals, those with depressive tendencies instead less likely to mention negative personality words in their public discourse. This brought new insights into our understanding of how depressed people present their self-image on social media. Rosen et al. (67) illustrated that frequent impression management such as updating profile information was positively related to depression. Thus, depressed people might also be motivated to avoid negative self-disclosure in order to leave a better impression with others.

4.2.3. Individualism/Collectivism Dictionary and Chinese Version of Moral Motivation Dictionary

Our study shows that the non-depressed group scored higher in collectivism and communion. In collectivist cultures, people stress the importance of the community and value the trait of altruism. Previous literature shows that collectivism could predict greater social support and enhance group identification (68, 69). Further, a higher level of social support and group identification could buffer vulnerable individuals from stressful conditions, and prevent people from developing depressive symptoms (70, 71).

Communion is the motive to promote the interests of others, with themes of caring for others and contributing to the society, involving qualities such as benevolence, attachment and empathy (72). Indeed, when we compare communion with collectivism values, it can be seen that communion could also be encouraged in the collectivism culture. A study conducted in Japan, also a typically collectivist country like China, found that communion was positively associated with social support, and contributed to psychological well-being (73). Thus, in East Asian societies, collectivism and communion could probably act as protective factors for depression.

4.2.4. Chinese Version of Moral Foundation Dictionary

It can be told that the Chinese Version of Moral Foundation Dictionary provides a lot information for making prediction, as nearly half of the lexical features from this dictionary were selected by our model. Graham et al. (74) came up with five moral foundations rooted in human nature, and each foundation has positive and negative dimensions, which are care/harm, fairness/cheating, loyalty/betrayal, authority/subversion, and sanctity/degradation. The part before the slash is the name of a moral foundation (i.e., virtues), and the part after the slash refers to the corresponding foundation-violating behavior (i.e., vices). Results show that the depressed group varied the healthy group in terms of the use of fairness vice words, sanctity vice words, authority vice words, authority virtue words and morality general words.

The fairness foundation was based on ethic of justice, emphasizing on the rights and welfare of individuals. A higher score for the depressed group on fairness vice (cheating) represented that they mentioned more about the fairness-violating behavior (e.g., prejudice, inequality) than the normal group. We suggested that this group might perceive a lower level of social fairness, and thus resulting a poorer mental health status, especially for the disadvantaged minority group. It was expected that individuals who hold fewer favorable beliefs about the fairness toward society were less likely to believe their hard work and efforts would pay off, leading to the increase of emotional depression (75, 76).

Sanctity, which is also called purity, did not only focus on pathogen avoidance to protect us from being contaminated, but also emphasizing on cultivating a more spiritual mindset by living in a pure and sanctified way. Han et al. (77) found that purity vice had positive significant mediating effect on suicidal behavior via the mediator psychache. They pointed out this could be due to the fact that the psychology of purity was associated with stigma, which further has a negative impact on mental health.

The authority foundation focused on forging beneficial relationships in the hierarchy. This foundation was comprised of both virtues of subordinates (e.g., obedience and respect for authority) and virtues of authorities (e.g., leadership and protection). The authority virtue words included expressions that promotes obedience and leadership (e.g., obey, respect), whereas the authority vice words contained expressions that describes subversion (e.g., rebel, riot). We observed that non-depressed populations mentioned both the authority vice vocabulary and the authority vitue vocabulary more frequently than depressed populations, a phenomenon that seems to contradict each other. However, in our opinion, the greater use of these two vocabularies is indicative of the group’s emphasis on adherence to authority, since most of the words in the authority vice category are negative descriptions that are more likely to be used in the context of accusations and condemnations of violations of authority. The Chinese culture has been largely influenced by Confucianism, which emphasizes order and conformity to authority. For the non-depressed group, their obedience to authorities enhanced their adaptation to the social environment, reducing the likelihood of depression (78).

4.2.5. The significance of those beyond basic expression features for depression identification

Our results not only demonstrated the importance of word frequencies of online expressions for depression recognition, but more importantly, brought new insights into the relationship between depressive symptoms and linguistic expression through building interpretable linear model. This study along with previous research using LIWC to detect depression have shown a tendency for depressed individuals to use several basic categories of words, including tenses, emotion, personal concerns and so on. Other features found to play an important role in our study, such as communion, focused on the socio-cultural elements of linguistic expression, and reflecting the motivation of users’ social behavior. This finding suggests that researchers should pay more attention on the socio-cultural features of linguistic expression while observing and monitoring depression. The reason why these features have not been explored in prior literature on depression recognition might be that researchers believed that these socio-cultural factors were relatively more difficult to reflect in the form of word frequencies. Our study, on the other hand, confirmed that the calculation of word frequencies could, to some extent, extract socio-cultural elements in linguistic expression. And these socio-cultural features had a non-negligible role in the identification and monitoring of depression.

To conclude, this study showed the feasibility of relying solely on lexical features of the text for depression detection, with all the five dictionaries contributing to the prediction to varying degrees. That is to say, in addition to LIWC/SCLIWC, which was frequently used in previous research, lexical features related to cultural psychology and suicide risk also played a role in the identification of depression. To better identify the mental health status of individuals on social media, we are supposed to attach more importance of the textual expressions related to culture psychology and suicide risk, as these factors could provide information on individuals’ predispositions shaped by the culture and also their perception toward the world.

4.3. Implications

The present study demonstrated that public discourse could reflect one’s mental status to some extent. Public discourse reveals important information about individuals’ different values and their perception toward the world (79, 80). Through linguistic analysis along with machine learning methods, we could discriminate depressed content from non-depressed content.

One of the advantages of this model is the applicability to text-only scenarios. A number of previous studies have utilized social media users’ social networking behaviors and profile information to train recognition models (30, 31, 50). However, if such personal information was not accessible, the validity of these models would be greatly compromised and even completely lose their predictive power. By contrast, our study only extracted features from text, which enables our model generalizable across platforms, especially those where only text can be collected.

Furthermore, to our knowledge, this is the first research that took cultural factors and suicide related expressions into account in feature extraction. This study deepens our understanding of the possible roles of cultural psychological related factors and suicide expression in depression identification, and shedding light on the relationship between these lexical features and depression.

4.4. Limitations

This study has a few limitations. Firstly, our analysis of the text is limited to the extraction of word frequency features, which have a very limited degree of representation of text information. The use of more different feature extraction strategies could enrich our feature set and perhaps further improve the performance of the model. Secondly, we have minimum requirements for posting frequency and total word count while selecting users to construct our dataset, so the model will be more suitable for predicting the depression status of individuals who are more active on social media. Thus, there might be a sampling bias, as some individuals with depression would not be willing to share their life on social platform, which would limit the generalization of our results and model to other populations. Thirdly, we only recruited some machine learning methods (e.g., random forest regression) to build the predictive model. In fact, in addition to regular machine learning methods, deep learning could also be utilized for depression detection, and previous studies have validated the effectiveness of this approach for depression prediction (81). Fourthly, the feature to sample ratio was suggested to be less than 0.10 to avoid the overfitting problem (82). In our study, the feature to sample ratio was 0.15, which is close but there is still room for further improvement. Thus, expanding the sample size to improve the robustness of the model is recommended for future research. Besides, future research could try more different combinations of texture feature extraction methods (e.g., n-gram, word2seq) and deep learning algorithms to further improve the model performance.

5. Conclusion

This study found that depression could be detected solely through word frequency features by machine learning methods. This model could have potential value in the screening for depression and be able to generalized across platforms. Furthermore, our study demonstrated that in addition to LIWC, which was commonly used in previous studies, lexicons related to cultural psychology and suicide risk were also associated with depression, and could contribute to the recognition of depression.

Data availability statement

The raw data supporting the conclusions of this article will be made available by the authors, without undue reservation.

Ethics statement

The studies involving human participants were reviewed and approved by the Scientific Research Ethics Committee of the Chinese Academy of Sciences Institute of Psychology. The patients/participants provided their written informed consent to participate in this study.

Author contributions

SL, NZ, and XR contributed to the conception and design of the study and final version of the manuscript. SL was responsible for the data collection and the statistical analysis. SL and NZ wrote the manuscript. YD helped with the manuscript revision. All authors contributed to the article and approved the submitted version.

Funding

This work was financially supported by the Strategic Priority Research Program of Chinese Academy of Sciences (No. XDC02060300), the Scientific Foundation of Institute of Psychology, Chinese Academy of Sciences (No. E2CX4735YZ), Youth Innovation Promotion Association CAS, and the Strategic Priority Research Program of Chinese Academy of Sciences (No. XDA27000000).

Acknowledgments

We thank all subjects for their participation in this study.

Conflict of interest

The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.

Publisher’s note

All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article, or claim that may be made by its manufacturer, is not guaranteed or endorsed by the publisher.

References

1. Depression fact sheet. Institute for Health Metrics and Evaluation. (2019). Available online at: https://vizhub.healthdata.org/gbd-results/?params=gbd-api-2019-permalink/d780dffbe8a381b25e1416884959e88b (accessed August 6, 2022).

Google Scholar

2. Gong Y, Han T, Chen W, Dib HH, Yang G, Zhuang R, et al. Prevalence of anxiety and depressive symptoms and related risk factors among physicians in china: a cross-sectional study. PLoS One. (2014) 9:e103242. doi: 10.1371/journal.pone.0103242

PubMed Abstract | CrossRef Full Text | Google Scholar

3. Qin X, Wang S, Hsieh C-R. The prevalence of depression and depressive symptoms among adults in China: estimation based on a National Household Survey. China Econ Rev. (2018) 51:271–82. doi: 10.1016/j.chieco.2016.04.001

CrossRef Full Text | Google Scholar

4. Tang T, Jiang J, Tang X. Prevalence of depressive symptoms among older adults in mainland China: a systematic review and meta-analysis. J. Affect. Disord. (2021) 293:379–90. doi: 10.1016/j.jad.2021.06.050

PubMed Abstract | CrossRef Full Text | Google Scholar

5. Tang X, Tang S, Ren Z, Wong DFK. Prevalence of depressive symptoms among adolescents in secondary school in mainland China: a systematic review and meta-analysis. J Affect Disord. (2019) 245:498–507. doi: 10.1016/j.jad.2018.11.043

PubMed Abstract | CrossRef Full Text | Google Scholar

6. Xu D-D, Rao W-W, Cao X-L, Wen S-Y, An F-R, Che W-I, et al. Prevalence of depressive symptoms in primary school students in China: a systematic review and meta-analysis. J Affect Disord. (2020) 268:20–7. doi: 10.1016/j.jad.2020.02.034

PubMed Abstract | CrossRef Full Text | Google Scholar

7. Chung L, Pan A-W, Hsiung P-C. Quality of life for patients with major depression in Taiwan: a model-based study of predictive factors. Psychiatry Res. (2009) 168:153–62. doi: 10.1016/j.psychres.2008.04.003

PubMed Abstract | CrossRef Full Text | Google Scholar

8. Clark M, DiBenedetti D, Perez V. Cognitive dysfunction and work productivity in major depressive disorder. Expert Rev Pharmacoecon Outcomes Res. (2016) 16:455–63. doi: 10.1080/14737167.2016.1195688

PubMed Abstract | CrossRef Full Text | Google Scholar

9. Lerner D, Adler DA, Chang H, Lapitsky L, Hood MY, Perissinotto C, et al. Unemployment, job retention, and productivity loss among employees with depression. Psychiatr Serv. (2004) 55:1371–8. doi: 10.1176/appi.ps.55.12.1371

PubMed Abstract | CrossRef Full Text | Google Scholar

10. Wells KB. The functioning and well-being of depressed patients: results from the medical outcomes study. JAMA. (1989) 262:914. doi: 10.1001/jama.1989.03430070062031

CrossRef Full Text | Google Scholar

11. Bodden DHM, Stikkelbroek Y, Dirksen CD. Societal burden of adolescent depression, an overview and cost-of-illness study. J Affect Disord. (2018) 241:256–62. doi: 10.1016/j.jad.2018.06.015

PubMed Abstract | CrossRef Full Text | Google Scholar

12. Briley M, Lépine JP. The increasing burden of depression. Neuropsychiatr Dis Treat. (2011) 7:3–7. doi: 10.2147/NDT.S19617

PubMed Abstract | CrossRef Full Text | Google Scholar

13. Maurer DM, Raymond TJ, Davis BN. Depression: screening and diagnosis. Am Family Phys. (2018) 98:508–15.

Google Scholar

14. Sartorius N. The economic and social burden of depression. J Psychiatry. (2001) 62:8–11.

Google Scholar

15. Santomauro DF, Herrera AMM, Shadid J, Zheng P, Ashbaugh C, Pigott DM, et al. Global prevalence and burden of depressive and anxiety disorders in 204 countries and territories in 2020 due to the COVID-19 pandemic. Lancet. (2021) 398:1700–12. doi: 10.1016/S0140-6736(21)02143-7

CrossRef Full Text | Google Scholar

16. Varma P, Junge M, Meaklim H, Jackson ML. Younger people are more vulnerable to stress, anxiety and depression during COVID-19 pandemic: a global cross-sectional survey. Prog Neuro-Psychopharmacol Biol Psychiatry. (2021) 109:110236. doi: 10.1016/j.pnpbp.2020.110236

PubMed Abstract | CrossRef Full Text | Google Scholar

17. Wu K, Wei X. Analysis of psychological and sleep status and exercise rehabilitation of front-line clinical staff in the fight against COVID-19 in China. Med Sci Monit Basic Res. (2020) 26:e924085. doi: 10.12659/MSMBR.924085

PubMed Abstract | CrossRef Full Text | Google Scholar

18. Halfin A. Depression: the benefits of early and appropriate treatment. Am J Managed Care. (2007) 13(4 Suppl):S92–7.

Google Scholar

19. Picardi A, Lega I, Tarsitani L, Caredda M, Matteucci G, Zerella MP, et al. A randomised controlled trial of the effectiveness of a program for early detection and treatment of depression in primary care. J. Affect. Disord. (2016) 198:96–101. doi: 10.1016/j.jad.2016.03.025

PubMed Abstract | CrossRef Full Text | Google Scholar

20. U.S. Preventive Services Task Force. Screening for depression in adults: US Preventive Services Task Force recommendation statement. Ann Internal Med. (2009) 151:784–92. doi: 10.7326/0003-4819-151-11-200912010-00006

PubMed Abstract | CrossRef Full Text | Google Scholar

21. BinDhim NF, Shaman AM, Trevena L, Basyouni MH, Pont LG, Alhawassi TM. Depression screening via a smartphone app: cross-country user characteristics and feasibility. J Am Med Informat Assoc. (2015) 22:29–34. doi: 10.1136/amiajnl-2014-002840

PubMed Abstract | CrossRef Full Text | Google Scholar

22. Stenman M, Sartipy U. Depression screening in cardiac surgery patients. Heart Lung Circ. (2019) 28:953–8. doi: 10.1016/j.hlc.2018.04.298

PubMed Abstract | CrossRef Full Text | Google Scholar

23. Liu D, Feng XL, Ahmed F, Shahid M, Guo J. Detecting and measuring depression on social media using a machine learning approach: systematic review. JMIR Ment Health. (2022) 9:e27244. doi: 10.2196/27244

PubMed Abstract | CrossRef Full Text | Google Scholar

24. Mumtaz W, Malik AS, Yasin MAM, Xia L. Review on EEG and ERP predictive biomarkers for major depressive disorder. Biomed Signal Process Control. (2015) 22:85–98. doi: 10.1016/j.bspc.2015.07.003

CrossRef Full Text | Google Scholar

25. Orrù G, Pettersson-Yeo W, Marquand AF, Sartori G, Mechelli A. Using Support Vector Machine to identify imaging biomarkers of neurological and psychiatric disease: a critical review. Neurosci Biobehav Rev. (2012) 36:1140–52. doi: 10.1016/j.neubiorev.2012.01.004

PubMed Abstract | CrossRef Full Text | Google Scholar

26. Fatima I, Mukhtar H, Ahmad HF, Rajpoot K. Analysis of user-generated content from online social communities to characterise and predict depression degree. J Informat Sci. (2018) 44:683–95. doi: 10.1177/0165551517740835

CrossRef Full Text | Google Scholar

27. Islam R, Kabir MA, Ahmed A, Kamal ARM, Wang H, Ulhaq A. Depression detection from social network data using machine learning techniques. Health Informat Sci Syst. (2018) 6:8. doi: 10.1007/s13755-018-0046-0

PubMed Abstract | CrossRef Full Text | Google Scholar

28. Nguyen T, Phung D, Dao B, Venkatesh S, Berk M. Affective and content analysis of online depression communities. IEEE Transac Affect Comput. (2014) 5:217–26. doi: 10.1109/TAFFC.2014.2315623

CrossRef Full Text | Google Scholar

29. Shatte ABR, Hutchinson DM, Fuller-Tyszkiewicz M, Teague SJ. Social media markers to identify fathers at risk of postpartum depression: a machine learning approach. Cyberpsychol Behav Soc Netw. (2020) 23:611–8. doi: 10.1089/cyber.2019.0746

PubMed Abstract | CrossRef Full Text | Google Scholar

30. Shen G, Jia J, Nie L, Feng F, Zhang C, Hu T, et al. Depression detection via harvesting social media: a multimodal dictionary learning solution. Proceedings of the Twenty-Sixth International Joint Conference on Artificial Intelligence. Palo Alto, CA: The AAAI Press (2017). p. 3838–44. doi: 10.24963/ijcai.2017/536

CrossRef Full Text | Google Scholar

31. Wang X, Zhang C, Ji Y, Sun L, Wu L, Bao Z. A depression detection model based on sentiment analysis in micro-blog social network. In: Li J, Cao L, Wang C, Tan KC, Liu B, Pei J, et al. editors. Trends and Applications in Knowledge Discovery and Data Mining. (Vol. 7867), Berlin: Springer (2013). p. 201–13. doi: 10.1007/978-3-642-40319-4_18

CrossRef Full Text | Google Scholar

32. Marsella AJ, Tharp RG, Ciborowski TP, Ciborowski TJ. Perspectives on Cross-Cultural Psychology. Cambridge, MA: Academic Press (1979).

Google Scholar

33. Helgeson VS. Relation of agency and communion to well-being: evidence and potential explanations. Psychol Bull. (1994) 116:412–28. doi: 10.1037/0033-2909.116.3.412

CrossRef Full Text | Google Scholar

34. Kirsh GA, Kuiper NA. Individualism and relatedness themes in the context of depression, gender, and a self-schema model of emotion. Can Psychol. (2002) 43:76–90. doi: 10.1037/h0086904

CrossRef Full Text | Google Scholar

35. Peker M, Gündoğdu N, Booth RW. Perceived self-society moral discrepancies predict depression but not anxiety: moral discrepancies and depression. Asian J Soc Psychol. (2015) 18:337–42. doi: 10.1111/ajsp.12100

CrossRef Full Text | Google Scholar

36. Coryell W, Young EA. Clinical predictors of suicide in primary major depressive disorder. J Clin Psychiatry. (2005) 66:412–7. doi: 10.4088/jcp.v66n0401

PubMed Abstract | CrossRef Full Text | Google Scholar

37. Li S, Wang Y, Xue J, Zhao N, Zhu T. The impact of COVID-19 epidemic declaration on psychological consequences: a study on active Weibo users. Int J Environ Res Public Health. (2020) 17:20–32. doi: 10.3390/ijerph17062032

PubMed Abstract | CrossRef Full Text | Google Scholar

38. Vine V, Boyd RL, Pennebaker JW. Natural emotion vocabularies as windows on distress and well-being. Nat Commun. (2020) 11:4525. doi: 10.1038/s41467-020-18349-0

PubMed Abstract | CrossRef Full Text | Google Scholar

39. De Choudhury M, Gamon M, Counts S, Horvitz E. Predicting depression via social media. Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media. Palo Alto, CA: Association for the Advancement of Artificial Intelligence (2013). doi: 10.1609/icwsm.v7i1.14432

CrossRef Full Text | Google Scholar

40. Radloff LS. The CES-D scale: a self-report depression scale for research in the general population. Appl Psychol Meas. (1977) 1:385–401. doi: 10.1177/014662167700100306

PubMed Abstract | CrossRef Full Text | Google Scholar

41. Chwastiak L, Ehde DM, Gibbons LE, Sullivan M, Bowen JD, Kraft GH. Depressive symptoms and severity of illness in multiple sclerosis: epidemiologic study of a large community sample. Am J Psychiatry. (2002) 159:1862–8. doi: 10.1176/appi.ajp.159.11.1862

PubMed Abstract | CrossRef Full Text | Google Scholar

42. Li L, Li A, Hao B, Guan Z, Zhu T. Predicting active users’ personality based on micro-blogging behaviors. PLoS One. (2014) 9:e84997. doi: 10.1371/journal.pone.0084997

PubMed Abstract | CrossRef Full Text | Google Scholar

43. Zhao N, Jiao D, Bai S, Zhu T. Evaluating the validity of simplified Chinese version of LIWC in detecting psychological expressions in short texts on social network services. PLoS One. (2016) 11:e0157947. doi: 10.1371/journal.pone.0157947

PubMed Abstract | CrossRef Full Text | Google Scholar

44. Lv M, Li A, Liu T, Zhu T. Creating a Chinese suicide dictionary for identifying suicide risk on social media. PeerJ. (2015) 3:e1455.

Google Scholar

45. Graham J, Haidt J, Nosek BA. Liberals and conservatives rely on different sets of moral foundations. J Pers Soc Psychol. (2009) 96:1029–46. doi: 10.1037/a0015141

PubMed Abstract | CrossRef Full Text | Google Scholar

46. Wu S, Yang C, Zhang Y. The Chinese version of moral foundations dictionary: a brief introduction and pilot analysis. ChinaXiv. (2019): doi: 10.12074/201911.00002

CrossRef Full Text | Google Scholar

47. Zhang Y, Yu F. Which socio-economic indicators influence collective morality? Big data analysis on online Chinese social media. Emerg Mark Finance Trade. (2018) 54:792–800. doi: 10.1080/1540496X.2017.1321984

CrossRef Full Text | Google Scholar

48. Frimer JA. The Moral Motivation Dictionary. (2013). Available online at: http://www.jeremyfrimer.com (accessed October 1, 2020).

Google Scholar

49. Ren X, Xiang Y, Zhou Y, Zhu T. Individualism/collectivism Map of China Based on Weibo. J Inner Mong Norm Univ. (2017) 46:59–64.

Google Scholar

50. Hu Q, Li A, Heng F, Li J, Zhu T. Predicting depression of social media user on different observation windows. Proceedings of the 2015 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology (WI-IAT). (Vol. 1), Niagara Falls, NY: IEEE (2015). p. 361–4. doi: 10.1109/WI-IAT.2015.166

CrossRef Full Text | Google Scholar

51. Preoţiuc-Pietro D, Eichstaedt J, Park G, Sap M, Smith L, Tobolsky V, et al. The role of personality, age, and gender in tweeting about mental illness. Proceedings of the 2nd Workshop on Computational Linguistics and Clinical Psychology: From Linguistic Signal to Clinical Reality. Denver: Association for Computational Linguistics (2015). p. 21–30. doi: 10.3115/v1/W15-1203

CrossRef Full Text | Google Scholar

52. Liu J, Shi M. What are the characteristics of user texts and behaviors in chinese depression posts? Int J Environ Res Public Health. (2022) 19:10. doi: 10.3390/ijerph19106129

PubMed Abstract | CrossRef Full Text | Google Scholar

53. Rude S, Gortner E-M, Pennebaker J. Language use of depressed and depression-vulnerable college students. Cognit Emot. (2004) 18:1121–33. doi: 10.1080/02699930441000030

CrossRef Full Text | Google Scholar

54. Vedula N, Parthasarathy S. Emotional and linguistic cues of depression from social media. Proceedings of the 2017 International Conference on Digital Health. London: ACM Digital Library (2017). p. 127–36. doi: 10.1145/3079452.3079465

CrossRef Full Text | Google Scholar

55. Brockmeyer T, Zimmermann J, Kulessa D, Hautzinger M, Bents H, Friederich H-C, et al. Me, myself, and I: self-referent word use as an indicator of self-focused attention in relation to depression and anxiety. Front Psychol. (2015) 6:1564. doi: 10.3389/fpsyg.2015.01564

PubMed Abstract | CrossRef Full Text | Google Scholar

56. Pyszczynski T, Greenberg J. Self-regulatory perseveration and the depressive self-focusing style: a self-awareness theory of reactive depression. Psychol Bull. (1987) 102:122–38. doi: 10.1037/0033-2909.102.1.122

CrossRef Full Text | Google Scholar

57. Yating C. Online Discourse of Depression in ChinaLinguistic Characteristics of ‘Zoufan’Community. (2022). Available online at: https://discourseanalysis.net/sites/default/files/2022-07/Yating_2022_DNCWPS_7.pdf (accessed June 7, 2022).

Google Scholar

58. Zhang S, Wu X, Feng Y. An analysis of cultural differences in Chinese and english first-person deixis from the perspective of pragmatic empathy. Theory Pract Lang Stud. (2013) 3:1868–72. doi: 10.4304/tpls.3.10.1868-1872

CrossRef Full Text | Google Scholar

59. Cheng Q, Li TM, Kwok C-L, Zhu T, Yip PS. Assessing suicide risk and emotional distress in Chinese social media: a text mining and machine learning study. J Med Internet Res. (2017) 19:e7276. doi: 10.2196/jmir.7276

PubMed Abstract | CrossRef Full Text | Google Scholar

60. Leis A, Ronzano F, Mayer MA, Furlong LI, Sanz F. Detecting signs of depression in tweets in Spanish: behavioral and linguistic analysis. J Med Internet Res. (2019) 21:e14199. doi: 10.2196/14199

PubMed Abstract | CrossRef Full Text | Google Scholar

61. De Choudhury M, De S. Mental health discourse on reddit: self-disclosure, social support, and anonymity. in Proceedings of the Eighth International AAAI Conference on Weblogs and Social Media. Palo Alto, CA: The AAAI Press (2014). doi: 10.1609/icwsm.v8i1.14526

CrossRef Full Text | Google Scholar

62. Tian X, Batterham P, Song S, Yao X, Yu G. Characterizing depression issues on Sina Weibo. Int J Environ Res Public Health. (2018) 15:4. doi: 10.3390/ijerph15040764

PubMed Abstract | CrossRef Full Text | Google Scholar

63. Rodriguez AJ, Holleran SE, Mehl MR. Reading between the lines: the lay assessment of subclinical depression from written self-descriptions. J Pers. (2010) 78:575–98. doi: 10.1111/j.1467-6494.2010.00627.x

PubMed Abstract | CrossRef Full Text | Google Scholar

64. Habermas T, Ott L-M, Schubert M, Schneider B, Pate A. Stuck in the past: negative bias, explanatory style, temporal order, and evaluative perspectives in life narratives of clinically depressed individuals. Depress Anxiety. (2008) 25:E121–32. doi: 10.1002/da.20389

PubMed Abstract | CrossRef Full Text | Google Scholar

65. O’Connor LE, Berry JW, Weiss J, Gilbert P. Guilt, fear, submission, and empathy in depression. J Affect Disord. (2002) 71:19–27. doi: 10.1016/S0165-0327(01)00408-6

CrossRef Full Text | Google Scholar

66. Olié E, Guillaume S, Jaussent I, Courtet P, Jollant F. Higher psychological pain during a major depressive episode may be a factor of vulnerability to suicidal ideation and act. J Affect Disord. (2010) 120:226–30. doi: 10.1016/j.jad.2009.03.013

PubMed Abstract | CrossRef Full Text | Google Scholar

67. Rosen LD, Whaling K, Rab S, Carrier LM, Cheever NA. Is Facebook creating “iDisorders”? The link between clinical symptoms of psychiatric disorders and technology use, attitudes and anxiety. Comput Hum Behav. (2013) 29:1243–54. doi: 10.1016/j.chb.2012.11.012

CrossRef Full Text | Google Scholar

68. Chen J, Zhou G. Chinese international students’ sense of belonging in North American postsecondary institutions: a critical literature review. Brock Educ J. (2019) 28:48–63. doi: 10.26522/brocked.v28i2.642

CrossRef Full Text | Google Scholar

69. Goodwin R, Hernandez PS. Perceived and received social support in two cultures: collectivism and support among British and Spanish students. J Soc Pers Relat. (2000) 17:282–91. doi: 10.1177/0265407500172007

CrossRef Full Text | Google Scholar

70. Cruwys T, Alexander Haslam S, Dingle GA, Jetten J, Hornsey MJ, Desdemona Chong EM, et al. Feeling connected again: interventions that increase social identification reduce depression symptoms in community and clinical settings. J Affect Disord. (2014) 159:139–46. doi: 10.1016/j.jad.2014.02.019

PubMed Abstract | CrossRef Full Text | Google Scholar

71. Moscardino U, Scrimin S, Capello F, Altoè G. Social support, sense of community, collectivistic values, and depressive symptoms in adolescent survivors of the 2004 Beslan terrorist attack. Soc Sci Med. (2010) 70:27–34. doi: 10.1016/j.socscimed.2009.09.035

PubMed Abstract | CrossRef Full Text | Google Scholar

72. Frimer JA, Walker LJ, Lee BH, Riches A, Dunlop WL. Hierarchical integration of agency and communion: a study of influential moral figures. J Pers. (2012) 80:1117–45. doi: 10.1111/j.1467-6494.2012.00764.x

PubMed Abstract | CrossRef Full Text | Google Scholar

73. Hirokawa K, Dohi I. Agency and communion related to mental health in Japanese young adults. Sex Roles. (2007) 56:517–24. doi: 10.1007/s11199-007-9190-8

CrossRef Full Text | Google Scholar

74. Graham J, Haidt J, Koleva S, Motyl M, Iyer R, Wojcik SP, et al. Moral foundations theory: the pragmatic validity of moral pluralism. In: Zanna MP editor. Advances in Experimental Social Psychology. (Vol. 47), Amsterdam: Elsevier (2013). p. 55–130. doi: 10.1016/B978-0-12-407236-7.00002-4

CrossRef Full Text | Google Scholar

75. Dover TL, Major B, Glace AM. Discrimination, health, and the costs and benefits of believing in system fairness. Health Psychol. (2020) 39:230–9. doi: 10.1037/hea0000841

PubMed Abstract | CrossRef Full Text | Google Scholar

76. Roh M. The effects of perceived social fairness and the possibility of upward social mobility on emotional depression. J Korea Contents Assoc. (2021) 21:173–84. doi: 10.5392/JKCA.2021.21.01.173

CrossRef Full Text | Google Scholar

77. Han Y, Li H, Xiao Y, Li A, Zhu T. Influential path of social risk factors toward suicidal behaviour-evidence from Chinese Sina Weibo Users 2013-2018. Int J Environ Res Public Health. (2021) 18:2604. doi: 10.3390/ijerph18052604

PubMed Abstract | CrossRef Full Text | Google Scholar

78. Paradiso S, Naridze R, Holm-Brown E. Lifetime romantic attachment style and social adaptation in late-onset depression. Int J Geriatr Psychiatry. (2012) 27:1008–16. doi: 10.1002/gps.2814

PubMed Abstract | CrossRef Full Text | Google Scholar

79. Amirmokhtar Radi S, Shokouhyar S. Toward consumer perception of cellphones sustainability: a social media analytics. Sustain Product Consumpt. (2021) 25:217–33. doi: 10.1016/j.spc.2020.08.012

CrossRef Full Text | Google Scholar

80. Chen J, Hsieh G, Mahmud JU, Nichols J. Understanding individuals’ personal values from social media word use. Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing. New York, NY: Association for Computing Machinery (2014). p. 405–14. doi: 10.1145/2531602.2531608

CrossRef Full Text | Google Scholar

81. Husseini Orabi A, Buddhitha P, Husseini Orabi M, Inkpen D. Deep learning for depression detection of twitter users. Proceedings of the Fifth Workshop on Computational Linguistics and Clinical Psychology: From Keyboard to Clinic. New Orleans, LA: Association for Computational Linguistics (ACL) (2018). p. 88–97. doi: 10.18653/v1/W18-0609

PubMed Abstract | CrossRef Full Text | Google Scholar

82. Berrar D. Cross-validation. Encycl Bioinformat Computat Biol. (2018) 1:542–5. doi: 10.1016/B978-0-12-809633-8.20349-X

CrossRef Full Text | Google Scholar

Keywords: CES-D, depression, prediction, microblogging, machine learning, text mining

Citation: Lyu S, Ren X, Du Y and Zhao N (2023) Detecting depression of Chinese microblog users via text analysis: Combining Linguistic Inquiry Word Count (LIWC) with culture and suicide related lexicons. Front. Psychiatry 14:1121583. doi: 10.3389/fpsyt.2023.1121583

Received: 12 December 2022; Accepted: 26 January 2023;
Published: 09 February 2023.

Edited by:

Ang Li, Beijing Forestry University, China

Reviewed by:

Xunbing Shen, Jiangxi University of Traditional Chinese Medicine, China
Jing Zhao, Capital Normal University, China

Copyright © 2023 Lyu, Ren, Du and Zhao. This is an open-access article distributed under the terms of the Creative Commons Attribution License (CC BY). The use, distribution or reproduction in other forums is permitted, provided the original author(s) and the copyright owner(s) are credited and that the original publication in this journal is cited, in accordance with accepted academic practice. No use, distribution or reproduction is permitted which does not comply with these terms.

*Correspondence: Nan Zhao, www.frontiersin.org zhaonan@psych.ac.cn

Disclaimer: All claims expressed in this article are solely those of the authors and do not necessarily represent those of their affiliated organizations, or those of the publisher, the editors and the reviewers. Any product that may be evaluated in this article or claim that may be made by its manufacturer is not guaranteed or endorsed by the publisher.