Skip to main content
Log in

Suicide risk assessment using word-level model with dictionary-based risky posts selection

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Suicide is a serious issue around the world and is a leading cause of death in US. In the past 20 years, the suicide rate has seen a significant increase of 35%. With the rapid development of information technology, more and more people begin to use social media to share their inner feelings. It enables social media data to be widely used for research on suicide risk assessment. However, not all social media posts are suicide related. Previous research addressed this problem with post-level attention mechanism. However, post-level attention mechanism may not find relevant suicide posts. This problem becomes more serious in the feature-based post embeddings since each post is converted into a single vector to serve as the input of the model, resulting in the loss of word-level information during training. In this paper, we addressed this problem by introducing a novel word-level model including a post-selectin layer as a solution. Firstly, we utilize a suicide keyword dictionary to identify risky posts that may be missed by the post-level attention mechanism. We then convert the words in the risky posts into word embeddings and use self-attention to generate the post embeddings for the risky posts. Finally, we pass the post embeddings to a multilayer perceptron to classify the suicide risk. We also demonstrate that the FScore used in previous studies can be reduced to a function of accuracy, which does not reflect the model performance in predicting imbalanced datasets. Therefore, we additionally adopt macro F1 score as the evaluation function. Experiment results show that our model not only outperforms previous studies in FScore performance, but also achieves macro F1 Score a nearly 4% improvement compared to previous studies.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Data availability

The datasets generated during the current study are available from the corresponding author on reasonable request.

Notes

  1. https://sites.google.com/stevens.edu/infinitylab/suicide-risk-detection

  2. https://fasttext.cc/

  3. https://commoncrawl.org/

  4. https://www.wikipedia.org/

  5. We also experimented with the sum and the attention results of the two vectors, but empirically found that concatenation performed better.

  6. https://microsoft.github.io/presidio/analyzer/

References

  1. Bahdanau D, Cho K, Bengio Y (2014) Neural machine translation by jointly learning to align and translate. arXiv preprint arXiv:14090473

  2. Baytas IM, Xiao C, Zhang X, Wang F, Jain AK, Zhou J (2017) Patient subtyping via time-aware LSTM networks. Proceedings of the 23rd ACM SIGKDD international conference on knowledge discovery and data mining. pp 65–74

  3. Beltagy I, Peters ME, Cohan A (2020) Longformer: The long-document transformer. arXiv preprint arXiv:200405150

  4. Bojanowski P, Grave E, Joulin A, Mikolov T (2017) Enriching word vectors with subword information. Trans Assoc Comput Linguist 5:135–146

    Article  Google Scholar 

  5. Cao L, Zhang H, Feng L, Wei Z, Wang X, Li N et al (2019) Latent suicide risk detection on microblog via suicide-oriented word embeddings and layered attention. arXiv preprint arXiv:191012038

  6. Centers for Disease Control and Prevention: Suicide Prevention. https://www.cdc.gov/suicide/index.html (2022). Accessed 27 June 2022

  7. Choi KS, Kim S, Kim B-H, Jeon HJ, Kim J-H, Jang JH et al (2021) Deep graph neural network-based prediction of acute suicidal ideation in young adults. Sci Rep 11(1):1–11

    Google Scholar 

  8. Coppersmith G, Leary R, Crutchley P, Fine A (2018) Natural language processing of social media as screening for suicide risk. Biomed Inform Insights 10:1178222618792860

    Article  PubMed  PubMed Central  Google Scholar 

  9. De Choudhury M, Kiciman E, Dredze M, Coppersmith G, Kumar M (2016) Discovering shifts to suicidal ideation from mental health content in social media. Proceedings of the 2016 CHI conference on human factors in computing systems. pp 2098–110

  10. Domino G (1996) Test-retest reliability of the Suicide Opinion Questionnaire. Psychol Rep 78(3):1009–1010

    Article  CAS  PubMed  Google Scholar 

  11. Gaur M, Alambo A, Sain JP, Kursuncu U, Thirunarayan K, Kavuluru R et al (2019) Knowledge-aware assessment of severity of suicide risk for early intervention. The world wide web conference. pp 514–25

  12. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  CAS  PubMed  Google Scholar 

  13. Jashinsky J, Burton SH, Hanson CL, West J, Giraud-Carrier C, Barnes MD et al (2014) Tracking suicide risk factors through Twitter in the US. Crisis 35(1):51

    Article  PubMed  Google Scholar 

  14. Klonsky ED, May AM (2015) The three-step theory (3ST): A new theory of suicide rooted in the “ideation-to-action” framework. Int J Cogn Ther 8(2):114–129

    Article  Google Scholar 

  15. Leavey G, Mallon S, Rondon-Sulbaran J, Galway K, Rosato M, Hughes L (2017) The failure of suicide prevention in primary care: family and GP perspectives–a qualitative study. BMC Psychiatry 17(1):1–10

    Article  Google Scholar 

  16. Lim M, Lee SU, Park J-I (2014) Difference in suicide methods used between suicide attempters and suicide completers. Int J Ment Heal Syst 8(1):1–4

    Google Scholar 

  17. Loshchilov I, Hutter F (2017) Decoupled weight decay regularization. arXiv preprint arXiv:171105101

  18. Masuda N, Kurahashi I, Onari H (2013) Suicide ideation of individuals in online social networks. PLoS ONE 8(4):e62262

    Article  ADS  CAS  PubMed  PubMed Central  Google Scholar 

  19. Matero M, Idnani A, Son Y, Giorgi S, Vu H, Zamani M et al (2019) Suicide risk assessment with multi-level dual-context language and BERT. Proceedings of the sixth workshop on computational linguistics and clinical psychology. pp 39–44

  20. Mikolov T, Grave E, Bojanowski P, Puhrsch C, Joulin A (2017) Advances in pre-training distributed word representations. arXiv preprint arXiv:171209405

  21. Mikolov T, Sutskever I, Chen K, Corrado GS, Dean J (2013) Distributed representations of words and phrases and their compositionality. Adv Neural Inf Process Syst 26

  22. Mishra R, Sinha PP, Sawhney R, Mahata D, Mathur P, Shah RR (2019) SNAP-BATNET: Cascading author profiling and social network graphs for suicide ideation detection on social media. Proceedings of the 2019 conference of the North American Chapter of the association for computational linguistics: student research workshop. pp 147–56

  23. Paszke A, Gross S, Massa F, Lerer A, Bradbury J, Chanan G et al (2019) Pytorch: an imperative style, high-performance deep learning library. In: Advances in Neural Information Processing Systems 32:8024–8035

  24. Posner K, Brent D, Lucas C, Gould M, Stanley B, Brown G et al (2008) Columbia-suicide severity rating scale (C-SSRS). Columbia University Medical Center, New York, NY, p 10

    Google Scholar 

  25. Reimers N, Gurevych I (2019) Sentence-bert: Sentence embeddings using siamese bert-networks. arXiv preprint arXiv:190810084

  26. Renberg ES, Jacobsson L (2003) Development of a questionnaire on attitudes towards suicide (ATTS) and its application in a Swedish population. Suicide Life Threat Behav 33(1):52–64

    Article  PubMed  Google Scholar 

  27. Roy A, Nikolitch K, McGinn R, Jinah S, Klement W, Kaminsky ZA (2020) A machine learning approach predicts future risk to suicidal ideation from social media data. NPJ Digit Med 3(1):78

    Article  PubMed  PubMed Central  Google Scholar 

  28. Sawhney R, Joshi H, Gandhi S, Shah R (2020) A time-aware transformer based model for suicide ideation detection on social media. Proceedings of the 2020 conference on empirical methods in natural language processing (EMNLP). pp 7685–97

  29. Sawhney R, Joshi H, Gandhi S, Shah RR (2021) Towards ordinal suicide ideation detection on social media. Proceedings of the 14th ACM International Conference on Web Search and Data Mining. pp 22–30

  30. Sawhney R, Manchanda P, Singh R, Aggarwal S (2018) A computational approach to feature extraction for identification of suicidal ideation in tweets. Proceedings of ACL, Student Research Workshop2018. pp 91–8

  31. Sawhney R, Neerkaje AT, Gaur M (2022) A risk-averse mechanism for suicidality assessment on social media. In :Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, vol 2: Short Papers. Dublin, Ireland, pp 628–635

  32. Shing H-C, Nair S, Zirikly A, Friedenberg M, Daumé III H, Resnik P (2018) Expert, crowdsourced, and machine assessment of suicide risk via online postings. Proceedings of the fifth workshop on computational linguistics and clinical psychology: from keyboard to clinic. pp 25–36

  33. Shing H-C, Resnik P, Oard DW (2020) A prioritization model for suicidality risk assessment. Proceedings of the 58th annual meeting of the association for computational linguistics. pp 8124–37

  34. Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN et al (2017) Attention is all you need. In: Proceedings of the 31st international conference on neural information processing systems (NIPS’17). Curran Associates Inc., Red Hook, NY, USA, pp 6000–6010

  35. Wang N, Luo F, Shivtare Y, Badal VD, Subbalakshmi K, Chandramouli R et al (2021) Learning Models for Suicide Prediction from Social Media Posts. arXiv preprint arXiv:210503315

  36. Yang C, Zhang Y, Muresan S (2021) Weakly-Supervised Methods for Suicide Risk Assessment: Role of Related Domains. arXiv preprint arXiv:210602792

  37. Zirikly A, Resnik P, Uzuner O, Hollingshead K (2019) CLPsych 2019 shared task: Predicting the degree of suicide risk in Reddit posts. Proceedings of the sixth workshop on computational linguistics and clinical psychology. pp 24–33

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Arbee L. P. Chen.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tsai, Y.S., Chen, A.L.P. Suicide risk assessment using word-level model with dictionary-based risky posts selection. Multimed Tools Appl 83, 21435–21454 (2024). https://doi.org/10.1007/s11042-023-16361-2

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-16361-2

Keywords

Navigation