Abstract
The growth of online health communities particularly those involving socially generated content can provide considerable value for society. Participants can gain knowledge of medical information or interact with peers on medical forum platforms. Analysing sentiment expressed by members of a health community in medical forum discourse can be of significant value, such as by identifying a particular aspect of an information space, determining themes that predominate among a large data set, and allowing people to summarize topics within a big data set. In this paper, we identify sentiments expressed in online medical forums that discuss Lyme disease. There are two goals in our research: first, to identify a complete and relevant set of categories that can characterize Lyme disease discourse; and second, to test and investigate strategies, both individually and collectively, for automating the classification of medical forum posts into those categories. We present a feature-based model that consists of three different feature sets: content-free, content-specific and meta-level features. Employing inductive learning algorithms to build a feature-based classification model, we assess the feasibility and accuracy of our automated classification. We further evaluate our model by assessing its ability to adapt to an online medical forum discussing Lupus disease. The experimental results demonstrate the effectiveness of our approach.
Similar content being viewed by others
Notes
The terms sentiment and affect have been used interchangeably in the literature, where they refer to extraction of opinions, emotions or views that may be expressed in the text.
References
Petrie, K.J., Weinman, J.: Perceptions of Health and Illness: Current Research and Applications. Taylor & Francis, Boca Raton (1997)
Davison, K.P., Pennebaker, J.W., Dickerson, S.S.: Who talks? The social psychology of illness support groups. Am. Psychol. 55(2), 205 (2000)
Bhatia, S., Mitra, P.: Adopting inference networks for online thread retrieval. In: AAAI, vol. 10, pp. 1300–1305 (2010)
Bobicev, V., Sokolova, M., Oakes, M.: What goes around comes around: learning sentiments in online medical forums. Cognit. Comput. 7(5), 609–621 (2015)
Zhang, T., Cho, J.H., Zhai, C.: Understanding user intents in online health forums. In: Proceedings of the 5th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics, pp. 220–229. ACM (2014)
Fox, S.: The Social Life of Health Information, 2011. Pew Internet & American Life Project, Washington (2011)
Bravo-Marquez, F., Mendoza, M., Poblete, B.: Meta-level sentiment models for big social data analysis. Knowl. Based Syst. 69, 86–99 (2014)
Biyani, P., Bhatia, S., Caragea, C., Mitra, P.: Using non-lexical features for identifying factual and opinionative threads in online forums. Knowl. Based Syst. 69, 170–178 (2014)
Ding, X., Liu, B.: The utility of linguistic rules in opinion mining. In: Proceedings of the 30th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval, pp. 811–812. ACM (2007)
Poggi, I., D’Errico, F.: Multimodal acid communication of a politician. In: ESSEM@ AI* IA, pp. 59–70 (2013)
Cieliebak, M., Dürr, O., Uzdilli, F.: Potential and limitations of commercial sentiment detection tools. In: ESSEM@ AI* IA, pp. 47–58 (2013)
Khan, F.H., Qamar, U., Bashir, S.: eSAP: a decision support framework for enhanced sentiment analysis and polarity classification. Inf. Sci. 367, 862–873 (2016)
Al-Twairesh, N., Al-Khalifa, H., Al-Salman, A.: Subjectivity and sentiment analysis of arabic: trends and challenges. In: 2014 IEEE/ACS 11th International Conference on Computer Systems and Applications (AICCSA), pp. 148–155. IEEE (2014)
Plutchik, R.: The nature of emotions human emotions have deep evolutionary roots, a fact that may explain their complexity and provide tools for clinical practice. Am. Sci. 89(4), 344–350 (2001)
Staiano, J., Guerini, M.: Depechemood: a lexicon for emotion analysis from crowd-annotated news. arXiv:1405.1605 (2014)
Bravo-Marquez, F., Frank, E., Mohammad, S.M., Pfahringer, B.: Determining word-emotion associations from tweets by multi-label classification. In: 2016 IEEE/WIC/ACM International Conference on Web Intelligence (WI), pp. 536–539. IEEE (2016)
Mohammad, S.M., Turney, P.D.: Emotions evoked by common words and phrases: using mechanical turk to create an emotion lexicon. In: Proceedings of the NAACL HLT 2010 Workshop on Computational Approaches to Analysis and Generation of Emotion in Text. Association for Computational Linguistics, pp. 26–34 (2010)
Wang, X., Wei, F., Liu, X., Zhou, M., Zhang, M.: Topic sentiment analysis in twitter: a graph-based hashtag sentiment classification approach. In: Proceedings of the 20th ACM International Conference on Information and Knowledge Management, pp. 1031–1040. ACM (2011)
Abbasi, A., Chen, H.: Applying authorship analysis to extremist-group web forum messages. IEEE Intell. Syst. 20(5), 67–75 (2005)
Dang, Y., Zhang, Y., Chen, H.: A lexicon-enhanced method for sentiment classification: an experiment on online product reviews. IEEE Intell. Syst. 25(4), 46–53 (2010)
Alnashwan, R., O’Riordan, A.P., Sorensen, H., Hoare, C.: Improving sentiment analysis through ensemble learning of meta-level features. In: KDWEB 2016: 2nd International Workshop on Knowledge Discovery on the Web. Sun SITE Central Europe (CEUR)/RWTH Aachen University (2016)
Zheng, R., Li, J., Chen, H., Huang, Z.: A framework for authorship identification of online messages: writing-style features and classification techniques. J. Assoc. Inf. Sci. Technol. 57(3), 378–393 (2006)
Lu, Y.: Automatic topic identification of health-related messages in online health community using text classification. SpringerPlus 2(1), 309 (2013)
Baccianella, S., Esuli, A., Sebastiani, F.: Sentiwordnet 3.0: An enhanced lexical resource for sentiment analysis and opinion mining. In: LREC, vol. 10, pp. 2200–2204 (2010)
Esuli, A., Sebastiani, F.: Sentiwordnet: a publicly available lexical resource for opinion mining. In: Proceedings of the 5th Conference on Language Resources and Evaluation (LREC06), pp. 417–422 (2006)
Liu, B.: Sentiment analysis and opinion mining. Synth. Lect. Hum. Lang. Technol. 5(1), 1–167 (2012)
Bradley, M.M., Lang, P.J.: Affective norms for English words (anew): instruction manual and affective ratings, Technical report C-1, the center for research in psychophysiology. University of Florida, Tech. Rep. (1999)
Nielsen, F. Å.: A new anew: evaluation of a word list for sentiment analysis in microblogs. arXiv:1103.2903 (2011)
Mohammad, S.M., Kiritchenko, S., Zhu, X.: NRC-Canada: building the state-of-the-art in sentiment analysis of tweets. arXiv:1308.6242 (2013)
Thelwall, M., Buckley, K., Paltoglou, G.: Sentiment strength detection for the social web. J. Assoc. Inf. Sci. Technol. 63(1), 163–173 (2012)
Cambria, E., Havasi, C., Hussain, A.: Senticnet 2: a semantic and affective resource for opinion mining and sentiment analysis. In: FLAIRS Conference, pp. 202–207 (2012)
Yang, Y.: An evaluation of statistical approaches to text categorization. Inf. Retr. 1(1), 69–90 (1999)
Nichols, T.R., Wisner, P.M., Cripe, G., Gulabchand, L.: Putting the kappa statistic to use. Qual. Assur. J. 13(3–4), 57–61 (2010)
Jain, A., Zongker, D.: Feature selection: evaluation, application, and small sample performance. IEEE Trans. Pattern Anal. Mach. Intell. 19(2), 153–158 (1997)
Guo, B., Nixon, M.S.: Gait feature subset selection by mutual information. IEEE Trans. Syst. Man Cybern. Part A Syst. Hum. 39(1), 36–46 (2009)
Bihis, M., Roychowdhury, S.: A generalized flow for multi-class and binary classification tasks: an azure ml approach. In: 2015 IEEE International Conference on Big Data (Big Data). pp. 1728–1737. IEEE (2015)
Salathe, M., Bengtsson, L., Bodnar, T.J., Brewer, D.D., Brownstein, J.S., Buckee, C., Campbell, E.M., Cattuto, C., Khandelwal, S., Mabry, P.L., et al.: Digital epidemiology. PLoS Comput. Biol. 8(7), e1002616 (2012)
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Alnashwan, R., Sorensen, H., O’Riordan, A. et al. Accurate classification of socially generated medical discourse. Int J Data Sci Anal 8, 353–365 (2019). https://doi.org/10.1007/s41060-018-0128-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s41060-018-0128-8