Abstract
National Happiness Index (NHI) is a national indicator of development that estimates the economic and social well-being of the nation's individuals. With the proliferation of the internet, people share a significant amount of data on social media websites. We can process the data with different sentiment analysis techniques to calculate the NHI. In the literature, different approaches have been used to calculate NHI, which include the lexicon-based approach and machine learning approach. All of these existing approaches are proposed to calculate NHI for the sentiments written in the English language. However, these methods fail for complex Roman Urdu tweets that contain more than two sub-opinions. There are three primary objectives of the research: (1) to investigate current sentiment analysis techniques are sufficient for the classification of complex Roman Urdu sentiments; (2) to propose rule-based classifier for the classification of Roman Urdu sentiments comprising multiple sub-opinions; (3) to calculate NHI using Roman Urdu sentiments. For this purpose, we proposed the discourse information extractor, the rule-based method (3-RBC), and the machine learning classifier. The experimental results show that 3-RBC is efficient for feature identification, and it is more statistically significant than the baseline classifiers. The 3-RBC has successfully increased the accuracy by 7% and precision by 8%, which provides evidence that the proposed technique significantly increased the calculation of NHI.
- A. Vishal and S. S. Sonawane. 2016. Sentiment analysis of Twitter data: A survey of techniques. Int. J. Comput. DOI:https://doi.org/10.5120/ijca2016908625Google Scholar
- Muhammad Zubair Asghar, Aurangzeb Khan, Shakeel Ahmad, and Fazal Masud Kundi. 2014. A review of feature extraction in sentiment analysis. J. Basic. Appl. Sci. Res 4, 3 (2014), 181--186.Google Scholar
- Muhammad Awais and Muhammad Shoaib. 2019. Role of discourse information in Urdu sentiment classification: A rule-based method and machine-learning technique. ACM Trans. Asian Low-resour. Lang. Inf. Proc. DOI:https://doi.org/10.1145/3300050Google Scholar
- Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. J. Comput. Sci. DOI:https://doi.org/10.1016/j.jocs.2010.12.007Google Scholar
- Gaurav Daga. 2014. Towards a new development paradigm: Critical analysis of gross national happiness. In Proceedings of the 17th European Roundtable on Sustainable Consumer Products. 69. DOI:https://doi.org/10.1016/j.indmarman.2015.07.002Google Scholar
- D. and Graff Dua. 2013. UCI Repository of Machine Learning Databases, Center for Machine Learning and intelligent Systems. University of California, School of Information and Computer Science, Irvine, CA. https://archive.ics.uci.edu/ml/datasets/Roman+Urdu+Data+Set.Google Scholar
- Muhammad Hassan and Muhammad Shoaib. 2018. Opinion within opinion: Segmentation approach for Urdu sentiment analysis. Int. Arab J. Inf. Technol. 15, 1 (2018), 21--28.Google Scholar
- Minqing Hu and Bing Liu. 2004. Mining opinion features in customer reviews. In Proceedings of the National Conference on Artificial Intelligence.Google ScholarDigital Library
- Efthymios Kouloumpis, Theresa Wilson, and Johanna Moore. 2011. Twitter sentiment analysis: The good the bad and the OMG! In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM’11).Google Scholar
- Adam D. I. Kramer. 2010. An unobtrusive behavioral model of “gross national happiness.” In Proceedings of the Conference on Human Factors in Computing Systems. DOI:https://doi.org/10.1145/1753326.1753369Google ScholarDigital Library
- Max Kuhn. 2012. Caret package. J. Stat. Softw. (2012). R Foundation for Statistical Computing, Vienna, Austria. https://cran.r-project. org/package=caret.Google Scholar
- Lewis Mitchell, Morgan R. Frank, Kameron Decker Harris, Peter Sheridan Dodds, and Christopher M. Danforth. 2013. The geography of happiness: Connecting Twitter sentiment and expression, demographics, and objective characteristics of place. PLoS One. DOI:https://doi.org/10.1371/journal.pone.0064417Google Scholar
- Subhabrata Mukherjee and Pushpak Bhattacharyya. 2012. Sentiment analysis in Twitter with lightweight discourse analysis. In Proceedings of the 24th International Conference on Computational Linguistics—COLING 2012: Technical Papers.Google Scholar
- Brendan O'Connor, Ramnath Balasubramanyan, Bryan R. Routledge, and Noah A. Smith. 2010. From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM’10).Google Scholar
- Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10). DOI:https://doi.org/10.17148/ijarcce.2016.51274Google Scholar
- Rudy Prabowo and Mike Thelwall. 2009. Sentiment analysis: A combined approach. J. Informetr. DOI:https://doi.org/10.1016/j.joi.2009.01.003Google Scholar
- Daniele Quercia, Jonathan Ellis, Licia Capra, and Jon Crowcroft. 2012. Tracking “gross community happiness” from tweets. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW’12). DOI:https://doi.org/10.1145/2145204.2145347Google ScholarDigital Library
- Abdul Rafae, Abdul Qayyum, Muhammad Moeenuddin, Asim Karim, Hassan Sajjad, and Faisal Kamiran. 2015. An unsupervised method for discovering lexical variations in Roman Urdu informal text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). DOI:https://doi.org/10.18653/v1/d15-1097Google ScholarCross Ref
- Ayesha Rafique, Kamran Malik, Zubair Nawaz, Faisal Bukhari, and Akhtar Hussain Jalbani. 2019. Sentiment analysis for Roman Urdu. Res. J. Eng. Technol. DOI:https://doi.org/10.22581/muet1982.1902.20Google Scholar
- Zareen Sharf and Saif Ur Rahman. 2018. Performing natural language processing on Roman Urdu datasets. International Journal of Computer Science and Network Security 18, 1 (2018), 141--148.Google Scholar
- Zareen Sharf and Saif Ur Rahman. 2017. Lexical normalization of Roman Urdu text. International Journal of Computer Science and Network Security 17, 12 (2017), 213--221.Google Scholar
- Antonios Siganos, Evangelos Vagenas-Nanos, and Patrick Verwijmeren. 2014. Facebook's daily sentiment and international stock markets. J. Econ. Behav. Organ. DOI:https://doi.org/10.1016/j.jebo.2014.06.004Google Scholar
- Swapna Somasundaran, Galileo Namata, Janyce Wiebe, and Lise Getoor. 2009. Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009. DOI:https://doi.org/10.3115/1699510.1699533Google ScholarCross Ref
- Afraz Z. Syed, Muhammad Aslam, and Ana Maria Martinez-Enriquez. 2010. Lexicon based sentiment analysis of Urdu text using sentiunits. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). DOI:https://doi.org/10.1007/978-3-642-16761-4_4Google Scholar
- I. I. Toshinskií. 1972. A Compass Towards A Just and Harmonious Society: 2015 GNH Survey Report. Centre for Bhutan Studies & GNH Research, Royal Government of Bhutan, Thimphu, Bhutan; 2016. https://www.bhutanstudies.org.bt/a-compass-towards-a-just-and-harmonious-society-2015-gnh-survey-report/.Google Scholar
- N. Wang, M. Kosinski, D. J. Stillwell, and J. Rust. 2014. Can well-being be measured using Facebook status updates? Validation of Facebook's Gross National Happiness Index. Soc. Indic. Res. DOI:https://doi.org/10.1007/s11205-012-9996-9Google Scholar
- Lowri Williams, Christian Bannister, Michael Arribas-Ayllon, Alun Preece, and Irena Spasić. 2015. The role of idioms in sentiment analysis. Expert Syst. Appl. DOI:https://doi.org/10.1016/j.eswa.2015.05.039Google Scholar
- Lanjun Zhou, Binyang Li, Wei Gao, Zhongyu Wei, and Kam Fai Wong. 2011. Unsupervised discovery of discourse relations for eliminating intra-sentence polarity ambiguities. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11)Google ScholarDigital Library
Index Terms
- A Technique to Calculate National Happiness Index by Analyzing Roman Urdu Messages Posted on Social Media
Recommendations
An Unsupervised Approach for Sentiment Analysis on Social Media Short Text Classification in Roman Urdu
During the last two decades, sentiment analysis, also known as opinion mining, has become one of the most explored research areas in Natural Language Processing (NLP) and data mining. Sentiment analysis focuses on the sentiments or opinions of consumers ...
Generate domain-specific sentiment lexicon for review sentiment analysis
Lexicon-based approaches for review sentiment analysis have attracted significant attention in recent years. Lots of sentiment lexicon generation methods have been proposed. However, the generation of domain-specific lexicon with unlabeled data has not ...
Detecting bursts in sentiment-aware topics from social media
Nowadays plenty of user-generated posts, e.g., sina weibos, are published on the social media. The posts contain the publics sentiments (i.e., positive or negative) towards various topics. Bursty sentiment-aware topics from these posts reveal sentiment-...
Comments