short-paper

A Technique to Calculate National Happiness Index by Analyzing Roman Urdu Messages Posted on Social Media

Authors:
Rabia Habiba

University of Engineering and Technology, Lahore, Punjab, Pakistan

University of Engineering and Technology, Lahore, Punjab, Pakistan
View Profile

,
Dr. Muhammad Awais

University of Engineering and Technology, Lahore, Punjab, Pakistan

University of Engineering and Technology, Lahore, Punjab, Pakistan
View Profile

,
Dr. Muhammad Shoaib

University of Engineering and Technology, Lahore, Punjab, Pakistan

University of Engineering and Technology, Lahore, Punjab, Pakistan
View Profile

ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19 Issue 6Article No.: 76pp 1–16https://doi.org/10.1145/3400712

Published:26 October 2020Publication History

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

National Happiness Index (NHI) is a national indicator of development that estimates the economic and social well-being of the nation's individuals. With the proliferation of the internet, people share a significant amount of data on social media websites. We can process the data with different sentiment analysis techniques to calculate the NHI. In the literature, different approaches have been used to calculate NHI, which include the lexicon-based approach and machine learning approach. All of these existing approaches are proposed to calculate NHI for the sentiments written in the English language. However, these methods fail for complex Roman Urdu tweets that contain more than two sub-opinions. There are three primary objectives of the research: (1) to investigate current sentiment analysis techniques are sufficient for the classification of complex Roman Urdu sentiments; (2) to propose rule-based classifier for the classification of Roman Urdu sentiments comprising multiple sub-opinions; (3) to calculate NHI using Roman Urdu sentiments. For this purpose, we proposed the discourse information extractor, the rule-based method (3-RBC), and the machine learning classifier. The experimental results show that 3-RBC is efficient for feature identification, and it is more statistically significant than the baseline classifiers. The 3-RBC has successfully increased the accuracy by 7% and precision by 8%, which provides evidence that the proposed technique significantly increased the calculation of NHI.

References

A. Vishal and S. S. Sonawane. 2016. Sentiment analysis of Twitter data: A survey of techniques. Int. J. Comput. DOI:https://doi.org/10.5120/ijca2016908625Google Scholar
Muhammad Zubair Asghar, Aurangzeb Khan, Shakeel Ahmad, and Fazal Masud Kundi. 2014. A review of feature extraction in sentiment analysis. J. Basic. Appl. Sci. Res 4, 3 (2014), 181--186.Google Scholar
Muhammad Awais and Muhammad Shoaib. 2019. Role of discourse information in Urdu sentiment classification: A rule-based method and machine-learning technique. ACM Trans. Asian Low-resour. Lang. Inf. Proc. DOI:https://doi.org/10.1145/3300050Google Scholar
Johan Bollen, Huina Mao, and Xiaojun Zeng. 2011. Twitter mood predicts the stock market. J. Comput. Sci. DOI:https://doi.org/10.1016/j.jocs.2010.12.007Google Scholar
Gaurav Daga. 2014. Towards a new development paradigm: Critical analysis of gross national happiness. In Proceedings of the 17th European Roundtable on Sustainable Consumer Products. 69. DOI:https://doi.org/10.1016/j.indmarman.2015.07.002Google Scholar
D. and Graff Dua. 2013. UCI Repository of Machine Learning Databases, Center for Machine Learning and intelligent Systems. University of California, School of Information and Computer Science, Irvine, CA. https://archive.ics.uci.edu/ml/datasets/Roman+Urdu+Data+Set.Google Scholar
Muhammad Hassan and Muhammad Shoaib. 2018. Opinion within opinion: Segmentation approach for Urdu sentiment analysis. Int. Arab J. Inf. Technol. 15, 1 (2018), 21--28.Google Scholar
Minqing Hu and Bing Liu. 2004. Mining opinion features in customer reviews. In Proceedings of the National Conference on Artificial Intelligence.Google ScholarDigital Library
Efthymios Kouloumpis, Theresa Wilson, and Johanna Moore. 2011. Twitter sentiment analysis: The good the bad and the OMG! In Proceedings of the International AAAI Conference on Weblogs and Social Media (ICWSM’11).Google Scholar
Adam D. I. Kramer. 2010. An unobtrusive behavioral model of “gross national happiness.” In Proceedings of the Conference on Human Factors in Computing Systems. DOI:https://doi.org/10.1145/1753326.1753369Google ScholarDigital Library
Max Kuhn. 2012. Caret package. J. Stat. Softw. (2012). R Foundation for Statistical Computing, Vienna, Austria. https://cran.r-project. org/package=caret.Google Scholar
Lewis Mitchell, Morgan R. Frank, Kameron Decker Harris, Peter Sheridan Dodds, and Christopher M. Danforth. 2013. The geography of happiness: Connecting Twitter sentiment and expression, demographics, and objective characteristics of place. PLoS One. DOI:https://doi.org/10.1371/journal.pone.0064417Google Scholar
Subhabrata Mukherjee and Pushpak Bhattacharyya. 2012. Sentiment analysis in Twitter with lightweight discourse analysis. In Proceedings of the 24th International Conference on Computational Linguistics—COLING 2012: Technical Papers.Google Scholar
Brendan O'Connor, Ramnath Balasubramanyan, Bryan R. Routledge, and Noah A. Smith. 2010. From tweets to polls: Linking text sentiment to public opinion time series. In Proceedings of the 4th International AAAI Conference on Weblogs and Social Media (ICWSM’10).Google Scholar
Alexander Pak and Patrick Paroubek. 2010. Twitter as a corpus for sentiment analysis and opinion mining. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC’10). DOI:https://doi.org/10.17148/ijarcce.2016.51274Google Scholar
Rudy Prabowo and Mike Thelwall. 2009. Sentiment analysis: A combined approach. J. Informetr. DOI:https://doi.org/10.1016/j.joi.2009.01.003Google Scholar
Daniele Quercia, Jonathan Ellis, Licia Capra, and Jon Crowcroft. 2012. Tracking “gross community happiness” from tweets. In Proceedings of the ACM Conference on Computer Supported Cooperative Work (CSCW’12). DOI:https://doi.org/10.1145/2145204.2145347Google ScholarDigital Library
Abdul Rafae, Abdul Qayyum, Muhammad Moeenuddin, Asim Karim, Hassan Sajjad, and Faisal Kamiran. 2015. An unsupervised method for discovering lexical variations in Roman Urdu informal text. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’15). DOI:https://doi.org/10.18653/v1/d15-1097Google ScholarCross Ref
Ayesha Rafique, Kamran Malik, Zubair Nawaz, Faisal Bukhari, and Akhtar Hussain Jalbani. 2019. Sentiment analysis for Roman Urdu. Res. J. Eng. Technol. DOI:https://doi.org/10.22581/muet1982.1902.20Google Scholar
Zareen Sharf and Saif Ur Rahman. 2018. Performing natural language processing on Roman Urdu datasets. International Journal of Computer Science and Network Security 18, 1 (2018), 141--148.Google Scholar
Zareen Sharf and Saif Ur Rahman. 2017. Lexical normalization of Roman Urdu text. International Journal of Computer Science and Network Security 17, 12 (2017), 213--221.Google Scholar
Antonios Siganos, Evangelos Vagenas-Nanos, and Patrick Verwijmeren. 2014. Facebook's daily sentiment and international stock markets. J. Econ. Behav. Organ. DOI:https://doi.org/10.1016/j.jebo.2014.06.004Google Scholar
Swapna Somasundaran, Galileo Namata, Janyce Wiebe, and Lise Getoor. 2009. Supervised and unsupervised methods in employing discourse relations for improving opinion polarity classification. In Proceedings of the Conference on Empirical Methods in Natural Language Processing: A Meeting of SIGDAT, a Special Interest Group of ACL, Held in Conjunction with ACL-IJCNLP 2009. DOI:https://doi.org/10.3115/1699510.1699533Google ScholarCross Ref
Afraz Z. Syed, Muhammad Aslam, and Ana Maria Martinez-Enriquez. 2010. Lexicon based sentiment analysis of Urdu text using sentiunits. In Lecture Notes in Computer Science (including subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics). DOI:https://doi.org/10.1007/978-3-642-16761-4_4Google Scholar
I. I. Toshinskií. 1972. A Compass Towards A Just and Harmonious Society: 2015 GNH Survey Report. Centre for Bhutan Studies & GNH Research, Royal Government of Bhutan, Thimphu, Bhutan; 2016. https://www.bhutanstudies.org.bt/a-compass-towards-a-just-and-harmonious-society-2015-gnh-survey-report/.Google Scholar
N. Wang, M. Kosinski, D. J. Stillwell, and J. Rust. 2014. Can well-being be measured using Facebook status updates? Validation of Facebook's Gross National Happiness Index. Soc. Indic. Res. DOI:https://doi.org/10.1007/s11205-012-9996-9Google Scholar
Lowri Williams, Christian Bannister, Michael Arribas-Ayllon, Alun Preece, and Irena Spasić. 2015. The role of idioms in sentiment analysis. Expert Syst. Appl. DOI:https://doi.org/10.1016/j.eswa.2015.05.039Google Scholar
Lanjun Zhou, Binyang Li, Wei Gao, Zhongyu Wei, and Kam Fai Wong. 2011. Unsupervised discovery of discourse relations for eliminating intra-sentence polarity ambiguities. In Proceedings of the Conference on Empirical Methods in Natural Language Processing (EMNLP’11)Google ScholarDigital Library

Index Terms

A Technique to Calculate National Happiness Index by Analyzing Roman Urdu Messages Posted on Social Media
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
      2. Lexical semantics
  2. Machine learning
    1. Learning paradigms
      1. Supervised learning
        Supervised learning by classification

Recommendations

An Unsupervised Approach for Sentiment Analysis on Social Media Short Text Classification in Roman Urdu
During the last two decades, sentiment analysis, also known as opinion mining, has become one of the most explored research areas in Natural Language Processing (NLP) and data mining. Sentiment analysis focuses on the sentiments or opinions of consumers ...
Read More
Generate domain-specific sentiment lexicon for review sentiment analysis

Lexicon-based approaches for review sentiment analysis have attracted significant attention in recent years. Lots of sentiment lexicon generation methods have been proposed. However, the generation of domain-specific lexicon with unlabeled data has not ...
Read More
Detecting bursts in sentiment-aware topics from social media

Nowadays plenty of user-generated posts, e.g., sina weibos, are published on the social media. The posts contain the publics sentiments (i.e., positive or negative) towards various topics. Bursty sentiment-aware topics from these posts reveal sentiment-...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Published in
ACM Transactions on Asian and Low-Resource Language Information Processing Volume 19, Issue 6
November 2020
277 pages
ISSN:2375-4699
EISSN:2375-4702
DOI:10.1145/3426881
Editor:
Imed Zitouni
Google, USA
Issue’s Table of Contents
Copyright © 2020 ACM
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 26 October 2020
- Accepted: 1 May 2020
- Revised: 1 February 2020
- Received: 1 October 2018
Published in tallip Volume 19, Issue 6

Permissions
Request permissions about this article.
Request Permissions

Check for updates
Author Tags
National happiness index
discourse information
lexicon-based approach
rule-based classifier
sentiment analysis
support vector machines
Qualifiers
- short-paper
- Research
- Refereed
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 8
  Total Citations
  View Citations
- 193
  Total Downloads
- Downloads (Last 12 months)21
- Downloads (Last 6 weeks)4
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format .

View HTML Format

A Technique to Calculate National Happiness Index by Analyzing Roman Urdu Messages Posted on Social Media

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

An Unsupervised Approach for Sentiment Analysis on Social Media Short Text Classification in Roman Urdu

Generate domain-specific sentiment lexicon for review sentiment analysis

Detecting bursts in sentiment-aware topics from social media

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

HTML Format

Caption

A Technique to Calculate National Happiness Index by Analyzing Roman Urdu Messages Posted on Social Media

ACM Transactions on Asian and Low-Resource Language Information Processing

Abstract

References

Cited By

Index Terms

Recommendations

An Unsupervised Approach for Sentiment Analysis on Social Media Short Text Classification in Roman Urdu

Generate domain-specific sentiment lexicon for review sentiment analysis

Detecting bursts in sentiment-aware topics from social media

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

HTML Format

Share this Publication link

Share on Social Media