Comparing Feature Extraction Methods for Sarcasm Detection in Twitter

Jenq-Haur WANG; Rahmat Fadli ISNANTO

doi:10.11517/pjsai.JSAI2023.0_1U5IS2b01

37th (2023)

セッションID: 1U5-IS-2b-01

DOI https://doi.org/10.11517/pjsai.JSAI2023.0_1U5IS2b01

会議情報

主催: The Japanese Society for Artificial Intelligence

会議名: 2023年度人工知能学会全国大会（第37回）

回次: 37

開催地: 熊本城ホール＋オンライン

開催日: 2023/06/06 - 2023/06/09

Comparing Feature Extraction Methods for Sarcasm Detection in Twitter

*Jenq-Haur WANG, Rahmat Fadli ISNANTO

著者情報

キーワード: sarcasm detection, feature extraction, machine learning

会議録・要旨集フリー

詳細

抄録

Sarcasm detection is a challenging task, which identifies expressions that have the opposite meaning of what is written. Most previous works only measure sentiment polarity in sentences. However, more features are needed for improving the result. In this paper, we intend to compare different feature extraction methods including n-gram, sentiment, punctuation, and part of speech features for sarcasm detection. Firstly, sarcastic data are collected using Twitter API, and preprocessed by removing all the hashtags, mentions and URLs. Then, after all features were extracted, they are combined by One Hot Encoding. Finally, we use two classification methods: Support Vector Machine and Logistic Regression for comparison. In our experimental results, n-gram feature gives the best performance compared to the other individual features. Support Vector Machine gives a better performance than logistic regression with an F1-measure of 79.64%. This shows the potential of combining different features for sarcasm detection.

責任著者(Corresponding author)

会議情報

J-STAGEへの登録はこちら（無料）