loading
Papers Papers/2022 Papers Papers/2022

Research.Publish.Connect.

Paper

Authors: Zubair Tusar 1 ; 2 ; Sadat Sharfuddin 1 ; 2 ; Muhtasim Abid 1 ; 2 ; Md. Haque 1 ; 2 and Md. Mostafa 1 ; 2

Affiliations: 1 Software Engineering Lab (SELab), Islamic University of Technology (IUT), Gazipur, Bangladesh ; 2 Department of Computer Science and Engineering, Islamic University of Technology (IUT), Gazipur, Bangladesh

Keyword(s): Sentiment Analysis, Pre-Trained Transformer-Based Models, Ensembling, Data Augmentation.

Abstract: Sentiment analysis for software engineering has undergone numerous research to efficiently develop tools and approaches for Software Engineering (SE) artifacts. State-of-the-art tools achieved better performance using transformer-based models like BERT, and RoBERTa to classify sentiment polarity. However, existing tools overlooked the data imbalance problem and did not consider the efficiency of ensembling multiple pre-trained models on SE-specific datasets. To overcome those limitations, we used context-specific data augmentation using SE-specific vocabularies and ensembled multiple models to classify sentiment polarity. Using four gold-standard SE-specific datasets, we trained our ensembled models and evaluated their performances. Our approach achieved an improvement ranging from 1% to 26% on weighted average F1 scores and macro-average F1 scores. Our findings demonstrate that the ensemble models outperform the pre-trained models on the original datasets and that data augmentation further improves the performance of all the previous approaches. (More)

CC BY-NC-ND 4.0

Sign In Guest: Register as new SciTePress user now for free.

Sign In SciTePress user: please login.

PDF ImageMy Papers

You are not signed in, therefore limits apply to your IP address 18.117.137.64

In the current month:
Recent papers: 100 available of 100 total
2+ years older papers: 200 available of 200 total

Paper citation in several formats:
Tusar, Z.; Sharfuddin, S.; Abid, M.; Haque, M. and Mostafa, M. (2023). Effectiveness of Data Augmentation and Ensembling Using Transformer-Based Models for Sentiment Analysis: Software Engineering Perspective. In Proceedings of the 18th International Conference on Software Technologies - ICSOFT; ISBN 978-989-758-665-1; ISSN 2184-2833, SciTePress, pages 438-447. DOI: 10.5220/0012092500003538

@conference{icsoft23,
author={Zubair Tusar. and Sadat Sharfuddin. and Muhtasim Abid. and Md. Haque. and Md. Mostafa.},
title={Effectiveness of Data Augmentation and Ensembling Using Transformer-Based Models for Sentiment Analysis: Software Engineering Perspective},
booktitle={Proceedings of the 18th International Conference on Software Technologies - ICSOFT},
year={2023},
pages={438-447},
publisher={SciTePress},
organization={INSTICC},
doi={10.5220/0012092500003538},
isbn={978-989-758-665-1},
issn={2184-2833},
}

TY - CONF

JO - Proceedings of the 18th International Conference on Software Technologies - ICSOFT
TI - Effectiveness of Data Augmentation and Ensembling Using Transformer-Based Models for Sentiment Analysis: Software Engineering Perspective
SN - 978-989-758-665-1
IS - 2184-2833
AU - Tusar, Z.
AU - Sharfuddin, S.
AU - Abid, M.
AU - Haque, M.
AU - Mostafa, M.
PY - 2023
SP - 438
EP - 447
DO - 10.5220/0012092500003538
PB - SciTePress