A Data Fusion Framework for Multi-Domain Morality Learning

Authors

  • Siyi Guo Information Sciences Institute, University of Southern California
  • Negar Mokhberian Information Sciences Institute, University of Southern California
  • Kristina Lerman Information Sciences Institute, University of Southern California

DOI:

https://doi.org/10.1609/icwsm.v17i1.22145

Keywords:

, Subjectivity in textual data; sentiment analysis; polarity/opinion identification and extraction, linguistic analyses of social media behavior, Psychological, personality-based and ethnographic studies of social media, Qualitative and quantitative studies of social media

Abstract

Language models can be trained to recognize the moral sentiment of text, creating new opportunities to study the role of morality in human life. As interest in language and morality has grown, several ground truth datasets with moral annotations have been released. However, these datasets vary in the method of data collection, domain, topics, instructions for annotators, etc. Simply aggregating such heterogeneous datasets during training can yield models that fail to generalize well. We describe a data fusion framework for training on multiple heterogeneous datasets that improve performance and generalizability. The model uses domain adversarial training to align the datasets in feature space and a weighted loss function to deal with label shift. We show that the proposed framework achieves state-of-the-art performance in different datasets compared to prior works in morality inference.

Downloads

Published

2023-06-02

How to Cite

Guo, S., Mokhberian, N., & Lerman, K. (2023). A Data Fusion Framework for Multi-Domain Morality Learning. Proceedings of the International AAAI Conference on Web and Social Media, 17(1), 281-291. https://doi.org/10.1609/icwsm.v17i1.22145