There is a newer version of the record available.

Published January 15, 2021 | Version v2
Dataset Open

SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection

  • 1. Queen Mary University of London
  • 2. Oxford Brookes University

Description

Our repository presents the Sina Weibo Sexism Review (SWSR) dataset containing sexism-related posts in Chinese collected from Sina Weibo, as well as the Chinese lexicon SexHateLex. SWSR dataset consists of two files:  hateWeibo.csv and hateComment.csv, and SexHateLex lexicon contains a list of 3016 abusive terms in the file SexHateLex.txt.

Files

hateComment.csv

Files (3.8 MB)

Name Size Download all
md5:a358b579275fa09d26f7e3a5eb3bf872
2.3 MB Preview Download
md5:776a6008ccc2bcf5cd50863a8da388c9
1.4 MB Preview Download
md5:d4af6df1948c0c01e070a6c1b296bf5b
2.0 kB Preview Download
md5:641671e316ae413012ca5c96a0ff3c48
27.2 kB Preview Download
md5:cd8104f778c61143b10e8fbd67a09e61
151.3 kB Preview Download

Additional details

References

  • A. Jiang, X. Yang, Y. Liu and A. Zubiaga (2021). SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection. Under review.