Skip to main content

What We Can Learn from Looking at Profanity

  • Conference paper
Book cover Computational Processing of the Portuguese Language (PROPOR 2014)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 8775))

Abstract

Profanity is a common occurrence in online text. Recent studies found swearing words in over 7% of English tweets and 9% of Yahoo! Buzz messages. However, efforts in recognizing, understanding and dealing with profanity do not share resources, namely, their dataset, which imposes duplication of effort and non-comparable results.

We here present a freely available dataset of 2500 messages from a popular Portuguese sports website. About 20% of the messages had profanity, thus we annotated 726 swear words, 510 of which were obfuscated by the authors. We also identified the most frequent profanities, and what methods, and combination of methods, people used to disguise their cursing.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 39.99
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 54.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Constant, N., Davis, C., Potts, C., Schwarz, F.: The pragmatics of expressive content: Evidence from large corpora. Sprache und Datenverarbeitung: International Journal for Language Data Processing (33), 5–21 (2009)

    Google Scholar 

  2. Jay, T., Janschewitz, K.: Filling the emotional gap in linguistic theory: Commentary on Pot’s expressive dimension (33), 215-221 (2007)

    Google Scholar 

  3. Jay, T.: The utility and ubiquity of taboo words. 4(2), 153-161 (2009)

    Google Scholar 

  4. Wang, W., Chen, L., Thirunarayan, K., Sheth, A.P.: Cursing in English on Twitter. In: Proceedings of the 17th ACM Conference on Computer Supported Cooperative Work & Social Computing, CSCW 2014 (February 2014)

    Google Scholar 

  5. Thelwall, M.: Fk yea I swear: Cursing and gender in MySpace. Corpora. 3(1), 83–107 (2008)

    Article  Google Scholar 

  6. Sood, S.O., Antin, J., Churchill, E.: Profanity use in online communities. In: Proceedings of the SIGCHI Conference on Human Factors in Computing Systems, CHI 2012, pp. 1481–1490. ACM, New York (2012)

    Google Scholar 

  7. Mehl, M.R., Pennebaker, J.W.: The Sounds of Social Life: A Psychometric Analysis of Students Daily Social Environments and Natural Conversations. Journal of Personality and Social Psychology 84(4), 857–870 (2003)

    Article  Google Scholar 

  8. Crisp, R.J., Heuston, S., Farr, M.J., Turner, R.N.: Seeing Red or Feeling Blue: Differentiated Intergroup Emotions and Ingroup Identification in Soccer Fans

    Google Scholar 

  9. Sousa Silva, R., Laboreiro, G., Sarmento, L., Grant, T., Oliveira, E., Maia, B.: ‘twazn me!!!;(’ Automatic Authorship Analysis of Micro-Blogging Messages. In: Muñoz, R., Montoyo, A., Métais, E. (eds.) NLDB 2011. LNCS, vol. 6716, pp. 161–168. Springer, Heidelberg (2011)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2014 Springer International Publishing Switzerland

About this paper

Cite this paper

Laboreiro, G., Oliveira, E. (2014). What We Can Learn from Looking at Profanity. In: Baptista, J., Mamede, N., Candeias, S., Paraboni, I., Pardo, T.A.S., Volpe Nunes, M.d.G. (eds) Computational Processing of the Portuguese Language. PROPOR 2014. Lecture Notes in Computer Science(), vol 8775. Springer, Cham. https://doi.org/10.1007/978-3-319-09761-9_11

Download citation

  • DOI: https://doi.org/10.1007/978-3-319-09761-9_11

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-319-09760-2

  • Online ISBN: 978-3-319-09761-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics