skip to main content
10.3115/1219840.1219894dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

A quantitative analysis of lexical differences between genders in telephone conversations

Published:25 June 2005Publication History

ABSTRACT

In this work, we provide an empirical analysis of differences in word use between genders in telephone conversations, which complements the considerable body of work in sociolinguistics concerned with gender linguistic differences. Experiments are performed on a large speech corpus of roughly 12000 conversations. We employ machine learning techniques to automatically categorize the gender of each speaker given only the transcript of his/her speech, achieving 92% accuracy. An analysis of the most characteristic words for each gender is also presented. Experiments reveal that the gender of one conversation side influences lexical use of the other side. A surprising result is that we were able to classify male-only vs. female-only conversations with almost perfect accuracy.

References

  1. C. Cieri, D. Miller, and K. Walker. 2004. The Fisher corpus: a resource for the next generations of speech-to-text. In 4th International Conference on Language Resources and Evaluation, LREC, pages 69--71.Google ScholarGoogle Scholar
  2. J. Coates, editor. 1997. Language and Gender: A Reader. Blackwell Publishers.Google ScholarGoogle Scholar
  3. G. Doddington. 2001. Speaker recognition based on idiolectal differences between speakers. In Proceedings of the 7th European Conference on Speech Communication and Technology (Eurospeech 2001), pages 2251--2254.Google ScholarGoogle Scholar
  4. P. Eckert and S. McConnell-Ginet, editors. 2003. Language and Gender. Cambridge University Press.Google ScholarGoogle Scholar
  5. G. Forman. 2003. An extensive empirical study of feature selection metrics for text classification. Machine Learning Research, 3:1289--1305. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. S. Kiesling. in press. Dude. American Speech.Google ScholarGoogle Scholar
  7. R. Kneser and H. Ney. 1987. Improved backing-off for m-gram language modeling. In Proc. Intl. Conf. on Acoustics, Speech and Signal Processing (ICASSP), pages 181--184.Google ScholarGoogle Scholar
  8. M. Koppel, S. Argamon, and A. R. Shimoni. 2002. Automatically categorizing written texts by author gender. Literary and Linguistic Computing, 17(4):401--412.Google ScholarGoogle ScholarCross RefCross Ref
  9. A. McCallum. 1996. Bow: A toolkit for statistical language modeling, text retrieval, classification and clustering. http://www.cs.cmu.edu/mccallum/bow.Google ScholarGoogle Scholar
  10. S. Singh. 2001. A pilot study on gender differences in conversational speech on lexical richness measures. Literary and Linguistic Computing, 16(3):251--264.Google ScholarGoogle ScholarCross RefCross Ref
  11. E. Stamatatos, N. Fakotakis, and G. Kokkinakis. 2000. Automatic text categorization in terms of genre and author. Computational Linguistics, 26:471--495. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. A. Stolcke. 2002. An extensible language modeling toolkit. In Proc. Intl. Conf. on Spoken Language Processing (ICSLP), pages 901--904.Google ScholarGoogle Scholar
  1. A quantitative analysis of lexical differences between genders in telephone conversations

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
          June 2005
          657 pages
          • General Chair:
          • Kevin Knight

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 25 June 2005

          Qualifiers

          • Article

          Acceptance Rates

          ACL '05 Paper Acceptance Rate77of423submissions,18%Overall Acceptance Rate85of443submissions,19%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader