Abstract
Idiolects are person-dependent similarities in language use. They imply that texts by one author show more similarities in language use than texts between authors. Sociolects, on the other hand, are group-dependent similarities in language use. They imply that texts by a group of authors, for instance in terms of gender or time period, share more similarities within a group than between groups. Although idiolects and sociolects are commonly used terms in the humanities, they have not been investigated a great deal from corpus and computational linguistic points of view. To test several idiolect and sociolect hypotheses a factorial combination was used of time period (Modernism, Realism), gender of author (male, female) and author (Eliot, Dickens, Woolf, Joyce) totaling 16 corresponding literary texts. In a series of corpus linguistic studies using Boolean and vector models, no conclusive evidence was found for the selected idiolect and sociolect hypotheses. In final analyses testing the semantics within each literary text, this lack of evidence was explained by the low homogeneity within a literary text.
Similar content being viewed by others
References
Baeza-Yates R., Ribeiro-Neto B. (eds.) (1999) Modern Information Retrieval. ACM Press, New York, 513 p.
Biber D. (1988) Variation Across Speech and Writing. Cambridge University Press, Cambridge, UK, 315 p.
Eco U. (1977) A Theory of Semiotics. Indiana University Press, Bloomington, 368 p.
Fellbaum C. (ed.) (1998) WordNet: An Electronic Lexical Database. MIT Press, Cambridge, MA, 500 p.
Fokkema D., Ibsch E. (1987) Modernist Conjectures. A Mainstream in European Literature 1910-1940. Hurst, London, 330 p.
Foltz P.W., Kintsch W., Landauer T.K. (1998) The Measurement of Textual Coherence With Latent Semantic Analysis. Discourse Processes, 25, pp. 285–307.
Graesser A., Wiemer-Hastings P., Wiemer-Hastings K., Harter D., Person N., and the Tutoring Research Group. (2000) Using Latent Semantic Analysis to Evaluate the Contributions of Students in Autotutor. Interactive Learning Environments, 8, pp. 149–169.
Hu X., Cai Z., Franceschetti D., Penumatsa P., Graesser A.C., Louwerse M.M., McNamara D.S. and the Tutoring Research Group (2003) LSA: The First Dimension and Dimensional Weighting. Proceedings of the 25th Annual Conference of the Cognitive Science Society. Mahwah, NJ: Erlbaum.
Jakobson R. (1987) Linguistics and Poetics. In Jakobson R. (ed.), Language in Literature. Harvard University Press, Cambridge, MA, pp. 62–94.
Kintsch W. (2000). Metaphor Comprehension: A Computational Theory. Psyhonomic Bulletin and Review, 7, pp. 257–266.
Landauer T.K., Dumais S.T. (1997) A Solution to Plato's Problem: The Latent Semantic Analysis Theory of the Acquisition, Induction, and Representation of Knowledge. Psychological Review, 104, pp. 211–240.
Landauer T.K., Foltz P.W., Laham D. (1998) Introduction to Latent Semantic Analysis. Discourse Processes, 25, pp. 259–284.
Lotman J. (1977) The Structure of the Artistic Text. University of Michigan, Ann Arbor, 300 p.
Louwerse M.M., Van Peer W. (eds.) (2002) Thematics: Interdisciplinary Studies. John Benjamins, Amsterdam/Philadelphia. 430 p.
Martindale, C. (1990) The Clockwork Muse. Basic Books, New York, 411 p.
Pennebaker J.W. (2002) What our Words Can Say about Us: Towards a Broader Language Psychology. Psychological Science Agenda, 15, pp. 8–9.
Project Gutenberg, http://www.ibiblio.org/gutenberg.
Sebeok T.A. (1991) A Sign is Just a Sign. Indiana University Press, Bloomington, 178 p.
The Online Books Page, http://onlinebooks.library.upenn.edu.
The Oxford Text Archive, http://ota.ahds.ac.uk.
Wardhaugh R. (1998) An Introduction to Sociolinguistics. Blackwell, Oxford, UK, 464 p.
Watson G. (1994) A Multidimensional Analysis of Style in Mudrooroo Nyoongah's Prose Works. Text, 14, pp. 239–285.
Wellek R., Warren A. (1963) Theory of Literature. Cape, London, 382 p.
Author information
Authors and Affiliations
Rights and permissions
About this article
Cite this article
Louwerse, M.M. Semantic Variation in Idiolect and Sociolect: Corpus Linguistic Evidence from Literary Texts. Computers and the Humanities 38, 207–221 (2004). https://doi.org/10.1023/B:CHUM.0000031185.88395.b1
Issue Date:
DOI: https://doi.org/10.1023/B:CHUM.0000031185.88395.b1