MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment

McCarthy, Philip M.; Jarvis, Scott

doi:10.3758/BRM.42.2.381

MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment

Articles From the SCiP Conference
Published: 01 May 2010

Volume 42, pages 381–392, (2010)
Cite this article

Behavior Research Methods Aims and scope Submit manuscript

Philip M. McCarthy¹ &
Scott Jarvis²

11k Accesses
353 Citations
2 Altmetric
Explore all metrics

Abstract

The main purpose of this study was to examine the validity of the approach to lexical diversity assessment known as the measure of textual lexical diversity (MTLD). The index for this approach is calculated as the mean length of word strings that maintain a criterion level of lexical variation. To validate the MTLD approach, we compared it against the performances of the primary competing indices in the field, which include vocd-D, TTR, Maas, Yule’s K, and an HD-D index derived directly from the hypergeometric distribution function. The comparisons involved assessments of convergent validity, divergent validity, internal validity, and incremental validity. The results of our assessments of these indices across two separate corpora suggest three major findings. First, MTLD performs well with respect to all four types of validity and is, in fact, the only index not found to vary as a function of text length. Second, HD-D is a viable alternative to the vocd-D standard. And third, three of the indices—MTLD, vocd-D (or HD-D), and Maas—appear to capture unique lexical information. We conclude by advising researchers to consider using MTLD, vocd-D (or HD-D), and Maas in their studies, rather than any single index, noting that lexical diversity can be assessed in many ways and each approach may be informative as to the construct under investigation.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Analyzing Dispersion

Analysing Keyword Lists

Beyond lexical frequencies: using R for text analysis in the digital humanities

Article 08 April 2019

References

American Educational Research Association, American Psychological Association, & National Council on Measurement in Education (1999). Standards for educational and psychological testing. Washington, DC: American Educational Research Association.
Google Scholar
Best, R., Ozuru, Y., Floyd, R., & McNamara, D. S. (2006). Children’s text comprehension: Effects of genre, knowledge, and text cohesion. In S. A. Barab, K. E. Hay, & D. T. Hickey (Eds.), Proceedings of the Seventh International Conference of the Learning Sciences (pp. 37–42). Mahwah, NJ: Erlbaum.
Google Scholar
Biber, D. (1988). Variation across speech and writing. Cambridge: Cambridge University Press.
Book Google Scholar
Biber, D. (1989). A typology of English texts. Linguistics, 27, 3–43.
Article Google Scholar
Biggs, A., Daniel, L., Feather, R. M., Ortleb, E., Rillero, P., Snyder, S. L., & Zike, D. (2003). Glencoe science: Science level green. New York: Glencoe/McGraw-Hill.
Google Scholar
Cohen, J. (1988). Statistical power analysis for the behavioral sciences. Hillsdale, NJ: Erlbaum.
Google Scholar
Crossley, S. A., & McNamara, D. S. (2009). Computationally assessing lexical differences in L1 and L2 writing. Journal of Second Language Writing, 18, 119–135.
Article Google Scholar
Crossley, S. A., & McNamara, D. S. (in press). Predicting second language writing proficiency: The role of cohesion, readability, and lexical difficulty. Journal of Research in Reading.
Crossley, S. A., Salsbury, T., & McNamara, D. S. (2009). Measuring second language lexical growth using hypernymic relationships. Language Learning, 59, 307–334.
Article Google Scholar
Dempsey, K. B., McCarthy, P. M., & McNamara, D. S. (2007). Using phrasal verbs as an index to distinguish text genres. In D. Wilson & G. Sutcliffe (Eds.), Proceedings of the Twentieth International Florida Artificial Intelligence Research Society Conference (pp. 217–222). Menlo Park, CA: AAAI Press.
Google Scholar
Dugast, D. (1978). Sur quoi se fonde la notion d’étendue théoretique du vocabulaire? Le Français Moderne, 46, 25–32.
Google Scholar
Ertmer, P. A., Bai, H., Dong, C., Khalil, M., Park, S. H., & Wang, L. (2002). Online professional development: Building administrators’ capacity for technology leadership. Journal in Computing Teacher Education, 19, 5–11.
Google Scholar
Glaser, B. G., & Strauss, A. (1967). Discovery of grounded theory: Strategies for qualitative research. New York: Aldine.
Google Scholar
Harris Wright, H., Silverman, S. W., & Newhoff, M. (2003). Measures of lexical diversity in aphasia. Aphasiology, 17, 443–452.
Article Google Scholar
Herdan, G. (1964). Quantitative linguistics. London: Butterworths.
Google Scholar
Hess, C. W., Sefton, K. M., & Landry, R. G. (1986). Sample size and type-token ratios for oral language of preschool children. Journal of Speech & Hearing Research, 29, 129–134.
Article Google Scholar
Honoré, A. (1979). Some simple measures of richness of vocabulary. Association for Literary & Linguistic Computing Bulletin, 7, 172–177.
Google Scholar
Jarvis, S. (2002). Short texts, best fitting curves, and new measures of lexical diversity. Language Testing, 19, 57–84.
Article Google Scholar
Johansson, S., Leech, G., & Goodluck, H. (1978). Manual of information to accompany the Lancaster-Oslo/Bergen Corpus of British English, for use with digital computers. Oslo: University of Oslo, Department of English.
Google Scholar
Johnson, W. (1944). Studies in language behavior: I. A program of research. Psychological Monographs, 56, 1–15.
Article Google Scholar
Kučera, H., & Francis, W. N. (1967). Computational analysis of present-day American English. Providence, RI: Brown University Press.
Google Scholar
Landauer, T. K., Laham, D., Rehder, B., & Schreiner, M. E. (1997). How well can passage meaning be derived without using word order? A comparison of latent semantic analysis and humans. In M. G. Shafto & P. Langley (Eds.), Proceedings of the 19th Annual Meeting of the Cognitive Science Society (pp. 412–417). Mahwah, NJ: Erlbaum.
Google Scholar
Lincoln, Y. S., & Guba, E. G. (1985). Naturalistic inquiry. Beverly Hills, CA: Sage.
Book Google Scholar
Louwerse, M. M., McCarthy, P. M., McNamara, D. S., & Graesser, A. C. (2004). Variation in language and cohesion across written and spoken registers. In K. Forbus, D. Gentner, & T. Regier (Eds.), Proceedings of the 26th Annual Conference of the Cognitive Science Society (pp. 843–848). Mahwah, NJ: Erlbaum.
Google Scholar
Maas, H. D. (1972). Zusammenhang zwischen Wortschatzumfang und Länge eines Textes. Zeitschrift für Literaturwissenschaft und Linguistik, 8, 73–79.
Google Scholar
Malvern, D. D., Richards, B. J., Chipere, N., & Durán, P. (2004). Lexical diversity and language development: Quantification and assessment. Houndmills, NH: Palgrave Macmillan.
Book Google Scholar
McCarthy, P. M., Dufty, D., Hempelman, C., Cai, Z., Graesser, A. C., & McNamara, D. S. (in press). Evaluating givenness/newness. Discourse Processes.
McCarthy, P. M., & Jarvis, S. (2007). A theoretical and empirical evaluation of vocd. Language Testing, 24, 459–488.
Article Google Scholar
McCarthy, P. M., Myers, J. C., Briner, S. W., Graesser, A. C., & McNamara, D. S. (2009). A psychological and computational study of genre recognition. Journal for Language Technology & Computational Linguistics, 24, 23–55.
Article Google Scholar
McEnery, T. (2003). Corpus linguistics. In R. Mitkov (Ed.), Handbook of computational linguistics (pp. 448–463). Oxford: Oxford University Press.
Google Scholar
McKee, G., Malvern, D., & Richards, B. (2000). Measuring vocabulary diversity using dedicated software. Literary & Linguistic Computing, 15, 323–337.
Article Google Scholar
McNamara, D. S., Crossley, S. A., & McCarthy, P. M. (2010). Linguistic features of writing quality. Written Communication, 27, 57–86.
Article Google Scholar
McNamara, D. S., Louwerse, M. M., McCarthy, P. M., & Graesser, A. C. (in press). Coh-Metrix: Capturing linguistic features of cohesion. Discourse Processes.
Miller, D. P. (1981). The depth/breadth trade-off in hierarchical computer menus. In Proceedings of the Human Factors Society 25th Annual Meeting (pp. 296–300). Santa Monica, CA: HFES.
Google Scholar
Morse, J. M. (1995). The significance of saturation. Qualitative Health Research, 5, 147–149.
Article Google Scholar
Olney, A. M. (2007). Latent semantic grammar induction: Context, projectivity, and prior distributions. In R. Dragomir & R. Mihalcea (Eds.), Proceedings of TextGraphs-2: Graph-based algorithms for natural language processing (pp. 45–52). Rochester, NY: Association for Computational Linguistics.
Google Scholar
Ong, A. D., & van Dulmen, M. H. M. (2006). Oxford handbook of methods in positive psychology. Oxford: Oxford University Press.
Book Google Scholar
Orlov, Y. K. (1983). Ein Model der Häufigekeitsstruktur des Vokabulars. In H. Guiter & M. V. Arapov (Eds.), Studies on Zipf’s law (pp. 154–233). Bochum: Brockmeyer.
Google Scholar
Owen, A. J., & Leonard, L. B. (2002). Lexical diversity in the spontaneous speech of children with specific language impairment: Application of D. Journal of Speech & Hearing Research, 45, 927–937.
Article Google Scholar
Silverman, S. W., & Bernstein Ratner, N. (2000). Word frequency distributions and type-token characteristics. Mathematical Scientist, 11, 45–72.
Google Scholar
Somers, H. H. (1966). Statistical methods in literary analysis. In J. Leeds (Ed.), The computer and literary style (pp. 128–140). Kent, OH: Kent State University.
Google Scholar
Templin, M. (1957). Certain language skills in children. Minneapolis: University of Minnesota Press.
Book Google Scholar
Tuldava, J. (1993). The statistical structure of a text and its readability. In L. Hrebícek & G. Altmann (Eds.), Quantitative text analysis (pp. 215–227). Trier: Wissenschaftlicher Verlag.
Google Scholar
Tweedie, F. J., & Baayen, R. H. (1998). How variable may a constant be? Measures of lexical richness in perspective. Computers & the Humanities, 32, 323–352.
Article Google Scholar
Van Dijk, T. A., & Kintsch, W. (1983). Strategies of discourse comprehension. New York: Academic Press.
Google Scholar
Wu, T. (1993). An accurate computation of the hypergeometric distribution function. ACM Transactions on Mathematical Software, 19, 33–43.
Article Google Scholar
Yule, G. U. (1944). The statistical study of literary vocabulary. Cambridge: Cambridge University Press.
Google Scholar

Download references

Author information

Authors and Affiliations

Department of English, University of Memphis, 467 Patterson Hall, 38152-3530, Memphis, TN
Philip M. McCarthy
Ohio University, Athens, Ohio
Scott Jarvis

Authors

Philip M. McCarthy
View author publications
You can also search for this author in PubMed Google Scholar
Scott Jarvis
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Philip M. McCarthy.

Additional information

This research was supported in part by the Institute for Education Sciences (IES; Grants R305GA080589, R305G020018-02, and R305G040046) and in part by the National Science Foundation (NSF; Grant IIS-0735682). The views expressed in this article do not necessarily reflect the views of the IES or the NSF.

Rights and permissions

Reprints and permissions

About this article

Cite this article

McCarthy, P.M., Jarvis, S. MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment. Behavior Research Methods 42, 381–392 (2010). https://doi.org/10.3758/BRM.42.2.381

Download citation

Received: 11 November 2009
Accepted: 07 February 2010
Published: 01 May 2010
Issue Date: May 2010
DOI: https://doi.org/10.3758/BRM.42.2.381

Keywords

Access this article

Log in via an institution

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment

Abstract

Access this article

Similar content being viewed by others

Analyzing Dispersion

Analysing Keyword Lists

Beyond lexical frequencies: using R for text analysis in the digital humanities

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

MTLD, vocd-D, and HD-D: A validation study of sophisticated approaches to lexical diversity assessment

Abstract

Access this article

Similar content being viewed by others

Analyzing Dispersion

Analysing Keyword Lists

Beyond lexical frequencies: using R for text analysis in the digital humanities

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation