Finding Appropriate Lexical Diversity Measurements for Small-Size Corpus

Article Preview

Abstract:

In the present investigation four kinds of lexical diversity measurement have been applied to the sets of word chunks with monotone increasing size. The computational experiment with corpus processing and statistical test has been conducted to find out the most effective lexical diversity measurement in evaluating a small-sized corpus of 350~550 words, and the result shows that D-estimate is the most appropriate among the four lexical diversity measurements which are considered in this research. Also D-estimate shows more stable results than other measurements when the number of words varies between texts.

You might also be interested in these eBooks

Info:

Periodical:

Pages:

1244-1248

Citation:

Online since:

October 2011

Authors:

Export:

Price:

[1] A. Mellor: Essay Length, Lexical Diversity and Automatic Essay Scoring, in Memoirs of the Osaka Institute of Technology, Series B, Vol. 55, No. 2 (2001, pp.1-14.

Google Scholar

[2] F.J. Tweedie and R.H. Baayen: How variable may a constant be? Measures of lexical richness in perspective, in Computers and the Humanities Vol. 32 (1998), pp.323-52.

Google Scholar

[3] R.H. Baayen: Analyzing Linguistic Data: A Practical Introduction to Statistics Using R (Cambridge University Press, NY 2008).

Google Scholar

[4] P. Durán, D. Malvern, R. Brian and C. Ngoni: Development Trends in Lexical Diversity, in Applied Linguistics Vol. 25, No. 2 (2004), pp.220-42.

Google Scholar

[5] Text Corpus from Project Gutenberg available on http: /www. gutenberg. org.

Google Scholar