Finding Appropriate Lexical Diversity Measurements for Small-Size Corpus

Woon Ho Choi

doi:10.4028/www.scientific.net/AMM.121-126.1244

Paper Titles

Chinese Ecovillage Practice with Cradle to Cradle Design
p.1220

Based on PLC in the Coin Cell Production Line Fault Diagnosis
p.1229

Preparation and Application of High Molecular Weight Coagulant Made from Ilmenite Residues
p.1234

Adhesive Bond Discrepancies Identification with Hybrid Dynamic Parameter Method
p.1239

Finding Appropriate Lexical Diversity Measurements for Small-Size Corpus
p.1244

Rational Production Proration of Gas Wells in Unconsolidated Sandstone Gas Reservoir
p.1249

Study on Contact Characteristic of Globoidal Continuous Cam with Ball
p.1254

Design of Building PV Power Station Data Transmission and Monitoring System Based on ZigBee
p.1259

Studies on Digital Shearography for Testing of Aircraft Composite Structures and Honeycomb-Based Specimen
p.1264

HomeApplied Mechanics and MaterialsApplied Mechanics and Materials Vols. 121-126Finding Appropriate Lexical Diversity Measurements...

Finding Appropriate Lexical Diversity Measurements for Small-Size Corpus

Abstract:

In the present investigation four kinds of lexical diversity measurement have been applied to the sets of word chunks with monotone increasing size. The computational experiment with corpus processing and statistical test has been conducted to find out the most effective lexical diversity measurement in evaluating a small-sized corpus of 350~550 words, and the result shows that D-estimate is the most appropriate among the four lexical diversity measurements which are considered in this research. Also D-estimate shows more stable results than other measurements when the number of words varies between texts.

You might also be interested in these eBooks

Frontiers of Manufacturing and Design Science II

View Preview

Info:

Periodical:

Applied Mechanics and Materials (Volumes 121-126)

Pages:

1244-1248

DOI:

https://doi.org/10.4028/www.scientific.net/AMM.121-126.1244

Citation:

Cite this paper

Online since:

October 2011

Authors:

Woon Ho Choi

Keywords:

D-Estimate, Guiraud's R, Lexical Diversity, Type-Token Ratio (TTR), Yule's K

Export:

RIS, BibTeX

Price:

Permissions:

Request Permissions

References

[1] A. Mellor: Essay Length, Lexical Diversity and Automatic Essay Scoring, in Memoirs of the Osaka Institute of Technology, Series B, Vol. 55, No. 2 (2001, pp.1-14.

Google Scholar

[2] F.J. Tweedie and R.H. Baayen: How variable may a constant be? Measures of lexical richness in perspective, in Computers and the Humanities Vol. 32 (1998), pp.323-52.

Google Scholar

[3] R.H. Baayen: Analyzing Linguistic Data: A Practical Introduction to Statistics Using R (Cambridge University Press, NY 2008).

Google Scholar

[4] P. Durán, D. Malvern, R. Brian and C. Ngoni: Development Trends in Lexical Diversity, in Applied Linguistics Vol. 25, No. 2 (2004), pp.220-42.

Google Scholar

[5] Text Corpus from Project Gutenberg available on http: /www. gutenberg. org.

Google Scholar