Skip to content
Open Access Published by De Gruyter Mouton January 13, 2010

Chinese Syntactic and Typological Properties Based on Dependency Syntactic Treebanks

  • Haitao Liu , Yiyi Zhao and Wenwen Li

Chinese Syntactic and Typological Properties Based on Dependency Syntactic Treebanks

This paper offers a quantitative analysis of the syntactic and typological properties of Chinese based on five Chinese dependency treebanks. The study shows that mean dependency distance of Chinese is 2.84; 40-50% dependencies are between non-adjacent words; Chinese is a mixed language with a governor-final and SV-VO-AdjN preference; the mean dependency distance of governor-initial dependencies is greater than that of governor-final ones. Methodologically, the paper adopts five treebanks with different text genres and annotation schemes as a resource to study syntactic features of a language. This method avoids corpus influences on results so that the conclusions can be more reliable and robust. If suitable treebanks are available, it will be an easy task to apply our method to other languages. In this way, the method has a broad theoretical and cross-linguistic perspective.

References

Abeillé A. (ed.). 2003. Treebank: Building and using parsed corpora. Dordrecht: Kluwer.Search in Google Scholar

Best, K.-H. 2006. Quantitative Linguistik: Eine Annaeherung. (3rd ed.) Göttingen: Peust & Gutschmidt.Search in Google Scholar

Bod, R., J. Hay and S. Jannedy (eds.). 2003. Probabilistic linguistics. Cambridge, MA: MIT Press.10.7551/mitpress/5582.001.0001Search in Google Scholar

Buch-Kromann, M. 2006. Discontinuous Grammar. A dependency-based model of human parsing and language acquisition. (Unpublished PhD dissertation, Copenhagen Business School.)Search in Google Scholar

Chen, K.-J. et al. 2003. "Sinica treebank: Design criteria, representational issues and implementation". In: Abeillé A. (ed.). 231-248.Search in Google Scholar

Collins, M. 1996. "A new statistical parser based on bigram lexical dependencies". Proceedings of the 34th Annual Meeting of the Association for Computational Linguistics, Santa Cruz, CA. 184-191.Search in Google Scholar

Cowan, N. 2005. Working memory capacity. Hove: Psychology Press.Search in Google Scholar

De Smedt, K., J. Hajič and S. Kübler (eds.). 2007. Proceedings of the Sixth International Workshop on Treebanks and Linguistic Theories. December 7-8, 2007. Bergen, Norway.Search in Google Scholar

Gries, S.Th. 2009. Quantitative corpus linguistics with R: A practical introduction. London: Routledge.10.4324/9780203880920Search in Google Scholar

Haspelmath, M., M. Dryer, D. Gil and B. Comrie (eds.). 2005. The world atlas of language structures. Oxford: Oxford University Press.Search in Google Scholar

Hudson, R. 1995. Measuring Syntactic Difficulty. http://www.phon.ucl.ac.uk/home/dick/difficulty.htmSearch in Google Scholar

Hudson, R. 2007. Language networks. The new word grammar. Oxford: Oxford University Press.Search in Google Scholar

Kakkonen, T. 2005. "Dependency treebanks: Methods, annotation schemes and tools". Proceedings of the 15th Nordic Conference of Computational Linguistics (NODALIDA 2005), Joensuu, Finland. 94-104.Search in Google Scholar

Köhler, R. and G. Altmann. 2000. "Probability distributions of syntactic units and properties". Journal of Quantitative Linguistics 7(3). 189-200.10.1076/jqul.7.3.189.4114Search in Google Scholar

Köhler, R., G. Altmann, and R.G. Piotrowski (eds.). 2005. Quantitative Linguistik. Ein internationales Handbuch [Quantitative linguistics. An international handbook]. Berlin: Mouton de Gruyter.Search in Google Scholar

Kühler, S., R. McDonald and J. Nivre. 2009. Dependency parsing. San Rafael, CA: Morgan and Claypool.10.2200/S00169ED1V01Y200901HLT002Search in Google Scholar

Liu, H. 2007a. "Probability distribution of dependency distance". Glottometrics 15. 1-12.Search in Google Scholar

Liu, H. 2007b. "Building and using a Chinese dependency treebank". Grkg/Humankybernetik, 48(1). 3-14.Search in Google Scholar

Liu, H. 2008. "Dependency distance as a metric of language comprehension difficulty". Journal of Cognitive Science 9(2). 159-191.10.17791/jcs.2008.9.2.159Search in Google Scholar

Liu, H. 2009a. "Probability distribution of dependencies based on Chinese Dependency Treebank". Journal of Quantitative Linguistics 16 (3). 256-273.10.1080/09296170902975742Search in Google Scholar

Liu, H. 2009b. Dependency grammar: From theory to practice. Beijing: Science Press.Search in Google Scholar

Liu, H. In press. "Dependency direction as a means of word-order typology: A method based on dependency treebanks". Lingua. doi: 10.1016/j.lingua.2009.10.001.10.1016/j.lingua.2009.10.001Search in Google Scholar

Liu, H., R. Hudson and Zh. Feng 2009. "Using a Chinese treebank to measure dependency distance". Corpus Linguistics and Linguistic Theory 5(2). 161-174.10.1515/CLLT.2009.007Search in Google Scholar

Ma, J. 2007. Research on Chinese dependency parsing based on statistical methods. (Unpublished PhD thesis, Harbin Technology University.)Search in Google Scholar

Marcus, M., B. Santorini and M.A. Marcinkiewicz. 1993. "Building a large annotated corpus of English: The Penn Treebank". Computational Linguistics 19(2). 313-330.10.21236/ADA273556Search in Google Scholar

Mel'čuk, I.A. 1988. Dependency syntax: Theory and practice. Albany: State University Press of New York.Search in Google Scholar

Miller, G. 1956. "The magical number seven plus or minus two: Some limits on our capacity for processing information". Psychological Review 63. 81-97.10.1037/h0043158Search in Google Scholar

Ninio, A. 2006. Language and the learning curve: A new theory of syntactic development. Oxford: Oxford University Press.10.1093/acprof:oso/9780199299829.003.0003Search in Google Scholar

Tesnière, L. 1959. Eléments de la syntaxe structurale. Paris: Klincksieck.Search in Google Scholar

Xue, N., F. Xia, F.-D. Chiou and M. Palmer 2005. "The Penn Chinese TreeBank: Phrase structure annotation of a large corpus". Natural Language Engineering 11(2). 207-238.10.1017/S135132490400364XSearch in Google Scholar

Published Online: 2010-1-13
Published in Print: 2009-12-1

This content is open access.

Downloaded on 30.4.2024 from https://www.degruyter.com/document/doi/10.2478/v10010-009-0025-3/html
Scroll to top button