Abstract
The present paper presents the structure of a cross-linguistic database of production data. The database contains annotated texts collected from a sample of fifteen different languages by means of identical data gathering methods, which are designed to enable studies on typology and universals of information structure. The special property of this database is that it combines the features of a natural language corpus and the features of a typological database. The challenge for the exploration interface is to provide user-friendly support for exploiting this particular type of resource, thus facilitating empirical generalizations about the collected data in the individual languages and comparison among them.
We would like to thank Sam Hellmuth and two anonymous reviewers for their valuable comments. This paper is part of the projects D1 ”Linguistic Databases for Information Structure: Annotation and Retrieval” and D2 ”Typology of Information Structure” at the University of Potsdam (sponsored by the German Research Foundation).
This is a preview of subscription content, log in via an institution.
Buying options
Tax calculation will be finalised at checkout
Purchases are for personal use only
Learn about institutional subscriptionsPreview
Unable to display preview. Download preview PDF.
References
Bickel, B., Comrie, B., Haspelmath, M.: Leipzig Glossing Rules. Ms. University of Leipzig (2004)
Bickel, B., Nichols, J.: Autotypologizing Databases and their Use in Field Work. In: Proc. Int. LREC Workshop on Resources and Tools in Field Linguistics (2002)
Boersma, P., Weenink, D.: Praat. doing phonetics by computer (Version 4.3.14) (2005), Computer program: http://www.praat.org/
Brown, D., Corbett, C., Tiberius, C., Barron, J.: The Surrey Database of Agreement (2005), Online database: http://www.smg.surrey.ac.uk/Agreement/explore.aspx
Comrie, B., Smith, N.: Lingua Descriptive Studies: Questionnaire. Lingua 42, 1–72 (1977)
Corbett, C., Baerman, M., Brown, D., Hippisley, A.: Extended Deponency: The Right Morphology in the Wrong Place (2005), Online database: http://www.surrey.ac.uk/LIS/MB/WALS/WALS.htm
Dahl, Ö. (ed.): Tense and Aspect in the Languages of Europe. Mouton de Gruyter, Berlin, New York (2000)
Ding, S.: Fundamentals of Prinmi. A Tibeto-Burman Language of Northwestern Yunnan, China. PhD. dissertation, Australian National University (1998)
Dipper, S.: XML-Based Stand-off Representation and Exploitation of Multi-Level Linguistic Annotation. In: BXML 2005. Proceedings of Berliner XML Tage 2005, Berlin, pp. 39–50 (2005)
Annotation Guidelines. In: Dipper, S., Götze, M., Skopeteas, S. (eds.) Interdisciplinary Studies on Information Structure (ISIS). Working Papers of the SFB 632, vol. 8, Universitätsverlag Potsdam, Potsdam (2006)
Dipper, S., Götze, M., Stede, M., Wegst, T.: ANNIS. A Linguistic Database For Exploring Information Structure. In: Interdisciplinary Studies on Information Structure (ISIS). Working Papers of the SFB 632, pp. 245–279. Universitätsverlag Potsdam, Potsdam (2004)
Dybkjaer, L., Berman, S., Bernsen, N.O., Carletta, J., Heid, U., LListerri, J.: Requirements Specification for a Tool in Support of Annotation of Natural Interaction and Multimodal Datad. ISLE Natural Interactivity and Multimodality Working Group. D11.2 (2001)
Skopeteas, S., Fiedler, M., Hellmuth, I., Schwarz, S., Stoel, A., Fanselow, R., Féry, G., Krifka, C.: Questionnaire on Information Structure. In: Interdisciplinary Studies on Information Structure (ISIS). Working Papers of the SFB 632, vol. 6, Universitätsverlag Potsdam, Potsdam (2006)
Harris, A.C.: Georgian Syntax. Cambridge University Press, Cambridge (1981)
Haspelmath, M., Dryer, M.S., Gil, D., Comrie, B. (eds.): The World Atlas of Language Structures. Oxford University Press, Oxford (2005)
Hyman, L., Mortensen, D., Allison, D.: X-tone: Cross-linguistic Tonal Database (2005), Online database: http://xtone.linguistics.berkeley.edu/display/index.php
König, E., Bakker, D., Dahl, Ö., Haspelmath, M., Koptjevskaja-Tamm, M., Lehmann, C., Siewierska, A.: EUROTYP Guidelines. European Science Foundation Programme in Language Typology (1993)
König, E., Gast, V., Hole, D., Siemund, P., Töpper, S.: Typological Database of Intensifiers and Reflexives. Freie Universität Berlin (2006), Online Database: http://noam.philologie.fu-berlin.de/~gast/tdir/
Hurch, B., Mattes, V.: The Graz Database on Reduplication. Faits de Langues (to appear)
Schmidt, T.: Transcribing and Annotating Spoken Language with EXMARaLDA. In: Proceedings of the LREC-Workshop on XML Based Richly Annotated Corpora, Lisbon 2004. ELRA, Paris (2004)
Wittenburg, P., Mosel, U., Dwyer, A.: Methods of Language Documentation in the DOBES Project. In: Proceedings of LREC 2002, pp. 34–42 (2002)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Götze, M., Skopeteas, S., Roloff, T., Stoel, R. (2007). Towards a Cross-Linguistic Production Data Archive: Structure and Exploration . In: ten Cate, B.D., Zeevat, H.W. (eds) Logic, Language, and Computation. TbiLLC 2005. Lecture Notes in Computer Science(), vol 4363. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75144-1_10
Download citation
DOI: https://doi.org/10.1007/978-3-540-75144-1_10
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75143-4
Online ISBN: 978-3-540-75144-1
eBook Packages: Computer ScienceComputer Science (R0)