Abstract
In this chapter we will be concerned primarily with the development of new parallel corpora, specifically for English paired with Indic languages. The focus of our discussion here will be Panjabi, though the issues we explore apply fairly equally to other Indic languages and scripts. We want to highlight a range of difficulties which face those constructing parallel corpus resources for the exploration of these languages, especially in the context of parallel corpora. In order to do this, two corpora—one of 16th century Panjabi and one of modern Panjabi—will be described, and some preliminary work on English/Panjabi alignment briefly presented.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Burnard, L. and Sperberg-McQueen, C. M. (1995). TEl Lite: An Introduction to Text Encoding for Interchange. (Online] Available: http://sable.ox.ac.uk/ota/teilite.
Debili, F. and Sammouda, E. (1992). Appariement des Phrases de Textes Bilingues. Proceedings of the 14th International Conference on Computational Linguistics (COLING ‘82), Nantes, France, 517–538.
Edwards, V. and Alladina, S. (1991). Many People Many Tongues: Babel and beyond. In Alladina, S. and Edwards, V. (Eds.) Multilingualism In The British Isles (Vol. 2, pp 1–29 ), London: Longman.
Hearn, P. (1996). The Language Engineering Directory. Madrid, Language and Technology.
Ide, N. and Véronis, J. (1994). MULTEXT (Multilingual Text Tools and Corpora). Proceedings of the International Conference on Computational Linguistics (COLING) 1994, Kyoto, Japan, 588–592.
McEnery, A. M. (1999). Final Report on MILLEFT, Report to EPSRC, Lancaster University. McEnery, A. M., Wilson, A., Sanchez-Leon, F. and Nieto-Serrano, A. (1997). Multilingual Resources for European Languages: Contributions of the CRATER Project. Literary and Linguistic Computing, 12 (4), 219–226.
McEnery, T., Piao, S. L. and Xin, X. (2000). Parallel Alignment in English and Chinese. In McEnery, A. M., Botley, S. and Wilson, A. (Eds.), Multilingual Corpora: Teaching and Research, Amsterdam: Rodopi to appear].
McLeod, W. H. (1989). The Sikhs: History, Religion and Society. Columbia University Press.
Nagao, M. (1984). A framework of a mechanical translation between Japanese and English by analogy principle. In Elithorn, A. and Banerji, R. (Eds.), Artificial and Human Intelligence (pp. 173–180 ), Amsterdam: North-Holland.
Piao, S. L. (2000). A Hybrid Model of English/Chinese Alignment, PhD Thesis, Lancaster University.
Talib, G. S. (1984). Sri Guru Granth Sahib (in English translation). Vol I, Patiala: Panjabi University.
Wu, D. (1995). An Algorithm For Simultaneously Bracketing Parallel Texts By Aligning Words. Proceedings of the 33“ . ’ meeting of the Association for Computational Linguistics, MIT, Cambridge, MA, 244–251.
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2000 Springer Science+Business Media Dordrecht
About this chapter
Cite this chapter
Singh, S., McEnery, T., Baker, P. (2000). Building a parallel corpus of English/Panjabi. In: Véronis, J. (eds) Parallel Text Processing. Text, Speech and Language Technology, vol 13. Springer, Dordrecht. https://doi.org/10.1007/978-94-017-2535-4_17
Download citation
DOI: https://doi.org/10.1007/978-94-017-2535-4_17
Publisher Name: Springer, Dordrecht
Print ISBN: 978-90-481-5555-2
Online ISBN: 978-94-017-2535-4
eBook Packages: Springer Book Archive