Abstract
Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. There has been increasing concern on machine transliteration as an assistant of machine translation and information retrieval. Three machine transliteration models, including “grapheme-based model”, “phoneme-based model”, and “hybrid model”, have been proposed. However, there are few works trying to make use of correspondence between source grapheme and phoneme, although the correspondence plays an important role in machine transliteration. Furthermore there are few works, which dynamically handle source grapheme and phoneme. In this paper, we propose a new transliteration model based on an ensemble of grapheme and phoneme. Our model makes use of the correspondence and dynamically uses source grapheme and phoneme. Our method shows better performance than the previous works about 15~23% in English-to-Korean transliteration and about 15~43% in English-to-Japanese transliteration.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
References
Aha, D.W.: Lazy learning: Special issue editorial. Artificial Intelligence Review 11, 710 (1997)
Al-Onaizan, Y., Knight, K.: Translating Named Entities Using Monolingual and Bilingual Resources. In: The Proceedings of ACL 2002 (2002)
Berger, A., Della Pietra, S., Della Pietra, V.: A maximum entropy approach to natural language processing. Computational Linguistics 22(1), 39–71 (1996)
Slaven, B., Tanaka, H.: Improving Back-Transliteration by Combining Information Sources. In: Proc. of IJC-NLP 2004, pp. 542–547 (2004)
Daelemans, W., Zavrel, J., van der Sloot, K., van den Bosch, A.: 2002, Timble TiMBL: Tilburg Memory Based Learner, version 4.3, Reference Guide, ILK Technical Report 02-10, (2002).
Fujii, A., Tetsuya, I.: Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration. Computers and the Humanities 35(4), 389–420 (2001)
Goto, I., Kato, N., Uratani, N., Ehara, T.: Transliteration Considering Context Information Based on the Maximum Entropy Method. In: Proceedings of MT-Summit IX (2003)
Kang, B.J., Choi, K.-S.: Automatic Transliteration and Back-transliteration by Decision Tree Learning. In: Proceedings of the 2nd International Conference on Language Resources and Evaluation (2000)
Kang, I.H., Kim, G.C.: English-to-Korean Transliteration using Multiple Unbounded Overlapping Phoneme Chunk. In: Proceedings of the 18th International Conference on Computational Linguistics (2000)
Knight, K., Graehl, J.: Machine Transliteration. In: Proceedings. of the 35th Annual Meetings of the Association for Computational Linguistics, ACL (1997)
Lee, J.S., Choi, K.S.: English to Korean Statistical transliteration for information retrieval. Computer Processing of Oriental Languages 12(1), 17–37 (1998)
Lee, J.S.: An English-Korean transliteration and Retransliteration model for Cross-lingual information retrieval, PhD Thesis, Computer Science Dept., KAIST (1999)
Haizhou, L., Zhang, M., Su, J.: A Joint Source-Channel Model for Machine Transliteration. In: ACL 2004, pp. 159–166 (2004)
Nam, Y.S.: Foreign dictionary. Sung-An-Dang publisher (1997)
Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)
Zhang, L.: Maximum Entropy Modeling Toolkit for Python and C++ (2004), http://www.nlplab.cn/zhangle/
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2005 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Oh, JH., Choi, KS. (2005). An Ensemble of Grapheme and Phoneme for Machine Transliteration. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_40
Download citation
DOI: https://doi.org/10.1007/11562214_40
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-29172-5
Online ISBN: 978-3-540-31724-1
eBook Packages: Computer ScienceComputer Science (R0)