Skip to main content

An Ensemble of Grapheme and Phoneme for Machine Transliteration

  • Conference paper
Natural Language Processing – IJCNLP 2005 (IJCNLP 2005)

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3651))

Included in the following conference series:

Abstract

Machine transliteration is an automatic method to generate characters or words in one alphabetical system for the corresponding characters in another alphabetical system. There has been increasing concern on machine transliteration as an assistant of machine translation and information retrieval. Three machine transliteration models, including “grapheme-based model”, “phoneme-based model”, and “hybrid model”, have been proposed. However, there are few works trying to make use of correspondence between source grapheme and phoneme, although the correspondence plays an important role in machine transliteration. Furthermore there are few works, which dynamically handle source grapheme and phoneme. In this paper, we propose a new transliteration model based on an ensemble of grapheme and phoneme. Our model makes use of the correspondence and dynamically uses source grapheme and phoneme. Our method shows better performance than the previous works about 15~23% in English-to-Korean transliteration and about 15~43% in English-to-Japanese transliteration.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 129.00
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 169.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

References

  1. Aha, D.W.: Lazy learning: Special issue editorial. Artificial Intelligence Review 11, 710 (1997)

    Article  Google Scholar 

  2. Al-Onaizan, Y., Knight, K.: Translating Named Entities Using Monolingual and Bilingual Resources. In: The Proceedings of ACL 2002 (2002)

    Google Scholar 

  3. Berger, A., Della Pietra, S., Della Pietra, V.: A maximum entropy approach to natural language processing. Computational Linguistics 22(1), 39–71 (1996)

    Google Scholar 

  4. Slaven, B., Tanaka, H.: Improving Back-Transliteration by Combining Information Sources. In: Proc. of IJC-NLP 2004, pp. 542–547 (2004)

    Google Scholar 

  5. Daelemans, W., Zavrel, J., van der Sloot, K., van den Bosch, A.: 2002, Timble TiMBL: Tilburg Memory Based Learner, version 4.3, Reference Guide, ILK Technical Report 02-10, (2002).

    Google Scholar 

  6. Fujii, A., Tetsuya, I.: Japanese/English Cross-Language Information Retrieval: Exploration of Query Translation and Transliteration. Computers and the Humanities 35(4), 389–420 (2001)

    Article  Google Scholar 

  7. Goto, I., Kato, N., Uratani, N., Ehara, T.: Transliteration Considering Context Information Based on the Maximum Entropy Method. In: Proceedings of MT-Summit IX (2003)

    Google Scholar 

  8. Kang, B.J., Choi, K.-S.: Automatic Transliteration and Back-transliteration by Decision Tree Learning. In: Proceedings of the 2nd International Conference on Language Resources and Evaluation (2000)

    Google Scholar 

  9. Kang, I.H., Kim, G.C.: English-to-Korean Transliteration using Multiple Unbounded Overlapping Phoneme Chunk. In: Proceedings of the 18th International Conference on Computational Linguistics (2000)

    Google Scholar 

  10. Knight, K., Graehl, J.: Machine Transliteration. In: Proceedings. of the 35th Annual Meetings of the Association for Computational Linguistics, ACL (1997)

    Google Scholar 

  11. Lee, J.S., Choi, K.S.: English to Korean Statistical transliteration for information retrieval. Computer Processing of Oriental Languages 12(1), 17–37 (1998)

    MathSciNet  Google Scholar 

  12. Lee, J.S.: An English-Korean transliteration and Retransliteration model for Cross-lingual information retrieval, PhD Thesis, Computer Science Dept., KAIST (1999)

    Google Scholar 

  13. Haizhou, L., Zhang, M., Su, J.: A Joint Source-Channel Model for Machine Transliteration. In: ACL 2004, pp. 159–166 (2004)

    Google Scholar 

  14. Nam, Y.S.: Foreign dictionary. Sung-An-Dang publisher (1997)

    Google Scholar 

  15. Quinlan, J.R.: C4.5: Programs for Machine Learning. Morgan Kauffman, San Francisco (1993)

    Google Scholar 

  16. Zhang, L.: Maximum Entropy Modeling Toolkit for Python and C++ (2004), http://www.nlplab.cn/zhangle/

Download references

Author information

Authors and Affiliations

Authors

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2005 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Oh, JH., Choi, KS. (2005). An Ensemble of Grapheme and Phoneme for Machine Transliteration. In: Dale, R., Wong, KF., Su, J., Kwong, O.Y. (eds) Natural Language Processing – IJCNLP 2005. IJCNLP 2005. Lecture Notes in Computer Science(), vol 3651. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11562214_40

Download citation

  • DOI: https://doi.org/10.1007/11562214_40

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-29172-5

  • Online ISBN: 978-3-540-31724-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics