skip to main content
10.3115/1219840.1219906dlproceedingsArticle/Chapter ViewAbstractPublication PagesaclConference Proceedingsconference-collections
Article
Free Access

Clause restructuring for statistical machine translation

Published:25 June 2005Publication History

ABSTRACT

We describe a method for incorporating syntactic information in statistical machine translation systems. The first step of the method is to parse the source language string that is being translated. The second step is to apply a series of transformations to the parse tree, effectively reordering the surface string on the source language side of the translation system. The goal of this step is to recover an underlying word order that is closer to the target language word-order than the original string. The reordering approach is applied as a pre-processing step in both the training and decoding phases of a phrase-based statistical MT system. We describe experiments on translation from German to English, showing an improvement from 25.2% Bleu score for a baseline system to 26.8% Bleu score for the system with reordering, a statistically significant improvement.

References

  1. Alshawi, H. (1996). Head automata and bilingual tiling: Translation with minimal representations (invited talk). In Proceedings of ACL 1996. Google ScholarGoogle ScholarDigital LibraryDigital Library
  2. Berger, A. L., Pietra, S. A. D., and Pietra, V. J. D. (1996). A maximum entropy approach to natural language processing. Computational Linguistics, 22(1):39--69. Google ScholarGoogle ScholarDigital LibraryDigital Library
  3. Brown, P. F., Pietra, S. A. D., Pietra, V. J. D., and Mercer, R. L. (1993). The mathematics of statistical machine translation. Computational Linguistics, 19(2):263--313. Google ScholarGoogle ScholarDigital LibraryDigital Library
  4. Charniak, E., Knight, K., and Yamada, K. (2003). Syntax-based language models for statistical machine translation. In Proceedings of the MT Summit IX.Google ScholarGoogle Scholar
  5. Dubey, A. and Keller, F. (2003). Parsing german with sister-head dependencies. In Proceedings of ACL 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  6. Efron, B. and Tibshirani, R. J. (1993). An Introduction to the Bootstrap. Springer-Verlag.Google ScholarGoogle Scholar
  7. Galley, M., Hopkins, M., Knight, K., and Marcu, D. (2004). What's in a translation rule? In Proceedings of HLT-NAACL 2004.Google ScholarGoogle ScholarCross RefCross Ref
  8. Gildea, D. (2003). Loosely tree-based alignment for machine translation. In Proceedings of ACL 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  9. Graehl, J. and Knight, K. (2004). Training tree transducers. In Proceedings of HLT-NAACL 2004.Google ScholarGoogle Scholar
  10. Koehn, P. (2004). Statistical significance tests for machine translation evaluation. In Lin, D. and Wu, D., editors, Proceedings of EMNLP 2004.Google ScholarGoogle Scholar
  11. Koehn, P. and Knight, K. (2003). Feature-rich statistical translation of noun phrases. In Hinrichs, E. and Roth, D., editors, Proceedings of ACL 2003, pages 311--318. Google ScholarGoogle ScholarDigital LibraryDigital Library
  12. Koehn, P., Och, F. J., and Marcu, D. (2003). Statistical phrase based translation. In Proceedings of HLT-NAACL 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  13. Lehmann, E. L. (1986). Testing Statistical Hypotheses (Second Edition). Springer-Verlag.Google ScholarGoogle Scholar
  14. Marcu, D. and Wong, W. (2002). A phrase-based, joint probability model for statistical machine translation. In Proceedings of EMNLP 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  15. Melamed, I. D. (2004). Statistical machine translation by parsing. In Proceedings of ACL 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  16. Niessen, S. and Ney, H. (2004). Statistical machine translation with scarce resources using morpho-syntactic information. Computational Linguistics, 30(2):181--204. Google ScholarGoogle ScholarDigital LibraryDigital Library
  17. Och, F. J. (2003). Minimum error rate training in statistical machine translation. In Proceedings of ACL 2003. Google ScholarGoogle ScholarDigital LibraryDigital Library
  18. Och, F. J., Gildea, D., Khudanpur, S., Sarkar, A., Yamada, K., Fraser, A., Kumar, S., Shen, L., Smith, D., Eng, K., Jain, V., Jin, Z., and Radev, D. (2004). A smorgasbord of features for statistical machine translation. In Proceedings of HLT-NAACL 2004.Google ScholarGoogle Scholar
  19. Och, F. J., Tillmann, C., and Ney, H. (1999). Improved alignment models for statistical machine translation. In Proceedings of EMNLP 1999, pages 20--28. Google ScholarGoogle ScholarDigital LibraryDigital Library
  20. Papineni, K., Roukos, S., Ward, T., and Zhu, W.-J. (2002). BLEU: a method for automatic evaluation of machine translation. In Proceedings of ACL 2002. Google ScholarGoogle ScholarDigital LibraryDigital Library
  21. Shen, L., Sarkar, A., and Och, F. J. (2004). Discriminative reranking for machine translation. In Proceedings of HLT-NAACL 2004.Google ScholarGoogle Scholar
  22. Wasserman, L. (2004). All of Statistics. Springer-Verlag.Google ScholarGoogle Scholar
  23. Wu, D. (1997). Stochastic inversion transduction grammars and bilingual parsing of parallel corpora. Computational Linguistics, 23(3). Google ScholarGoogle ScholarDigital LibraryDigital Library
  24. Xia, F. and McCord, M. (2004). Improving a statistical MT system with automatically learned rewrite patterns. In Proceedings of Coling 2004. Google ScholarGoogle ScholarDigital LibraryDigital Library
  25. Yamada, K. and Knight, K. (2001). A syntax-based statistical translation model. In Proceedings of ACL 2001. Google ScholarGoogle ScholarDigital LibraryDigital Library
  26. Zhang, Y. and Vogel, S. (2004). Measuring confidence intervals for the machine translation evaluation metrics. In Proceedings of the Tenth Conference on Theoretical and Methodological Issues in Machine Translation (TMI).Google ScholarGoogle Scholar
  1. Clause restructuring for statistical machine translation

        Recommendations

        Comments

        Login options

        Check if you have access through your login credentials or your institution to get full access on this article.

        Sign in
        • Published in

          cover image DL Hosted proceedings
          ACL '05: Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics
          June 2005
          657 pages
          • General Chair:
          • Kevin Knight

          Publisher

          Association for Computational Linguistics

          United States

          Publication History

          • Published: 25 June 2005

          Qualifiers

          • Article

          Acceptance Rates

          ACL '05 Paper Acceptance Rate77of423submissions,18%Overall Acceptance Rate85of443submissions,19%

        PDF Format

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader