ABSTRACT
Decision rules that explicitly account for non-probabilistic evaluation metrics in machine translation typically require special training, often to estimate parameters in exponential models that govern the search space and the selection of candidate translations. While the traditional Maximum A Posteriori (MAP) decision rule can be optimized as a piecewise linear function in a greedy search of the parameter space, the Minimum Bayes Risk (MBR) decision rule is not well suited to this technique, a condition that makes past results difficult to compare. We present a novel training approach for non-tractable decision rules, allowing us to compare and evaluate these and other decision rules on a large scale translation task, taking advantage of the high dimensional parameter space available to the phrase based Pharaoh decoder. This comparison is timely, and important, as decoders evolve to represent more complex search space decisions and are evaluated against innovative evaluation metrics of translation quality.
- Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2):263--311. Google ScholarDigital Library
- George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In In Proc. ARPA Workshop on Human Language Technology. Google ScholarDigital Library
- Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL), Edomonton, Canada, May 27-June 1. Google ScholarDigital Library
- Shankar Kumar and William Byrne. 2004. Minimum bayes-risk decoding for statistical machine translation. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL), Boston, MA, May 27-June 1.Google Scholar
- Lidia Mangu, Eric Brill, and Andreas Stolcke. 2000. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. CoRR, cs.CL/0010012.Google Scholar
- Daniel Marcu and William Wong. 2002. A phrase-based, joint probability model for statistical machine translation. In Proc. of the Conference on Empirical Methods in Natural Language Processing, Philadephia, PA, July 6--7. Google ScholarDigital Library
- Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proc. of the Association for Computational Linguistics, Sapporo, Japan, July 6--7. Google ScholarDigital Library
- Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the Association of Computational Linguistics, pages 311--318. Google ScholarDigital Library
- Nicola Ueffing, Franz Josef Och, and Hermann Ney. 2002. Generation of word graphs in statistical machine translation. In Proc. of the Conference on Empirical Methods in Natural Language Processing, Philadephia, PA, July 6--7. Google ScholarDigital Library
- Ashish Venugopal and Stephan Vogel. 2005. Considerations in mce and mmi training for statistical machine translation. In Proceedings of the Tenth Conference of the European Association for Machine Translation (EAMT-05), Budapest, Hungary, May. The European Association for Machine Translation.Google Scholar
- Stephan Vogel, Ying Zhang, Fei Huang, Alicia Tribble, Ashish Venogupal, Bing Zhao, and Alex Waibel. 2003. The CMU statistical translation system. In Proceedings of MT Summit IX, New Orleans, LA, September.Google Scholar
- Training and evaluating error minimization rules for statistical machine translation
Recommendations
Integrating rules and dictionaries from shallow-transfer machine translation into phrase-based statistical machine translation
We describe a hybridisation strategy whose objective is to integrate linguistic resources from shallow-transfer rule-based machine translation (RBMT) into phrase-based statistical machine translation (PBSMT). It basically consists of enriching the ...
N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational LinguisticsIn this paper we compare and contrast two approaches to Machine Translation (MT): the CMU-UKA Syntax Augmented Machine Translation system (SAMT) and UPC-TALP N-gram-based Statistical Machine Translation (SMT). SAMT is a hierarchical syntax-driven ...
Linguistically annotated BTG for statistical machine translation
COLING '08: Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys ...
Comments