research-article

Free Access

Training and evaluating error minimization rules for statistical machine translation

Authors:
Ashish Venugopal

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Andreas Zollmann

Carnegie Mellon University

Carnegie Mellon University
View Profile

,
Alex Waibel

Carnegie Mellon University

Carnegie Mellon University
View Profile

Authors Info & Claims

ParaText '05: Proceedings of the ACL Workshop on Building and Using Parallel TextsJune 2005Pages 208–215

Published:29 June 2005Publication History

ParaText '05: Proceedings of the ACL Workshop on Building and Using Parallel Texts

Pages 208–215

ABSTRACT

Decision rules that explicitly account for non-probabilistic evaluation metrics in machine translation typically require special training, often to estimate parameters in exponential models that govern the search space and the selection of candidate translations. While the traditional Maximum A Posteriori (MAP) decision rule can be optimized as a piecewise linear function in a greedy search of the parameter space, the Minimum Bayes Risk (MBR) decision rule is not well suited to this technique, a condition that makes past results difficult to compare. We present a novel training approach for non-tractable decision rules, allowing us to compare and evaluate these and other decision rules on a large scale translation task, taking advantage of the high dimensional parameter space available to the phrase based Pharaoh decoder. This comparison is timely, and important, as decoders evolve to represent more complex search space decisions and are evaluated against innovative evaluation metrics of translation quality.

References

Peter F. Brown, Vincent J. Della Pietra, Stephen A. Della Pietra, and Robert L. Mercer. 1993. The mathematics of statistical machine translation: parameter estimation. Computational Linguistics, 19(2):263--311. Google ScholarDigital Library
George Doddington. 2002. Automatic evaluation of machine translation quality using n-gram co-occurrence statistics. In In Proc. ARPA Workshop on Human Language Technology. Google ScholarDigital Library
Philipp Koehn, Franz Josef Och, and Daniel Marcu. 2003. Statistical phrase-based translation. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL), Edomonton, Canada, May 27-June 1. Google ScholarDigital Library
Shankar Kumar and William Byrne. 2004. Minimum bayes-risk decoding for statistical machine translation. In Proceedings of the Human Language Technology and North American Association for Computational Linguistics Conference (HLT/NAACL), Boston, MA, May 27-June 1.Google Scholar
Lidia Mangu, Eric Brill, and Andreas Stolcke. 2000. Finding consensus in speech recognition: word error minimization and other applications of confusion networks. CoRR, cs.CL/0010012.Google Scholar
Daniel Marcu and William Wong. 2002. A phrase-based, joint probability model for statistical machine translation. In Proc. of the Conference on Empirical Methods in Natural Language Processing, Philadephia, PA, July 6--7. Google ScholarDigital Library
Franz Josef Och. 2003. Minimum error rate training in statistical machine translation. In Proc. of the Association for Computational Linguistics, Sapporo, Japan, July 6--7. Google ScholarDigital Library
Kishore Papineni, Salim Roukos, Todd Ward, and Wei-Jing Zhu. 2002. Bleu: a method for automatic evaluation of machine translation. In Proceedings of the Association of Computational Linguistics, pages 311--318. Google ScholarDigital Library
Nicola Ueffing, Franz Josef Och, and Hermann Ney. 2002. Generation of word graphs in statistical machine translation. In Proc. of the Conference on Empirical Methods in Natural Language Processing, Philadephia, PA, July 6--7. Google ScholarDigital Library
Ashish Venugopal and Stephan Vogel. 2005. Considerations in mce and mmi training for statistical machine translation. In Proceedings of the Tenth Conference of the European Association for Machine Translation (EAMT-05), Budapest, Hungary, May. The European Association for Machine Translation.Google Scholar
Stephan Vogel, Ying Zhang, Fei Huang, Alicia Tribble, Ashish Venogupal, Bing Zhao, and Alex Waibel. 2003. The CMU statistical translation system. In Proceedings of MT Summit IX, New Orleans, LA, September.Google Scholar

Training and evaluating error minimization rules for statistical machine translation
1. Computing methodologies
  1. Artificial intelligence
2. Hardware
  1. Power and energy
    1. Power estimation and optimization

Recommendations

Integrating rules and dictionaries from shallow-transfer machine translation into phrase-based statistical machine translation

We describe a hybridisation strategy whose objective is to integrate linguistic resources from shallow-transfer rule-based machine translation (RBMT) into phrase-based statistical machine translation (PBSMT). It basically consists of enriching the ...
Read More
N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination
EACL '09: Proceedings of the 12th Conference of the European Chapter of the Association for Computational Linguistics

In this paper we compare and contrast two approaches to Machine Translation (MT): the CMU-UKA Syntax Augmented Machine Translation system (SAMT) and UPC-TALP N-gram-based Statistical Machine Translation (SMT). SAMT is a hierarchical syntax-driven ...
Read More
Linguistically annotated BTG for statistical machine translation
COLING '08: Proceedings of the 22nd International Conference on Computational Linguistics - Volume 1

Bracketing Transduction Grammar (BTG) is a natural choice for effective integration of desired linguistic knowledge into statistical machine translation (SMT). In this paper, we propose a Linguistically Annotated BTG (LABTG) for SMT. It conveys ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
ParaText '05: Proceedings of the ACL Workshop on Building and Using Parallel Texts
June 2005
233 pages
Program Chairs:
Philipp Koehn
University of Edinburgh
,
Joel Martin
National Research Council of Canada
,
Rada Mihalcea
University of North Texas
,
Christof Monz
University of Maryland
,
Ted Pedersen
University of Minnesota, Duluth
Sponsors
In-Cooperation
Publisher
Association for Computational Linguistics
United States
Publication History
- Published: 29 June 2005
Qualifiers
- research-article
Conference
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 7
  Total Citations
  View Citations
- 93
  Total Downloads
- Downloads (Last 12 months)18
- Downloads (Last 6 weeks)5
Other Metrics
View Author Metrics
Cited By
View all

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Training and evaluating error minimization rules for statistical machine translation

ParaText '05: Proceedings of the ACL Workshop on Building and Using Parallel Texts

ABSTRACT

References

Cited By

Recommendations

Integrating rules and dictionaries from shallow-transfer machine translation into phrase-based statistical machine translation

N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination

Linguistically annotated BTG for statistical machine translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Other Metrics

Article Metrics

Other Metrics

Cited By

PDF Format

eReader

Digital Edition

Caption

Training and evaluating error minimization rules for statistical machine translation

ParaText '05: Proceedings of the ACL Workshop on Building and Using Parallel Texts

ABSTRACT

References

Cited By

Recommendations

Integrating rules and dictionaries from shallow-transfer machine translation into phrase-based statistical machine translation

N-gram-based statistical machine translation versus syntax augmented machine translation: comparison and system combination

Linguistically annotated BTG for statistical machine translation

Comments

Login options

Full Access

Published in

Sponsors

In-Cooperation

Publisher

Publication History

Qualifiers

Conference

Funding Sources

Article Metrics

Other Metrics

PDF Format

eReader

Digital Edition

Share this Publication link

Share on Social Media