Skip to main content Skip to main navigation

Publication

Hybrid Machine Translation Using Joint, Binarised Feature Vectors

Christian Federmann
In: Proceedings of the 20th Conference of the Association for Machine Translation in the Americas. Conference of the Association for Machine Translation in the Americas (AMTA-12), October 28 - November 1, San Diego, CA, USA, Association for Machine Translation in the Americas, 10/2012.

Abstract

We present an approach for Hybrid Machine Translation, based on a Machine-Learning framework. Our method combines output from several source systems. We first define an extensible, total order on translations and use it to estimate a ranking on the sentence level for a given set of systems. We introduce and define the notion of joint, binarised feature vectors. We train an SVM-based classifier and show how its classification results can be used to create hybrid translations. We describe a series of oracle experiments on data sets from the WMT11 translation task in order to find an upper bound regarding the achievable level of translation quality. We also present results from first experiments with an implemented version of our system. Evaluation using NIST and BLEU metrics indicates that the proposed method can outperform its individual source systems. An interesting finding is that our approach allows to leverage good translations from otherwise bad systems as the translation quality estimation is based on sentence-level phenomena rather than corpus-level metrics. We conclude by summarising our findings and by giving an outlook to future work.