Improving an Existing RBMT System by Stochastic Analysis

Christian Federmann, Sabine Hunsicker

In: Hamburg Working Paper in Multilingualism. Conference of the German Society for Computational Linguistics and Language Technology (GSCL-11) September 28-30 Hamburg Germany Hamburg Working Paper in Multilingualism German Society for Computational Linguistics and Language Technology 9/2011.


In this paper we describe how an existing, rule-based machine translation (RBMT) system that follows a transfer-based translation approach can be improved by integrating stochastic knowledge into its analysis phase. First, we investigate how often the rule-based system selects the wrong analysis tree to determine the potential benefit from an improved selection method. Afterwards we describe an extended architecture that allows integrating an external stochastic parser into the analysis phase of the RBMT system. We report on the results of both automatic metrics and human evaluation and also give some examples that show the improvements that can be obtained by such a hybrid machine translation setup. While the work reported on in this paper is a dedicated extension of a specific rule-based machine translation system, the overall approach can be used with any transfer-based RBMT system. The addition of stochastic knowledge to an existing rule-based machine translation system represents an example of a successful, hybrid combination of different MT paradigms into a joint system.


GSCL-2011-Federmann-Hunsicker.pdf (pdf, 136 KB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz