Involving language professionals in the evaluation of machine translation

Maja Popovic, Eleftherios Avramidis, Aljoscha Burchardt, Sabine Hunsicker, Sven Schmeier, Cindy Tscherwinka, David Vilar, Hans Uszkoreit

In: Journal on Language Resources and Evaluation 48 4 Seiten 541-559 Springer 2014.


Significant breakthroughs in machine translation only seem possible if human translators are taken into the loop. While automatic evaluation and scoring mechanisms such as \bleu have enabled the fast development of systems, it is not clear how systems can meet real-world (quality) requirements in industrial translation scenarios today. The taraXÜ project has paved the way for wide usage of multiple machine translation outputs through various feedback loops in system development. The project has integrated human translators into the development process thus collecting feedback for possible improvements. This paper describes results from detailed human evaluation. Performance of different types of translation systems has been compared and analysed via ranking, error analysis and post-editing.


Weitere Links

evaluation-revised.pdf (pdf, 175 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence