Fine-grained evaluation of Quality Estimation for Machine translation based on a linguistically-motivated Test Suite

Eleftherios Avramidis, Vivien Macketanz, Arle Lommel, Hans Uszkoreit

In: Proceedings for AMTA 2018 Workshop: Translation Quality Estimation and Automatic Post-Editing. Conference of the Association for Machine Translation in the Americas (AMTA-2018) March 21-21 Boston United States Seiten 243-248 Association for Machine Translation in the Americas 3/2018.


We present an alternative method of evaluating Quality Estimation systems, which is based on a linguistically-motivated Test Suite. We create a test-set consisting of 14 linguistic error categories and we gather for each of them a set of samples with both correct and erroneous translations. Then, we measure the performance of 5 Quality Estimation systems by checking their ability to distinguish between the correct and the erroneous translations. The detailed results are much more informative about the ability of each system. The fact that different Quality Estimation systems perform differently at various phenomena confirms the usefulness of the Test Suite.

Avramidis_et_al_2018_fine-grained-QE.pdf (pdf, 253 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence