Publikation
Information Density and Quality Estimation Features as Translationese Indicators for Human Translation Classification
Raphael Rubino; Ekaterina Lapshinova-Koltunski; Josef van Genabith
In: Proceedings of the 2016 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies. Annual Conference of the North American Chapter of the Association for Computational Linguistics (HLT-NAACL-2016), 15th, June 12-17, San Diego, CA, USA, 2016.
Zusammenfassung
This paper introduces information density and machine translation quality estimation
inspired features to automatically detect and classify human translated texts. We investigate
two settings: discriminating between translations and comparable originally authored texts,
and distinguishing two levels of translation professionalism. Our framework is based on
delexicalised sentence-level dense feature vector representations combined with a
supervised machine learning approach. The results show state-of-the-art performance for
mixed-domain translationese detection with information density and quality estimation
based features, while results on translation expertise classification are mixed.