Mobile Texting: Can Post-ASR Correction Solve the Issues? An Experimental Study on Gain vs. Costs

Michael Feld; Saeedeh Momtazi; Farina Freigang; Dietrich Klakow; Christian Müller

In: Proceedings of the 17th International Conference on Intelligent User Interfaces (IUI 2012). International Conference on Intelligent User Interfaces (IUI-2012), 17th, February 14-17, Lisbon, Portugal, ACM, 2/2012.


The next big step in embedded, mobile speech recognition will be to allow completely free input as it is needed for messaging like SMS or email. However, unconstrained dictation remains error-prone, especially when the environment is noisy. In this paper, we compare different methods for improving a given free-text dictation system used to enter text-based messages in embedded mobile scenarios, where distraction, interaction cost, and hardware limitations enforce strict constraints over traditional scenarios. We present a cor\-pus-based evaluation, measuring the trade-off between improvement of the word error rate versus the interaction steps that are required under various parameters. Results show that by post-processing the output of a "black box" speech recognizer (e.g. a web-based speech recognition service), a reduction of word error rate by 55% (10.3% abs.) can be obtained. For further error reduction, however, a richer representation of the original hypotheses (e.g. lattice) is necessary.


German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz