Determining Latency for on-line Dialog Act Classification

Sebastian Germesin, Peter Poller, Tilman Becker

In: Andrei Popoescu-Belis, Rainer Stiefelhagen (Hrsg.). Machine Learning for Multimodal Interaction. Machine Learning and Multimodal Interaction (MLMI-08) September 8-10 Utrecht Netherlands Lecture Notes in Computer Science (LNCS) 5237 ISBN 978-3-540-85852-2 Springer Heidelberg 9/2008.


This paper presents results from our ongoing research on the recursive classification of dialog acts. We successfully used dynamic features like the label of previous and future dialog acts as features in a statistical machine learning approach to gain information about the class of the current dialog act. Using these features in a real-time application system leads to the problem that the labels of the upcoming dialog acts are not available when classifying the current one. Thus, these features change over time and when new dialog acts get classified, the already classified dialog acts have to be re-classified with the new information. We found that a latency of about 60 dialog acts which corresponds to nearly 2 minutes is sufficient to reach the maximized detection rate. Furthermore, a latency of already 30 segments (60 seconds) yields an improvement of about 50% of the maximum achievable improvement.

Weitere Links

germesin_da_mlmi08.pdf (pdf, 129 KB )

Deutsches Forschungszentrum für Künstliche Intelligenz
German Research Center for Artificial Intelligence