next up previous
Next: Extraction of Utterance Contents Up: main Previous: Introduction

 

Welcome to the Real World

Since work on the VERBMOBIL system always has to take into account the complete processing from speech input to speech output we have to deal with incorrect information due to shortcomings in other modules, especially speech recognition. The following two examples show how dramatic even small errors in recognition can be in terms of interpretation (recognized string in italics):

(a)
Unfortunately I have only time in December.
unfortunately I have a meeting of December.

(b)
When would be a good time for us to meet?
one would be a good time for us to meet.

Utterance (a) will be interpreted as a rejection of December as a possible date (since having a meeting usually signals an explained rejection of a date) whereas utterance (b) triggers the false suggestion of one o'clock for the meeting.

These utterances were taken from a corpus of end-to-end evaluation dialogues which were recorded in test runs under realistic conditions, i.e. with only the VERBMOBIL system as a translator between an English and a German person. Although in utterances like the above it is almost impossible to recover the original meaning, other heavily damaged input like the next example is easily interpreted correctly by our shallow analysis approach:

(c)
OK, could you repeat please. Do you have time on the fifth and sixth, or don't you?
repeat please do you have time. on the fifth and sixth why don't you.

Apart from imperfect input, two other factors influence our approach to dialogue understanding in VERBMBIL. First, users of VERBMOBIL in its current state have to deal with inaccurate translations and therefore, make ample use of confirmations and clarifications. In general, people behave differently in a real application of VERBMOBIL as opposed to the ideal conditions of mono-lingual sample dialogues from the VERBMOBIL corpus. Second, since VERBMOBIL acts as a mediator between two or more parties it is not supposed to pose questions itself (clarification dialogues). Its dialogue engine is not an interactive machine but solely tracking the dialogue. It has to obey the principle of unobtrusiveness. As a result, we pursue an approach using methods not unlike those of Information Extraction (IE, cf. [Hobbs et al. 1996]): we know what to expect, we try to extract as much information as possible and put them into frames, checking consistency on the way.

To guarantee robustness under such working conditions, a number of domain-specific assumptions have to be made and implemented (see section 4.4).


next up previous
Next: Extraction of Utterance Contents Up: main Previous: Introduction

Jan Alexandersson
Thu Nov 11 15:15:06 MET 1999