Next: System Overview Up: Language-Processing Strategies and Mixed-Initiative Previous: Language-Processing Strategies and Mixed-Initiative

Introduction

Travel-planning domains have been a common application area for spoken-language dialogue systems almost from their inception, both as pure research vehicles and now, with maturing speech technology, as fielded prototypes. Fielded systems naturally tend to employ simpler linguistic and dialogue processing. Domain-specific keyword/phrase spotting and slot-filling techniques are preferred for utterance interpretation. At the dialogue level, systems tend to keep the dialogue initiative to themselves by treating the user simply as an answer-supplier. Particular systems may also implement particular instances of more sophisticated processing. However, the simple methods do dovetail simply because the more expectations that a system can impose on a dialogue, then the more those expectations can be used to aid interpretation of user utterances. (For a range of recent work, see [Aust and Oerder1995], [Allen et al.1996], [Lamel et al. 1998], [Litman et al. 1998] and [Bos et al. 1999].)

In the work described here, we are primarily interested in exploring relaxation of the constraint that dialogues be system-driven together with the use of both sophisticated (but sometimes brittle) and simple (but generally robust) linguistic processing. We hypothesize that different techniques may be applicable at different points in a dialogue. The specific scenario used was that of booking a business trip within Sweden, using air travel or train, and accessing information about times, destinations and fares. Communication in both directions was entirely in spoken Swedish. The underlying database was the Travellink (TM) system, accessible at http://www.travellink.se.

Prior to designing the system, we collected a corpus of data through a Wizard-of-Oz experiment, obtaining altogether 131 dialogues from 47 subjects (31 male and 16 female); the Wizard's conversational style was purposely chosen so as to permit mixed-initiative user strategies. Analysis of the data showed that it displayed significant variation. For example, with respect to verboseness, there is a range of behaviour stretching from consistent use of short, telegraphic-style utterances to very long, disfluent utterances. Furthermore, there are both inactive users who refrain completely from taking the initiative (in effect leaving it open to the system to cross-examine them) and active users who quickly take the initiative by means of counter-questions, keeping it more or less throughout the dialogue. There is also a range of users whose behaviours fall between these extremes. One of our immediate conclusions was that if mixed-initiative dialogues were supported, then a large proportion of the people interacting with the system would make use of this capability.

Typically, we found that the structure of a dialogue about (a leg of) a trip could be subdivided into two phases. First, there is a specification phase, in which the user, possibly in response to system prompting, gave the basic constraints on the trip they were looking for: where they were going to, where they were coming from, the date, and some information about the desired departure or arrival time. We regarded the specification phase as terminated when the system had collected enough information that it could access the database and suggest a possible specific trip. After this, there is a second negotiation phase, in which the user may request additional information about the initially suggested trip, ask for alternative trips, and eventually make a booking. The balance between the two phases displayed considerable variation. For the most active users, the negotiation phase dominated: it sometimes started even before the system had suggested any alternative and could persist more or less throughout the dialogue. In contrast, the negotiation phase could be non-existent in the case of the least active users.

In general, we found that analysis of utterances during the negotiation phase required a higher degree of linguistic sophistication than during the specification phase. For example, it was often necessary to be able to understand expressions referring to objects previously mentioned in the dialogue (``that flight'', ``the first flight''), or distinguish between questions expecting a yes/no response (``Is that a direct flight?'') and questions expecting a new object response (``Is there a direct flight?'').

The above characteristics of the data and domain prompted us to focus on the following aspects in the design of the system:

Ability to handle context-dependent, mixed-initiative dialogues in order to cover both kinds of phases in the dialogue as well as the range of active/inactive users.
Ability to do linguistic analysis deeper than surface slot-filling, so as to be able to distinguish between different forms of utterances critical to the domain.
Robustness to be able to advance the dialogue even in the case of complex, disfluent utterances and errors likely to be introduced by the speech recognizer.

To meet these desiderata, we have taken an approach with the following distinguishing characteristics:

Linguistic analysis is factored into context-independent and context-dependent processing phases. The initial context-independent phase produces a set of descriptions based on the explicit form of the input utterance; the descriptions are then interpreted in the relevant context by the dialogue manager.
The local exchange of initiatives and responses is guided by domain-dependent moves and games, whereas the global goals are handled using an agenda.
To tackle deep linguistic analysis as well as robustness successfully, and to try to cover different phases of the dialogue equally well, we augment the slot-filling processing method with a more sophisticated grammar-based method. The two parsing engines are run in parallel, and feed independently into the dialogue manager.

Next: System Overview Up: Language-Processing Strategies and Mixed-Initiative Previous: Language-Processing Strategies and Mixed-Initiative

Mats Wiren
Mon Oct 25 13:51:54 MET DST 1999