Anke Koelzer

Universal Dialogue Specification for Conversational Systems
[Full Text]
[send contribution]
[debate procedure]
[copyright]

Overview of interactions

No	Comment(s)	Answer(s)	Continued discussion
1	16.2.2000 Arne Jönsson	24.2.2000 Anke Kölzer

C1. Arne Jönsson (16.2.00):

The paper presents an interesting attempt to develop tools for development of dialogue systems. From the paper it seems that your current application is in focus rather than general theories of dialogue, i.e. it is more on how to graphically describe a domain in terms of what information that is needed to access the background system than to decide how utterances are incorporated in the dialogue and what a user intend to convey in an utterance. How general do you think that your model is, can it be useful for other applications as well and can the same set of tools be used for other theories of dialogue?

The dialogue description language mentioned in section 5 is very interesting, what limitations does it have? We are currently working on such a language and have encountered a number of interesting probelms related to for instance a declarative description of focus management and concurrent processes such as accessing the background system. How do you handle that in the statecharts or do you consider it beyond the coverage of the statecharts?

It is also unclear how you represent recursive clarifications, i.e. clarifications of clarifications etc and also what will happen if the user jumps from one statchart-box into another and then back again, e.g. in the process of making a reservation asks for information and them, for instance based on the received information, returns to complete the reservation task. Furthermore, are these requests represented in the same data structure or do you have different structures for the reservation task and the information task? BTW how is the dialogue history represented in your dialogue model?

In subsection 4.1. you distinguish between the information shown to the application developer and the expert. What information is hidden and how do you know what information to hide?

Another comment relating to your statement in the beginning of section 2. I am not sure that DaimlerChrysler is the only system being able to handle spontaneous speech. I am aware of other systems with the ability to handle more than single commands, at least research systems for instance VerbMobil or have I missed something here?

A1. Anke Kölzer (24.2.00):

Arne Jönsson:

The authors reply:

Currently my focus is clearly on information giving dialogue systems. At the beginning of my work the task was to find a way to model dialogues in a rather easy way. The method should be capable of representing the structure of dialogue in terms of sub-dialogues, dialogue-steps, dialogue parameters ... The approach is very useful for "question-answer"-dialogue-systems where finding out the intentions of a speaker is (usually) not so difficult.
Adapting the method for approaches like e.g. mashine translation or chatting systems has not been investigated up till now but would be an intersting topic. There it would be necessary to find out if one can define patterns how dialogue works in this systems (compare dialogue patterns below) and to represent them as statecharts.

Arne Jönsson:

The authors reply:

Statecharts in their full complexity (Harel 1987) are a kind of graphical programming language with means to express concurrency, state history, conditions for state-transitions, system-events (which e.g. could be barge-in or a time-out in the speech recognizer in dialogue systems) etc. Thus you could say that nearly everything that can be programmed with a common programming language can be modeled with statecharts.
But in my approach I tried to find a sub-language of statecharts to keep it easy to use, even for non-dialogue-experts. The statecharts are mainly used for application modeling (see 4.1 - application mode). The expert mode is currently only used for describing generic features of the dialogue system so that the application developer can work with them. E.g. in our dialogue system there is only a fixed set of dialogue acts. Thus the application developer has to fill in system prompts for those but may not add new dialogue acts (only the expert - the one who implements the generic system - may add something here).

The more-or-less unchanging generic dialogue systems the tool-system is intended for, must have means how to deal with linguistic phenomenon (ellipsis, anaphora etc.), dialogue-history, focus and dialogue strategy. Thus they have their own "intelligence" how to handle this and the application developer does not have to (and even may not) specify anything concerning this with the tool-system. Thus currently the management of dialogue history and focus is not represented in the dialogue-statecharts.

What I like about the statechart approach it, that it can easily be extended, if someone wants to describe more dialogue-concepts in the application models. If you want to allow the user to specify something about focus and so on, you would add this as a concept to the statechart-dialect.

Arne Jönsson:

The authors reply:

Again the question here is how the generic dialogue system is able to handle such jumps. In our system "wild" jumps are not allowed. Thus the statechart dialect we use will only allow state-jumps between states in the same context (Ehrlich 1999). Thus when modeling system prompts e.g. the application developer can only use dialogue-parameters which were defined in the current or a super-ordinated state. We did it like that because we found out that dialogues become more structured and consistent like this, and are easier to maintain. Of course this is not as flexible as human-human-dialogues but it is a useful means.

Concerning clarifications of clarifications there is no special way of representing this with the statecharts. What we did here is a rather easy and pragmatic approach (with not much flexibility, I admit). In every system state the application developer can enter a list of system prompts. If you come to the same state several times, e.g. because of misunderstandings, you take the next prompt in the list. E.g. the dialogue-system will ask for a departure time "When do you want to leave?" when it is in the request-state for the first time. If it comes there again after problems it might say "I did not understand. Please tell me again when you want to leave". We get around quite well with this approach.

Arne Jönsson:

Furthermore, are these requests represented in the same data structure or do you have different structures for the reservation task and the information task?

The authors reply:

No - usually they are not the same. One nice aspect of using statecharts is that sub-dialogues can be modeled differently as long as the generic dialogue system can handle it. E.g. the dialogue expert can define generic dialogue-modules, something like dialogue-patterns, which the application developer can work with. Such a pattern could be something like: - first request a parameter - if there is an utterance from the speaker and it contains a valid semantic concept (like a departure time) - confirm the value if it is not confirmed - go to a problem handler etc. If the expert prepares such modules (which would be modeled as generic statecharts) for the application developer in the expert phase the only thing the application developer has to do is combine the module with parameters and fill in some system prompts. This is the basis for a rather fast application development. You can provide a library of such patterns and reuse old dialogue modules with this.

Arne Jönsson:

BTW how is the dialogue history represented in your dialogue model?

The authors reply:

Currently it is not represented (see above).

Arne Jönsson:

In subsection 4.1. you distinguish between the information shown to the application developer and the expert. What information is hidden and how do you know what information to hide?

The authors reply:

One idea with this was what I described above - adaptable dialogue patterns. If the application developer uses such patterns he just has to fill in parameters and prompts. He does not have to know how the states follow each other under which conditions, or how internal variables of the dialogue system change their values, when states change. You could just offer him a table like that shown in figure 4 (which is a little bit simplified, because it is also possible to combine several parameters in a system prompt). You could hide the complete generic statechart with its conditions and transitions describing the dialogue-pattern from him. That's the way we do it using the task hierarchies. How much we hide in the statecharts is dependable on an actual user of the tool-system. We still have to gather some experiences with that.

Arne Jönsson:

The authors reply:

Sorry about this misunderstanding. That was an improper translation from german ans should say "... whereas many other dialogue systems are ony capable of processing single commands". As far as I see it is still not state-of-the-art to use spontaneous speech in commercial systems. Though this is quite dynamically changing as you can see with Philips-SpeechMania, Vocalis-SpeecHTML, VoiceML and others. These approaches usually allow for simple context-free grammars without dealing with linguistic phenomenon and use word-spotting whereas our system works with a linguistic module and word-hypothesis-nets containing phrases. But as we found out implementing and maintaining such linguistic grammars is very costly and slow. Currently we try to extend the module in a way that you can switch between more "intelligent" linguistic analysis and simple word-spotting.

Additional questions and answers will be added here.
To contribute, please click [send contribution] above and send your question or comment as an E-mail message.
For additional details, please click [debate procedure] above.
This debate is moderated by the guest editors.