The VOICE Awards Corpus in Numbers
Number of dialogs and systems per year
Year | Dialogs | Systems |
2005 | 213 | 34 |
2006 | 529 | 42 |
2007 | 452 | 26 |
2008 | 471 | 29 |
2009 | 305 | 18 |
Total | 1970 | 120 |
Domains
All systems in the VOICE Awards Corpus were
hand-classified for their domain, along two axes: First,
the kind of interaction that is taking place ("goal
domain") between the user and system, and second, the
topic of the dialog system ("content domain").
Goal domain | Systems |
---|
Banking | 23 |
Connect | 4 |
Data entry | 18 |
Game | 8 |
Information | 36 |
Order | 18 |
Transit | 8 |
Other | 40 |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
| |
|
Content domain | Systems |
---|
Banking | 23 |
Cards | 5 |
Communication | 11 |
Flight | 3 |
Hotel | 3 |
Insurance | 2 |
Lotto | 2 |
Medical | 3 |
Meters | 3 |
Mobile phone | 13 |
Movies | 3 |
News | 4 |
Package tracking | 5 |
Prices | 2 |
Product ordering | 3 |
Public service | 4 |
Ringtones | 3 |
Sports | 8 |
Taxi | 4 |
Traffic | 2 |
Transit | 9 |
TV | 4 |
University | 1 |
Weather | 3 |
Other | 33 |
|
Comparing VOICE Awards with other corpora
With a sum of 1970 dialogs, the VOICE Awards Corpus is
the biggest existing German human-machine-dialog
corpus. It combines the advantages of several other
corpora, like Verbmobil, which relies on dialogs, but is a
human-human dialog corpus or Smartweb, which also is a
huge human-machine corpus, but contains queries and no
dialogs. Besides, the VOICE Awards Corpus is the only
human-machine-dialog corpus which is annotated with dialog
acts and many other informations, such as miscommunication
and success measures.