Skip to main content Skip to main navigation

Publication

Automatic Speaker Age and Gender Recognition in the Car for Tailoring Dialog and Mobile Services

Michael Feld; Felix Burkhardt; Christian Müller
In: Proceedings of Interspeech (2010). Conference in the Annual Series of Interspeech Events (INTERSPEECH-2010), September 26-30, Makuhari, Japan, ISCA, 9/2010.

Abstract

Car manufacturers are faced with a new challenge. While a new generation of "digital natives" with a natural affinity to computers and Internet services has grown up and become a new customer group, the problem of aging society is still increasing. This emphasizes the need of providing flexible in-car dialog and access to mobile services that take into account the specific needs and preferences of the respective user (group). In order to tackle this issue, three top-level problems have to be solved: 1. How do we find out which group the current user belongs to?; 2. How should the knowledge be represented and linked to knowledge on the system / service in order to support adaptation?; 3. What would be the best adaptation strategy? In this paper, we address the first question. We consider speech as one of the possible sources of information that allow non-intrusive user model acquisition. We present a GMM/SVM-supervector system (Gaussian Mixture Model combined with Support Vector Machine) for speaker age and gender recognition, a technique that is adopted from state-of-the-art speaker recognition research. We furthermore describe an experimental study with the aim to evaluate the performance of the system as well as to explore the selection of parameters. In comparison to previous work on age and gender recognition (by the authors and others), the underlying task is considered more difficult because we modified the definition of the age classes and used particularly short utterances for testing. Nevertheless, a comparable accuracy has been obtained. Additional contributions of this paper are: a structured itemization of experimental results, which shed light on the effect of various design decisions, as well as a concrete conceptual outline with respect to problem two (knowledge representation) and three (adaptation strategies).