Publication
Comparison of Four Approaches to Age and Gender Recognition for Telephone Applications
Florian Metze; Jitendra Ajmera; Roman Englert; Udo Bub; Felix Burkhardt; Joachim Stegmann; Christian Müller; Richard Huber; Bernt Andrassy; Josef G. Bauer; Bernhard Littel
In: Proceedings of the 32nd International Conference on Acoustics, Speech, and Signal Processing. International Conference on Acoustics, Speech and Signal Processing (ICASSP-2007), April 15-20, Honolulu, Hawaii, USA, 2007.
Abstract
This paper presents a comparative study of four different approaches
to automatic age and gender classification using seven
classes on a telephony speech task and also compares the results with
Human performance on the same data. The automatic approaches
compared are based on (1) a parallel phone recognizer, derived from
an automatic language identification system; (2) a system using dynamic
Bayesian networks to combine several prosodic features; (3)
a system based solely on linear prediction analysis; and (4) Gaussian
mixture models based on MFCCs for separate recognition of
age and gender. On average, the parallel phone recognizer performs
as well as Human listeners do, while loosing performance on short
utterances. The system based on prosodic features however shows
very little dependence on the length of the utterance.