A DevOps Manifesto for Speech Corpus Management

Ingmar Steiner

In: Jürgen Trouvain, Ingmar Steiner, Bernd Möbius (editor). 28th Conference on Electronic Speech Signal Processing (ESSV). Elektronische Sprachsignalverarbeitung (ESSV) 28th March 15-17 Saarbrücken Germany Pages 160-166 TUD Press Dresden 3/2017.


In this paper, we introduce certain concepts from the DevOps philosophy, and more generally from the software development lifecycle. We argue that the separation between source code and how it is built and released for distribution can be applied to speech corpora as well. We draw a distinction between the developers and maintainers of a speech corpus on one hand, and the researchers who use it on the other. We propose conventions to efficiently manage corpus metadata like source code, and speech data like static assets that can be retrieved automatically. Finally, we mention several use cases which illustrate the merits of these conventions.

Weitere Links

Steiner_2.pdf (pdf, 233 KB )

German Research Center for Artificial Intelligence
Deutsches Forschungszentrum für Künstliche Intelligenz