Publication
Time Series Generation for Augmenting Multi-Channel Automotive Audio Data
Philipp Peter Engler; Ludger van Elst; Andreas Dengel; Sheraz Ahmed
In: Lecture Notes in Computer Science. International Conference on Artificial Neural Networks (ICANN-2025), 34th International Conference on Artificial Neural Networks, September 9-12, Kaunas, Lithuania, Springer, 2025.
Abstract
One of the top limiting factors for machine learning applications in industry is the lack of data, as the training of powerful models often requires vast datasets. Acquiring data at large scales is generally expensive and sometimes not even possible, as it may depend on costly measurements with specialized equipment, trained personnel and possibly manual annotations by experts. Thus, augmenting or synthesizing data plays a key role in making machine learning solutions feasible and cost-effective for industrial applications. In this paper, we propose a diffusion model for synthesizing audio data in a real-world multi-label audio classification task from the automotive industry. Augmenting the training dataset with synthetic data, we obtain an improvement of 1.8 p.p. in mAP over training with real data alone. We discuss the difficulties in generating such domain-specific data and examine issues further by comparing our method to an alternative generative approach for environmental sound augmentation.