Publication
Eliciting Multimodal Approaches for Machine Learning--Assisted Photobook Creation
Sara-Jane Bittner; Michael Barz; Daniel Sonntag
In: Carmelo Ardito; Simone Diniz Junqueira Barbosa; Tayana Conte; André Freire; Isabela Gasparini; Philippe Palanque; Raquel Prates (Hrsg.). Human-Computer Interaction -- INTERACT 2025. IFIP Conference on Human-Computer Interaction (INTERACT-2025), 20th IFIP TC 13 International Conference, Belo Horizonte, Brazil, September 8–12, 2025, Proceedings, Part II, September 8-12, Belo Horizonte, Brazil, Pages 576-601, ISBN 978-3-032-05002-1, Springer Nature Switzerland, 9/2025.
Abstract
Machine learning (ML) is increasingly applied in various end-user applications. To provide successful human-AI collaboration, co-creation for Interactive Machine Learning (IML) has become a growing topic, iteratively fusing the human creative view with the algorithmic strength to diverge ideas. Interactive photobook creation represents an ideal use case to investigate ML co-creation as it covers a range of typical ML tasks, like image retrieval, caption generation and layout generation. However, existing solutions do not exploit the benefits of introducing multimodal interaction to co-creation. We propose common operations for IML tasks related to interactive photobook creation and conduct an elicitation study (N = 14) investigating which (combination of) modalities could well support these tasks. An open-ended questionnaire revealed how users imagine an ideal IML environment, focusing on device setup, key factors, and the utility of specific features. Our findings show that 1) enabling a wide variety of modalities allows for most intuitiinteractions, 2) Informing users about uncommon modalities opens up suitable modality choices, that are otherwise missed, and 3) Multimodal interactions represent a high consensus, when chosen by the users.