Publication
Towards Multimodal Stream Processing Systems
Uélison Santos; Alessandro Ferri; Szilard Nistor; Riccardo Tommasini; Carsten Binnig; Manisha Luthra
In: Wolfgang Lehner; Vanessa Braganholo; Kostas Stefanidis; Zheying Zhang; Alexander Krause; Jo~ao Felipe Nicolaci Pimentel (Hrsg.). Proceedings 29th International Conference on Extending Database Technology, EDBT 2026, Tampere, Finland, March 24-27, 2026. International Conference on Extending Database Technology (EDBT), Pages 627-633, OpenProceedings.org, 2026.
Abstract
In this paper, we present a vision for a new generation of multi-
modal streaming systems that embed MLLMs as first-class opera-
tors, enabling real-time query processing across multiple modali-
ties. Achieving this is non-trivial: while recent work has integrated
MLLMs into databases for multimodal queries, streaming systems
require fundamentally different approaches due to their strict la-
tency and throughput requirements. Our approach proposes novel
optimizations at all levels, including logical, physical, and semantic
query transformations that reduce model load to improve through-
put while preserving accuracy. We demonstrate this with Sam
˙ s¯ara,
a prototype leveraging such optimizations to improve performance
by more than an order of magnitude. Moreover, we discuss a re-
search roadmap that outlines open research challenges for building
a scalable and efficient multimodal stream processing systems.
