Skip to main content Skip to main navigation

Publication

Towards Multimodal Stream Processing Systems

Uélison Santos; Alessandro Ferri; Szilard Nistor; Riccardo Tommasini; Carsten Binnig; Manisha Luthra
In: Wolfgang Lehner; Vanessa Braganholo; Kostas Stefanidis; Zheying Zhang; Alexander Krause; Jo~ao Felipe Nicolaci Pimentel (Hrsg.). Proceedings 29th International Conference on Extending Database Technology, EDBT 2026, Tampere, Finland, March 24-27, 2026. International Conference on Extending Database Technology (EDBT), Pages 627-633, OpenProceedings.org, 2026.

Abstract

In this paper, we present a vision for a new generation of multi- modal streaming systems that embed MLLMs as first-class opera- tors, enabling real-time query processing across multiple modali- ties. Achieving this is non-trivial: while recent work has integrated MLLMs into databases for multimodal queries, streaming systems require fundamentally different approaches due to their strict la- tency and throughput requirements. Our approach proposes novel optimizations at all levels, including logical, physical, and semantic query transformations that reduce model load to improve through- put while preserving accuracy. We demonstrate this with Sam ˙ s¯ara, a prototype leveraging such optimizations to improve performance by more than an order of magnitude. Moreover, we discuss a re- search roadmap that outlines open research challenges for building a scalable and efficient multimodal stream processing systems.

More links