Publication
Efficient Migration of Very Large Distributed State for Scalable Stream Processing
Bonaventura Del Monte
In: Proceedings of the VLDB 2017 PhD Workshop. International Conference on Very Large Data Bases (VLDB-2017), located at co-located with the 43rd International Conference on Very Large Databases (VLDB 2017), August 28, München, Germany, CEUR-WS.org, 2017.
Abstract
Any scalable stream data processing engine must handle the
dynamic nature of data streams and it must quickly react to
every fluctuation in the data rate. Many systems successfully
address data rate spikes through resource elasticity and dynamic
load balancing. The main challenge is the presence of stateful op-
erators because their internal, mutable state must be scaled out
while assuring fault-tolerance and continuous stream processing.
Both rescaling, load balancing, and recovering demand state
movement among work units. Therefore, how to guarantee those
features in the presence of large distributed state with minimal
impact on the performance is still an open issue. We propose an
incremental migration mechanism for fine-grained state shards
through periodic incremental checkpoints and replica groups.
This enables moving large state with minimal impact on stream
processing. Finally, we present a low-latency hand-over protocol
that smoothly migrates tuples processing among work units.