Project | E2Data

Duration: 01/01/2018 - 12/31/2020

European Extreme Performing Big Data Stacks

Research Topics

Data Management & Analysis
Other

Application fields

Knowledge & Business Intelligence
Other

In today´world, data is streamed from the local network or edge devices to a cloud provider which is rented by a customer to perform the data execution. The Big Data software stack, in an application and hardware agnostic manner, splits the execution stream into multiple tasks and send them for processing on the nodes the customer has paid for. If the outcome does not match the strict three second business requirement, then the customer has two options: 1) scale-up (by upgrading processors at node level) 2) scale-out (by adding nodes to their clusters), or 3) manually implement code optimizations specific to the underlying hardware. However, the customer does not have the financial capability to achieve that. Ideally, they would like to achieve their business requirements without stretching their hardware budget. In order to address the alarming scalability concerns, both end-users as well as cloud infrastructure vendors (such as Google, Microsoft, Amazon, and Alibaba) are investing in heterogeneous hardware resources able to utilize a diverse selection of architectures such as CPUs, GPUs, FPGAs, and MICs aiming to further increase performance while minimizing the climbing operational costs. Furthermore, despite current investments in heterogeneous resources, large companies such as Google develop in-house ASICs with TensorFlow being the prime example.

E2Data proposes an end-to-end solution for Big Data deployments that will fully exploit and advance the state-of-the-art in infrastructure services by delivering a performance increase of up to 10x while utilizing up to 50% less cloud resources. E2Data will provide a new Big Data software paradigm of achieving the maximum resource utilization for heterogeneous cloud deployments without affecting current Big Data programming norms (i.e., no code changes in the original source). The proposed solution takes a cross-layer approach by allowing vertical communication between the four key layers of Big Data deployments (application, Big Data software, scheduler/cloud provider, and execution run time).

Partners

The University of Manchester, Institute of Communications and Computer Systems, Neurocom Luxembourg, KALEAO Limited, Computer Technology Institute and Press "Diophantus" (CTI), Spark Works Limited, iProov Limited

Keyfacts

Involved research areas

Website

https://e2data.eu/

Publications

All publications

Efficient Compilation and Execution of JVM-Based Data Processing Frameworks on Heterogeneous Co-Processors
Christos Kotselidis; Sotiris Diamantopoulos; Orestis Akrivopoulos; Viktor Rosenfeld; Katerina DOka; Hazeef Mohammed; Georgios Mylonas; Vassilis Spitadakis; Will Morgan; Juan Fumero; Foivos S. Zakkak; Michail Papadimitriou; Maria Xekalaki; Nikos Foutris; Athanasios Stratikopoulos; Nectarios Koziris; Ioannis Konstantinou; Ioannis Mytilinis; Constatinos Bitsakos; Christos Tsalidis; Christos Tselios; Nikolaos Kanakis; Clemens Lutz; Sebastian Breß; Volker Markl
In: Design, Automation & Test in Europe. Design, Automation & Test in Europe (DATE-2020), March 9-13, Grenoble, France, IEEE, 2020.
Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects
Clemens Lutz; Sebastian Breß; Steffen Zeuch; Tilmann Rabl; Volker Markl
In: David Maier; Rachel Pottinger (Hrsg.). Proceedings of the 2020 ACM SIGMOD International Conference on Management of Data. ACM SIGMOD International Conference on Management of Data (SIGMOD-2020), June 14-19, Portland, OR, USA, Pages 1633-1649, ISBN 978-1-4503-6735-6, The Association for Computing Machinery, 2020.
Analyzing Efficient Stream Processing on Modern Hardware
Steffen Zeuch; Bonaventura Del Monte; Jeyhun Karimov; Clemens Lutz; Manuel Renz; Jonas Traub; Sebastian Breß; Tilmann Rabl; Volker Markl
In: Proceedings of the VLDB Endowment (PVLDB), Vol. 12, No. 5, Pages 516-530, VLDB Endowment, 2019.

Project | E2Data

European Extreme Performing Big Data Stacks

Research Topics

Application fields

Partners

Keyfacts

Involved research areas

Website

Publications

Efficient Compilation and Execution of JVM-Based Data Processing Frameworks on Heterogeneous Co-Processors

Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects

Analyzing Efficient Stream Processing on Modern Hardware

Funding Authorities

EU - European Union

780245

Research Topics

Application fields

Partners

Share project:

Keyfacts

Involved research areas

Website

Efficient Compilation and Execution of JVM-Based Data Processing Frameworks on Heterogeneous Co-Processors

Pump Up the Volume: Processing Large Data on GPUs with Fast Interconnects

Analyzing Efficient Stream Processing on Modern Hardware

Funding Authorities

EU - European Union

780245