The DAPHNE project aims to define and build an open and extensible system infrastructure for integrated data analysis pipelines, including data management and processing, high-performance computing (HPC), and machine learning (ML) training and scoring. Key observations are that (1) systems of these areas share many compilation and runtime techniques, (2) there is a trend towards complex data analysis pipelines that combine these systems, and (3) the used, increasingly heterogeneous, hardware infrastructure converges as well. Yet, the programming paradigms, cluster resource management, as well as data formats and representations differ substantially. Therefore, this project aims – with a joint consortium of experts from the data management, ML systems, and HPC communities – at systematically investigating the necessary system infrastructure, language abstractions, compilation and runtime techniques, as well as systems and tools necessary to increase the productivity when building such data analysis pipelines, and eliminating unnecessary performance bottlenecks.
Know-Center GmbH (coordinator), Austria
AVL List GmbH, Austria
Deutsches Zentrum fuer Luft- und Raumfahrt e.V., Germany
Eidgenoessische Technische Hochschule Zuerich, Switzerland
Hasso-Plattner-Institut for Digital Engineering gGmbH, Germany
Institute of Communication and Computer Systems, Greece
Infineon Technologies Austria AG, Austria
Intel Technology Poland sp. z o.o., Poland
IT-Universitetet i København, Denmark
Kompetenzzentrum Automobil- und Industrieelektronik GmbH, Austria
Technische Universität Dresden, Germany
Univerza v Mariboru, Slovenia
Universitaet Basel, Switzerland