While we are already used to see more than 1,000 cores within a single machine, the next processing platforms for database engines will be widely heterogeneous with built-in GPU-style processors as well as specialized FPGAs and chips with domain-specific instruction sets taking advantage of the “Dark Silicon” effect. Moreover, the traditional volatile as well as the upcoming non-volatile RAM with capacities in the 100s of TBytes per machine will provide great opportunities for storage engines but also call for radical changes on the architecture of such systems. Finally, the emergence of economically affordable, high-speed/low-latency interconnects as a basis for rack-scale computing is questioning long-standing folklore algorithmic assumptions but will certainly play an important role of the big picture for building modern data management platforms. While database research on modern hardware has already produced a rich bouquet of promising results targeting a wide variety of hardware directions, the talk will try to classify and review existing approaches from a performance, robustness, as well as energy efficiency perspective. Moreover, the talk will discuss the overall question on how these results can be incorporated into the design and implementation of modern DB systems. The goal is therefore to outline current trends and research activities as well as to pinpoint to interesting starting points for further research activities.
Wolfgang Lehner is full professor and head of the Database Technology Group as well as director of the Institute for System Architecture at TU Dresden, Germany. His research focuses on database system architecture specifically looking at crosscutting aspects from algorithms down to hardware-related aspects in main-memory centric settings. He is part of TU Dresden’s research cluster of excellence with topics in energy-aware computing, resilient data structures on unreliable hardware, and orchestration of widely heterogeneous systems. He is heading a Research Training Group on large-scale adaptive system soware design and acts as a principal investigator in Germany’s national “Competence Center for Scalable Data Services and Solutions” (ScaDS). Wolfgang also maintains a close research relationship with the SAP HANA development team. He serves the community in many PCs, is an elected member of the VLDB Endowment, is chairing the review board of Computer Science within the German Research Foundation (DFG), and is an appointed member of the Academy of Europe.
The Big Data revolution has been enabled in part by a wealth of innovation in software platforms for data storage, analytics, and machine learning. The design of Big Data platforms such as Hadoop and Spark focused on scalability, fault-tolerance and performance. As these and other systems increasingly become part of the mainstream, the next set of challenges are becoming clearer. Requirements for performance are changing as workloads evolve to include techniques such as hardware-accelerated deep learning. But more fundamentally, other issues are moving to the forefront. These include ease of use for a wide range of users, security, concerns about privacy and potential bias in results, and the perennial problems of data quality and integration from heterogeneous sources. Fortunately, the database community has much to say about all of these topics, and can and should take a leading role in addressing them. In this talk, I will give an overview of how we got here, with an emphasis on the development of the Apache Spark system. I will then focus on these emerging issues with an eye towards where the database community can most effectively engage.
Michael J. Franklin is the Liew Family Chair of Computer Science and Sr. Advisor to the Provost for Computation and Data at the University of Chicago where his research focuses on database systems, data analytics, human-in-the-loop computing, and distributed computing systems. Previously he was the Thomas M. Siebel Professor and Chair of Computer Science at UC Berkeley. He co-founded and directed the Algorithms, Machines and People Laboratory (AMPLab), which created industry-changing open source Big Data software such as Apache Spark and BDAS, the Berkeley Data Analytics Stack. Franklin has nearly three decades(!) of experience with Database Systems projects including the Bubba massively parallel DBMS, the SHORE object-oriented DBMS, the TelegraphCQ and Truviso stream processing systems, the TinyDB and HiFi sensor query processing systems, and Spark/BDAS. He currently serves as a Board Member of the Computing Research Association and on the NSF CISE Advisory Committee. He is an ACM Fellow, a two-time recipient of the ACM SIGMOD “Test of Time” award and received the Outstanding Advisor award from Berkeley’s Computer Science Graduate Student Association.
Imagine a machine that is able to compose music and write poems; paint realistic
artificial images and dream up video from textual descriptions; paint pictures or
entire videos in the style of any artist; translate in-between any pair of natural languages.
A machine that can recognize any content in images and videos;
diagnose diseases, imitate spoken language — in any voice.
A machine that wins games thought to be exclusive to human intelligence.
All of that with superhuman performance of course.
Sounds like science fiction? Well, then welcome to the year 2017!
Currently we are witnessing the biggest revolution in computer science since the
invention of the Internet. Deep Learning is shaking the world of computer science
and overrunning entire (sub-)disciplines.
In this talk I will briefly sketch some of the recent advances in deep
learning and what they have to do with databases. Where are synergies?
Where should we be looking at? This talk will have a particular focus on
recent technical developments in the intersection of databases and/or
deep learning in Europe.
Jens Dittrich is a Full Professor of Computer Science in the area of Databases,
Data Management, and Big Data at Saarland University, Germany. Previous affiliations
include U Marburg, SAP AG, and ETH Zurich. He received an Outrageous Ideas and
Vision Paper Award at CIDR 2011, a BMBF VIP Grant in 2011, a best paper award at
VLDB 2014 (the second ever given to an E&A paper), two CS teaching awards in 2011
and 2013, as well as several presentation awards including a qualification for the
interdisciplinary German science slam finals in 2012 and three presentation awards at
CIDR (2011, 2013, and 2015). He has been a PC member and area chair/group leader of
prestigious international database conferences and journals such as PVLDB/VLDB,
SIGMOD, ICDE, and VLDB Journal. At Saarland University he co-organizes the Data
Science Summer School (http://datasciencemaster.de).
Since 2013 he has been teaching some of his classes on data management as flipped
classrooms. See http://datenbankenlernen.de or
http://youtube.com/jensdit for a
list of freely available videos on database technology in German (introduction to databases)
and English (database architectures and implementation techniques). He is also
author of a "flipped textbook" on databases. Since 2016 he has been working
on a start-up at the intersection of deep learning and databases
(http://daimond.ai).
His research focuses on fast access to big data including in particular:
data analytics on large datasets, main-memory databases, database indexing,
reproducability, and deep learning.