



# How Disruptive is Modern Hardware?

Wolfgang Lehner

## Interest in Modern HW?



...IN RESEARCH





Thirteenth International Workshop on Data Management on New Hardware (DaMoN 2017)

#### HardBD 2017

International Workshop on Big Data Management on Emerging Hardware, Sponsored and Held in conjunction with ICDE 2017 April 22, 2017, San Diego, USA



First International Workshop on Data Management on Virtualized Active Systems.









## Interest in Modern HW?





## Interest in Modern HW?



## ...in commercial DB settings

- some developments (acceleration models, ...)
- last disruptive development: >10 years back!



Why was In-Memory Computing development disruptive?



## Disruptiveness



### disruptive 4



a. relating to or noting a new product, service, or idea that radically changes an industry or business strategy, especially by creating a new market and disrupting an existing one:

### Disruptions in Data Management always require two ingredients





...novel technology and hardware





...novel types of DB(!!!) applications



## Disruptiveness (2)



### Why was In-Memory Computing development disruptive?

## A Common Database Approach for OLTP and OLAP Using an In-Memory Column Database

Hasso Plattner
Hasso Plattner Institute for IT Systems Engineering
University of Potsdam
Prof.-Dr.-Helmert-Str. 2-3
14482 Potsdam, Germany
hasso.plattner@hpi.uni-potsdam.de

#### Categories and Subject Descriptors

 $\begin{array}{ll} \text{H.2.0 [Information Systems]:} & \text{DATABASE MANAGE-} \\ \text{MENT}--General & \end{array}$ 

#### General Terms

Design, Performance

After the conversion of attributes into integers, processing becomes faster. More recently, the use of column store data-bases for analytics has become quite popular. Dictionary compression on the database level and reading only those columns necessary to process a query speed up query processing significantly in the column store case.

I always believed the introduction of so-called data warehouses was a compromise. The flexibility and speed we

→ enabler for HTAP = OLTP & OLAP



# UseCase - Hybrid Transactional Processing





Accounting, Reconciliation, and Reporting application

**OLTP** 

**OLAP** 

#### **S**TATISTICS

- 203m active accounts (1st Quarter 2017)
- Online payments in 200+ countries (1st Quarter 2017)
- **6.1 billion** payment transactions in 2016

→ All on a Single HANA box (48 TB)



#### TRANSACTIONAL DATA VOLUME

- 500 million FTs
- 500 million business partner
- 100 million transaction per day
- ~321 million sub ledger documents per day
- 6 million PDAs per day
- 6 million VDAs per day
- 150 million VDAs in incoming layer
- 150.000 cash entries per day
- ...







## The DB Sandwich





## Outline



### Characteristics / Opportunities / Challenges



What is the design space? / What might be a hypothetical HW blueprint?



## Outline









# Compute Unit Diversity









## Compute Unit Diversity





#### **FOCUS ON**

Heterogeneous Computing by offloading "simple tasks"





# PERFORMANCE, PERFORMANCE, PERFORMANCE!!!

- specialized data structures and algorithms
- parallel programming models and compiler support
- data and operator placement strategies



# Compute Unit Diversity









https://www.nvidia.com/en-us/data-center/tesla-v100/

https://www.top500.org/featured/systems/asci-white-lawrence-livermore-national-laboratory/

|                                          | Nvidia V100 (2017) | IBM ASCI White (2000)                    |
|------------------------------------------|--------------------|------------------------------------------|
| Number of Processor Cores                | 3584               | 8192 (512 nodes x 16 IBM Power3)         |
| Double-Precision Performance             | 7.5 TeraFLOPS      | 7.2 TeraFLOPS                            |
| NVIDIA NVLink™ v2 Interconnect Bandwidth | 2x150 GB/s         | N/A                                      |
| PCle x16 Interconnect Bandwidth          | 2x16 GB/s          | N/A                                      |
| Memory Capacity                          | 16 GB              | 6 TB DRAM<br>(Power 3 w/ 16 MB L2 cache) |
| Max. overall data transfer speed         | 900 GB/s           | ?                                        |
| Weight                                   | 450 gramm          | 106 tons                                 |
| Energy consumption                       | 300W               | 3 MW                                     |

# ...but: we are hitting the "Energy Wall"









### **QUESTIONS:**

 Does it matter and is there an impact on database systems (regarding energy savings without compromising performance)?

2) Why should the DB community care about it?

Appears in the Proceedings of the 38th International Symposium on Computer Architecture (ISCA '11)

# Dark Silicon and the End of Multicore Scaling

Hadi Esmaeilzadeh† Emily Blem‡ Renée St. Amant§ Karthikeyan Sankaralingam‡ Doug Burger°
†University of Washington ‡University of Wisconsin-Madison
§The University of Texas at Austin °Microsoft Research

hadianeh@cs.washington.edu blem@cs.wisc.edu stamant@cs.utexas.edu karu@cs.wisc.edu dburger@microsoft.com



## **Energy Awareness**



#### POWER BREAKDOWN HASWELL-EP



#### **INITIAL EVALUATION**



#### HARDWARE CONFIGURATION KNOBS



#### **OBSERVATIONS:**

- 1) There are opportunities
- 2) There are many knobs to tune



# ...but: workload knowledge makes a difference





# **Energy Savings**



Query Load

Linux Governour

DB-controlled







# **Energy Awareness**





Figure 1: CPU performance trend for Android mobile devices.



## HW/SW-CoDesign





## Tomahawk DBA Primitives



|            | Comp<br>Proce | Bitmap<br>pression<br>essing (<br>DR, XOF | n and<br>AND, | Hashing       |               |           |               |            | Sorted     | Set Ope      | erations | 5          |                 |                                 |
|------------|---------------|-------------------------------------------|---------------|---------------|---------------|-----------|---------------|------------|------------|--------------|----------|------------|-----------------|---------------------------------|
| Primitives | WAH           | PLWAH                                     | COMPAX        | Hash + Lookup | Hash + Insert | Hash Keys | Hash Sampling | CityHash32 | Merge Sort | Intersection | Union    | Difference | Sort-Merge Join | Sort-Merge<br>Aggregation (SUM) |

+ development is going on

...more on Friday 11:40am – 12pm



## Energy-Efficient Hash Join Implementations in Hardware-Accelerated MPSoCs

Sebastian Haas, Gerhard Fettweis Vodafone Chair Mobile Communications Systems Center for Advancing Electronics Dresden (cfaed) Technische Universität Dresden, Germany

sebastian.haas@tu-dresden.de, gerhard.fettweis@tu-dresden.de



## Tomahawk DBA: Sorted Set Intersection







## Tomahawk DBA: Sorted Set Intersection



|                         | Intel 17-920        | D  | BA_2LSU_E           | IS | _     |   |
|-------------------------|---------------------|----|---------------------|----|-------|---|
| Throughput (elements/s) | 1,100 mio           |    | 1,203 mio           |    | ~ ±x% | 6 |
| Clock frequency         | $2.67~\mathrm{GHz}$ |    | $0.41~\mathrm{GHz}$ |    |       |   |
| Max. TDP                | 130 W               | >> | $0.135 \; W$        |    |       |   |
| Cores/Threads           | 4/8                 |    | 1/1                 |    |       |   |
| Feature size            | 45 nm               |    | 65 nm               |    |       |   |
| Area (logic & memory)   | $263 \text{ mm}^2$  | >> | $1.5~\mathrm{mm}^2$ |    |       |   |

| Relative Area Consumption |  |
|---------------------------|--|
| DBA_2LSU_EIS)             |  |

| Part             | $\mathrm{Area}[\%]$ |
|------------------|---------------------|
| Basic Core       | 20.5                |
| Decoding/Muxing  | 14.4                |
| States           | 14.7                |
| Op: All          | 11.3                |
| Op: Intersection | 6.8                 |
| Op: Difference   | 9.0                 |
| Op: Union        | 17.6                |
| Op: Merge-Sort   | 5.7                 |
| SUM              | 100                 |



Appears in the Proceedings of the 38th International Symposium on Computer Architecture (ISCA '11)

### Dark Silicon and the End of Multicore Scaling

Hadi Esmaeilzadeh† Emily Blem‡ Renée St. Amant§ Karthikeyan Sankaralingam‡ Doug Burger°
†University of Washington †University of Wisconsin-Madison

§The University of Texas at Austin °Microsoft Research

hadianeh@cs.washington.edu blem@cs.wisc.edu stamant@cs.utexas.edu karu@cs.wisc.edu dburger@microsoft.com



## Tomahawk DBA: Sorted Set Intersection



|                         | Intel 17-920        | DI | BA_2LSU_EI          | S     |
|-------------------------|---------------------|----|---------------------|-------|
| Throughput (elements/s) | 1,100 mio           |    | 1,203 mio           | ~ ±x% |
| Clock frequency         | $2.67~\mathrm{GHz}$ |    | $0.41~\mathrm{GHz}$ |       |
| Max. TDP                | 130 W               | >> | $0.135 \; W$        |       |
| Cores/Threads           | 4/8                 |    | 1/1                 |       |
| Feature size            | $45 \mathrm{nm}$    |    | $65 \mathrm{nm}$    |       |
| Area (logic & memory)   | $263 \text{ mm}^2$  | >> | $1.5~\mathrm{mm}^2$ |       |

# Relative Area Consumption (DBA\_2LSU\_EIS)

| Part             | $\mathrm{Area}[\%]$ |
|------------------|---------------------|
| Basic Core       | 20.5                |
| Decoding/Muxing  | 14.4                |
| States           | 14.7                |
| Op: All          | 11.3                |
| Op: Intersection | 6.8                 |
| Op: Difference   | 9.0                 |
| Op: Union        | 17.6                |
| Op: Merge-Sort   | 5.7                 |









## Summary - Compute Unit Diversity







## Outline









# **Memory Diversity**









# Memory Diversity







## **DRAM Process Scaling Challenge**

Dresden Database
Systems Group

- DRAM process scaling is slowing significantly
  - Approaching physical limit
  - Migrating to new process difficult and requires large investment
  - Core parameters (ex. Refresh, t<sub>WR</sub>, VRT) are getting worse
    - ⇒ increasing fail bit count

## **DRAM Family Tree and Applications**

Cannot easily satisfy density and performan Cost/bit 8Gb GPU/GPGPU 1Gb Transition Period 30nm class **2011** 20nm cl 40nm class SDRAM SDRAM SDRAM OLLABORATE. INNOVATE. GROW. Commodity



## Memory Diversity







1X 10X Latency

1X 100X Capacity

#### **DEVELOPMENTS**

HDD: huge demand for extremely cheap cloud storage (→ e.g. new form factors)

#### SDD:

- large capacity (> 1PByte) and (relatively) high bandwidth
- significant development ahead
- still (relatively) poor latency



## Memory Diversity



### Merging Point between Storage and Memory





# Game Changer? Non-Volatile Memory (NVRAM) Dresden Database

#### **ADVANTAGES**

- ... does not consume energy if not used
- ... is persistent, byte-addressable
- ... x-times denser than DRAM

#### **DRAWBACKS**

- ... has higher latency than DRAM
  - Read latency ~2x slower than DRAM
  - Write latency ~10x slower than DRAM
- Number of writes is limited.

|                      | MRAM    | DRAM   | PCM                  | ReRAM                | TLC NAND |
|----------------------|---------|--------|----------------------|----------------------|----------|
| Cost per Bit         | ~5      | 1      | >0.5                 | >0.5                 | 0.05     |
| Read Latency         | ~1      | 1      | 10                   | 100                  | 1000     |
| Write Latency        | ~1      | 1      | 50                   | 1000                 | 10000    |
| Volatility           | no      | yes    | no                   | no                   | no       |
| Endurance            | >1E15   | > 1E16 | <b>1E6</b> ~1E8      | 1E6                  | 1E3      |
| Write Energy (J/bit) | 0.1~100 | 1      | 0.1~ <mark>10</mark> | 0.1~ <mark>10</mark> | 10       |

IEEE TRANSACTIONS ON PARALLEL AND DISTRIBUTING SYSTEMS

### A Survey of Software Techniques for Using Non-Volatile Memories for Storage and Main Memory Systems

Sparsh Mittal, Member, IEEE, and Jeffrey S. Vetter, Senior Member, IEEE

Abstract—Non-volatile memory (NVM) devices, such as Flash, phase change RAM, spir transfer torque RAM, and resistive RAM, offer several advantages and challenges when compared to conventional memory technologies, such as DRAM and magnetic hard disk drives (HDDs). In this paper, we present a survey of software techniques that have been proposed to exploit the advantages and mitigate the disadvantages of NVMs when used for designing memory systems, and, in particular, secondary storage (e.g., sould state drive) and main memory. We classly these software techniques along several dimensions to highlight their similarities and differences. Given that NVMs are growing in popularity, we believe that this survey will motivate further research in the field of software technicogy for NVMs.

Index Terms—Review, classification, non-volatile memory (NVM) (NVRAM), flash memory, phase change RAM (PCM) (PCRAM), spin transfer torque RAM (STT-RAM) (STT-MRAM), resistive RAM (ReRAM) (RRAM), storage class memory (SCM), Solid State Drive (SSD).

#### 1 INTRODUCTION

To all computing systems ranging from hand-held embedded systems to massive supercomputers, memory systems play the primary role in determining their power consumption, reliability, and, unquestionably, application performance. The ever-increasing data-intensive nature of state-of-the-art applications

caches to main memory to secondary storage. The potential benefits of these NVMs stem from their projected physical properties that may allow them to both consume very low power and provide much higher density than projected traditional technologies. For example, the size of a typical SRAM cell is in the range of 125–200 F<sup>2</sup>, while that of a PCM and Flash

| Parameter                     | NAND            | DRAM            | PCM                              | STT-RAM                         |
|-------------------------------|-----------------|-----------------|----------------------------------|---------------------------------|
| Read Latency<br>Write Latency | 25 μs<br>500 μs | $50\mathrm{ns}$ | $50\mathrm{ns}$ $500\mathrm{ns}$ | $10\mathrm{ns}$ $50\mathrm{ns}$ |
| Byte-addressable              | No              | Yes             | Yes                              | Yes                             |
| Endurance                     | $10^4 - 10^5$   | $> 10^{15}$     | $10^8 - 10^9$                    | $> 10^{15}$                     |



## NVRAM as Transient Main Memory





### NVRAM operates in two modes



DRAM as hardware-managed cache for NVRAM



#### NVRAM next to DRAM





## NVRAM as Persistent Main Memory





- SNIA recommends to access NVRAM via file mmap ()
- NVRAM-optimized filesystem provides zero-copy mmap(), bypassing the OS page cache
  - → Linux ext4 and xfs already provide Direct Access support



may result in single-level database, i.e. the persistent version == the working copy



# NVRAM as Universal Memory





"...not fast enough to replace main memory...not cheap enough to replace flash" M. Stonebraker <a href="https://www.nextplatform.com/2017/08/15/hardware-drives-shape-databases-come/">https://www.nextplatform.com/2017/08/15/hardware-drives-shape-databases-come/</a>

Is this disruptive?







# NVRAM as Universal Memory: Pros and Cons



### CONS / THREADS

- NVRAM too expensive to fill the gap between DRAM and SSDs
- higher latency is directly visible for state-of-the-art data structures
- little control over when data is persisted due to CPU cache eviction policy or memory reordering
- testing methods required to cover novel types of bugs

#### Pros / Opportunities

- DRAM may be hitting scalability limits soon
- fits nicely into rack-scale architectural blueprints
- very limited performance degradations for the right data structure with matching access patterns
- provides near instant recovery! (loading an X-TeraByte database into Main Memory is a pain!)





#### **Adaptive Recovery for SCM-Enabled Databases**

Ismail Oukid<sup>†‡</sup>

Anisoara Nica‡

**Daniel Dos Santos** 

Wolfgang Lehnert

first.last@tu-dresden.de first.last@sap.com

\*Intel Deutschland GmbH first.last@intel.com



# Programming and Testing Challenges



### (DATABASE) DEVELOPERS ARE USED TO

- ordering operations at the logical level (e.g., write undo log, then update primary data)
- fully controlling when data is made persistent (e.g., log durability must precede data durability)

#### **NVM** INVALIDATES THESE ASSUMPTIONS

- little control over when data is made persistent
- writes need to be ordered at the system level resulting in novel failure scenarios



How to ensure consistency of data structures in NVM?



## Example: Array Append Operation



```
void push_back(int val){
    m_array[m_size] = val;
    sfence();
    clwb(&m array[m_size]);
    sfence();
    m size++;◀
    sfence();
    clwb(&m size);
    sfence();
Array.push back(2017);
void push back(int val){
    TXBEGIN {
        m array[m size] = val;
        m size++;
    } TXEND
```

#### What is in NVM?



### PROS:

- low-level optimizations possible

### Cons:

- programmer must reason about the application state
  - → harder to use and error prone

### à la software transactional memory

pmem.io
Persistent Memory Programming

Home Glossary Documents NVM Library Blog About

This site is focused on making persistent memory programming easier. The current focus is on the NVM Library, which is a library (set of libraries, actually) designed to provide some useful APIs for server

applications wanting to use persistent memory. You can

read more about the NVM Library or go directly to the

source. Contributions are welcome!

#### PROS:

- easy to use and to reason about

#### CONS:

**Recent Blog Posts** 

Using Standard Library

- overhead due to systematic logging
- low-level optimizations not possible



#### **NVM Performance Challenges**

#### WHAT IS THE COST OF FLUSHING INSTRUCTIONS?

- Prototype hybrid NVM-DRAM storage engine
- TPC-C throughput relative to "without flushes"
- Flushes incur ~18% performance overhead

Flushes are expensive but agnostic to latency

#### WHAT IS THE EFFECT OF HIGHER NVM LATENCIES?

- TPC-C throughput relative to "baseline NVM latency" (154ns)
- 4x higher latency → ~32% performance penalty with or without flushes

NVM latency is the main performance-deciding factor







### Persistent Memory Leaks



Novel class of memory leaks resulting from failures

```
Example: crash during a linked-list insertion
                                                  persistent allocation
    void append(int val){
        node *newNode = new node();
        newNode->value = val;
                                                                      Failure-induced
        persist(&(newNode->value));
                                                                         persistent
        m tail->next = newNode;
                                                                       memory leak!
                                                        m tail
        persist(m tail);
                                                                          9
                                                          12
        m tail = newNode;
        persist(&m tail);
```



### Persistent Memory Leaks

Novel class of memory leaks resulting f

**Example:** crash during a linked-list insertion

```
void append(int val){
    node *newNode = new node();

    newNode->value = val;
    persist(&(newNode->value));

    m_tail->next = newNode;
    persist(m_tail);
    m_tail = newNode;
    persist(&m_tail);
}
List.append(9);
```



PAllocator, a fail-safe persistent SCM allocator whose design



Resistive RAM (RRAM) [13], and HP's Memristors [26]. Given



novel types of bugs and additional testing overhead



#### Summary - Memory Diversity



Testing





#### Outline









# **Network Diversity**







#### **Network Diversity**











Ring Network

Haswell-EP

Fully Connected

Fully Connected

Fat Tree Infiniband, etc.

On-Chip → On-Board → Cross-Board → Cross-Node

data locality is king, moving data is evil!!!



#### ...most critical component for data-centric systems

- key for separation of compute and memory
  - core prerequiste for providing elasticity in database systems



#### Fast Networks: Infiniband & RDMA



#### **Network vs. Memory Bandwidth:**

The End of Slow Networks: It's Time for a Redesign

Carsten Binnig Andrew Crotty Alex Galakatos Tim Kraska Erfan Zamanian

Department of Computer Science, Brown University

{firstname\_lastname}@brown.edu PVLDB 2016





Machine: 2 Sockets, 4 NICs

⇒ Network bandwidth is not a bottleneck anymore (Latency is still 10x higher for remote access)



#### Example: Distributed Radix Join



TECHNISCHE UNIVERSITÄT DRESDEN



#### Workload:

14

15

- Joins: Classic Dist. Join Classic Dist. Join RDMA Joins (Slow Net) (Fast Net) (Fast Net)
- Data: 512M records per table x 2 on 4 servers (1Gb Ethernet + FDR 4x IB)

Listing 1: Remote Radix-Partitioning

```
for each tuple r in R do{
  h = radix-hash(key(r));
  buffer[h].append(r);

if(buffer.isFull()){
    copy-counter[h]++;
    if(partition[n] is local){
       memcpy(buffer[h], partition[h]);
    }
  else{
    signalled = (counter[h]==N);
    RDMA_WRITE(buffer[h], partition[h], signalled);
  }
  buffer.emtpy();
}
```

# **Example: Memory Extensions**









Figure 1: Integrating remote memory into an RDBMS.



#### ...even more Memory Extensions



#### **Efficient Memory Disaggregation with INFINISWAP**

Juncheng Gu, Youngmoon Lee, Yiwen Zhang, Mosharaf Chowdhury, Kang G. Shin University of Michigan



**NSDI 2017** 



Figure 3: INFINISWAP architecture. Each machine loads a block device as a kernel module (set as swap device) and runs an INFINISWAP daemon. The block device divides its address space into slabs and transparently maps them across many machines' remote memory; paging happens at page granularity via RDMA.



pathfinding projects towards Rack-scale computing



#### ...but still: Latency matters!!!





Topology







### Solution? - Accelerated Memory Operations





#### SGI Global Reference Unit (GRU)

- Global Shared Memory & Cache Coherency
- **Explicit Offloading**



...more on Friday 1.10 pm -1.35 pm

**Hardware-Accelerated Memory Operations** on Large-Scale NUMA Systems

Markus Dreseler Timo Djürken Matthias Uflacker Hasso Plattner Hasso Plattner Institute Potsdam, Germany

{first.last}@hpi.uni-potsdam.de

Thomas Kissinger Eric Lübke Dirk Habich Wolfgang Lehner Database Systems Group Technische Universität Dresden

{first.last}@tu-dresden.de



### Network Developments



All-to-All topology, e.g. NUMAlink 7



#### Network vs. Memory Bandwidth:



Is this disruptive?



What is next?



### **Network Diversity**





Ring Network

Haswell-EP



**Fully Connected** 

On-Board -







Cross-Board ———

Fat Tree



Infiniband, etc.

Cross-Node

Rack-scale architecture







52

# Technology Advances in Network Technology (cross-board)





(2.5mm x 3.0mm)







single mode onboard waveguides



# Technology Advances in Network Technology (cross-board)







(cost-based) configurable topology











# Impact on Database System Design









Volume (scalability)

- Single query vs. overall system performance
- Scheduling & data placement
- Concurrency control













Volume (scalability)

- Single query vs. overall system performance
- Scheduling & data placement
- Concurrency control



Millions of cores?



Scheduling a zoo?











Variety (heterogeneity)

- Impact on runtime
- Dealing with non-relational operators / application code







Volume (scalability)

- Single query vs. overall system performance
- Scheduling & data placement
- Concurrency control







Variety

(heterogeneity)



- Impact on runtime
- Dealing with non-relational operators / application code









Scheduling a zoo?

Energy Awareness?









Volume (scalability)



Scheduling a zoo?

**Energy Constraints? Resilience Constraints?** 





#### Heterogeneous-Reliability Memory: **Exploiting Application-Level Memory Error Tolerance**

Yixin Luo Sriram Govindan<sup>†</sup> Bikash Sharma<sup>†</sup> Mark Santaniello<sup>†</sup> Justin Meza Aman Kansal<sup>†</sup> Jie Liu<sup>†</sup> Badriddine Khessib<sup>‡</sup> Kushagra Vaid<sup>†</sup> Onur Mutlu Carnegie Mellon University, yixinluo@cs.cmu.edu, [meza, onur]@cmu.edu <sup>†</sup>Microsoft Corporation, [srgovin, bsharma, marksan, kansal, jie.liu, bkhessib, kvaid]@microsoft.com

Recent studies estimate that server cost contributes to as much as 57% of the total cost of ownership (TCO) of a datacenter [1]. One key contributor to this high server cost is the procurement of memory devices such as DRAMs, especially for data-intensive datacenter cloud applications that need low

errors in the field [40, 33, 41, 20, 27, 18, 36], we wanted to design a framework to emulate the occurrence of a memory error in an application's data in a controlled manner. Second, we wanted an efficient way to measure how an application accesses its data. Third, we wanted our framework to be easily adaptable to other workloads or system configurations.











Variety (heterogeneity)











Energy Constraints?
Resilience Constraints?



(heterogeneity)



# Database Design Principles





Data-Centric Design



Fine-Grained Adaptivity



**Self-Adaptation** 





# DB Design Principle: Data Centric Architecture



#### **Scalability Limiters**

#### **Transaction-Oriented Architecture**

Latches in Data Structures

Remote Memory Accesses













### DB Design Principle: Data Centric Architecture





# DB Design Principle: Fine Grained Adaptivity



Clear abstraction between













# DB Design Principle: Fine Grained Adaptivity



Clear abstraction between ... allows for

















# ... the End!



#### Conclusion



#### Hardware developments are pushing system software development

#### **OPPORTUNITIES ARE MANYFOLD**

- We don't have a choice!
  - modern HW will be exploited for efficient data management
    →if not by "us", then by other communities
- Extremely interesting research questions, but
  - "there is no free lunch" still holds!
  - requires interdisciplinary research activities beyond DB system engine design



#### RECAP: DISRUPTIONS ALWAYS REQUIRE TWO INGREDIENTS:

- Novel technology
- Novel types of DB(!!!) applications

Now we have both!!!

Now it's time for the next disruption!







# How Disruptive is Modern Hardware?

Wolfgang Lehner

Thanks!