Hash Joins Meet CXL: A Fresh Look

Authors:
Wentao Huang, Mian Lu, Kian-Lee Tan
Abstract

Compute Express Link (CXL) has emerged as a promising technology for expanding memory capacity and bandwidth in data-intensive systems. Recent studies advocate interleaving CXL memory with local DRAM to create a new interleaved memory tier, thereby increasing aggregate bandwidth for workloads that are constrained by memory bandwidth. Following this trend, many systems place data in the interleaved tier to accelerate query processing. However, for most applications where data is stored on CXL memory, existing works fail to account for the additional data movement overhead to load the data into the interleaved memory. In this paper, we revisit this design decision through the lens of main-memory hash joins, breaking down performance across execution phases and developing a performance model that captures both bandwidth benefits and data movement costs. Our analysis demonstrates that moving only a portion of data from CXL memory to DRAM can outperform the conventional strategy of relocating the entire dataset to the interleaved memory tier for subsequent in-place processing, due to reduced data movement and a balanced use of memory bandwidth resources. This work challenges established practices and offers practical guidance for designing CXL-aware query operators.