FLEET: High-Performance Durable Replicated State Machines using Scattered and Coordinated Log Entries

Authors:

Hua Fan, Hao Tan, Wenchao Zhou, Feifei Li

Abstract

Distributed coordination services are fundamental components of distributed systems, employing durable replicated state machines (RSMs) to ensure consistency across replicas and prevent data loss, even in the event of all nodes failing. These services typically rely on persistent logs for rapid recovery, as a universally agreed-upon log allows replicas to restore their state by sequentially replaying ordered log entries. However, the requirement for a totally ordered log inherently limits opportunities for parallelism. This paper introduces Fleet, a high-performance durable RSM protocol that combines a hybrid scattered-entry log with an asyn- chronous ordered log. Our approach integrates synchronous persis- tence of scattered entries with asynchronous persistence of ordered entries, ensuring both rapid recovery and high levels of parallelism. Additionally, we propose a parallel applying optimization for the etcd database, named pre-apply. Experimental results demonstrate that Fleet significantly outperforms Raft and Scalog in terms of throughput and latency, achieving up to 10× the throughput under specific configurations and scaling effectively across multiple nodes. Additionally, with the pre-apply optimization, Fleet delivers a 10- fold increase in throughput compared to sequential applying on etcd. Although Fleet incurs a 5% overhead in recovery time during leader failure, this delay is tolerable given the rarity of such events.

PVLDB is part of the VLDB Endowment Inc.

Start

Current Submission

All Volumes

Reproducibility

General Information

Volume 18, No. 5

FLEET: High-Performance Durable Replicated State Machines using Scattered and Coordinated Log Entries

Abstract