Approximately Synchronous Distributed Systems

Speaker : Arvind Krishnamurthy
University of Washington
Date: 02/07/2014
Time: 2:00 pm - 3:00 pm
Location: LINCS Meeting Room 40


Applications hosted within the datacenter often rely on distributed services such as Zookeeper, Chubby, and Spanner for fault-tolerant storage, distributed coordination, and transaction support.  These systems provide consistency and availability in the presence of limited failures by relying on sophisticated distributed algorithms such as state machine replication.  Unfortunately, these distributed algorithms are expensive, accrue additional latency, suffer from bottlenecks, and are difficult to optimize.  This state of affairs is due to the fact that distributed systems are traditionally designed independently from the underlying network and supporting protocols, making worst-case assumptions (e.g., complete asynchrony) about its behavior.

While this is reasonable for wide-area networks, many distributed applications are however deployed in datacenters, where the network is more reliable, predictable, and extensible.  Our position is that codesigning networks and distributed systems in order to operate under an “approximately synchronous” execution model can have substantial benefits in datacenter settings.  We will illustrate this using two case studies in this talk: Speculative Paxos — a distributed coordination service for datacenters that relies on the network to exhibit approximately synchronous behavior in the normal case, while still remaining correct if the network exhibits weaker properties, and Optimistic Replicated Two-Phase Commit (OR-2PC) — a new distributed transaction protocol that uses a new optimistic ordering technique, based on loosely synchronized clocks in order to improve both throughput and latency.