Brendan Ang

Search

Search IconIcon to open search

Distributed Data Management

Last updated Mar 9, 2023 Edit Source

# Distributed Data Management

# Distributed Transactions

Transaction Management for distributed systems. All shards should either commit/abort the same transaction.

# Atomic Commit

The de-facto protocol for atomic commit is two-phase commit (2PC)

Approach: use 1 process as the coordinator (leader). Given a proposed transaction T, commit if all followers agree to commit. Abort if at least one follower aborts/fails. Problem: if the process was to fail after the decision was made by the coordinator it will be unable to apply the changes locally in the shard.

# Distributed Snapshotting

Capturing the global state of a distributed system.

# Consistent Cuts

Properties:

  1. Termination: eventually every process records its state
  2. Validity: all recorded states correspond to a consistent cut

# Chandy Lamport Algorithm

Approach: disseminate a special marker to mark events during the cut. 500

# Epoch-based Snapshotting

For continuous data stream processing, it is difficult to log individual task executions.

Approach: divide computations into epochs, such as stages, and treat them as 1 transaction. The Chandy Lamport algorithm is not enough, as it will capture a lot of in-flight messages. We want to capture just the states which would in itself reflect the effect of these messages. This is done by epoch alignment:

  1. Allow all messages to go through until an epoch change marker is introduced
  2. On receiving the marker, log the state
  3. When a process receiving the marker has multiple channels, prioritise inputs from channels which have not seen the marker until they all see the marker.
  4. Terminate once all processes seen the marker. 500