Demystifying Distributed Consistency CAP Theorem and Raft Algorithm Explained

| Categories Distributed Systems  | Tags MIT 6.824  Distributed Consistency  CAP Theorem  Raft Algorithm 

Demystifying Distributed Consistency: CAP Theorem and Raft Algorithm Explained


1. Everyday Analogy: “Agreement” in Team Collaboration

Imagine a group of friends planning a trip but living in different cities. Messages have delays and some may lose connection, so opinions may not be unified. The consistency challenge in distributed systems is similar: how to get multiple machines to “agree” even under unreliable networks is crucial.


2. Consistency Models and the CAP Theorem

1. Overview of Consistency Models

Model Description Everyday Analogy
Strong Consistency All nodes see the latest data instantly Everyone receives the updated plan simultaneously
Eventual Consistency Data eventually syncs but may be temporarily inconsistent Some get the plan earlier, others later
Weak Consistency No guarantee of synchronization; states may diverge for long periods Everyone has different travel plans

2. CAP Theorem Trade-offs

CAP theorem states that a distributed system cannot simultaneously guarantee Consistency (C), Availability (A), and Partition tolerance (P); only two can be achieved at the same time.

CAP Theorem Diagram:

     Consistency (C)
        / \
       /   \
Availability (A) — Partition tolerance (P)
Trade-off Representative Systems Use Cases
CA Single-node databases Stable network, no partitions
CP ZooKeeper Systems needing strong consistency
AP Dynamo, Cassandra Highly available, eventually consistent systems

3. Replica Mechanisms and Data Consistency

Replication improves reliability and performance, but keeping replicas consistent is challenging. Common replication methods:

  • Primary-Backup: Primary handles writes; backups asynchronously sync
  • Multi-Master: Multiple writable nodes, complex conflict resolution

Consistency guarantees rely on consensus algorithms to synchronize logs and state.


4. Core Consistency Protocol: Raft Algorithm Explained

Raft is known for simplicity and divides nodes into three roles:

Raft Roles:

Leader       Followers        Candidate
  ↑              ↑               ↑
  | ←——— Election Process ———→ |

1. Leader Election

  • All nodes start as Followers
  • After election timeout, a node becomes Candidate and requests votes
  • Gains majority votes to become Leader

2. Log Replication

  • Leader receives client commands and appends them to its log
  • Concurrently replicates logs to Followers
  • Commits logs after majority acknowledgement, updates state machine

3. Safety and Fault Tolerance

  • Ensures log consistency and prevents split-brain
  • Uses term numbers to prevent outdated leaders from committing logs
  • Handles network partitions and node failures
// Raft log append pseudocode example
func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if args.Term < rf.currentTerm {
        reply.Success = false
        return
    }
    rf.log = append(rf.log, args.Entries...)
    reply.Success = true
}

5. Practical Tips for Observation and Debugging

  • Observe leader election and heartbeat via logs
  • Test fault tolerance by simulating network partitions
  • Use Go debugging tools like Delve to trace state changes

6. Terminology Mapping Table

Everyday Term Technical Term Explanation
Meeting Host Leader Coordinates log replication and state updates
Attendees Follower Receives leader commands and stays in sync
Candidate Candidate Initiates election to become leader
Voting Vote Mechanism for electing leader

7. Thought Chain and Exercises

  • How does the CAP theorem guide real-world system design?
  • How does Raft prevent split-brain scenarios?
  • Implement a simplified Raft supporting election and log replication.

8. Conclusion: Protect Distributed Data Consistency with Raft

Distributed consistency is the cornerstone for stable system operation. The CAP theorem helps understand design trade-offs, and the Raft algorithm provides a clear and practical implementation path. Mastering these is a key step toward becoming a distributed systems expert.