Fault-tolerant Key-Value Store Based on Raft Practical Analysis

| Categories Distributed Systems  | Tags MIT 6.824  Raft  Key-Value Store  Fault Tolerance  Distributed Consistency 

Fault-tolerant Key-Value Store Based on Raft: Practical Analysis


1. Everyday Analogy: Group Accounting, No Lost Ledgers

Imagine a team jointly managing a ledger, where everyone can record transactions anytime but must ensure everyone sees the latest and consistent accounts. The Raft-based fault-tolerant key-value store solves exactly this “ledger synchronization” problem.


2. System Design Goals

  • Fault Tolerance: Continue serving despite node failures
  • Consistency: All clients see synchronized data
  • High Performance: Fast response for read/write requests

3. Architecture and Core Workflow

Client Request Flow:

Client ----> Leader Node ----> Append Raft Log ----> Replicate Log to Followers ----> Commit Log ----> Update State Machine ----> Respond to Client
  • Client requests are received by the Leader
  • Leader packages operations into log entries appended to local log
  • Log entries are replicated concurrently to a majority of Followers
  • After commit, entries are applied to the key-value store state machine
  • Finally, Leader returns execution results to the client

4. Key Code Examples (Go)

1. Client Write Request Handling

func (kv *KVServer) PutAppend(args *PutAppendArgs, reply *PutAppendReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    if !kv.rf.IsLeader() {
        reply.Err = ErrWrongLeader
        return
    }

    op := Op{
        Key:   args.Key,
        Value: args.Value,
        Type:  args.Op, // "Put" or "Append"
    }

    index, _, isLeader := kv.rf.Start(op)
    if !isLeader {
        reply.Err = ErrWrongLeader
        return
    }

    // Wait for the log entry to be committed and applied
    kv.waitForCommit(index)
    reply.Err = OK
}

2. State Machine Update (After Log Commit)

func (kv *KVServer) applyCommand(cmd Op) {
    switch cmd.Type {
    case "Put":
        kv.store[cmd.Key] = cmd.Value
    case "Append":
        kv.store[cmd.Key] += cmd.Value
    }
}

5. Consistency Maintenance and Idempotency Design

  • Avoid duplicate execution: Track client request IDs to ensure each request executes once
  • Read request handling: Usually served directly from Leader’s local state to ensure linearizability

6. Debugging Tips and Practical Skills

  • Use Raft logs to trace request states
  • Simulate node crashes to verify fault recovery
  • Test repeated requests to ensure idempotency correctness
  • Use network delay simulation to locate system bottlenecks

7. Terminology Mapping Table

Everyday Term Technical Term Explanation
Ledger Key-Value Store Data structure storing key-value pairs
Accounting Action Client Request Write or append data operation
Meeting Resolution Raft Log Commit Achieving consensus and applying operations
Chairperson Leader Coordinates requests and log replication

8. Thought Exercises and Practice

  • How to ensure the order consistency of concurrent write requests?
  • Design an idempotency mechanism to prevent duplicate request execution.
  • Extend the implementation to support snapshotting to avoid infinite log growth.

9. Conclusion: Protect Your Data Ledger with Raft

The fault-tolerant key-value store built on Raft uses distributed log replication and state machine application to achieve strong consistency and reliability. Understanding and mastering this design is a crucial step toward building production-grade distributed storage systems.