HEWEN's blog 2025-08-01T17:31:34+08:00 wen.bin.he@foxmail.com Distributed Systems Topics Consistent Hashing and Cache Consistency Explained 2025-08-01T17:12:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-11 Distributed Systems Topics: Consistent Hashing and Cache Consistency Explained

1. Everyday Analogy: Parcel Sorting and Pickup Stores

Imagine a courier company handling tens of thousands of parcels—how do they assign them to different sorting centers? And how do pickup stores keep their inventory up to date, avoiding customers receiving “expired” goods? Consistent hashing and cache consistency in distributed systems solve similar challenges of “allocation” and “synchronization.”


2. Consistent Hashing and Data Distribution

1. Concept of Consistent Hashing

Consistent hashing maps both data and nodes onto a virtual ring, storing data on the first node clockwise from the data point. This design minimizes data migration when nodes are dynamically added or removed.

Consistent Hash Ring Illustration:

[Node A]---[Node B]----[Node C]---[Node D]---(Ring Structure)
        ↑                  ↑
       Data X             Data Y

### 2. Advantages

- Smooth and efficient scaling up and down
- Minimizes data movement
- Well-suited for caching systems and distributed storage

---

## 3. Distributed Cache and Cache Consistency

### 1. Introduction to Distributed Cache

Caches hot data to improve system response speed, commonly seen in Memcached and Redis Cluster.

### 2. Challenges of Cache Consistency

- **Cache update latency**: Cache not refreshed promptly after data changes
- **Stale data risk**: Clients read expired cache entries
- **Concurrent update conflicts**

### 3. Common Solutions

| Strategy              | Description                                                     | Suitable Scenario                            |
| --------------------- | --------------------------------------------------------------- | -------------------------------------------- |
| Cache Aside           | Update database first, then delete cache                        | Simple, widely used                          |
| Write Through         | Write synchronously to cache and database                       | Read-heavy, write-light workloads            |
| Write Back            | Delay writing back to database                                  | Write-heavy scenarios for better performance |
| TTL & Version Control | Use expiration time and version numbers to maintain consistency | Avoid stale data and cache avalanches        |

---

## 4. Distributed File Systems (DFS) Overview

### 1. Purpose

Enable massive file sharing across multiple machines, e.g., Google File System (GFS), Hadoop Distributed File System (HDFS).

### 2. Key Design Points

- File chunking and replica management
- Metadata services (NameNode, Zookeeper)
- Fault tolerance and load balancing

---

## 5. Go Language Example: Simple Consistent Hash Algorithm

```go
type HashRing struct {
    nodes []string
}

func (hr *HashRing) GetNode(key string) string {
    h := fnv.New32a()
    h.Write([]byte(key))
    hash := h.Sum32()
    idx := int(hash) % len(hr.nodes)
    return hr.nodes[idx]
}
```

---

## 6. Debugging and Practical Advice

- Use monitoring tools to observe cache hit rates and expiration
- Simulate dynamic node joins/leaves to verify consistent hashing migration efficiency
- Design reasonable cache expiration and update mechanisms for consistency

---

## 7. Terminology Mapping Table

| Everyday Term   | Technical Term     | Description                                  |
| --------------- | ------------------ | -------------------------------------------- |
| Parcel Sorting  | Consistent Hashing | Efficient data allocation algorithm          |
| Pickup Stores   | Distributed Cache  | Multi-node caching system for hot data       |
| Parcel Tracking | Metadata Service   | Component managing file locations and states |

---

## 8. Thought Exercises and Practice

- How does consistent hashing reduce data migration caused by node changes?
- Design cache expiration policies to prevent cache avalanches.
- Implement a simple metadata management module for a distributed file system.

---

## 9. Conclusion: The "Soft Power" Design of Distributed Systems

Consistent hashing, distributed caching, and file systems form the backbone of distributed systems. Mastering these technologies helps build more stable and efficient distributed applications.
]]>
Distributed Transaction Processing Ensuring Cross-Node Data Consistency 2025-08-01T17:10:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-10 Distributed Transaction Processing: Ensuring Cross-Node Data Consistency

1. Everyday Analogy: Group House Purchase and Escrow Accounts

Imagine several people jointly buying a house. Each person must first deposit funds into an escrow account. Only when all funds are confirmed does the transaction complete; if someone backs out or funds are insufficient, the entire deal is canceled. Distributed transactions solve similar “all-or-nothing” operations across multiple nodes.


2. Basic Concepts and Properties of Transactions

Transaction Property (ACID) Explanation Everyday Analogy
Atomicity All operations within a transaction either succeed completely or fail completely All funds in group purchase are either fully deposited or the deal is void
Consistency System remains in a valid state before and after the transaction The ledger numbers are accurate and correct
Isolation Concurrent transactions do not interfere with each other Multiple people signing contracts simultaneously without impact
Durability Results of a completed transaction are permanently saved Property registration is completed and cannot be lost

3. Challenges of Distributed Transaction Processing

  • Network delays and failures: Communications may timeout or drop packets
  • Partial node crashes: Some participants may be unavailable, making decisions difficult
  • Coordination difficulty: Multiple nodes must unanimously “agree” or “reject”
  • Blocking issues: Participants may be blocked waiting for coordinator instructions for a long time

4. Two-Phase Commit Protocol (2PC)

Phase 1: Prepare
Coordinator ------> Participants: Request to prepare to commit
Participants ------> Coordinator: Respond ready (Yes/No)

Phase 2: Commit/Rollback
Coordinator ------> Participants: Global commit or rollback command
Participants ------> Coordinator: Acknowledge completion

Advantages

  • Simple to implement, guarantees atomic commit

Disadvantages

  • Blocking: If coordinator crashes, participants wait indefinitely
  • Single point of failure risk

5. Three-Phase Commit Protocol (3PC)

Phase 1: CanCommit?
Coordinator ------> Participants: Ask if they can commit
Participants ------> Coordinator: Reply Yes/No

Phase 2: PreCommit
Coordinator ------> Participants: Notify pre-commit
Participants ------> Coordinator: Confirm receipt

Phase 3: DoCommit
Coordinator ------> Participants: Final commit or rollback
Participants ------> Coordinator: Acknowledge completion

Advantages

  • Reduces blocking, participants can make decisions if coordinator fails
  • Improves fault tolerance

Disadvantages

  • More complex protocol, higher implementation cost
  • Still affected by network partitioning

6. Go Language Simple Example: Coordinator Logic for Two-Phase Commit

func coordinatorCommit(participants []Participant) bool {
    // Phase 1: Prepare
    for _, p := range participants {
        if !p.Prepare() {
            // Some participant rejects, rollback all
            for _, p2 := range participants {
                p2.Rollback()
            }
            return false
        }
    }
    // Phase 2: Commit
    for _, p := range participants {
        p.Commit()
    }
    return true
}

7. Thought Exercises and Practice

  • How can the blocking problem of 2PC be improved?
  • Implement a simple 3PC simulation to observe failure recovery flow.
  • Explore distributed transaction implementations based on Raft consensus.

8. Conclusion: The Art of Trade-offs in Distributed Transactions

Distributed transactions guarantee atomicity and consistency across nodes but come with complex coordination and fault handling challenges. Understanding and wisely choosing protocols like 2PC and 3PC is foundational for building strongly consistent distributed systems.

]]>
Sharded Key-Value Store in Practice Design and Implementation 2025-07-30T17:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-09 Sharded Key-Value Store in Practice: Design and Implementation

1. Everyday Analogy: Division of Labor and Ledger Partitioning

Imagine a group managing a massive ledger; handling it alone is difficult and error-prone. They decide to split the ledger into multiple parts, each managed by different people while coordinating their work. This reduces individual burden and ensures data consistency. Sharded key-value stores similarly split large data across nodes for efficient cooperation.


2. System Goals and Challenges

  • Shard management: Partition data reasonably to evenly distribute load
  • Request routing: Client requests accurately target the corresponding shard
  • Data replication and fault tolerance: Ensure data reliability and prevent single points of failure
  • Dynamic scaling and migration: Support shard adjustments while maintaining system stability

3. Architecture Overview and Workflow

Overall Architecture:

Client
   ↓ Request shard mapping
Shard Controller (manages shard mappings)
   ↓ Specifies target shard
Shard Servers (shard node clusters)
   ↓ Data storage and replication

Request Flow:

Client
   └── Queries Shard Controller for shard info
        └── Sends request to specific Shard Server
            └── Read/write operations

4. Key Design Points

1. Shard Mapping Management

  • Maintain a mapping table recording which shard each key belongs to
  • Use consistent hashing or range partitioning for mapping

2. Request Routing Strategy

  • Client or proxy first accesses shard controller to get shard info
  • Requests are routed directly to the corresponding shard server to reduce forwarding

3. Shard Data Replication

  • Use Raft within each shard to guarantee consistency and fault tolerance
  • Multi-replica mechanism ensures data durability when nodes fail

4. Shard Migration and Scaling

  • When new nodes join, coordinate migration of partial data from old nodes
  • Ensure data consistency and availability during migration

5. Key Code Examples (Go)

1. Get Shard Number (Hash Function)

func key2shard(key string, shardCount int) int {
    h := fnv.New32a()
    h.Write([]byte(key))
    return int(h.Sum32()) % shardCount
}

2. Client Requests Shard Controller for Routing Info

func (client *Clerk) QueryShard(key string) int {
    shard := key2shard(key, client.shardCount)
    return client.config.Shards[shard] // Returns the server ID for the shard
}

3. Shard Server Handles Write Request (Invoking Raft)

func (kv *ShardKV) Put(args *PutArgs, reply *PutReply) {
    if !kv.rf.IsLeader() {
        reply.Err = ErrWrongLeader
        return
    }
    op := Op{Key: args.Key, Value: args.Value, Type: "Put"}
    index, _, isLeader := kv.rf.Start(op)
    if !isLeader {
        reply.Err = ErrWrongLeader
        return
    }
    kv.waitForCommit(index)
    reply.Err = OK
}

6. Debugging and Practical Tips

  • Simulate shard node dynamic join/leave to verify migration mechanism
  • Test cross-shard requests to ensure accurate routing
  • Stress test shard balancing to avoid hotspot nodes
  • Use logs and monitoring to track shard states

7. Terminology Mapping Table

Everyday Term Technical Term Description
Ledger Partition Data Sharding Splitting data into parts for distributed storage
Chief Accountant Shard Controller Manages shard info and routing rules
Ledger Manager Shard Server Server storing shard data
Ledger Migration Shard Migration Reallocation of data among nodes

8. Thought Exercises and Practice

  • How to implement dynamic shard scaling without service interruption?
  • Design client-side shard mapping caching to reduce shard controller load.
  • Implement leader election and failure recovery for shard replicas.

9. Conclusion: The Path to Scalable Sharded Key-Value Stores

Sharded key-value systems combine shard management, load balancing, and Raft replication to deliver highly available and high-performance data services. Mastering these design principles and practical skills is key to building large-scale distributed storage.

]]>
Data Sharding and Load Balancing The Scalability Boosters for Distributed Systems 2025-07-29T16:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-08 Data Sharding and Load Balancing: The Scalability Boosters for Distributed Systems

1. Everyday Analogy: Library “Categorized Shelves” and “Visitor Diversion”

Imagine a large library where all books are piled in one area — searching is slow and crowded. Instead, books are categorized and shelved separately (data sharding), and visitors are directed to different reading areas (load balancing), making library operations orderly and efficient.


2. Principles of Distributed Data Sharding and Partitioning

1. What is Data Sharding?

Splitting massive data into multiple “chunks,” each stored on different servers, reducing single-node pressure and enabling horizontal scaling.

Data Sharding Illustration:

Data Set
  ├── Shard 1
  ├── Shard 2
  ├── Shard 3
  └── ...

2. Partitioning Strategies

Strategy Description Pros & Cons
Range Partition Data divided by key ranges Fast range queries but risks data skew
Hash Partition Hash key mod to assign shards Good load balance but no range queries
Consistent Hash Dynamic shard adjustment for smooth scaling High scalability, complex implementation

3. Load Balancing Strategies and Algorithms

1. Goals of Load Balancing

  • Evenly distribute requests to avoid node overload
  • Dynamically adapt to node join/leave

2. Common Load Balancing Algorithms

Algorithm Description Suitable Scenarios
Round Robin Requests distributed in turns Nodes with balanced capacity, simple to implement
Weighted Round Robin Requests allocated based on node weight Adjust load for heterogeneous nodes
Least Connections Assign to node with fewest active connections Long-lived connections apps
Consistent Hash Requests mapped by key hash to node Cache systems and distributed storage

4. Data Replication and Migration Mechanisms

1. Necessity of Data Replication

  • Improve data reliability
  • Support read scalability

2. Migration Challenges

  • Ensure data consistency
  • Minimize service disruption risk

3. Migration Flow Illustration

Data Migration Process:

Original Shard Node              New Shard Node
       ↓                                  ↑
Read/Write Requests --> Data Sync --> Switch Access Path

5. Go Example: Simple Hash-based Sharding

func getShard(key string, shardCount int) int {
    h := fnv.New32a()
    h.Write([]byte(key))
    return int(h.Sum32()) % shardCount
}

6. Debugging and Practical Tips

  • Monitor shard loads and adjust partitioning timely
  • Simulate node join/leave to test migration mechanisms
  • Observe request distribution to detect hotspots and bottlenecks

7. Terminology Mapping Table

Everyday Term Technical Term Description
Bookshelf Partition Data Shard Horizontal split storage unit
Librarian Load Balancer Component distributing requests
Book Relocation Data Migration Data reassignment among nodes

8. Thought Exercises and Practice

  • How to design a dynamic scaling data sharding strategy?
  • How to combine load balancing with consistent hashing for seamless scaling?
  • Implement a simple sharding function and simulate request assignment.

9. Conclusion: Sharding and Load Balancing Bring Systems to Life

Effective data sharding and load balancing are key technologies for horizontal scaling in distributed systems. Mastering these methods helps systems stay robust and efficient amid exploding data and surging access demands.

]]>
Fault-tolerant Key-Value Store Based on Raft Practical Analysis 2025-07-29T15:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-07 Fault-tolerant Key-Value Store Based on Raft: Practical Analysis

1. Everyday Analogy: Group Accounting, No Lost Ledgers

Imagine a team jointly managing a ledger, where everyone can record transactions anytime but must ensure everyone sees the latest and consistent accounts. The Raft-based fault-tolerant key-value store solves exactly this “ledger synchronization” problem.


2. System Design Goals

  • Fault Tolerance: Continue serving despite node failures
  • Consistency: All clients see synchronized data
  • High Performance: Fast response for read/write requests

3. Architecture and Core Workflow

Client Request Flow:

Client ----> Leader Node ----> Append Raft Log ----> Replicate Log to Followers ----> Commit Log ----> Update State Machine ----> Respond to Client
  • Client requests are received by the Leader
  • Leader packages operations into log entries appended to local log
  • Log entries are replicated concurrently to a majority of Followers
  • After commit, entries are applied to the key-value store state machine
  • Finally, Leader returns execution results to the client

4. Key Code Examples (Go)

1. Client Write Request Handling

func (kv *KVServer) PutAppend(args *PutAppendArgs, reply *PutAppendReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    if !kv.rf.IsLeader() {
        reply.Err = ErrWrongLeader
        return
    }

    op := Op{
        Key:   args.Key,
        Value: args.Value,
        Type:  args.Op, // "Put" or "Append"
    }

    index, _, isLeader := kv.rf.Start(op)
    if !isLeader {
        reply.Err = ErrWrongLeader
        return
    }

    // Wait for the log entry to be committed and applied
    kv.waitForCommit(index)
    reply.Err = OK
}

2. State Machine Update (After Log Commit)

func (kv *KVServer) applyCommand(cmd Op) {
    switch cmd.Type {
    case "Put":
        kv.store[cmd.Key] = cmd.Value
    case "Append":
        kv.store[cmd.Key] += cmd.Value
    }
}

5. Consistency Maintenance and Idempotency Design

  • Avoid duplicate execution: Track client request IDs to ensure each request executes once
  • Read request handling: Usually served directly from Leader’s local state to ensure linearizability

6. Debugging Tips and Practical Skills

  • Use Raft logs to trace request states
  • Simulate node crashes to verify fault recovery
  • Test repeated requests to ensure idempotency correctness
  • Use network delay simulation to locate system bottlenecks

7. Terminology Mapping Table

Everyday Term Technical Term Explanation
Ledger Key-Value Store Data structure storing key-value pairs
Accounting Action Client Request Write or append data operation
Meeting Resolution Raft Log Commit Achieving consensus and applying operations
Chairperson Leader Coordinates requests and log replication

8. Thought Exercises and Practice

  • How to ensure the order consistency of concurrent write requests?
  • Design an idempotency mechanism to prevent duplicate request execution.
  • Extend the implementation to support snapshotting to avoid infinite log growth.

9. Conclusion: Protect Your Data Ledger with Raft

The fault-tolerant key-value store built on Raft uses distributed log replication and state machine application to achieve strong consistency and reliability. Understanding and mastering this design is a crucial step toward building production-grade distributed storage systems.

]]>
Fault Tolerance and High Availability Building Stable Distributed Systems 2025-07-28T14:30:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-06 Fault Tolerance and High Availability: Building Stable Distributed Systems

1. Everyday Analogy: Airplane “Failures” and Passenger Safety

Imagine a flight where the airplane might encounter engine failure or turbulence. To ensure safety, multiple backup systems and emergency plans are designed. Distributed systems face similar “failures,” and ensuring stable operation is a core design challenge.


2. Fault Models and Fault Handling

1. Common Fault Types

Fault Type Description Analogy Example
Node Failure Server crash or shutdown Airplane engine failure
Network Failure Network partition, message loss or delay Airplane communication cut-off
Software Bug Program bug causing abnormal behavior Flight system software flaw
Hardware Fault Disk failure, memory error Airplane instrument failure

2. Fault Tolerance Goals

  • Detect faults: Quickly identify anomalies
  • Recover service: Replace or repair failed nodes
  • Maintain consistency: Ensure data correctness

3. Fault Tolerance Techniques

1. Retry Mechanism

Automatically retry failed requests, suitable for transient faults.

// Simple retry example
func Retry(op func() error, attempts int) error {
    for i := 0; i < attempts; i++ {
        if err := op(); err == nil {
            return nil
        }
        time.Sleep(time.Millisecond * 100)
    }
    return errors.New("all retries failed")
}

2. Checkpoint

Periodically save system state to reduce recovery workload.

Checkpoint Illustration:

Running State ----> [Save Snapshot] ----> New State
    ↑                             |
    |-----------------------------|
      Recovery starts from snapshot

3. Failover

Automatically switch to standby nodes to ensure continuous service.

Failover Process:

Primary Node Failure
       ↓
Monitoring Detects
       ↓
Standby Node Takes Over
       ↓
Service Restored

4. High Availability and Service Level Agreement (SLA)

1. Availability Metric

  • Availability = (Uptime) / (Total Time)
  • Common targets: 99.9% (“three nines”) availability equals roughly 8.7 hours downtime per year

2. SLA Definition

SLA specifies quality and availability commitments, including response and recovery times.

SLA Metric Description Example
Availability Percentage uptime 99.9%
Response Time Max time for request Within 100ms
Recovery Time Time to recover from failure Within 5 minutes

5. Practical Observations and Debugging Tips

  • Monitoring Systems: Real-time health detection and alerting
  • Log Analysis: Trace fault causes and bottlenecks
  • Fault Injection: Simulate failures to verify resilience
  • Recovery Drills: Regular failover process testing

6. Terminology Mapping Table

Everyday Term Technical Term Explanation
Backup Engine Standby Node Server that takes over when primary fails
Repair Plane Fault Recovery Restoring system to normal operation
Retry Attempt Retry Mechanism Automatic request resending on failure
Safety Net Checkpoint Periodic system snapshot

7. Thought Exercises and Practice

  • How to design retry strategies to avoid cascading failures?
  • How do checkpoints and logs coordinate during recovery?
  • Implement a simple failover detection and switchover module.

8. Conclusion: The Engineering Wisdom of Fault Tolerance and High Availability

Fault tolerance techniques and high availability design form the foundation of business continuity in distributed systems. Understanding fault models, leveraging retries and checkpoints wisely, and designing reasonable failover and SLA agreements are essential skills for every distributed systems engineer.

]]>
Practical Raft A Deep Dive into Distributed Replicated Log Systems 2025-07-28T13:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-05 Practical Raft: A Deep Dive into Distributed Replicated Log Systems

1. Everyday Analogy: Team “Leader” Election and Task Synchronization

Imagine a project team that needs to select a leader through voting. The leader then assigns tasks to ensure everyone executes according to plan. Raft is such a “democratic” mechanism that guarantees multiple nodes coordinate consistently.


2. Core Design of Raft

1. Roles and States

Node Roles:
- Leader: Handles client requests and manages log replication
- Follower: Passively accepts commands from the leader
- Candidate: Competes to become the leader

2. Election Mechanism

  • Each follower waits for a randomized election timeout before becoming a candidate
  • Candidates request votes; the one with majority votes becomes leader
  • Leader periodically sends heartbeats (AppendEntries RPC) to prevent new elections

3. Log Replication

  • Leader receives client commands and appends them to its log
  • Leader replicates logs concurrently to all followers
  • Once a log entry is written by a majority, it is committed and applied to the state machine

3. Detailed Workflow

Raft Workflow:

Client Request
    ↓
Leader appends entry to log
    ↓
Sends AppendEntries RPC concurrently to Followers
    ↓
Followers write logs and respond success
    ↓
Leader confirms majority success, commits logs
    ↓
Apply to state machine

4. Core Code Examples (Go)

1. Election Timeout Triggering Election

func (rf *Raft) electionTimeout() {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if rf.role != Leader && time.Since(rf.lastHeartbeat) > rf.electionTimeout {
        rf.startElection()
    }
}

2. Sending Vote Requests

func (rf *Raft) startElection() {
    rf.currentTerm++
    rf.role = Candidate
    rf.votedFor = rf.me
    votes := 1
    for _, peer := range rf.peers {
        if peer == rf.me {
            continue
        }
        go func(p int) {
            voteGranted := rf.sendRequestVote(p)
            if voteGranted {
                votes++
                if votes > len(rf.peers)/2 {
                    rf.becomeLeader()
                }
            }
        }(peer)
    }
}

3. Appending Log Entries

func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if args.Term < rf.currentTerm {
        reply.Success = false
        return
    }
    rf.lastHeartbeat = time.Now()
    rf.role = Follower
    rf.currentTerm = args.Term
    rf.log = append(rf.log, args.Entries...)
    reply.Success = true
}

5. Debugging Tips and Practical Experience

  • Simulate network delays and partitions to test election stability
  • Monitor log consistency to avoid log loss or out-of-order entries
  • Use Go’s race detector to catch race conditions
  • Add detailed logs for state transitions to diagnose role changes

6. Terminology Mapping Table

Everyday Term Technical Term Explanation
Team Leader Leader Manages logs and commands cluster
Team Member Follower Receives and executes leader commands
Candidate Candidate Runs for leadership
Vote RequestVote RPC Message requesting votes
Heartbeat AppendEntries RPC Leader’s periodic authority message

7. Thought Exercises and Practice

  • How does Raft prevent multiple leaders during network partitions?
  • Design log compaction and snapshot mechanisms to improve performance.
  • Implement AppendEntries RPC with retries and timeout handling.

8. Conclusion: Master Distributed Consensus with Raft

Raft’s clear role definitions and workflows make it a cornerstone of distributed system consistency. Understanding and implementing Raft is key to mastering distributed log replication and fault tolerance design.

]]>
Demystifying Distributed Consistency CAP Theorem and Raft Algorithm Explained 2025-07-27T12:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-04 Demystifying Distributed Consistency: CAP Theorem and Raft Algorithm Explained

1. Everyday Analogy: “Agreement” in Team Collaboration

Imagine a group of friends planning a trip but living in different cities. Messages have delays and some may lose connection, so opinions may not be unified. The consistency challenge in distributed systems is similar: how to get multiple machines to “agree” even under unreliable networks is crucial.


2. Consistency Models and the CAP Theorem

1. Overview of Consistency Models

Model Description Everyday Analogy
Strong Consistency All nodes see the latest data instantly Everyone receives the updated plan simultaneously
Eventual Consistency Data eventually syncs but may be temporarily inconsistent Some get the plan earlier, others later
Weak Consistency No guarantee of synchronization; states may diverge for long periods Everyone has different travel plans

2. CAP Theorem Trade-offs

CAP theorem states that a distributed system cannot simultaneously guarantee Consistency (C), Availability (A), and Partition tolerance (P); only two can be achieved at the same time.

CAP Theorem Diagram:

     Consistency (C)
        / \
       /   \
Availability (A) — Partition tolerance (P)
Trade-off Representative Systems Use Cases
CA Single-node databases Stable network, no partitions
CP ZooKeeper Systems needing strong consistency
AP Dynamo, Cassandra Highly available, eventually consistent systems

3. Replica Mechanisms and Data Consistency

Replication improves reliability and performance, but keeping replicas consistent is challenging. Common replication methods:

  • Primary-Backup: Primary handles writes; backups asynchronously sync
  • Multi-Master: Multiple writable nodes, complex conflict resolution

Consistency guarantees rely on consensus algorithms to synchronize logs and state.


4. Core Consistency Protocol: Raft Algorithm Explained

Raft is known for simplicity and divides nodes into three roles:

Raft Roles:

Leader       Followers        Candidate
  ↑              ↑               ↑
  | ←——— Election Process ———→ |

1. Leader Election

  • All nodes start as Followers
  • After election timeout, a node becomes Candidate and requests votes
  • Gains majority votes to become Leader

2. Log Replication

  • Leader receives client commands and appends them to its log
  • Concurrently replicates logs to Followers
  • Commits logs after majority acknowledgement, updates state machine

3. Safety and Fault Tolerance

  • Ensures log consistency and prevents split-brain
  • Uses term numbers to prevent outdated leaders from committing logs
  • Handles network partitions and node failures
// Raft log append pseudocode example
func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if args.Term < rf.currentTerm {
        reply.Success = false
        return
    }
    rf.log = append(rf.log, args.Entries...)
    reply.Success = true
}

5. Practical Tips for Observation and Debugging

  • Observe leader election and heartbeat via logs
  • Test fault tolerance by simulating network partitions
  • Use Go debugging tools like Delve to trace state changes

6. Terminology Mapping Table

Everyday Term Technical Term Explanation
Meeting Host Leader Coordinates log replication and state updates
Attendees Follower Receives leader commands and stays in sync
Candidate Candidate Initiates election to become leader
Voting Vote Mechanism for electing leader

7. Thought Chain and Exercises

  • How does the CAP theorem guide real-world system design?
  • How does Raft prevent split-brain scenarios?
  • Implement a simplified Raft supporting election and log replication.

8. Conclusion: Protect Distributed Data Consistency with Raft

Distributed consistency is the cornerstone for stable system operation. The CAP theorem helps understand design trade-offs, and the Raft algorithm provides a clear and practical implementation path. Mastering these is a key step toward becoming a distributed systems expert.

]]>
MapReduce in Practice An Introduction to Distributed Big Data Processing 2025-07-27T11:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-03 MapReduce in Practice: An Introduction to Distributed Big Data Processing

1. Analogy: Efficient Collaboration in a Distributed Kitchen

Imagine a large kitchen tasked with preparing thousands of dishes. If one chef does all the work, efficiency suffers. MapReduce works like dividing the chefs: some chop ingredients (Map), others cook the dishes (Reduce), and finally all dishes are served efficiently and orderly.


2. MapReduce Framework Design and Principles

MapReduce consists of two phases:

  • Map phase: Input data is split into independent chunks, each processed separately to generate a series of <key, value> pairs
  • Reduce phase: Data with the same key is aggregated and processed to produce the final results

This design inherently supports parallelism and fault tolerance.

MapReduce Flowchart:

Input Data
   ↓ Split into chunks
[Map Task 1] [Map Task 2] ... [Map Task N]
   ↓ Produce intermediate <key,value> pairs
Shuffle phase (group by key)
   ↓
[Reduce Task 1] [Reduce Task 2] ... [Reduce Task M]
   ↓ Aggregate processing
Final Results

3. Core Go Implementation: Writing Map and Reduce Functions

1. Example Map Function

Suppose counting word occurrences in a text; the Map function splits text into words and outputs key-value pairs for each word.

func Map(filename string, contents string) []KeyValue {
    // Split text by whitespace into words
    words := strings.Fields(contents)
    kva := []KeyValue{}
    for _, w := range words {
        kva = append(kva, KeyValue{Key: w, Value: "1"})
    }
    return kva
}

2. Example Reduce Function

The Reduce function receives all values for a word and sums them up.

func Reduce(key string, values []string) string {
    count := 0
    for range values {
        // Each value is "1"; count occurrences
        count += 1
    }
    return strconv.Itoa(count)
}

4. Basic Methods for Parallel Data Processing

  • Input splitting: Large files are divided into chunks, distributed to multiple Map tasks
  • Shuffle phase: Map outputs are grouped by key and sent to Reduce tasks
  • Concurrent execution: Map and Reduce tasks run in parallel across multiple machines or threads, improving throughput
  • Fault tolerance: Failed tasks can be restarted, ensuring final correctness

5. Practical Tips for Observation and Debugging

  • Local debugging: Use Go’s built-in test framework to verify Map and Reduce correctness
  • Log printing: Track anomalies during data processing
  • Simulate failures: Intentionally cause task failures to test fault tolerance
  • Performance monitoring: Observe execution time and optimize data chunk sizes

6. Terminology Mapping Table

Everyday Term Technical Term Explanation
Chef who chops Map function Processes data splits, generates intermediate results
Chef who cooks Reduce function Aggregates intermediate data, produces final results
Kitchen section Data chunk Input data split into multiple processing units
Serving process Shuffle Transfer of intermediate data from Map to Reduce

7. Thinking and Exercises

  • How to design Map functions to accommodate different data types and aggregation needs?
  • How to implement complex aggregation operations in Reduce?
  • Design a simple word frequency program handling multiple text inputs, validating parallelism effectiveness.

8. Summary: MapReduce Makes Big Data Processing Accessible

By splitting tasks and executing in parallel, MapReduce greatly improves big data processing efficiency and reliability. Mastering Map and Reduce function design is the first step to understanding distributed computing and lays a solid foundation for learning distributed consistency and fault tolerance.

]]>
Distributed Communication Essentials RPC and an Introduction to Go Concurrency 2025-07-26T10:00:00+08:00 HeWen https://ehewen.com/blog/MIT6.824-02 Distributed Communication Essentials: RPC and an Introduction to Go Concurrency

1. Opening the Magic Box of Distributed Communication: What is RPC?

In distributed systems, different machines need to “talk” to each other to collaborate. RPC (Remote Procedure Call) is a magical technique that lets you call a remote service as if it were a local function.

Everyday Analogy

Imagine you’re cooking at home but want to use your neighbor’s oven. You call them (RPC) to ask for help baking. Though not under the same roof, you can give instructions as if it’s your own kitchen.


2. Core Mechanism of RPC

The key to RPC is to “unwrap” the function call into a request, transmit it over the network, then “wrap” the response back to the caller. It mainly includes:

  • Client Stub: Packages function calls and sends requests
  • Server Stub: Receives requests and calls the actual implementation
  • Transport Protocol: Ensures safe and reliable data transfer over the network
RPC Call Flow Diagram:

Client Application
    ↓ calls local function
Client Stub
    ↓ encodes and sends request
Network Transport
    ↓ decodes request
Server Stub
    ↓ calls real service function
Returns result

---

## 3. Introduction to Go: A Tool for Concurrency and Networking

Go is popular for distributed development due to its simplicity, efficiency, and built-in concurrency support.

### 1. Basic Go Syntax Recap

```go
// Simple function example
func Add(a, b int) int {
    return a + b
}

2. goroutine: Lightweight Threads

Go implements concurrency with goroutines, supporting tens of thousands without issue.

go func() {
    fmt.Println("Hello from goroutine")
}()

3. Channel: Safe Communication Pipelines

Goroutines communicate via channels, avoiding complex locks by passing messages.

ch := make(chan int)
go func() {
    ch <- 42  // send data
}()
val := <-ch   // receive data

4. Combining RPC and Go Concurrency: Designing Efficient Distributed Communication

Go’s goroutines and channels simplify and enhance RPC implementations:

  • Each RPC request handled in a separate goroutine, naturally supporting concurrency
  • Channels can be used for asynchronous messaging and event notification
  • The built-in net/rpc package abstracts serialization and transport, easing development

5. Example: Simple RPC Server and Client in Go

// Server: providing addition service
type Arith struct{}

func (a *Arith) Add(args *Args, reply *int) error {
    *reply = args.A + args.B
    return nil
}

func main() {
    arith := new(Arith)
    rpc.Register(arith)
    listener, _ := net.Listen("tcp", ":1234")
    for {
        conn, _ := listener.Accept()
        go rpc.ServeConn(conn)
    }
}
// Client call example
client, _ := rpc.Dial("tcp", "localhost:1234")
args := &Args{A: 10, B: 20}
var reply int
client.Call("Arith.Add", args, &reply)
fmt.Println("Result:", reply)

6. Debugging and Performance Optimization Tips

  • Use Delve to debug goroutine scheduling and deadlocks
  • Capture packets with tcpdump to analyze RPC request details
  • Control the number of goroutines to avoid resource exhaustion
  • Use connection pooling to reuse TCP connections and reduce latency

7. Terminology Mapping Table

Everyday Expression Technical Term Explanation
Making a Phone Call RPC Remote call, cross-machine function invocation mechanism
Courier Stub Proxy component packaging and receiving requests
Lightweight Knight goroutine Lightweight thread enabling efficient concurrency
Pipeline Channel Safe communication mechanism between goroutines

8. Thinking and Exercises

  • How does RPC ensure reliability and order of calls?
  • How does Go’s concurrency model avoid pitfalls of traditional threads?
  • Implement an RPC client supporting timeout and retry mechanisms.

9. Summary: Making RPC and Go the Powerful Engines of Distributed Systems

RPC connects the “nerves” of distributed systems, while Go’s concurrency makes those “nerves” efficient and stable. Mastering both lets you build flexible and robust distributed communication systems.

]]>