HEWEN's blog

Distributed Systems Topics Consistent Hashing and Cache Consistency Explained

HeWen — 2025-08-01T17:12:00+08:00

1. Parcel Sorting and Pickup Stores

Imagine a courier company handling tens of thousands of parcels—how do they assign them to different sorting centers? And how do pickup stores keep their inventory up to date, avoiding customers receiving “expired” goods? Consistent hashing and cache consistency in distributed systems solve similar challenges of “allocation” and “synchronization.”

2. Consistent Hashing and Data Distribution

1. Concept of Consistent Hashing

Consistent hashing maps both data and nodes onto a virtual ring, storing data on the first node clockwise from the data point. This design minimizes data migration when nodes are dynamically added or removed.

Consistent Hash Ring Illustration:

[Node A]---[Node B]----[Node C]---[Node D]---(Ring Structure)
        ↑                  ↑
       Data X             Data Y


### 2. Advantages

- Smooth and efficient scaling up and down
- Minimizes data movement
- Well-suited for caching systems and distributed storage

---

## 3. Distributed Cache and Cache Consistency

### 1. Introduction to Distributed Cache

Caches hot data to improve system response speed, commonly seen in Memcached and Redis Cluster.

### 2. Challenges of Cache Consistency

- **Cache update latency**: Cache not refreshed promptly after data changes
- **Stale data risk**: Clients read expired cache entries
- **Concurrent update conflicts**

### 3. Common Solutions

| Strategy              | Description                                                     | Suitable Scenario                            |
| --------------------- | --------------------------------------------------------------- | -------------------------------------------- |
| Cache Aside           | Update database first, then delete cache                        | Simple, widely used                          |
| Write Through         | Write synchronously to cache and database                       | Read-heavy, write-light workloads            |
| Write Back            | Delay writing back to database                                  | Write-heavy scenarios for better performance |
| TTL & Version Control | Use expiration time and version numbers to maintain consistency | Avoid stale data and cache avalanches        |

---

## 4. Distributed File Systems (DFS) Overview

### 1. Purpose

Enable massive file sharing across multiple machines, e.g., Google File System (GFS), Hadoop Distributed File System (HDFS).

### 2. Key Design Points

- File chunking and replica management
- Metadata services (NameNode, Zookeeper)
- Fault tolerance and load balancing

---

## 5. Go Language Example: Simple Consistent Hash Algorithm

```go
type HashRing struct {
    nodes []string
}

func (hr *HashRing) GetNode(key string) string {
    h := fnv.New32a()
    h.Write([]byte(key))
    hash := h.Sum32()
    idx := int(hash) % len(hr.nodes)
    return hr.nodes[idx]
}
```

---

## 6. Debugging and Practical Advice

- Use monitoring tools to observe cache hit rates and expiration
- Simulate dynamic node joins/leaves to verify consistent hashing migration efficiency
- Design reasonable cache expiration and update mechanisms for consistency

---

## 7. Terminology Mapping Table

| Everyday Term   | Technical Term     | Description                                  |
| --------------- | ------------------ | -------------------------------------------- |
| Parcel Sorting  | Consistent Hashing | Efficient data allocation algorithm          |
| Pickup Stores   | Distributed Cache  | Multi-node caching system for hot data       |
| Parcel Tracking | Metadata Service   | Component managing file locations and states |

---

## 8. Thought Exercises and Practice

- How does consistent hashing reduce data migration caused by node changes?
- Design cache expiration policies to prevent cache avalanches.
- Implement a simple metadata management module for a distributed file system.

---

## 9. Conclusion: The "Soft Power" Design of Distributed Systems

Consistent hashing, distributed caching, and file systems form the backbone of distributed systems. Mastering these technologies helps build more stable and efficient distributed applications.

Distributed Transaction Processing Ensuring Cross-Node Data Consistency

HeWen — 2025-08-01T17:10:00+08:00

1. Group House Purchase and Escrow Accounts

Imagine several people jointly buying a house. Each person must first deposit funds into an escrow account. Only when all funds are confirmed does the transaction complete; if someone backs out or funds are insufficient, the entire deal is canceled. Distributed transactions solve similar “all-or-nothing” operations across multiple nodes.

2. Basic Concepts and Properties of Transactions

Transaction Property (ACID)	Explanation	Everyday Analogy
Atomicity	All operations within a transaction either succeed completely or fail completely	All funds in group purchase are either fully deposited or the deal is void
Consistency	System remains in a valid state before and after the transaction	The ledger numbers are accurate and correct
Isolation	Concurrent transactions do not interfere with each other	Multiple people signing contracts simultaneously without impact
Durability	Results of a completed transaction are permanently saved	Property registration is completed and cannot be lost

3. Challenges of Distributed Transaction Processing

Network delays and failures: Communications may timeout or drop packets
Partial node crashes: Some participants may be unavailable, making decisions difficult
Coordination difficulty: Multiple nodes must unanimously “agree” or “reject”
Blocking issues: Participants may be blocked waiting for coordinator instructions for a long time

4. Two-Phase Commit Protocol (2PC)

Phase 1: Prepare
Coordinator ------> Participants: Request to prepare to commit
Participants ------> Coordinator: Respond ready (Yes/No)

Phase 2: Commit/Rollback
Coordinator ------> Participants: Global commit or rollback command
Participants ------> Coordinator: Acknowledge completion

Advantages

Simple to implement, guarantees atomic commit

Disadvantages

Blocking: If coordinator crashes, participants wait indefinitely
Single point of failure risk

5. Three-Phase Commit Protocol (3PC)

Phase 1: CanCommit?
Coordinator ------> Participants: Ask if they can commit
Participants ------> Coordinator: Reply Yes/No

Phase 2: PreCommit
Coordinator ------> Participants: Notify pre-commit
Participants ------> Coordinator: Confirm receipt

Phase 3: DoCommit
Coordinator ------> Participants: Final commit or rollback
Participants ------> Coordinator: Acknowledge completion

Advantages

Reduces blocking, participants can make decisions if coordinator fails
Improves fault tolerance

Disadvantages

More complex protocol, higher implementation cost
Still affected by network partitioning

6. Go Language Simple Example: Coordinator Logic for Two-Phase Commit

func coordinatorCommit(participants []Participant) bool {
    // Phase 1: Prepare
    for _, p := range participants {
        if !p.Prepare() {
            // Some participant rejects, rollback all
            for _, p2 := range participants {
                p2.Rollback()
            }
            return false
        }
    }
    // Phase 2: Commit
    for _, p := range participants {
        p.Commit()
    }
    return true
}

7. Thought Exercises and Practice

How can the blocking problem of 2PC be improved?
Implement a simple 3PC simulation to observe failure recovery flow.
Explore distributed transaction implementations based on Raft consensus.

8. Conclusion: The Art of Trade-offs in Distributed Transactions

Distributed transactions guarantee atomicity and consistency across nodes but come with complex coordination and fault handling challenges. Understanding and wisely choosing protocols like 2PC and 3PC is foundational for building strongly consistent distributed systems.

Sharded Key-Value Store in Practice -- Design and Implementation

HeWen — 2025-07-30T17:00:00+08:00

1. Division of Labor and Ledger Partitioning

Imagine a group managing a massive ledger; handling it alone is difficult and error-prone. They decide to split the ledger into multiple parts, each managed by different people while coordinating their work. This reduces individual burden and ensures data consistency. Sharded key-value stores similarly split large data across nodes for efficient cooperation.

2. System Goals and Challenges

Shard management: Partition data reasonably to evenly distribute load
Request routing: Client requests accurately target the corresponding shard
Data replication and fault tolerance: Ensure data reliability and prevent single points of failure
Dynamic scaling and migration: Support shard adjustments while maintaining system stability

3. Architecture Overview and Workflow

Overall Architecture:

Client
   ↓ Request shard mapping
Shard Controller (manages shard mappings)
   ↓ Specifies target shard
Shard Servers (shard node clusters)
   ↓ Data storage and replication

Request Flow:

Client
   └── Queries Shard Controller for shard info
        └── Sends request to specific Shard Server
            └── Read/write operations

4. Key Design Points

1. Shard Mapping Management

Maintain a mapping table recording which shard each key belongs to
Use consistent hashing or range partitioning for mapping

2. Request Routing Strategy

Client or proxy first accesses shard controller to get shard info
Requests are routed directly to the corresponding shard server to reduce forwarding

3. Shard Data Replication

Use Raft within each shard to guarantee consistency and fault tolerance
Multi-replica mechanism ensures data durability when nodes fail

4. Shard Migration and Scaling

When new nodes join, coordinate migration of partial data from old nodes
Ensure data consistency and availability during migration

5. Key Code Examples (Go)

1. Get Shard Number (Hash Function)

func key2shard(key string, shardCount int) int {
    h := fnv.New32a()
    h.Write([]byte(key))
    return int(h.Sum32()) % shardCount
}

2. Client Requests Shard Controller for Routing Info

func (client *Clerk) QueryShard(key string) int {
    shard := key2shard(key, client.shardCount)
    return client.config.Shards[shard] // Returns the server ID for the shard
}

3. Shard Server Handles Write Request (Invoking Raft)

func (kv *ShardKV) Put(args *PutArgs, reply *PutReply) {
    if !kv.rf.IsLeader() {
        reply.Err = ErrWrongLeader
        return
    }
    op := Op{Key: args.Key, Value: args.Value, Type: "Put"}
    index, _, isLeader := kv.rf.Start(op)
    if !isLeader {
        reply.Err = ErrWrongLeader
        return
    }
    kv.waitForCommit(index)
    reply.Err = OK
}

6. Debugging and Practical Tips

Simulate shard node dynamic join/leave to verify migration mechanism
Test cross-shard requests to ensure accurate routing
Stress test shard balancing to avoid hotspot nodes
Use logs and monitoring to track shard states

7. Terminology Mapping Table

Everyday Term	Technical Term	Description
Ledger Partition	Data Sharding	Splitting data into parts for distributed storage
Chief Accountant	Shard Controller	Manages shard info and routing rules
Ledger Manager	Shard Server	Server storing shard data
Ledger Migration	Shard Migration	Reallocation of data among nodes

8. Thought Exercises and Practice

How to implement dynamic shard scaling without service interruption?
Design client-side shard mapping caching to reduce shard controller load.
Implement leader election and failure recovery for shard replicas.

9. Conclusion: The Path to Scalable Sharded Key-Value Stores

Sharded key-value systems combine shard management, load balancing, and Raft replication to deliver highly available and high-performance data services. Mastering these design principles and practical skills is key to building large-scale distributed storage.

Data Sharding and Load Balancing The Scalability Boosters for Distributed Systems

HeWen — 2025-07-29T16:00:00+08:00

1. Library “Categorized Shelves” and “Visitor Diversion”

Imagine a large library where all books are piled in one area — searching is slow and crowded. Instead, books are categorized and shelved separately (data sharding), and visitors are directed to different reading areas (load balancing), making library operations orderly and efficient.

2. Principles of Distributed Data Sharding and Partitioning

1. What is Data Sharding?

Splitting massive data into multiple “chunks,” each stored on different servers, reducing single-node pressure and enabling horizontal scaling.

Data Sharding Illustration:

Data Set
  ├── Shard 1
  ├── Shard 2
  ├── Shard 3
  └── ...

2. Partitioning Strategies

Strategy	Description	Pros & Cons
Range Partition	Data divided by key ranges	Fast range queries but risks data skew
Hash Partition	Hash key mod to assign shards	Good load balance but no range queries
Consistent Hash	Dynamic shard adjustment for smooth scaling	High scalability, complex implementation

3. Load Balancing Strategies and Algorithms

1. Goals of Load Balancing

Evenly distribute requests to avoid node overload
Dynamically adapt to node join/leave

2. Common Load Balancing Algorithms

Algorithm	Description	Suitable Scenarios
Round Robin	Requests distributed in turns	Nodes with balanced capacity, simple to implement
Weighted Round Robin	Requests allocated based on node weight	Adjust load for heterogeneous nodes
Least Connections	Assign to node with fewest active connections	Long-lived connections apps
Consistent Hash	Requests mapped by key hash to node	Cache systems and distributed storage

4. Data Replication and Migration Mechanisms

1. Necessity of Data Replication

Improve data reliability
Support read scalability

2. Migration Challenges

Ensure data consistency
Minimize service disruption risk

3. Migration Flow Illustration

Data Migration Process:

Original Shard Node              New Shard Node
       ↓                                  ↑
Read/Write Requests --> Data Sync --> Switch Access Path

5. Go Example: Simple Hash-based Sharding

func getShard(key string, shardCount int) int {
    h := fnv.New32a()
    h.Write([]byte(key))
    return int(h.Sum32()) % shardCount
}

6. Debugging and Practical Tips

Monitor shard loads and adjust partitioning timely
Simulate node join/leave to test migration mechanisms
Observe request distribution to detect hotspots and bottlenecks

7. Terminology Mapping Table

Everyday Term	Technical Term	Description
Bookshelf Partition	Data Shard	Horizontal split storage unit
Librarian	Load Balancer	Component distributing requests
Book Relocation	Data Migration	Data reassignment among nodes

8. Thought Exercises and Practice

How to design a dynamic scaling data sharding strategy?
How to combine load balancing with consistent hashing for seamless scaling?
Implement a simple sharding function and simulate request assignment.

9. Conclusion: Sharding and Load Balancing Bring Systems to Life

Effective data sharding and load balancing are key technologies for horizontal scaling in distributed systems. Mastering these methods helps systems stay robust and efficient amid exploding data and surging access demands.

Fault-tolerant Key-Value Store Based on Raft Practical Analysis

HeWen — 2025-07-29T15:00:00+08:00

1. Group Accounting, No Lost Ledgers

Imagine a team jointly managing a ledger, where everyone can record transactions anytime but must ensure everyone sees the latest and consistent accounts. The Raft-based fault-tolerant key-value store solves exactly this “ledger synchronization” problem.

2. System Design Goals

Fault Tolerance: Continue serving despite node failures
Consistency: All clients see synchronized data
High Performance: Fast response for read/write requests

3. Architecture and Core Workflow

Client Request Flow:

Client ----> Leader Node ----> Append Raft Log ----> Replicate Log to Followers ----> Commit Log ----> Update State Machine ----> Respond to Client

Client requests are received by the Leader
Leader packages operations into log entries appended to local log
Log entries are replicated concurrently to a majority of Followers
After commit, entries are applied to the key-value store state machine
Finally, Leader returns execution results to the client

4. Key Code Examples (Go)

1. Client Write Request Handling

func (kv *KVServer) PutAppend(args *PutAppendArgs, reply *PutAppendReply) {
    kv.mu.Lock()
    defer kv.mu.Unlock()

    if !kv.rf.IsLeader() {
        reply.Err = ErrWrongLeader
        return
    }

    op := Op{
        Key:   args.Key,
        Value: args.Value,
        Type:  args.Op, // "Put" or "Append"
    }

    index, _, isLeader := kv.rf.Start(op)
    if !isLeader {
        reply.Err = ErrWrongLeader
        return
    }

    // Wait for the log entry to be committed and applied
    kv.waitForCommit(index)
    reply.Err = OK
}

2. State Machine Update (After Log Commit)

func (kv *KVServer) applyCommand(cmd Op) {
    switch cmd.Type {
    case "Put":
        kv.store[cmd.Key] = cmd.Value
    case "Append":
        kv.store[cmd.Key] += cmd.Value
    }
}

5. Consistency Maintenance and Idempotency Design

Avoid duplicate execution: Track client request IDs to ensure each request executes once
Read request handling: Usually served directly from Leader’s local state to ensure linearizability

6. Debugging Tips and Practical Skills

Use Raft logs to trace request states
Simulate node crashes to verify fault recovery
Test repeated requests to ensure idempotency correctness
Use network delay simulation to locate system bottlenecks

7. Terminology Mapping Table

Everyday Term	Technical Term	Explanation
Ledger	Key-Value Store	Data structure storing key-value pairs
Accounting Action	Client Request	Write or append data operation
Meeting Resolution	Raft Log Commit	Achieving consensus and applying operations
Chairperson	Leader	Coordinates requests and log replication

8. Thought Exercises and Practice

How to ensure the order consistency of concurrent write requests?
Design an idempotency mechanism to prevent duplicate request execution.
Extend the implementation to support snapshotting to avoid infinite log growth.

9. Conclusion: Protect Your Data Ledger with Raft

The fault-tolerant key-value store built on Raft uses distributed log replication and state machine application to achieve strong consistency and reliability. Understanding and mastering this design is a crucial step toward building production-grade distributed storage systems.

Fault Tolerance and High Availability -- Building Stable Distributed Systems

HeWen — 2025-07-28T14:30:00+08:00

1. Airplane “Failures” and Passenger Safety

Imagine a flight where the airplane might encounter engine failure or turbulence. To ensure safety, multiple backup systems and emergency plans are designed. Distributed systems face similar “failures,” and ensuring stable operation is a core design challenge.

2. Fault Models and Fault Handling

1. Common Fault Types

Fault Type	Description	Analogy Example
Node Failure	Server crash or shutdown	Airplane engine failure
Network Failure	Network partition, message loss or delay	Airplane communication cut-off
Software Bug	Program bug causing abnormal behavior	Flight system software flaw
Hardware Fault	Disk failure, memory error	Airplane instrument failure

2. Fault Tolerance Goals

Detect faults: Quickly identify anomalies
Recover service: Replace or repair failed nodes
Maintain consistency: Ensure data correctness

3. Fault Tolerance Techniques

1. Retry Mechanism

Automatically retry failed requests, suitable for transient faults.

// Simple retry example
func Retry(op func() error, attempts int) error {
    for i := 0; i < attempts; i++ {
        if err := op(); err == nil {
            return nil
        }
        time.Sleep(time.Millisecond * 100)
    }
    return errors.New("all retries failed")
}

2. Checkpoint

Periodically save system state to reduce recovery workload.

Checkpoint Illustration:

Running State ----> [Save Snapshot] ----> New State
    ↑                             |
    |-----------------------------|
      Recovery starts from snapshot

3. Failover

Automatically switch to standby nodes to ensure continuous service.

Failover Process:

Primary Node Failure
       ↓
Monitoring Detects
       ↓
Standby Node Takes Over
       ↓
Service Restored

4. High Availability and Service Level Agreement (SLA)

1. Availability Metric

Availability = (Uptime) / (Total Time)
Common targets: 99.9% (“three nines”) availability equals roughly 8.7 hours downtime per year

2. SLA Definition

SLA specifies quality and availability commitments, including response and recovery times.

SLA Metric	Description	Example
Availability	Percentage uptime	99.9%
Response Time	Max time for request	Within 100ms
Recovery Time	Time to recover from failure	Within 5 minutes

5. Practical Observations and Debugging Tips

Monitoring Systems: Real-time health detection and alerting
Log Analysis: Trace fault causes and bottlenecks
Fault Injection: Simulate failures to verify resilience
Recovery Drills: Regular failover process testing

6. Terminology Mapping Table

Everyday Term	Technical Term	Explanation
Backup Engine	Standby Node	Server that takes over when primary fails
Repair Plane	Fault Recovery	Restoring system to normal operation
Retry Attempt	Retry Mechanism	Automatic request resending on failure
Safety Net	Checkpoint	Periodic system snapshot

7. Thought Exercises and Practice

How to design retry strategies to avoid cascading failures?
How do checkpoints and logs coordinate during recovery?
Implement a simple failover detection and switchover module.

8. Conclusion: The Engineering Wisdom of Fault Tolerance and High Availability

Fault tolerance techniques and high availability design form the foundation of business continuity in distributed systems. Understanding fault models, leveraging retries and checkpoints wisely, and designing reasonable failover and SLA agreements are essential skills for every distributed systems engineer.

Practical Raft -- A Deep Dive into Distributed Replicated Log Systems

HeWen — 2025-07-28T13:00:00+08:00

1. Team “Leader” Election and Task Synchronization

Imagine a project team that needs to select a leader through voting. The leader then assigns tasks to ensure everyone executes according to plan. Raft is such a “democratic” mechanism that guarantees multiple nodes coordinate consistently.

2. Core Design of Raft

1. Roles and States

Node Roles:
- Leader: Handles client requests and manages log replication
- Follower: Passively accepts commands from the leader
- Candidate: Competes to become the leader

2. Election Mechanism

Each follower waits for a randomized election timeout before becoming a candidate
Candidates request votes; the one with majority votes becomes leader
Leader periodically sends heartbeats (AppendEntries RPC) to prevent new elections

3. Log Replication

Leader receives client commands and appends them to its log
Leader replicates logs concurrently to all followers
Once a log entry is written by a majority, it is committed and applied to the state machine

3. Detailed Workflow

Raft Workflow:

Client Request
    ↓
Leader appends entry to log
    ↓
Sends AppendEntries RPC concurrently to Followers
    ↓
Followers write logs and respond success
    ↓
Leader confirms majority success, commits logs
    ↓
Apply to state machine

4. Core Code Examples (Go)

1. Election Timeout Triggering Election

func (rf *Raft) electionTimeout() {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if rf.role != Leader && time.Since(rf.lastHeartbeat) > rf.electionTimeout {
        rf.startElection()
    }
}

2. Sending Vote Requests

func (rf *Raft) startElection() {
    rf.currentTerm++
    rf.role = Candidate
    rf.votedFor = rf.me
    votes := 1
    for _, peer := range rf.peers {
        if peer == rf.me {
            continue
        }
        go func(p int) {
            voteGranted := rf.sendRequestVote(p)
            if voteGranted {
                votes++
                if votes > len(rf.peers)/2 {
                    rf.becomeLeader()
                }
            }
        }(peer)
    }
}

3. Appending Log Entries

func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if args.Term < rf.currentTerm {
        reply.Success = false
        return
    }
    rf.lastHeartbeat = time.Now()
    rf.role = Follower
    rf.currentTerm = args.Term
    rf.log = append(rf.log, args.Entries...)
    reply.Success = true
}

5. Debugging Tips and Practical Experience

Simulate network delays and partitions to test election stability
Monitor log consistency to avoid log loss or out-of-order entries
Use Go’s race detector to catch race conditions
Add detailed logs for state transitions to diagnose role changes

6. Terminology Mapping Table

Everyday Term	Technical Term	Explanation
Team Leader	Leader	Manages logs and commands cluster
Team Member	Follower	Receives and executes leader commands
Candidate	Candidate	Runs for leadership
Vote	RequestVote RPC	Message requesting votes
Heartbeat	AppendEntries RPC	Leader’s periodic authority message

7. Thought Exercises and Practice

How does Raft prevent multiple leaders during network partitions?
Design log compaction and snapshot mechanisms to improve performance.
Implement AppendEntries RPC with retries and timeout handling.

8. Conclusion: Master Distributed Consensus with Raft

Raft’s clear role definitions and workflows make it a cornerstone of distributed system consistency. Understanding and implementing Raft is key to mastering distributed log replication and fault tolerance design.

Demystifying Distributed Consistency CAP Theorem and Raft Algorithm Explained

HeWen — 2025-07-27T12:00:00+08:00

1. “Agreement” in Team Collaboration

Imagine a group of friends planning a trip but living in different cities. Messages have delays and some may lose connection, so opinions may not be unified. The consistency challenge in distributed systems is similar: how to get multiple machines to “agree” even under unreliable networks is crucial.

2. Consistency Models and the CAP Theorem

1. Overview of Consistency Models

Model	Description	Everyday Analogy
Strong Consistency	All nodes see the latest data instantly	Everyone receives the updated plan simultaneously
Eventual Consistency	Data eventually syncs but may be temporarily inconsistent	Some get the plan earlier, others later
Weak Consistency	No guarantee of synchronization; states may diverge for long periods	Everyone has different travel plans

2. CAP Theorem Trade-offs

CAP theorem states that a distributed system cannot simultaneously guarantee Consistency (C), Availability (A), and Partition tolerance (P); only two can be achieved at the same time.

CAP Theorem Diagram:

     Consistency (C)
        / \
       /   \
Availability (A) — Partition tolerance (P)

Trade-off	Representative Systems	Use Cases
CA	Single-node databases	Stable network, no partitions
CP	ZooKeeper	Systems needing strong consistency
AP	Dynamo, Cassandra	Highly available, eventually consistent systems

3. Replica Mechanisms and Data Consistency

Replication improves reliability and performance, but keeping replicas consistent is challenging. Common replication methods:

Primary-Backup: Primary handles writes; backups asynchronously sync
Multi-Master: Multiple writable nodes, complex conflict resolution

Consistency guarantees rely on consensus algorithms to synchronize logs and state.

4. Core Consistency Protocol: Raft Algorithm Explained

Raft is known for simplicity and divides nodes into three roles:

Raft Roles:

Leader       Followers        Candidate
  ↑              ↑               ↑
  | ←——— Election Process ———→ |

1. Leader Election

All nodes start as Followers
After election timeout, a node becomes Candidate and requests votes
Gains majority votes to become Leader

2. Log Replication

Leader receives client commands and appends them to its log
Concurrently replicates logs to Followers
Commits logs after majority acknowledgement, updates state machine

3. Safety and Fault Tolerance

Ensures log consistency and prevents split-brain
Uses term numbers to prevent outdated leaders from committing logs
Handles network partitions and node failures

// Raft log append pseudocode example
func (rf *Raft) AppendEntries(args *AppendEntriesArgs, reply *AppendEntriesReply) {
    rf.mu.Lock()
    defer rf.mu.Unlock()
    if args.Term < rf.currentTerm {
        reply.Success = false
        return
    }
    rf.log = append(rf.log, args.Entries...)
    reply.Success = true
}

5. Practical Tips for Observation and Debugging

Observe leader election and heartbeat via logs
Test fault tolerance by simulating network partitions
Use Go debugging tools like Delve to trace state changes

6. Terminology Mapping Table

Everyday Term	Technical Term	Explanation
Meeting Host	Leader	Coordinates log replication and state updates
Attendees	Follower	Receives leader commands and stays in sync
Candidate	Candidate	Initiates election to become leader
Voting	Vote	Mechanism for electing leader

7. Thought Chain and Exercises

How does the CAP theorem guide real-world system design?
How does Raft prevent split-brain scenarios?
Implement a simplified Raft supporting election and log replication.

8. Conclusion: Protect Distributed Data Consistency with Raft

Distributed consistency is the cornerstone for stable system operation. The CAP theorem helps understand design trade-offs, and the Raft algorithm provides a clear and practical implementation path. Mastering these is a key step toward becoming a distributed systems expert.

MapReduce in Practice -- An Introduction to Distributed Big Data Processing

HeWen — 2025-07-27T11:00:00+08:00

1. Analogy: Efficient Collaboration in a Distributed Kitchen

Imagine a large kitchen tasked with preparing thousands of dishes. If one chef does all the work, efficiency suffers. MapReduce works like dividing the chefs: some chop ingredients (Map), others cook the dishes (Reduce), and finally all dishes are served efficiently and orderly.

2. MapReduce Framework Design and Principles

MapReduce consists of two phases:

Map phase: Input data is split into independent chunks, each processed separately to generate a series of pairs
Reduce phase: Data with the same key is aggregated and processed to produce the final results

This design inherently supports parallelism and fault tolerance.

MapReduce Flowchart:

Input Data
   ↓ Split into chunks
[Map Task 1] [Map Task 2] ... [Map Task N]
   ↓ Produce intermediate  pairs
Shuffle phase (group by key)
   ↓
[Reduce Task 1] [Reduce Task 2] ... [Reduce Task M]
   ↓ Aggregate processing
Final Results

3. Core Go Implementation: Writing Map and Reduce Functions

1. Example Map Function

Suppose counting word occurrences in a text; the Map function splits text into words and outputs key-value pairs for each word.

func Map(filename string, contents string) []KeyValue {
    // Split text by whitespace into words
    words := strings.Fields(contents)
    kva := []KeyValue{}
    for _, w := range words {
        kva = append(kva, KeyValue{Key: w, Value: "1"})
    }
    return kva
}

2. Example Reduce Function

The Reduce function receives all values for a word and sums them up.

func Reduce(key string, values []string) string {
    count := 0
    for range values {
        // Each value is "1"; count occurrences
        count += 1
    }
    return strconv.Itoa(count)
}

4. Basic Methods for Parallel Data Processing

Input splitting: Large files are divided into chunks, distributed to multiple Map tasks
Shuffle phase: Map outputs are grouped by key and sent to Reduce tasks
Concurrent execution: Map and Reduce tasks run in parallel across multiple machines or threads, improving throughput
Fault tolerance: Failed tasks can be restarted, ensuring final correctness

5. Practical Tips for Observation and Debugging

Local debugging: Use Go’s built-in test framework to verify Map and Reduce correctness
Log printing: Track anomalies during data processing
Simulate failures: Intentionally cause task failures to test fault tolerance
Performance monitoring: Observe execution time and optimize data chunk sizes

6. Terminology Mapping Table

Everyday Term	Technical Term	Explanation
Chef who chops	Map function	Processes data splits, generates intermediate results
Chef who cooks	Reduce function	Aggregates intermediate data, produces final results
Kitchen section	Data chunk	Input data split into multiple processing units
Serving process	Shuffle	Transfer of intermediate data from Map to Reduce

7. Thinking and Exercises

How to design Map functions to accommodate different data types and aggregation needs?
How to implement complex aggregation operations in Reduce?
Design a simple word frequency program handling multiple text inputs, validating parallelism effectiveness.

8. Summary: MapReduce Makes Big Data Processing Accessible

By splitting tasks and executing in parallel, MapReduce greatly improves big data processing efficiency and reliability. Mastering Map and Reduce function design is the first step to understanding distributed computing and lays a solid foundation for learning distributed consistency and fault tolerance.

Distributed Communication Essentials -- RPC and an Introduction to Go Concurrency

HeWen — 2025-07-26T10:00:00+08:00

1. Opening the Magic Box of Distributed Communication: What is RPC?

In distributed systems, different machines need to “talk” to each other to collaborate. RPC (Remote Procedure Call) is a magical technique that lets you call a remote service as if it were a local function.

Everyday Analogy

Imagine you’re cooking at home but want to use your neighbor’s oven. You call them (RPC) to ask for help baking. Though not under the same roof, you can give instructions as if it’s your own kitchen.

2. Core Mechanism of RPC

The key to RPC is to “unwrap” the function call into a request, transmit it over the network, then “wrap” the response back to the caller. It mainly includes:

Client Stub: Packages function calls and sends requests
Server Stub: Receives requests and calls the actual implementation
Transport Protocol: Ensures safe and reliable data transfer over the network

RPC Call Flow Diagram:

Client Application
    ↓ calls local function
Client Stub
    ↓ encodes and sends request
Network Transport
    ↓ decodes request
Server Stub
    ↓ calls real service function
Returns result


---

## 3. Introduction to Go: A Tool for Concurrency and Networking

Go is popular for distributed development due to its simplicity, efficiency, and built-in concurrency support.

### 1. Basic Go Syntax Recap

```go
// Simple function example
func Add(a, b int) int {
    return a + b
}

2. goroutine: Lightweight Threads

Go implements concurrency with goroutines, supporting tens of thousands without issue.

go func() {
    fmt.Println("Hello from goroutine")
}()

3. Channel: Safe Communication Pipelines

Goroutines communicate via channels, avoiding complex locks by passing messages.

ch := make(chan int)
go func() {
    ch <- 42  // send data
}()
val := <-ch   // receive data

4. Combining RPC and Go Concurrency: Designing Efficient Distributed Communication

Go’s goroutines and channels simplify and enhance RPC implementations:

Each RPC request handled in a separate goroutine, naturally supporting concurrency
Channels can be used for asynchronous messaging and event notification
The built-in net/rpc package abstracts serialization and transport, easing development

5. Example: Simple RPC Server and Client in Go

// Server: providing addition service
type Arith struct{}

func (a *Arith) Add(args *Args, reply *int) error {
    *reply = args.A + args.B
    return nil
}

func main() {
    arith := new(Arith)
    rpc.Register(arith)
    listener, _ := net.Listen("tcp", ":1234")
    for {
        conn, _ := listener.Accept()
        go rpc.ServeConn(conn)
    }
}

// Client call example
client, _ := rpc.Dial("tcp", "localhost:1234")
args := &Args{A: 10, B: 20}
var reply int
client.Call("Arith.Add", args, &reply)
fmt.Println("Result:", reply)

6. Debugging and Performance Optimization Tips

Use Delve to debug goroutine scheduling and deadlocks
Capture packets with tcpdump to analyze RPC request details
Control the number of goroutines to avoid resource exhaustion
Use connection pooling to reuse TCP connections and reduce latency

7. Terminology Mapping Table

Everyday Expression	Technical Term	Explanation
Making a Phone Call	RPC	Remote call, cross-machine function invocation mechanism
Courier	Stub	Proxy component packaging and receiving requests
Lightweight Knight	goroutine	Lightweight thread enabling efficient concurrency
Pipeline	Channel	Safe communication mechanism between goroutines

8. Thinking and Exercises

How does RPC ensure reliability and order of calls?
How does Go’s concurrency model avoid pitfalls of traditional threads?
Implement an RPC client supporting timeout and retry mechanisms.

9. Summary: Making RPC and Go the Powerful Engines of Distributed Systems

RPC connects the “nerves” of distributed systems, while Go’s concurrency makes those “nerves” efficient and stable. Mastering both lets you build flexible and robust distributed communication systems.