Zulfa Falah
Zulfa Falah

Software Engineer

RAFT Consensus Protocol

The goal of RAFT is to ensure all nodes in a distributed system agree on the same data state, even when some nodes fail, experience delays, or get disconnected from the network.

RAFT aims to achieve a "Single Source of Truth"one unified source of truth for data across the entire cluster.


Three Main Components

  1. Leader Election

    • Only 1 node becomes the leader in a given term (period).
    • A follower that does not receive a heartbeat within a certain time → becomes a candidate → starts an election.
    • Majority vote = becomes the leader.
  2. Log Replication

    • The leader receives commands (e.g., write new data).
    • The leader propagates the log to all followers.
    • After the majority acknowledges (ACK), the log is considered committed.
    • All nodes will have the same log in the same order.
  3. Safety & Fault Tolerance

    • Decisions that have been "committed" will not change even if the leader dies.
    • A new node joining can catch up from the leader's log.

Basic Mechanism

RoleResponsibility
LeaderReceives requests from clients, sends heartbeats, replicates logs
FollowerWaits for commands from the leader, votes during elections
CandidateNominates itself as leader if it doesn't receive a heartbeat

Simple RAFT Cycle

Follower  (timeout)  Candidate  (wins voting)  Leader
Leader  (dies / disconnects)  Other followers timeout  New election

Key Terms

TermMeaning
TermA time period in Raft; each election starts a new term
HeartbeatA routine message from the leader to followers to confirm it's still alive
Majority VoteHalf + 1 of the total nodes required for consensus
Commit LogData that has been approved by the majority and guaranteed to be consistent

Systems That Use Raft

  • TiDB / TiKV (PingCAP)
  • Etcd (used by Kubernetes)
  • Consul (HashiCorp)
  • CockroachDB, RQLite

Simple Analogy

Imagine 5 people in a meeting (nodes).

  • One person (leader) leads the decisions.
  • If the leader leaves the room → the others elect a new leader.
  • Every new decision must be approved by at least 3 people (majority) to be valid.
  • Everyone records the same meeting notes (log replication).

"Raft maintains one leader, majority agreement, and identical logs across all nodes."

Ref: https://raft.github.io/