Currently the raft log grows indefinitely, which is a serious memory management issue and unfeasible. A node that was offline for hours must receive potentially millions of log entries over the network via AppendEntries RPCs. Persisted state in raft_state_*.gob grows without limit.
Implement snapshotting mechanism, as per the Raft paper - Section 7 to solve this issue.
In short,
- Periodically snapshot the KV state machine at a committed index
- Truncate log entries before that index
- Add
InstallSnapshot RPC for leaders to send snapshots to lagging followers instead of replaying the entire log
Currently the raft log grows indefinitely, which is a serious memory management issue and unfeasible. A node that was offline for hours must receive potentially millions of log entries over the network via AppendEntries RPCs. Persisted state in
raft_state_*.gobgrows without limit.Implement snapshotting mechanism, as per the Raft paper - Section 7 to solve this issue.
In short,
InstallSnapshotRPC for leaders to send snapshots to lagging followers instead of replaying the entire log