"A no-nonsense, performant file syncing engine that thinks in diffs and sleeps when idle."
Tired of constant polling? So were we. Tired of syncing whole files just because 2 bytes changed? Same here.
Synclet is built to sync files precisely, using content-defined chunking, event-driven notifications, and peer-based snapshots, all without wasting cycles.
- 💤 No Polling – Uses
inotify+epollto only wake up when something changes. - 🧩 Smart Chunk Sync – Breaks files into logical chunks using rolling hashes, syncing only what’s needed.
- 🧠 Snapshot Hashing – 32-byte global snapshot versioning detects even the smallest file change across machines.
- 🔄 Two-Way Syncing – Automatically detects if peer is ahead or behind, then fetches or sends.
- 📦 Binary-safe Transfers – No corruption, even in weird chunk boundaries. Everything's binary-packed.
- 📂 Recursive Directory Tracking – Watches all nested directories efficiently.
- 🛠️ Protocol-Based Events – From
FILE_CREATEtoMODIFIED_CHUNK, your peer always knows what's happening. - 📊 Progress Tracking – Visual indicators of how much sync has completed.
Each file is chunked using rolling Rabin-Karp hashes:
- A window slides over bytes.
- If
hash % N == 0, a chunk boundary is marked. - SHA-256 ensures strong uniqueness of each chunk.
✅ CDC over fixed-size: Fixed-size fails for small inserts — everything shifts. Not here.
- Uses
inotify + epollto watch files/dirs non-blockingly. - File events (
modify,create,move,delete) are captured instantly, no polling.
Your protocol isn't ad-hoc. It's designed like this:
TYPE: MODIFIED_CHUNK
PAYLOAD: {
filename, offset, chunk_size, old_chunk_size, is_last_chunk
}🔗 All operations are structured via typed messages + raw data when required.
-
Peers exchange
snap-versionhashes. -
If versions differ:
- Only the required chunks/files are transferred.
- Supports downloading missed changes or pushing local changes back.
make client # Builds the client binary in ./client/output
make server # Builds the server binary in ./server/outputSee
dev-logs.txtfor more
- ✅ Reduced polling overhead using
inotifyinstead of file scan loops. - ✅ Chunking optimization from char-by-char to buffer-based reads to reduce syscalls.
- ✅ Smart rename detection using
cookieand timestamps. - ✅ Snap versioning to avoid sending full snapshot every time.
- ✅ Progress bar for user feedback.
- ✅ File sanitization to avoid invalid chars in saved chunks.
- Event-driven I/O: Replaced polling-based file monitoring with
inotify+epollfor CPU-efficient, non-blocking operations - Content-defined chunking: Implemented Rabin-Karp rolling hash algorithm to detect logical chunk boundaries, enabling delta synchronization
- Snapshot versioning: Designed 32-byte hash-based versioning system to track directory states and detect changes across peers
- Binary-safe storage: Implemented custom binary format for chunk metadata and data to prevent corruption across all file types
- Smart rename detection: Developed timer-based pairing of
IN_MOVED_FROM/IN_MOVED_TOevents using 200ms windows and cookie matching - Ordered reconstruction: Used ordered sets to ensure deterministic file reconstruction regardless of chunk arrival order
- Efficient I/O: Optimized from character-by-character to buffer-based reads to minimize system calls
- Bidirectional sync: Implemented comparison logic to determine whether to push local changes or fetch remote updates based on snapshot versions
- Progress visibility: Added real-time progress tracking for ongoing sync operations
- Recursive monitoring: Efficient tracking of nested directory structures with automatic watch registration
All low-level strategies, from CDC algorithms to protocol structure, are in blue-print.txt. Here’s a glimpse:
- 🔍 Hybrid Chunk Hashing: Rolling hash to detect chunk boundary, SHA-256 to fingerprint.
- 🔐 Binary Chunk Storage: Metadata and data are stored in raw
.binfiles with custom structure. - 📤 Ordered Set for Chunks: Ensures deterministic file reconstructions from random chunk delivery order.
- 🕵️ Rename detection: Paired
IN_MOVED_FROM/IN_MOVED_TOusing 200ms time window.
Engineers who hate noisy logs, wasted CPU cycles, or bloated tools — and want clarity, control, and clean syncing logic.
Thanks to:
inotify,epoll, andtimerfd— for event-driven efficiency.nlohmann/json— for no-fuss JSON serialization.- Classic chunking papers + rsync for conceptual inspiration.