An event-driven, single-threaded, in-memory key-value data store and real-time Pub/Sub engine written in C++23. DrutaDB implements an event-driven networking architecture using POSIX sockets and I/O multiplexing, making it a fast, lightweight, and Redis-compatible database.
-
Event-Driven Concurrency: Utilizes the POSIX
poll()API for I/O multiplexing. This allows the server to handle multiple concurrent TCP client connections efficiently on a single thread without the context-switching overhead of a thread-per-connection model. -
Custom RESP Parser: Implements a state-machine-based parser for the Redis Serialization Protocol (RESP). The parser safely processes incoming network byte streams, handling TCP packet fragmentation and partial reads natively.
-
Data Structures: The underlying datastore relies on standard C++ containers (
std::map,std::deque,std::string,std::set) wrapped in a customDrutaValuestruct to support polymorphic data types. -
O(1) LRU Eviction: Implements a Hash Map + Doubly Linked List hybrid to track key recency. This architecture ensures that cache hits, updates, and evictions all occur in constant time, maintaining deterministic performance under high memory pressure.
-
Connection Management: Maintains an active socket watchlist with O(1) connection teardown using back-swapping, minimizing latency during client disconnects.
-
Lazy Eviction: Supports key Time-to-Live (TTL). Expired keys are lazily evaluated and evicted upon read attempts (
GET). -
AOF Persistence & Durability: Implements Append-Only File (AOF) logging to ensure data durability across server restarts. Mutating commands are serialized into RESP format and committed to disk after successful execution.
-
Smart AOF Rewriting: Features a growth-based trigger mechanism to prevent unbounded log growth. It compresses the current memory state into a minimal sequence of commands using an atomic file-swap strategy to ensure zero data corruption.
-
Single-Allocation Message Broadcasting: Implements a Pub/Sub engine where messages are serialized into RESP exactly once (
$O(1)$ construction) and broadcasted to all subscribers in$O(N)$ . -
Resource Safety & Cleanup: Protects against resource exhaustion with a configurable subscription limit per client. Includes an automated cleanup policy that purges subscriber state upon connection teardown to prevent "ghost" subscriptions.
DrutaDB supports the following data structures as values :
-
STRINGS : String values implemented using
std::string. Operations supported :- SET
<key> <val>EX<ttl in seconds>(Used for setting a string value, supportsEXfor seconds ttl,PXfor milliseconds). - GET
<key>
- SET
-
LIST : Values stored in list format, imitating Redis lists. Operations supported :
- RPUSH
<list_name> <element_1> <element_2> ...(Push into list from the right or back side, initialize list if does not exist). - LPUSH
<list_name> <element_1> <element_2> ...(Push into list from the left or front side, initialize list if does not exist). - LPOP
<list_name> [count](Remove and return elements from front). - RPOP
<list_name> [count](Remove and return elements from back). - LRANGE
<list_name> <start_index> <end_index>(Get elements within range from list). - LLEN
<list_name>(Get list length)
- RPUSH
-
HASH : Stores data in object-like format, imitating Redis Hashes. Implemented using a dual data-structure policy, wherein it is initialized as a
std::vector<std::pair<std::string, std::string>>and is maintained as a vector until it has more thanDRUTA_MAX_KEY(usually 64) keys, upon which it is converted into a map structure internally usingstd::map<std::string,std::string>. This was done in order to limit the memory footprint of smaller hashes, inspired by Redis' Listpack Implementation.- HSET
<hash_name> <field_1> <value_1> <field_2> <value_2> ...(Set elements in hash; create hash if not present. The values are stored asfield_1: value_1 field_2:value_2underhash_namekey). - HGET
<hash_name> <key>(Get value). - HGETALL
<hash_name>(Get all key-value pairs underhash_namein the format<key1> <value1> <key2> <value2>). - HDEL
<hash_name> <key_1> <key_2> ...(deletes keys from hash). - HLEN
<hash_size>(number of keys in hash).
- HSET
-
SET : Stores unique data in set format, imitating Redis Sets.
- SADD
<set_name> <val_1> <val_2>(Create or add to a set). - SCARD
<set_name>(Returns cardinality of set). - SISMEMBER
<set_name> <key>(Checks whether element are part of the set). - SMEMBERS
<set_name>(Returns all elements in set). - SMISMEMBER
<set_name> <key1> <key2> ...(Checks whether elements are part of the set). - SINTER
<set1> <set2> <set3>...(Returns intersection of given sets). - SUNION
<set1> <set2> <set3>...(Returns union of given sets). - SDIFF
<set1> <set2> <set3>...(Returns difference of first set with other sets). - SDEL
<val1> <val2> ...(Deletes values from set).
- SADD
-
PUB/SUB : Real-time message broadcasting system.
- SUBSCRIBE
<channel_1> <channel_2> ...(Subscribes the client to the specified channels). - UNSUBSCRIBE
[channel_1] [channel_2] ...(Unsubscribes the client from the given channels, or all channels if none are specified). - PUBLISH
<channel> <message>(Posts a message to the given channel and returns the number of subscribers that received it).
- SUBSCRIBE
Other commands which are supported:
- PING - Returns
PONG. - ECHO
<message>- Returns<message> - DEL
<key_1> <key_2> ...- Deletes keys from store. - FLUSHDB
--sure- Removes all keys from store. Use--sureflag for safer flushing.
DrutaDB ensures data persistence using an Append-Only File (AOF):
- Logging Strategy: Commands are logged to
data/drutadb.aofin standard RESP format. - State Recovery: During startup, the server replays the AOF through the internal
RespParserto restore the memory state. - Rewrite Trigger: To optimize disk space, the server triggers a rewrite when the AOF exceeds 10KB and has doubled in size relative to the previous base.
- Crash Safety: The rewrite process generates a temporary file and utilizes
std::filesystem::renamefor an atomic swap, ensuring that a crash during the rewrite never results in data loss or corruption.
DrutaDB implements a memory-aware eviction policy designed for deterministic performance and strict memory bounds:
-
$O(1)$ LRU Implementation: Utilizes a dual-layered architecture combining astd::mapfor coordinate lookups and a custom Doubly Linked List for access tracking. This ensures that both cache hits (touch) and evictions occur in constant time. -
Physical Heap Accounting: Rather than tracking object counts, DrutaDB calculates actual heap consumption. It accounts for
std::stringcapacity,std::dequeinternal node pointers (sizeof(void*)), and the overhead of theDrutaNodestructure. -
Incremental Memory Updates: Memory deltas for complex types (like LIST) are calculated incrementally during mutation (
LPUSH,RPOP). This avoids costly$O(N)$ full-container scans, maintaining high throughput for large collections. -
RAII-Based Ownership: Employs
std::unique_ptrfor primary node ownership within thekv_store. The LRU list maintains non-owning observer pointers, ensuring safe, leak-free teardowns during rapid eviction cycles. - Memory-Triggered Eviction: Features a strict 64MB ceiling (configurable). When exceeded, the database executes a synchronous "evict-until-safe" loop, purging the least recently used entries until the heap returns within safe bounds.
DrutaDB features a high-performance messaging engine designed for real-time data broadcasting with a focus on resource safety and efficiency:
-
Single-Allocation Broadcast: The engine serializes messages into RESP format exactly once per publish operation, achieving
$O(1)$ message construction. It then iterates through the subscriber set and writes the pre-built buffer directly to each socket in$O(N)$ time, where N is the number of subscribers. -
Resource Protection (Limits): Implements a
MAX_SUBSCRIPTIONS_PER_CLIENTlimit (Default: 128) to protect the server from heap-spraying attacks or runaway client scripts. - Zero-Ghost Subscriber Policy: A dedicated cleanup handler is triggered on socket teardown. It performs a reverse lookup in the client's subscription set to purge all channel entries, ensuring that disconnected clients never leak memory or contaminate the broadcast loop.
-
Idempotent State Management: Subscription logic verifies state before incrementing counters, ensuring that repeated
SUBSCRIBEcommands for the same channel are handled gracefully without inflating memory usage.
DrutaDB is optimized for high-throughput, low-latency workloads. Below are representative results from the internal benchmarking suite (test/benchmark.py):
| Metric | Result |
|---|---|
| Single-client Sequential SET | ~21,000 req/s |
| Pipelined SET (x50 batch) | ~102,000 req/s |
| Concurrent Clients (50 clients) | ~30,500 req/s |
| Mixed Keyspace (70/30 Read/Write) | ~20,800 req/s |
| List Operations (RPUSH/LPOP) | ~20,000 req/s |
| Hash Operations (HSET/HGET) | ~19,000 req/s |
everysecAOF Policy: Implements a time-buffered disk flush (every 1 second), maintaining a p99 tail latency of 0.31ms and sustained 10k+ req/s under concurrent load.- TCP_NODELAY: Disables Nagle's algorithm to eliminate the delayed-ACK penalty, enabling 4.8x higher throughput for pipelined operations, compared to sequential baseline.
Note
Benchmark Environment: Tests conducted on an 12th Gen Intel(R) Core(TM) i5-1235U with peak throughput measured during Turbo Boost. Results may vary based on CPU thermal limits and background process interference.
Due to its reliance on standard POSIX APIs (poll, socket, arpa/inet.h), DrutaDB is highly portable and natively supported in the following environments:
- Linux
- macOS / FreeBSD
- Windows Subsystem for Linux (WSL)
(Note: While poll() provides excellent portability across Unix-like systems, scaling to extreme concurrent loads (e.g., C10K) would require swapping the multiplexer to OS-specific APIs like Linux's epoll or macOS's kqueue.)
Requirements:
- CMake 3.13+
- A C++23 compatible compiler (e.g., GCC 13+)
- Pthreads
mkdir build
cd build
cmake ..
make
./drutadbYou can interact with DrutaDB using the standard redis-cli:
$ redis-cli -p 6379
127.0.0.1:6379> PING
PONG
127.0.0.1:6379> SET session "active" EX 5
OK
127.0.0.1:6379> GET session
"active"
127.0.0.1:6379> RPUSH mylist "item1" "item2"
(integer) 2
127.0.0.1:6379> LRANGE mylist 0 1
1) "item1"
2) "item2"
... and so onShell scripts and Python Benchmark file for testing are added in the test folder.
Run them from root of project by following command:
./test/test.sh
./test/pubsub_test.sh
./test/eviction_test.sh
python3 test/benchmark.py(for initial shell script test, run chmod +x <filename>)
This project is licensed under the MIT License - see the LICENSE file for details.
Contributions are welcome! Open issues or submit pull requests to improve the project!
The inspiration for the name of this project comes from the Sanskrit word druta, which means Fast or Quick.