A blazing fast, type-safe, and low-latency decoder for Nasdaq TotalView-ITCH 5.0 binary protocol, written in modern C++.
- Processing Speed: ~594,777 messages/second
- Average Latency: 1.68 microseconds per message
- Tested Dataset: Successfully processed 1.2M+ messages
- ⚡ Zero-Copy Memory Mapping: Uses
mmapfor optimal file reading performance - 🔄 Graceful Interruption: Handles Ctrl+C with clean statistics output
- 📊 CSV Export: Generates structured logs for analysis
- 🎯 CPU Cache Optimization: Implements prefetching for enhanced performance
- 🔍 Real-time Monitoring: Tracks and reports processing statistics
This project is built for ultra-low latency and maximum safety, balancing:
- ⚡ Raw performance (via
memcpy-based zero-copy decoding) - 🔐 Type and memory safety (via compile-time layout checks)
- 🧩 Modularity (message-type-specific structs in a namespace)
- 🚀 Production-level reliability (used in real-time feed decoding contexts)
We built a generic PackedParser<T> that performs a single std::memcpy into a pre-defined struct.
- It's much faster than bit-by-bit manual parsing
- Prevents duplicated logic for every message type
- Great for high-throughput environments like HFT
static_assert(std::is_trivially_copyable<T>::value);
static_assert(std::is_standard_layout<T>::value);✅ Prevents use of classes with virtual tables, dynamic memory, or complex constructors.
Ensures no compiler-inserted padding.
- Nasdaq ITCH defines exact byte-level layouts
- Even a single padding byte would corrupt your data
Ensures compile-time correctness of struct layouts.
- Protects against accidental layout shifts
- Catches bugs early in development
- Verifies ITCH compliance
ITCH timestamps are 48-bit nanoseconds-since-midnight values.
- C++ has no native
uint48_t uint8_t[6]preserves the raw bytes safely- Later decoding can be done via a helper
E.g., char stock_symbol[8];, char reason[4];
- Nasdaq strings are fixed-length and space-padded
std::stringis dynamic, unsafe, and non-trivial (breaksPackedParser)char[]ensures byte-perfect alignment and no heap allocations
All structs and logic live under the itch namespace.
- Avoids naming collisions
- Allows message-specific dispatch logic like
itch::parse() - Keeps code modular and organized
Every message struct begins with a char message_type;.
- Enables fast lookup dispatch:
ParserFn parsers[256]; parsers['R'] = [](const char* b, size_t l) { parseStockDirectoryMessage(b, l); };
- Clean separation of parsing and handling
| Message Type | Struct Name | Size | Purpose |
|---|---|---|---|
'S' |
SystemEventMessage |
12 | Market open/close events |
'R' |
StockDirectoryMessage |
39 | Stock definitions |
'H' |
StockTradingActionMessage |
25 | Halt/resume trading |
'Y' |
RegSHOMessage |
20 | Short sale rule info |
'L' |
MarketParticipantPositionMessage |
26 | MPID registration status |
'V' |
MWCBDeclineLevelMessage |
35 | Circuit breaker thresholds |
'W' |
MWCBStatusMessage |
12 | Circuit breaker breach alerts |
✅ Each struct:
- Begins with a
char message_type - Uses
uint8_t[6]for timestamps - Is
#pragma pack(1)aligned - Ends with
static_assert(sizeof(...))
- SIMD for decoding batches of messages
- Arena allocators for short-lived parsed data
- Efficient timestamp converters
- C++20 concepts instead of
static_assert - Benchmark suite for throughput testing
While faster than even memcpy, it:
- Is UB unless alignment is guaranteed
- Fails on platforms that don't allow strict aliasing
- Breaks on virtual classes or non-POD layouts
We chose memcpy + static_assert + pack(1) for speed AND safety.
- Memory-mapped I/O for zero-copy reading
- CPU cache prefetching via
__builtin_prefetch - Signal handling for clean interruption
- High-precision timing measurements using
CLOCK_MONOTONIC - Efficient binary message parsing
- CSV logging for data analysis
Test conducted on real ITCH market data: