Skip to content

repo refactor: M5: Implement bounded hot-path pre-allocation/allocation-reduction improvements and capture before/after benchmark deltas#12

Merged
Aroesler1 merged 1 commit into
mainfrom
codex/repo_refactor-199-20260405-225632
Apr 5, 2026
Merged

repo refactor: M5: Implement bounded hot-path pre-allocation/allocation-reduction improvements and capture before/after benchmark deltas#12
Aroesler1 merged 1 commit into
mainfrom
codex/repo_refactor-199-20260405-225632

Conversation

@Aroesler1
Copy link
Copy Markdown
Owner

Repo job repo_refactor for objective #38 is ready for review and merge.

Objective: Build project: Real-Time Limit Order Book Engine in C++

Job instructions:
On top of main, make narrowly scoped performance-engineering changes in the replay/book/analytics hot path that reduce avoidable heap activity without changing external behavior. Focus on explicit pre-allocation/reserve strategies for order lookup and rolling analytics buffers, removal of unnecessary per-message temporary allocations/copies, and any obvious benchmark-safe fast-path cleanup already implied by the current design. Do not change analytics schema or replay semantics. Extend the benchmark harness or report-generation path as needed so the repo can produce comparable before/after measurements for both container variants on the documented AAPL fixture and the additional in-repo ticker fixtures. Update the benchmark report to describe the exact optimization changes, measurement method, hardware/compiler context, and before/after throughput results. Validate with existing tests plus parity/correctness checks for both backends.

Why this PR looks credible:
Tests passed on the current branch. The change is coherent and appropriately scoped for M5: it introduces real source-level hot-path allocation reductions in replay and analytics, wires derived reserve hints into the CLI and benchmark paths, adds a backend-parity analytics test, and updates the benchmark report with methodology and before/after results. I did not find a clear semantic regression in the diff itself, and the implementation aligns with the objective of reducing avoidable heap activity without changing replay or analytics behavior. Audit cleared the diff, with 2 remaining validation gap(s) to watch. Backend verification returned unsupported.

Audit summary:
The change is coherent and appropriately scoped for M5: it introduces real source-level hot-path allocation reductions in replay and analytics, wires derived reserve hints into the CLI and benchmark paths, adds a backend-parity analytics test, and updates the benchmark report with methodology and before/after results. I did not find a clear semantic regression in the diff itself, and the implementation aligns with the objective of reducing avoidable heap activity without changing replay or analytics behavior.

Audit must-fix items:

  • None

Validation gaps:

  • The audit record only shows the pytest wrapper result (2 passed) and not a fresh ctest or raw benchmark command transcript, so benchmark reproducibility is documented but not independently evidenced in the supplied execution log.
  • Before/after benchmark numbers are documented in markdown, but there is no checked-in automation that regenerates or verifies the report tables from captured benchmark output.

Copilot AI review requested due to automatic review settings April 5, 2026 23:14
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Performance-focused refactor to reduce avoidable heap activity in the replay + analytics hot path, while keeping replay semantics and analytics schema unchanged, and documenting before/after benchmark deltas.

Changes:

  • Derive and propagate OrderBookBuildConfig reserve hints from the parsed message stream (instead of hard-coded defaults) into CLI + benchmark flows.
  • Reduce per-message allocations/copies in analytics by reusing buffers (reset() reuse), switching to capacity-stable trade window storage, and constructing AnalyticsRow in place during replay.
  • Add a C++ test asserting analytics output parity across map and flat_vector backends; update README/report with measurement methodology and results.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 5 comments.

Show a summary per file
File Description
tests/test_analytics.cpp Adds analytics parity test across backends and snapshot equality assertion.
src/replay.cpp Adds message-stream-based build config derivation; uses it in benchmark path.
src/main.cpp Uses derived build config; passes expected message count into analytics config.
src/analytics.cpp Introduces pre-sized rolling buffers + in-place row fill; reuses allocations on reset.
include/lob/replay.hpp Exposes derive_order_book_build_config() API.
include/lob/analytics.hpp Adds expected_messages to config; adds in-place on_message overload.
benchmark/lob_benchmark.cpp Prints derived reserve hints; uses derived build config per run mode.
report/benchmark_report.md Expands methodology + before/after numbers and scope notes.
README.md Summarizes optimization changes and includes updated benchmark table + method.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread src/replay.cpp

BuildEstimate estimate_build_requirements(const std::vector<LobsterMessage>& messages) {
EstimatedOrders orders;
orders.reserve(messages.size());
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

estimate_build_requirements() reserves orders to messages.size(), which can be orders of magnitude larger than the peak active order count. On large datasets this pre-pass can allocate a huge unordered_map bucket array (potentially OOM) just to compute reserve hints. Consider removing this reserve or capping it to a reasonable upper bound / heuristic (e.g., based on a sampled estimate or a user-provided hint).

Suggested change
orders.reserve(messages.size());

Copilot uses AI. Check for mistakes.
Comment thread src/analytics.cpp
explicit Impl(const AnalyticsConfig& config)
: trade_window_capacity(std::max<std::size_t>(1, config.trade_window_messages)),
trade_window(trade_window_capacity),
mid_window(config.expected_messages) {}
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

mid_window reserves to config.expected_messages, and replay_with_analytics() defaults that to messages.size(). For large replays this can force a very large upfront allocation even though the realized-vol window only needs a sliding subset. Consider making expected_messages a bounded hint (or default it to 0) and reserving based on an estimate of samples retained in the time window rather than total message count.

Suggested change
mid_window(config.expected_messages) {}
mid_window(0) {}

Copilot uses AI. Check for mistakes.
Comment thread src/analytics.cpp
Comment on lines +232 to +235
if (config.expected_messages == 0) {
config.expected_messages = messages.size();
}

Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

replay_with_analytics() sets expected_messages = messages.size() when unset, which can cause large preallocation in AnalyticsEngine on big datasets. If expected_messages is meant as an optional hint, consider leaving it at 0 by default (no reserve) or applying an upper bound so analytics runs don't accidentally allocate proportional to the full input size.

Suggested change
if (config.expected_messages == 0) {
config.expected_messages = messages.size();
}

Copilot uses AI. Check for mistakes.
Comment thread src/main.cpp
Comment on lines 239 to 244
lob::AnalyticsConfig{
options.trade_window_messages,
options.realized_vol_window_seconds,
std::max<std::size_t>(options.depth, 10),
messages.size(),
});
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This passes messages.size() as AnalyticsConfig::expected_messages, which can trigger very large upfront reserves in analytics on large datasets. Consider making this a bounded/optional hint (or omit it) so running the CLI with analytics enabled doesn’t unexpectedly allocate memory proportional to the entire input file.

Copilot uses AI. Check for mistakes.
Comment on lines 184 to +190
for (bool enable_preallocation : reserve_modes) {
lob::OrderBookBuildConfig build_config;
build_config.expected_orders = messages.size();
build_config.expected_levels_per_side = 64;
build_config.enable_preallocation = enable_preallocation;
build_config = lob::derive_order_book_build_config(messages, build_config);
std::cout << "Reserve hints reserve=" << (enable_preallocation ? "on" : "off")
<< " expected_orders=" << build_config.expected_orders
<< " expected_levels_per_side=" << build_config.expected_levels_per_side << '\n';
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

derive_order_book_build_config() is called inside the reserve_modes loop, so the message-stream estimation pass runs once per reserve mode even though the hints are identical except for enable_preallocation. Consider deriving once before the loop and then only toggling build_config.enable_preallocation (or caching the derived values) to avoid redundant work when benchmarking large datasets.

Copilot uses AI. Check for mistakes.
@Aroesler1 Aroesler1 merged commit 0d108f8 into main Apr 5, 2026
8 checks passed
@Aroesler1 Aroesler1 deleted the codex/repo_refactor-199-20260405-225632 branch April 6, 2026 02:01
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants