Add predictive signal report with IC by Aroesler1 · Pull Request #13 · Aroesler1/lob-engine-cpp

Aroesler1 · 2026-04-06T01:45:04Z

Implements the predictive-signal milestone by adding top-5 order-imbalance prediction reporting with configurable horizons, hit rate, and information coefficient, while preserving the existing analytics CSV path.

Validation run locally:

cmake -S . -B build_verify -DCMAKE_BUILD_TYPE=Release
cmake --build build_verify -j2
ctest --test-dir build_verify --output-on-failure -C Release
pytest tests/test_parser.py -q

Copilot

Pull request overview

Adds an optional post-replay “prediction report” pipeline that evaluates the directional usefulness of the existing top-5 order-imbalance signal across configurable message horizons, while keeping the existing analytics CSV export unchanged.

Changes:

Extend analytics/public API with prediction snapshot + summary types, horizon resolution, and CSV writer (including hit rate + information coefficient).
Add CLI flags (--prediction-report-out, --prediction-horizons) to generate per-backend prediction report CSVs alongside (unchanged) analytics CSVs.
Expand C++ and Python integration tests to validate report outputs, config behavior, and CLI validation errors; update docs/benchmark notes to describe the feature gate.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 3 comments.

Show a summary per file

File	Description
`include/lob/analytics.hpp`	Adds prediction reporting config fields + new public structs/APIs for prediction summaries and report writing.
`src/analytics.cpp`	Implements labeling, hit-rate + running Pearson IC calculation, snapshot collection, and prediction report CSV output.
`src/main.cpp`	Adds CLI parsing/validation for prediction flags and emits prediction report output per backend.
`tests/test_analytics.cpp`	Adds unit tests for prediction config resolution, labeling rules, IC behavior, and CSV writer output.
`tests/test_parser.py`	Extends integration test to verify prediction report files exist and analytics CSV output remains unchanged.
`README.md`	Documents prediction reporting semantics, flags, and label/metric definitions.
`report/benchmark_report.md`	Notes prediction reporting is feature-gated and outside the replay-only benchmark timing path.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-06T01:50:03Z

 #include <array>
+#include <cctype>
 #include <cstdlib>
 #include <fstream>
 #include <iomanip>
 #include <iostream>
+#include <limits>
 #include <optional>
 #include <string>
 #include <vector>


std::max is used in this file (e.g., when setting analytics_config.depth_levels), but <algorithm> is not included. Relying on transitive includes is non-portable; add #include <algorithm> to ensure this compiles consistently across standard library implementations.

Copilot · 2026-04-06T01:50:03Z

+std::vector<PredictionSummaryRow> summarize_prediction_horizons_impl(
+    const std::vector<PredictionSnapshot>& snapshots,
+    const std::vector<std::size_t>& horizons) {
+    std::vector<PredictionSummaryRow> summaries;
+    summaries.reserve(horizons.size());
+
+    for (const std::size_t horizon : horizons) {
+        if (horizon == 0) {
+            throw std::invalid_argument("Prediction horizons must be positive");
+        }
+
+        PredictionSummaryRow summary;
+        RunningPearsonCorrelation information_coefficient;
+        summary.horizon_messages = horizon;
+        summary.total_rows_seen = snapshots.size();
+
+        for (std::size_t index = 0; index < snapshots.size(); ++index) {
+            const PredictionSnapshot& snapshot = snapshots[index];
+            if (!snapshot.mid_price.has_value()) {
+                ++summary.skipped_no_valid_mid;
+                continue;
+            }
+
+            ++summary.eligible_rows_with_valid_mid;
+            const std::optional<int> label = first_future_label(snapshots, index, horizon);
+            if (!label.has_value()) {
+                ++summary.skipped_no_future_move_within_horizon;
+                continue;
+            }


summarize_prediction_horizons_impl calls first_future_label(...) for every row, and first_future_label scans up to horizon steps ahead. This makes runtime O(N * horizon) per horizon (and O(N * sum(horizons)) overall), which can get expensive on large datasets/horizons. Consider precomputing the first non-zero future move direction once (or using a two-pointer / next-nonzero index cache) and reusing it across horizons to keep the prediction-report path scalable.

Copilot · 2026-04-06T01:50:04Z

+    assert(almost_equal(summary.coverage_vs_total, 0.0));
+}
+
+void test_prediction_zero_signal_is_skipped_instead_of_labeled() {


The test name suggests zero-signal rows are skipped instead of labeled, but the assertions expect labeled_rows == 2 and skipped_zero_signal == 1 (i.e., the row is labeled but excluded from the hit-rate denominator). Renaming this test to reflect the actual behavior would make the intent clearer and avoid confusion with the report semantics.

Suggested change

void test_prediction_zero_signal_is_skipped_instead_of_labeled() {

void test_prediction_zero_signal_is_excluded_from_hit_rate_but_tracked_in_summary() {

Add predictive signal report with IC

1fc4474

Copilot AI review requested due to automatic review settings April 6, 2026 01:45

Copilot started reviewing on behalf of Aroesler1 April 6, 2026 01:45 View session

Aroesler1 merged commit f424e06 into main Apr 6, 2026
6 checks passed

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Aroesler1 deleted the codex/repo_feature_build-205-20260406-010810 branch April 6, 2026 02:01

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add predictive signal report with IC#13

Add predictive signal report with IC#13
Aroesler1 merged 1 commit into
mainfrom
codex/repo_feature_build-205-20260406-010810

Aroesler1 commented Apr 6, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Copilot AI Apr 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

	void test_prediction_zero_signal_is_skipped_instead_of_labeled() {
	void test_prediction_zero_signal_is_excluded_from_hit_rate_but_tracked_in_summary() {

Conversation

Aroesler1 commented Apr 6, 2026

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants