Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
ad12d74
feat: add post-run data integrity audit with reference source
AlexanderPietsch Apr 15, 2026
2fbbc8d
docs: add new module to README.md and example.yaml
AlexanderPietsch Apr 15, 2026
2af8125
Merge commit '6f4107ac8ad4e323bc54b9d49f4649a70063e462' into feat/VD-…
AlexanderPietsch Apr 20, 2026
24cc274
Merge remote-tracking branch 'origin/dev' into feat/VD-4355-data-inte…
AlexanderPietsch Apr 20, 2026
87881fc
Fix audit reference collection self-reference
cursoragent Apr 20, 2026
cdddccf
Fix reference exchange resolution in data integrity audit
cursoragent Apr 20, 2026
40cbceb
fix: active data_integrity gate tracking now on reference_source as e…
AlexanderPietsch Apr 20, 2026
c14fbf9
Merge remote-tracking branch 'origin/feat/VD-4355-data-integrity-audi…
AlexanderPietsch Apr 20, 2026
320f99d
perf: cache data integrity audit per job
AlexanderPietsch Apr 20, 2026
bd4939b
refactor: reduce parser and audit complexity
AlexanderPietsch Apr 20, 2026
72522c9
Fix data integrity threshold fallbacks for None
cursoragent Apr 20, 2026
9c7ee77
fix: preserve reference exchange for integrity audit source
AlexanderPietsch Apr 20, 2026
0e49ab4
Potential fix for pull request finding
AlexanderPietsch Apr 20, 2026
6a094fc
fix: normalize audit thresholds in indeterminate metadata
AlexanderPietsch Apr 20, 2026
175381a
Merge commit '0e49ab49907b88741b45c15b344108e485b36f59' into feat/VD-…
AlexanderPietsch Apr 20, 2026
a696bf0
Potential fix for pull request finding
AlexanderPietsch Apr 20, 2026
808d648
fix: fix the previously bot commit
AlexanderPietsch Apr 20, 2026
165af24
fix: normalize data integrity audit cache key thresholds
AlexanderPietsch Apr 20, 2026
413826c
docs: clarify data integrity audit cache key semantics
AlexanderPietsch Apr 20, 2026
a728241
Potential fix for pull request finding
AlexanderPietsch Apr 20, 2026
91a84ee
chore: extra lines
AlexanderPietsch Apr 20, 2026
e887eb1
Potential fix for pull request finding
AlexanderPietsch Apr 20, 2026
3260342
fix: honor cache-only mode in reference audit fetch
AlexanderPietsch Apr 20, 2026
883c8ba
Potential fix for pull request finding
AlexanderPietsch Apr 20, 2026
110644c
fix: align integrity gate activation with resolved policy
AlexanderPietsch Apr 20, 2026
5ea496f
fix: reintroduce run_only cached that got lost during merges
AlexanderPietsch Apr 20, 2026
1650b09
docs: clarify min_overlap_ratio overlap definition
AlexanderPietsch Apr 20, 2026
e94964c
fix: sanitize non-finite audit divergence metadata
AlexanderPietsch Apr 20, 2026
36f0dac
fix: reject audit on non-finite divergence metrics
AlexanderPietsch Apr 20, 2026
ba71233
test: cover non-finite divergence indeterminate audit path
AlexanderPietsch Apr 20, 2026
21fb470
fix: avoid all-nan warnings in audit divergence metrics
AlexanderPietsch Apr 20, 2026
858a495
fix: adjust gate order flow for performance
AlexanderPietsch Apr 20, 2026
e9e1075
feat: support reference_exchange in data integrity audit
AlexanderPietsch Apr 20, 2026
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
18 changes: 17 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -254,6 +254,22 @@ See new collection examples under `config/collections/` for FX intraday via Finn
- collection-level overrides are supported via `collections[].validation.optimization`
and are resolved against global `validation.optimization` during config loading.
- `validation.result_consistency` controls strategy-result concentration checks:
- `data_integrity_audit` (optional thresholds module; gate is active when `collections[].reference_source` is set):
- purpose: compare canonicalized bars from the primary `source` and a secondary `reference_source`
to catch bad prints / ghost bars before accepting strategy results
- source routing:
- primary fetch uses `collections[].source` (+ `collections[].exchange` for ccxt)
- reference fetch uses `collections[].reference_source`
- for ccxt-vs-ccxt venue comparisons, set:
- `reference_source: ccxt`
- `reference_exchange: <venue>`
- when `reference_source` is ccxt and `reference_exchange` is unset, the runner falls back to
`collections[].exchange`
- `min_overlap_ratio` (optional, default `0.99`, `0..1`): minimum fraction of primary-source bars that must have matching reference-source timestamps (`overlap_bars / primary_bars`)
- `max_median_ohlc_diff_bps` (optional, default `5.0`, `>=0`): maximum allowed median OHLC drift (bps)
- `max_p95_ohlc_diff_bps` (optional, default `20.0`, `>=0`): maximum allowed p95 OHLC drift (bps)
- action: fixed to `reject_result` when overlap/drift thresholds are breached (or comparison is indeterminate)
- diagnostics are attached under `post_run_meta.data_integrity_audit`
- `outlier_dependency` (optional module; active when configured):
- `slices` (required, `>=2`): number of equal time-slices used for diagnostics
- `profit_share_threshold` (required, `0..1`)
Expand Down Expand Up @@ -297,7 +313,7 @@ Structured logs reflect this directly via gate actions:
- `data_validation_gate` can emit `skip_optimization` (job-level optimization disable).
- `strategy_optimization_gate` can emit `baseline_only` (strategy-level baseline fallback) or `skip_job`.
- `strategy_validation_gate` can emit `reject_result` for outlier dependency,
execution price variance, and lookahead shuffle testing.
execution price variance, lookahead shuffle testing, data integrity audit, and transaction-cost robustness.

Numeric config parsing follows `src/config.py` coercion helpers:
- numeric fields are strict types: use YAML numbers, not quoted numeric strings
Expand Down
12 changes: 11 additions & 1 deletion config/example.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -54,6 +54,10 @@ validation:
result_consistency:
min_metric: 0.5 # fail fast: require at least this metric before expensive checks
min_trades: 20 # fail fast: require at least this many closed trades
data_integrity_audit:
min_overlap_ratio: 0.99 # min fraction of primary bars covered by reference timestamps (overlap_bars / primary_bars)
max_median_ohlc_diff_bps: 5.0 # median OHLC drift tolerance (bps)
max_p95_ohlc_diff_bps: 20.0 # tail OHLC drift tolerance (bps)
outlier_dependency:
slices: 5 # split trade history into N equal time-slices for diagnostics
profit_share_threshold: 0.80
Expand All @@ -80,6 +84,7 @@ collections:
# Stocks (large-cap growth)
- name: stocks_large_cap_growth
source: yfinance
reference_source: twelvedata # optional golden source for post-run data-integrity audit
symbols: ["CNDX.L", "AAPL", "MSFT", "NVDA"]
fees: 0.0005 # approx IBKR
slippage: 0.0005
Expand All @@ -100,8 +105,13 @@ collections:

# Crypto (Binance via ccxt)
- name: crypto
source: binance
source: ccxt
# For ccxt collections, `exchange` selects the primary venue adapter.
# Set `reference_source: ccxt` + `reference_exchange` to compare venues
# in data_integrity_audit (for example Binance vs Bybit).
reference_source: ccxt
exchange: binance
reference_exchange: bybit
quote: USDT
Comment thread
AlexanderPietsch marked this conversation as resolved.
symbols: ["BTC/USDT", "ETH/USDT", "BNB/USDT", "SOL/USDT"]
fees: 0.0006 # approx Bybit/Binance taker
Expand Down
Loading
Loading