Sub-project B PR 1: to_dict() sync + cohort tuple lock-in by sangicook · Pull Request #4 · sangicook/edgartools

sangicook · 2026-04-07T12:24:55Z

Summary

Sub-project B / PR 1 — two preventive consistency fixes, each with a permanent unit-test lock to catch future drift. No active bug is being patched; both are latent drift fixes called out by the Consensus 024 deep-investigation session.

Stacked PR: Built on top of feature/sub-a-determinism-strict-cqs (which has its own open upstream PR at dgunning#764). When that branch merges, this PR's base can be re-targeted to main.

fix(ledger): ExtractionRun.to_dict() had drifted out of sync with its dataclass — 10 provenance fields (concept, accession_number, statement_role, period_type, period_start, period_end, unit, decimals, reference_source, publish_confidence) existed on the dataclass but were silently dropped by the serializer. The method has zero callers in the standardization subsystem today (the production write path record_run() at schema.py:586-608 uses direct attribute access), so the bug is latent. New test uses dataclasses.fields() reflection + a TRANSIENT_FIELDS allowlist to lock the invariant against future drift.
refactor(auto-eval): EXPANSION_COHORT_50/100/500 were mutable lists; tests or overnight loops calling .append() could silently corrupt every downstream run depending on stable cohort membership. Convert all three to tuples, matching the DETERMINISM_TEST_COHORT precedent already established by Sub-project A. New test locks the type AND the chain ordering invariant via slice equality (EXPANSION_COHORT_100[:50] == EXPANSION_COHORT_50, etc.).

Both changes are scoped, independent, and bisectable. PR 1 has no external blockers (other than the upstream sub-a merge) and unblocks the planned PR 3 (provenance hashes) by establishing a consistent serializer path.

Plan & Context

Plan: docs/superpowers/plans/2026-04-07-subproject-b-pr1-to-dict-and-cohort-tuple.md (lives on local main)
Parent decision: docs/autonomous-system/consensus/024-2026-04-07-sub-project-b-scope.md — note the correction footer reframing this PR as preventive consistency work, not an active bug fix
Sequence: PR 1 of 3 in the Sub-project B chain. PR 2 is blocked on a prerequisite write-path investigation; PR 3 is blocked on PR 1 landing (it depends on this PR's to_dict() fix to have a consistent serializer path for new hash fields).
Depends on: Sub-project A determinism work in feature/sub-a-determinism-strict-cqs (upstream PR Sub-project A: strict EF-CQS + determinism CI gate dgunning/edgartools#764)

Diff Stats

File	Change
`edgar/xbrl/standardization/ledger/schema.py`	+32 / −9 (`to_dict()` body + drift-prevention docstring)
`edgar/xbrl/standardization/tools/auto_eval.py`	+20 / −8 (three coordinated `[`→`(` conversions + comment blocks)
`tests/xbrl/standardization/test_extraction_run_serialization.py`	+115 (new — 3 tests)
`tests/xbrl/standardization/test_cohort_immutability.py`	+104 (new — 5 tests)

Total: +258 / −17, 4 files, 2 commits.

Test Plan

pytest -m fast tests/xbrl/standardization/test_extraction_run_serialization.py -v — 3 tests pass (drift lock, no-unexpected-keys, provenance round-trip)
pytest -m fast tests/xbrl/standardization/test_cohort_immutability.py -v — 5 tests pass (precondition + 3 type asserts + content regression guard with len(_50)==50, len(_100)==100, len(_500)==500, slice-equality chain checks, and ticker spot-checks)
pytest -m fast tests/xbrl/standardization/ — full standardization fast suite, expect 184 passed / 1 pre-existing failure (the failure is TestO15SemanticInstruction::test_candidates_instruction_is_semantic_not_numerical in test_closed_loop.py:1462, a string-match assertion on prompt wording that pre-dates this PR and is unrelated)
python -c "from edgar.xbrl.standardization.tools.auto_eval import DETERMINISM_TEST_COHORT, compute_cqs, EXPANSION_COHORT_50, EXPANSION_COHORT_100, EXPANSION_COHORT_500" — determinism gate imports cleanly with new tuple cohorts
No test in the repo mutates the expansion cohorts (verified via grep for .append/.extend/.clear/.remove against the cohort names)

Quality Process

Built via superpowers:subagent-driven-development. Each commit went through two-stage review (spec compliance → code quality), with one revision round each:

Task	First-pass quality findings	Fixes amended
1 (`to_dict()` sync)	non-deterministic test helper; missing run_id-ordering docstring	helper now uses explicit `run_id`/`run_timestamp`; docstring explains run_id-first ordering
2 (cohort tuples)	missing `len(_500) == 500` assertion; unused imports	500-length assertion added; `QUICK_EVAL_COHORT`/`VALIDATION_COHORT` imports removed

A final whole-PR review across both commits caught one cross-commit consistency issue (pytestmark placement in test_cohort_immutability.py violated PEP 8 E402 and was inconsistent with the Task 1 test file and 8 sibling files). Fixed and re-verified.

🤖 Generated with Claude Code

Add the 10 provenance fields that had drifted out of to_dict(): concept, accession_number, statement_role, period_type, period_start, period_end, unit, decimals, reference_source, publish_confidence. This is a latent bug fix: to_dict() has zero callers in the standardization subsystem today (record_run() at schema.py:556-595 writes via direct field access), but any future caller would silently lose the fields. Add a test that locks to_dict() to the dataclass field set so future drift is caught immediately. See Consensus 024 for context. Part of Sub-project B PR 1.

Convert the expansion cohort chain from lists to tuples, matching the DETERMINISM_TEST_COHORT precedent set by Sub-project A. Prevents accidental mutation by tests or overnight loops that could silently corrupt every downstream run that depends on a stable cohort membership. Lock type discipline via new unit tests so future drift is caught at CI. See Consensus 024 Phase 1 Finding 4. Part of Sub-project B PR 1.

sangicook · 2026-04-07T13:46:51Z

Closing this PR — work has been merged into local `main` via `git merge --no-ff` (merge commit `0f4906b8`, PR commits `5ba0aba1` + `207dc163`).

This project is pivoting to a standalone product architecture where `edgartools` becomes a pinned dependency rather than a fork target. Public PR ceremony is no longer needed for autonomous-system work. The code is preserved in local main's history with the full 2-commit structure and merge commit intact.

The remote branch is being deleted as part of the close.

ychan added 2 commits April 7, 2026 14:20

sangicook closed this Apr 7, 2026

sangicook deleted the subproject-b/pr1-to-dict-and-cohort-tuple branch April 7, 2026 13:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sub-project B PR 1: to_dict() sync + cohort tuple lock-in#4

Sub-project B PR 1: to_dict() sync + cohort tuple lock-in#4
sangicook wants to merge 2 commits intofeature/sub-a-determinism-strict-cqsfrom
subproject-b/pr1-to-dict-and-cohort-tuple

sangicook commented Apr 7, 2026

Uh oh!

sangicook commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

sangicook commented Apr 7, 2026

Summary

Plan & Context

Diff Stats

Test Plan

Quality Process

Uh oh!

sangicook commented Apr 7, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant