CI: add smoke suite and gate unit/integration workflows#5
Merged
Conversation
Contributor
There was a problem hiding this comment.
Pull request overview
Adds a lightweight smoke test suite and restructures CI so smoke tests run first and (attempt to) gate heavier test workflows behind a successful smoke run.
Changes:
- Introduces
tests/smokewith fast checks covering CLI load/version, config defaults, and seen-store persistence. - Adds/updates GitHub Actions workflows for smoke, unit, integration, lint, pre-commit, and coverage.
- Updates dev dependencies to include
pytest-covandpre-commit.
Reviewed changes
Copilot reviewed 14 out of 14 changed files in this pull request and generated 4 comments.
Show a summary per file
| File | Description |
|---|---|
tests/smoke/test_smoke_suite.py |
New fast smoke tests for CLI/config/SeenStore. |
pyproject.toml |
Adds dev tools needed by CI (pytest-cov, pre-commit). |
.pre-commit-config.yaml |
Defines pre-commit hooks (basic hygiene + ruff/format). |
.pre-commit-ci.yaml |
Configures pre-commit.ci service behavior (autoupdate schedule, no autofix PRs). |
.github/workflows/smoke-tests.yml |
Runs the smoke suite on PRs and main pushes. |
.github/workflows/unit-tests.yml |
Runs unit tests on workflow_run after smoke completion. |
.github/workflows/integration-tests.yml |
Runs integration tests on workflow_run after smoke completion. |
.github/workflows/lint.yml |
Adds ruff format/lint + mypy workflow. |
.github/workflows/pre-commit-ci.yml |
Runs pre-commit hooks in GitHub Actions. |
.github/workflows/codecoverage.yml |
Runs unit+integration with coverage and uploads XML artifact. |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
shaypal5
added a commit
that referenced
this pull request
May 21, 2026
Fixes all 16 issues raised in the post-merge review: Critical: - [#1] Orchestrator now checks config.enabled / config.mode at the top of evaluate_thin/evaluate_thick: mode=OFF or enabled=False returns a _noop_pass immediately without running any stage or writing telemetry; SHADOW mode downgrades drop→pass in _conclude while preserving stopped_at_stage for recall analysis; ENFORCE respects drops. - [#2] Stage objects are only instantiated when config.stages.X.enabled is True; disabled stages are stored as None, preventing model-load cost for stages like C (embedding) and D (SLM) that aren't in use. - [#3] Added @runtime_checkable StageEvaluator Protocol in models.py with uniform evaluate(candidate, pass_kind, body=None) signature; all four stage stubs (A–D) updated to that signature so the orchestrator calls them uniformly. - [#4] Removed duplicate ThinOrThick alias from cascade.py; PassKind from models.py is the single source of truth. Major: - [#5] StoppedAt = StageName | Literal["passed_all"] — no longer a copy-paste of the four stage letters. - [#6] PrefilterDecision.decided_at changed from str to datetime; telemetry writer converts to .isoformat() at the serialisation boundary; _path_for uses .strftime() directly on the datetime. - [#7] StageScore.__post_init__ validates p_negative and threshold are both in [0.0, 1.0], raising ValueError for out-of-range values. - [#8] Stage A re-run in evaluate_thick documented with an explicit Note in the docstring; thin-result passthrough deferred to a later PR. - [#9] Test fixtures now use typed aliases (StageName, Verdict, StoppedAt, PassKind) — all type: ignore[arg-type] comments removed from helpers. Minor: - [#10] flush() removed from PrefilterDecisionWriter. - [#11] _path_for no longer has a try/except — datetime param makes it unnecessary. - [#12] "short" removed from _hash_config docstring. - [#13] test_frozen uses pytest.raises(FrozenInstanceError) instead of try/except/else antipattern. - [#14] PrefilterStatePaths converted from pydantic BaseModel to @dataclasses.dataclass(frozen=True) — consistent with StageScore / PrefilterDecision. - [#15] __init__.py now exports CandidateView, StageEvaluator, PrefilterStatePaths, resolve_prefilter_state_paths, PassKind, Verdict. - [#16] cli.py summary command no longer hardcodes agents/news/local.yaml; prints an actionable error and exits 1 when --config is not supplied. Tests: 54 → 78 (+24), all passing. New coverage: StageScore bounds validation (5 tests), StageEvaluator protocol conformance for all four stages (5 tests), type-alias smoke checks (4 tests), OFF-mode no-telemetry (2 tests), disabled-flag suppression (1), shadow/enforce telemetry (2), shadow downgrade with monkeypatched stage (1), enforce drop with monkeypatched stage (1), disabled-stages-not-instantiated (1), decided_at-is-datetime (2). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
shaypal5
added a commit
that referenced
this pull request
May 21, 2026
* feat(prefilter): LPF-PR-01 — prefilter package foundation, models, config, telemetry, no-op cascade Introduces the `denbust.prefilter` package (10 modules, 54 unit tests, 0 ruff/mypy errors): - `models.py`: `CandidateView` runtime-checkable Protocol, `StageScore` and `PrefilterDecision` frozen dataclasses with Literal-typed `StageName`, `PassKind`, `Verdict`, and `StoppedAt` fields. - `config.py`: `PrefilterMode(StrEnum)` (off/shadow/enforce), per-stage configs (`StageAConfig`–`StageDConfig`), `PrefilterStagesConfig`, `PrefilterRefreshConfig`, and `PrefilterConfig` root with `~`-expansion model_validator. - `state_paths.py`: `PrefilterStatePaths` pydantic model + `resolve_prefilter_state_paths()` anchoring artefacts under `<state_root>/<dataset>/<job>/prefilter/`. - `telemetry.py`: `PrefilterDecisionWriter` appending decisions to date-sharded `<decisions_dir>/YYYY-MM-DD.jsonl` files. - `cascade.py`: `CascadeOrchestrator` with `evaluate_thin()` / `evaluate_thick()` — always returns `verdict="pass"` stub; records every decision via the writer. - `stage_a.py`–`stage_d.py`: stub `evaluate()` methods returning `None` so the cascade always passes through; full implementations land in LPF-PR-03 through LPF-PR-07. - `cli.py`: `denbust prefilter summary` Typer command stub. - `__init__.py`: re-exports `CascadeOrchestrator`, `PrefilterConfig`, `PrefilterMode`, `PrefilterDecision`, `StageScore`. - `src/denbust/config.py`: adds `prefilter: PrefilterConfig` field to the root `Config`. - `src/denbust/cli.py`: registers `prefilter_app` under `denbust prefilter`. - `README.md`: retitles the cascade section to reflect active implementation. Cascade ships disabled (`mode: off`); no pipeline insertion in this PR. 54 unit tests covering protocol conformance, config validation, YAML round-trips, state-path resolution, JSONL telemetry, and cascade no-op behaviour. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * docs(plan): mark LPF-PR-01 done, update last-merged-PR reference (#158) Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> * fix(prefilter): address code-review issues from PR #158 self-review Fixes all 16 issues raised in the post-merge review: Critical: - [#1] Orchestrator now checks config.enabled / config.mode at the top of evaluate_thin/evaluate_thick: mode=OFF or enabled=False returns a _noop_pass immediately without running any stage or writing telemetry; SHADOW mode downgrades drop→pass in _conclude while preserving stopped_at_stage for recall analysis; ENFORCE respects drops. - [#2] Stage objects are only instantiated when config.stages.X.enabled is True; disabled stages are stored as None, preventing model-load cost for stages like C (embedding) and D (SLM) that aren't in use. - [#3] Added @runtime_checkable StageEvaluator Protocol in models.py with uniform evaluate(candidate, pass_kind, body=None) signature; all four stage stubs (A–D) updated to that signature so the orchestrator calls them uniformly. - [#4] Removed duplicate ThinOrThick alias from cascade.py; PassKind from models.py is the single source of truth. Major: - [#5] StoppedAt = StageName | Literal["passed_all"] — no longer a copy-paste of the four stage letters. - [#6] PrefilterDecision.decided_at changed from str to datetime; telemetry writer converts to .isoformat() at the serialisation boundary; _path_for uses .strftime() directly on the datetime. - [#7] StageScore.__post_init__ validates p_negative and threshold are both in [0.0, 1.0], raising ValueError for out-of-range values. - [#8] Stage A re-run in evaluate_thick documented with an explicit Note in the docstring; thin-result passthrough deferred to a later PR. - [#9] Test fixtures now use typed aliases (StageName, Verdict, StoppedAt, PassKind) — all type: ignore[arg-type] comments removed from helpers. Minor: - [#10] flush() removed from PrefilterDecisionWriter. - [#11] _path_for no longer has a try/except — datetime param makes it unnecessary. - [#12] "short" removed from _hash_config docstring. - [#13] test_frozen uses pytest.raises(FrozenInstanceError) instead of try/except/else antipattern. - [#14] PrefilterStatePaths converted from pydantic BaseModel to @dataclasses.dataclass(frozen=True) — consistent with StageScore / PrefilterDecision. - [#15] __init__.py now exports CandidateView, StageEvaluator, PrefilterStatePaths, resolve_prefilter_state_paths, PassKind, Verdict. - [#16] cli.py summary command no longer hardcodes agents/news/local.yaml; prints an actionable error and exits 1 when --config is not supplied. Tests: 54 → 78 (+24), all passing. New coverage: StageScore bounds validation (5 tests), StageEvaluator protocol conformance for all four stages (5 tests), type-alias smoke checks (4 tests), OFF-mode no-telemetry (2 tests), disabled-flag suppression (1), shadow/enforce telemetry (2), shadow downgrade with monkeypatched stage (1), enforce drop with monkeypatched stage (1), disabled-stages-not-instantiated (1), decided_at-is-datetime (2). Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com> --------- Co-authored-by: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
tests/smokesuite with fast checks for CLI/config/seen-storesmoke-testsworkflow to run the smoke suite directlyunit-testsandintegration-testsValidation
PYTHONPATH=src pytest -q tests/smoke(3 passed)Notes
smoke-testsruns on the same head SHA/branch