feat(exceptions): Layer-1 structured exception hierarchy with NR-* error codes#31
Merged
Merged
Conversation
Every public SDK exception now inherits from NullRunError and carries
four actionable fields (error_code, user_action, retryable, docs_url)
plus an optional chained cause. Users get a stable, grep-able error
code (NR-A001, NR-B002, NR-R001, ...) and a short imperative
next-step hint instead of a free-form message string.
New specialized classes (back-compat subclasses of existing
user-facing classes, so existing except clauses keep matching):
* NullRunConfigError — config/initialization failures
* NullRunAuthError — invalid/missing API key (subclass of
NullRunAuthenticationError)
* NullRunBackendError — gateway 5xx (subclass of
NullRunTransportError, retryable=True)
* NullRunBudgetError — budget exhausted (subclass of
NullRunBlockedException)
* NullRunToolBlockedError — tool blocked by policy (subclass of
NullRunBlockedException)
Existing except handlers keep working: every new class is a subclass
of an existing one, so e.g. 'except NullRunBlockedException' still
catches NullRunBudgetError and NullRunToolBlockedError.
Tests: tests/test_exception_hierarchy.py pins the hierarchy shape
(class roots), the structured fields on every public class, and the
five back-compat invariants (subclass matching for the user-facing
exception trees, BaseException isolation for WorkflowKilledInterrupt).
Verified locally: pytest 880 passed / 13 skipped, ruff check src/
clean, mypy src/ clean.
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
maltsev-dev
pushed a commit
that referenced
this pull request
Jun 24, 2026
…ection Builds on the Layer-1 structured exception hierarchy (PR #31). Three deliverables in this commit: 1) nullrun.observability package - error_hooks.py: global hook registry with thread-safe register / unregister / dispatch. Multiple hooks fire in registration order. Hook exceptions are caught and logged at DEBUG — a misbehaving hook cannot break the SDK. has_hooks() short-circuit keeps the hot path zero-cost when nothing is registered. - status.py: NullRunStatus dataclass (frozen) + RecentError ring buffer (capacity 10) + WorkflowState enum. State derivation covers four headline buckets: ok / degraded / offline / misconfigured. Per-instance state queries never mutate the runtime. - observability.py is renamed into the package (__init__.py keeps the previous public surface). 2) nullrun public API additions - on_error(hook) — Layer 2 entry point. Documented as 'give the user a chance' to observe every structured failure before it propagates. Skipped for WorkflowKilledInterrupt (BaseException subclass) — kill is a signal, not an error. - status() — Layer 3 entry point. Returns a frozen NullRunStatus snapshot. Raises NullRunConfigError (NR-C004) if no runtime has been init()'d. Never lazily creates a runtime as a side effect (pinned by test_status_never_lazily_creates_runtime). - Both are added to __all__ so they appear in dir(nullrun) for discoverability. 3) Docs: docs/errors/ - 15 per-code pages (NR-A001..A003, B001..B005, C001/C003, L001, R001, T001, W002/W003) plus README index. Each page documents the error_code, the trigger conditions, the user_action, and the retryable hint. - docs/integration-baseline-2026-06-19.md — pinned baseline for the next integration run. 4) Test updates - test_error_hooks.py — registry + dispatch + bypass tests (killed interrupt does not fire; one bad hook does not prevent later hooks; unregister is idempotent). - test_status.py — no-runtime / with-runtime / state derivation / recent-errors ring buffer. - test_integration_contract.py — track_event setdefault race pinned against the locked helper. - test_dead_code_removed.py::test_dir_size_unchanged — now keys off nullrun.__all__ (the source of truth for the curated surface) so the curated-surface contract is pinned without hardcoding the symbol count. 5) Source wiring - runtime.py — _emit_sdk_error / _emit_for_transport_error wire the new error_hooks.emit_error into the two SDK failure paths. status() builder reads runtime state and feeds the recent-errors ring buffer. - transport.py — failed batches emit NullRunBackendError (retryable=True) through the new path so retries surface the correlation_id in the ErrorContext. - decorators.py — @Protect catches the structured NullRunBlockedException family and emits with stage='tool' so a hook can attribute the failure to the right gate. Verified locally on Windows / Python 3.14.2: pytest 926 passed, 13 skipped ruff check clean on src/ and tests/ mypy src/ clean on 26 source files
maltsev-dev
added a commit
that referenced
this pull request
Jun 24, 2026
…ection (#32) Builds on the Layer-1 structured exception hierarchy (PR #31). Three deliverables in this commit: 1) nullrun.observability package - error_hooks.py: global hook registry with thread-safe register / unregister / dispatch. Multiple hooks fire in registration order. Hook exceptions are caught and logged at DEBUG — a misbehaving hook cannot break the SDK. has_hooks() short-circuit keeps the hot path zero-cost when nothing is registered. - status.py: NullRunStatus dataclass (frozen) + RecentError ring buffer (capacity 10) + WorkflowState enum. State derivation covers four headline buckets: ok / degraded / offline / misconfigured. Per-instance state queries never mutate the runtime. - observability.py is renamed into the package (__init__.py keeps the previous public surface). 2) nullrun public API additions - on_error(hook) — Layer 2 entry point. Documented as 'give the user a chance' to observe every structured failure before it propagates. Skipped for WorkflowKilledInterrupt (BaseException subclass) — kill is a signal, not an error. - status() — Layer 3 entry point. Returns a frozen NullRunStatus snapshot. Raises NullRunConfigError (NR-C004) if no runtime has been init()'d. Never lazily creates a runtime as a side effect (pinned by test_status_never_lazily_creates_runtime). - Both are added to __all__ so they appear in dir(nullrun) for discoverability. 3) Docs: docs/errors/ - 15 per-code pages (NR-A001..A003, B001..B005, C001/C003, L001, R001, T001, W002/W003) plus README index. Each page documents the error_code, the trigger conditions, the user_action, and the retryable hint. - docs/integration-baseline-2026-06-19.md — pinned baseline for the next integration run. 4) Test updates - test_error_hooks.py — registry + dispatch + bypass tests (killed interrupt does not fire; one bad hook does not prevent later hooks; unregister is idempotent). - test_status.py — no-runtime / with-runtime / state derivation / recent-errors ring buffer. - test_integration_contract.py — track_event setdefault race pinned against the locked helper. - test_dead_code_removed.py::test_dir_size_unchanged — now keys off nullrun.__all__ (the source of truth for the curated surface) so the curated-surface contract is pinned without hardcoding the symbol count. 5) Source wiring - runtime.py — _emit_sdk_error / _emit_for_transport_error wire the new error_hooks.emit_error into the two SDK failure paths. status() builder reads runtime state and feeds the recent-errors ring buffer. - transport.py — failed batches emit NullRunBackendError (retryable=True) through the new path so retries surface the correlation_id in the ErrorContext. - decorators.py — @Protect catches the structured NullRunBlockedException family and emits with stage='tool' so a hook can attribute the failure to the right gate. Verified locally on Windows / Python 3.14.2: pytest 926 passed, 13 skipped ruff check clean on src/ and tests/ mypy src/ clean on 26 source files Co-authored-by: Anatolii <anatolii@nullrun.io>
maltsev-dev
added a commit
that referenced
this pull request
Jun 24, 2026
Bump version 0.6.0 → 0.6.1. This release lands all three layers
of the 'give the user a chance' design on top of the 0.6.0 P0
hardening pass:
* Layer 1 — structured exception hierarchy. Every public SDK
exception inherits from NullRunError and carries
error_code / user_action / retryable / docs_url / cause.
Five new typed classes (NullRunConfigError, NullRunAuthError,
NullRunBackendError, NullRunBudgetError, NullRunToolBlockedError)
are subclasses of the existing user-facing classes, so every
'except' clause from 0.6.0 keeps matching.
* Layer 2 — nullrun.on_error() global error hook. Fires for
every structured NullRunError before the exception
propagates. Skipped for WorkflowKilledInterrupt (BaseException
subclass — kill is a signal, not an error). Multiple hooks
fire in registration order; hook exceptions are caught and
logged at DEBUG. has_hooks() short-circuit keeps the hot
path zero-cost when no hook is registered.
* Layer 3 — nullrun.status() introspection. Synchronous,
thread-safe, side-effect-free snapshot of runtime state.
Returns a frozen NullRunStatus dataclass with one of four
headline states (ok / degraded / offline / misconfigured).
Raises NullRunConfigError (NR-C004) if no runtime has been
init()'d — never lazily creates a runtime as a side effect.
Per-code docs in docs/errors/ (15 pages + README index).
New tests pin the hierarchy, the hook semantics, the snapshot
fields, and the recent-errors ring buffer.
TestPyPI: the previous 0.6.0 (uploaded 2026-06-23, before
#31 and #32 landed) is yanked separately so the new 0.6.1
wheel can be uploaded. The yank is a TestPyPI-side action;
it does not change the source tree.
maltsev-dev
added a commit
that referenced
this pull request
Jun 24, 2026
Bump version 0.6.0 → 0.6.1. This release lands all three layers
of the 'give the user a chance' design on top of the 0.6.0 P0
hardening pass:
* Layer 1 — structured exception hierarchy. Every public SDK
exception inherits from NullRunError and carries
error_code / user_action / retryable / docs_url / cause.
Five new typed classes (NullRunConfigError, NullRunAuthError,
NullRunBackendError, NullRunBudgetError, NullRunToolBlockedError)
are subclasses of the existing user-facing classes, so every
'except' clause from 0.6.0 keeps matching.
* Layer 2 — nullrun.on_error() global error hook. Fires for
every structured NullRunError before the exception
propagates. Skipped for WorkflowKilledInterrupt (BaseException
subclass — kill is a signal, not an error). Multiple hooks
fire in registration order; hook exceptions are caught and
logged at DEBUG. has_hooks() short-circuit keeps the hot
path zero-cost when no hook is registered.
* Layer 3 — nullrun.status() introspection. Synchronous,
thread-safe, side-effect-free snapshot of runtime state.
Returns a frozen NullRunStatus dataclass with one of four
headline states (ok / degraded / offline / misconfigured).
Raises NullRunConfigError (NR-C004) if no runtime has been
init()'d — never lazily creates a runtime as a side effect.
Per-code docs in docs/errors/ (15 pages + README index).
New tests pin the hierarchy, the hook semantics, the snapshot
fields, and the recent-errors ring buffer.
TestPyPI: the previous 0.6.0 (uploaded 2026-06-23, before
#31 and #32 landed) is yanked separately so the new 0.6.1
wheel can be uploaded. The yank is a TestPyPI-side action;
it does not change the source tree.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Introduces a Layer-1 structured exception hierarchy: every public
SDK exception now inherits from
NullRunErrorand carries fouractionable fields (
error_code,user_action,retryable,docs_url)plus an optional chained
cause. Users get a stable, grep-able code(e.g.
NR-A001,NR-B002,NR-R001) and a short imperativenext-step hint instead of a free-form message string.
What changes
New base class —
NullRunError(BreakerError)with the fourstructured fields and the
__init__per-instance override patternso subclass defaults don't leak across siblings.
New specialized classes (each is a subclass of the existing
user-facing class it refines, so existing
exceptclauses keepmatching):
error_coderetryableNullRunConfigErrorNullRunErrorNR-C001NullRunAuthErrorNullRunAuthenticationErrorNR-A001NullRunBackendErrorNullRunTransportErrorNR-B002NullRunBudgetErrorNullRunBlockedExceptionNR-X001NullRunToolBlockedErrorNullRunBlockedExceptionNR-T001Runtime/Decorators/Transport— raise the new typedexceptions where the code path used to raise plain strings or the
generic base. Transport continues to map the gateway 5xx envelope
to
NullRunBackendErrorso the retryable hint propagates cleanly.Public re-exports —
src/nullrun/__init__.pyexposes the newclasses so cookbook examples and external code can
except NullRunBudgetErrordirectly.Back-compat invariants (pinned by tests)
except NullRunAuthenticationErrorstill catchesNullRunAuthErrorexcept NullRunBlockedExceptionstill catchesNullRunBudgetErrorand
NullRunToolBlockedErrorexcept NullRunTransportErrorstill catchesNullRunBackendErrorand
RateLimitErrorexcept WorkflowKilledExceptionstill catchesWorkflowKilledInterrupt(BaseException inheritance preserved)except Exceptiondoes not catchWorkflowKilledInterruptTests
New file:
tests/test_exception_hierarchy.py(258 lines) covers:NullRunErrorwith a non-empty
user_actionand a stableerror_codematchingthe
NR-LETTERNNNpatternCI status (local verification on Windows / Python 3.14.2)
pytestruff check src/mypy src/Note:
ruff format --checkfor the whole repo fails on 79 files,but this is preexisting on
master(verified withgit stash—the same 79 files fail before this change) and is not part of the
CI workflow. A separate repo-wide format cleanup can land in a
dedicated commit.
Files
src/nullrun/__init__.py— re-exportssrc/nullrun/breaker/exceptions.py— new base + 5 new classessrc/nullrun/decorators.py— typed raisessrc/nullrun/runtime.py— typed raisessrc/nullrun/transport.py— typed raises + retryable mappingtests/test_exception_hierarchy.py— new