Fix full-codebase review findings: hermetic tests, py.typed, CLI error handling, installer hardening, CI#129
Merged
Merged
Conversation
Tests / safety: - Stub MCP/skills installs in the four install-flow tests that executed real `npx add-mcp` / `npx skills add` against $HOME - Make the wheel-content test build from a temp copy instead of deleting the repo's build/ dir; stringify API fields before Rich-escaping in `scouts list` (non-string values no longer crash the CLI) SDK: - handle_response: treat 3xx as an error (redirects are not followed); include status + server body in 401/403 AuthenticationError messages - apply_chat_extra_body: stop mutating the caller's extra_body dict - Strip resolved API keys (trailing-newline keys no longer fail deep in httpx with LocalProtocolError) - Auth flow: detect EADDRINUSE by errno; report the orphaned key when save_config fails after key generation; YUTORI_API_BASE_URL override - __version__ fallback 0.1.0 -> 0.0.0+unknown yutori.n1 compat: - Shims now alias full module contents (non-__all__ names importable again) and the package exposes submodules as attributes Navigator: - Payload trimming converges on multi-image messages - Replay: survive valid-JSON non-object tool arguments, escape screenshot URLs in generated HTML, keep the pending tool screenshot across text-only user interjections - _assets: chained joinpath for py3.9 zip-import compatibility CLI: - Friendly errors (exit 1, no traceback) for AuthenticationError / APIError / httpx errors across data commands, with AUTH_FAILURE_MARKERS kept in sync - browse/research run and scouts create exit 1 when the API reports status=failed - Escape user/API/subprocess text rendered through Rich markup - Drop the dead auth-URL reprint after login failure; treat "N/A" task id as missing in verification; Windows venv Scripts/ path; Ctrl+C during verification polling now renders the summary and honors the exit-code contract Packaging / installer / CI: - Ship yutori/py.typed (Typing :: Typed was previously a no-op for consumers); setuptools>=77 floor to match the PEP 639 license syntax; MANIFEST.in grafts tests/ so the sdist suite can collect - install.sh regenerated (0.7.8 header); frame render dir uses mktemp -d (predictable /tmp path TOCTOU); uninstall.sh accepts "yes"; pre-commit freshness hook now also triggers on pyproject.toml - New CI workflow (ruff + pytest on 3.9-3.13 + wheel content test + twine check + install.sh freshness); publish workflow gates on tests and twine check; release action pinned to a commit SHA - Add __future__ imports to two test files whose PEP 604 annotations would crash collection on Python 3.9 Cleanup: internal tracker/hostname references removed from comments, stale redirected-stdout-guard comments and no-op isatty patches removed, examples use NAVIGATOR_N1_MODEL, api.md documents the new exit codes. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
SC2016 is info-level and flags the intentional literal backticks in uninstall.sh's single-quoted grep pattern and user-facing message. The test always skipped locally (no shellcheck) and only started running on the new CI's ubuntu runners, which ship shellcheck preinstalled. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Error handling (silent-failure review): - handle_response: a 2xx non-JSON body (captive portal / SSO gateway) now raises APIError instead of escaping as a bare JSONDecodeError that bypassed cli_api_errors; 3xx errors name the redirect Location; 401/403 body detail truncated to 500 chars - print_task_submission_result: a non-failed create response without a task_id now fails (red banner, exit 1) instead of printing a green "submitted" banner for an unpollable task; the installer's "N/A" guard becomes defense-in-depth - cli_api_errors: auth recovery hint tailored to the rejected key's source (env var vs saved config vs none) - run_login_flow: non-EADDRINUSE bind errors name the callback server host:port; orphaned-key message points at the settings page - yutori.n1: PEP 562 __dir__ so lazy submodules stay discoverable Comment accuracy (comment review): - chat namespace docstrings no longer suggest passing max_retries through create() (not accepted there); retry policy documented as not configurable via the SDK surface - credentials: trailing-newline example replaced ($(cat file) strips trailing newlines and cannot reproduce the failure) - n1 __init__ comment no longer overstates pre-rename eager imports or hardcodes the shim count; payload trimming comments/docstring state the per-phase single-strip constraint and best-effort protection - release action pin annotated with machine-readable trailing # v2.6.2 Test gaps (test-coverage review): - test_port_in_use now constructs OSError with errno so the EADDRINUSE branch is actually exercised; companion fallthrough test - scouts create failed-status exit code, missing-task-id failure, installer prompt/summary markup escaping, Windows Scripts venv branch (via an os proxy), YUTORI_API_BASE_URL override (module reload), new "invalid api key" auth marker, full n1 submodule set + dir(), positive escaped-URL assertion in the replay XSS test, mktemp -d and yes-regex pins for the shell scripts, and sdist content assertions (conftest/helpers/py.typed) in the packaging test Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
Quality pass over the review-fix PR (reuse / simplification / efficiency / altitude), no behavior changes: - CLI: escape-by-default in the shared print helpers (safe_str) — print_optional_field/print_aligned_fields/print_creation_result now escape values unconditionally, closing the six remaining unescaped API-value sites and making the next call site safe by default; the escape_value flag is gone, call sites pass raw values - CLI: cli_client() bundles cli_api_errors + get_authenticated_client so a new command cannot take the client without the error handling; test patches retarget to the single package-level symbol - _auth_recovery_hint uses the public get_auth_status().source instead of re-implementing source detection via a private credentials helper - n1 shims: install_shim() derives the target from the shim's own __name__ (13 files shrink to two lines each; a copy-pasted shim can no longer alias the wrong module); _SUBMODULES is derived from the files on disk via pkgutil, dropping the hand-list and the magic-13 test assertion - tests: a module-level autouse fixture in test_install_flow.py stubs the MCP/skills installs for every test — the dangerous npx path is now unreachable by default instead of guarded by per-test boilerplate (4 copies removed); direct-exercise tests are unaffected (import-time bindings) - test_packaging builds once via plain `python -m build` (sdist + wheel-from-sdist) instead of two separate invocations — halves the package-job build work and covers the more realistic path - test_http: non-JSON/redirect tests folded into TestHandleResponse; dead _all_scripts helper removed from test_bash_scripts - comments: N/A guard documented as a version-skew defense, APIError docstring covers unusable responses, alias_module_contents documents copy (non-aliasing) monkeypatch semantics Skipped (noted, not bugs): exit-code contract to replace AUTH_FAILURE_MARKERS grepping (behavior change — follow-up), payload incremental size accounting (pre-existing structure), negligible micro-costs on error/debug paths. Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Fixes every issue from a full-codebase review (Claude Code review with four area subagents, cross-reviewed by Codex — including the two issues Codex found that the original review missed, and one issue Codex found in this PR's own escaping path).
Highest impact
__install_flowtests ran unmockednpx add-mcp -y -g .../npx skills add ... -g, rewriting~/.cursor/mcp.json,~/.codex/config.toml,~/.gemini/settings.jsonon any machine (or CI runner) withnpx. Now stubbed like their sibling tests; full suite runs in ~4s with zero$HOMEwrites.py.typedwas missing despite theTyping :: Typedclassifier — type checkers ignored every annotation in the installed package. Added and asserted in the wheel-content test.install.shwas stale (0.7.7header at0.7.8) — andyutori.com/install.shserves this file, so it was live. Regenerated; the pre-commit freshness hook now also triggers onpyproject.toml, and new CI enforces it.twine check, installer freshness +bash -n); the publish workflow now runs tests andtwine checkbefore upload, and the release-asset action is pinned to a commit SHA.Behavior changes
yutori browse run/research run/scouts createnow exit 1 when the API reportsstatus: failed(previously exit 0 — scripts treated rejected tasks as success). Documented in api.md.AuthenticationError/APIError/ network failures instead of multi-screen tracebacks; rejected keys get "Run 'yutori auth login'" guidance.AUTH_FAILURE_MARKERSupdated in lockstep.APIErrorinstead of silently returning{}.export YUTORI_API_KEY=$(cat key)) no longer fails every request deep in httpx.Correctness fixes
watch [/b] releasescrashedscouts listwithMarkupError;[beta]-style tokens were silently deleted; the installer summary could crash after a successful install. All user/API/subprocess text rendered through Rich is now escaped (and stringified first — non-string API values also crashed).yutori.n1compat shims:yutori.n1.payloadattribute access andfrom yutori.n1.page_ready import SupportsAsyncPageReadywork again (star-import shims dropped submodule attributes and non-__all__names).max_bytesno matter how many passes ran; trimming now drains multi-image messages (and extra images sharing the last message).apply_chat_extra_bodyno longer mutates the caller'sextra_bodydict; install-flow Ctrl+C during verification polling renders the summary and honors the documented exit-code contract;Task ID: N/Ais treated as missing instead of pollingbrowse get N/A; Windows venvs resolveScripts\python.exe;EADDRINUSEdetected by errno; failedsave_configafter key generation reports the orphaned key.Packaging / installer
setuptools>=77build floor (PEP 639 stringlicensefails validation on 67–76).MANIFEST.ingraftstests/so the sdist suite can collect (it shippedtest_*.pywithoutconftest.py/helpers).mktemp -d(predictable/tmppath was symlink-clobberable on shared Linux);uninstall.shacceptsyes, not justy.__version__fallback is0.0.0+unknowninstead of a plausible-but-wrong0.1.0.Cleanup
Internal tracker/hostname references removed from comments; stale comments about a removed stdout guard (and their no-op
isattypatches) removed; examples use the canonicalNAVIGATOR_N1_MODEL; two test files gained__future__imports their PEP 604 annotations needed on 3.9; the packaging test builds from a temp copy instead of deleting the repo'sbuild/.Verification
ruff check .clean;bash -non all installer scriptsyutori/py.typed+ all 8 JS assets present; sdist inspected: full test suite included;twine checkPASSED on bothinstall.shregeneration verified idempotent; pinned action SHA verified upstreamNot included (needs external setup): OIDC trusted publishing requires configuring the publisher on PyPI first; the workflow keeps the token until then.
🤖 Generated with Claude Code
Note
Medium Risk
CLI exit-code changes can break scripts that assumed success on rejected creates; auth and HTTP error handling changes affect all API clients, while CI/publish gates reduce the risk of shipping broken releases.
Overview
This PR bundles review-driven fixes across CI/publish, CLI/SDK behavior, packaging, and shell installers, bumping to 0.7.8 with a regenerated
install.sh.Automation: New CI runs ruff, pytest on Python 3.9–3.13, slow wheel/sdist checks,
twine check, andinstall.shfreshness +bash -n. Publish now runs tests and metadata validation before upload; the GitHub release action is pinned to a commit SHA. Pre-commit also retriggersinstall.shregen whenpyproject.tomlversion changes.CLI (breaking for scripts):
browse run,research run, andscouts createexit 1 on APIstatus: failedor missing task ID. Commands use sharedcli_client/safe_strso API, auth, and network errors print short messages (no tracebacks) and user/API strings don’t break Rich markup.SDK/auth: Keys are stripped at resolve time; 401/403 include status and body; 3xx and 2xx non-JSON raise
APIErrorinstead of{}. OAuth bind errors use errno; failedsave_configafter key creation reports an orphaned key. Auth API base URL respectsYUTORI_API_BASE_URL.Navigator / compat: Payload trimming drains multi-image messages; replay HTML escapes screenshot URLs and keeps tool screenshots when a text-only user message follows.
yutori.n1shims useinstall_shim/ lazy submodules so submodule attributes match pre-rename imports; examples useNAVIGATOR_N1_MODEL.Packaging/installer: Ships
py.typed,MANIFEST.ingrafts fulltests/,setuptools>=77. Installer frame cache usesmktemp -d;uninstall.shacceptsyes. Install-flow tests stub MCPnpxinstalls; verification handles Ctrl+C,Task ID: N/A, and Windows venvScripts\python.exe.Reviewed by Cursor Bugbot for commit 0b1e0ba. Bugbot is set up for automated code reviews on this repo. Configure here.