Bump models (GLM-5.2 / MiniMax-M3 / DeepSeek-V4-Pro), add route preference, + GitHub Pages KB site by jimstratus · Pull Request #23 · jimstratus/argus

jimstratus · 2026-06-23T07:50:14Z

Summary

Updates reviewer models to the latest versions, adds an easy OpenRouter↔direct-API routing switch, and (incoming) a hosted GitHub Pages knowledge-base site.

Models

glm-5.1 → glm-5.2 (z.ai direct + OpenRouter dual route)
minimax-m2.7 → minimax-m3 (MiniMax direct + OpenRouter dual route)
DeepSeek now defaults to deepseek-v4-pro (V3.2 demoted to custom_only, kept for benchmark-history continuity)
New deepseek aichat client (api.deepseek.com) for the direct route
Profiles repointed; new direct profile (direct-API subs only, no Gemini)

Routing — OpenRouter vs direct API

A single knob decides which provider a dual-route reviewer (glm-5.2, minimax-m3, deepseek-v4-pro) tries first; the other becomes the automatic fallback.

defaults.route_preference: openrouter (public default — one OPENROUTER_API_KEY covers most reviewers) | direct
Per-run override: --route-pref {openrouter,direct}, shorthands --prefer-direct / --prefer-openrouter, or ARGUS_ROUTE_PREF env. Precedence: CLI flag › env › config
Wired through dispatch.py, verify.py, benchmark.py, estimate_cost.py via the single resolver _common.resolve_routes / resolve_route_preference
CLI reviewers are never reordered — their paid CLI sub stays primary and OpenRouter remains a true fallback
OR-balance pre-flight only fires when OpenRouter is the resolved primary, so a direct-preference run with a depleted OR balance isn't gated

This covers the "depleted OR balance → fall back to direct API for all but Gemini (skipped)" workflow: set route_preference: direct (or --prefer-direct) and use the direct profile.

Docs

README / SKILL / CLAUDE / DEVELOPMENT / CONTRIBUTING updated (routing section, env vars incl. DEEPSEEK_API_KEY, flags, profiles, dual-route guidance)
GitHub Pages KB site under docs/ — responsive, collapsible sidebar (mobile-collapsed / desktop-open), dark/light toggle (system default + persisted override), Mermaid diagrams, benchmark chart, onboarding (beginner + advanced), contributing/FAQ/glossary, per-page "View as Markdown" links, llms.txt, plus a Pages deploy workflow. (Landing in a follow-up commit on this branch.)

Tests

New tests/test_routes.py (route classification, preference reordering, CLI-never-reordered invariant, precedence)
Full suite green (37 passed), py_compile clean, config validates (15 reviewers, 8 profiles)

Note: live provider dispatch isn't exercised in CI (no network/keys); model slugs for the bumped versions follow the repo's existing forward-dated convention and may need a verify.py pass against live endpoints.

🤖 Generated with Claude Code

Generated by Claude Code

… route preference Update reviewer model versions and add a single knob to choose OpenRouter vs each provider's own API for dual-route reviewers. Models: - glm-5.1 -> glm-5.2 (z.ai direct + OR dual route) - minimax-m2.7 -> minimax-m3 (MiniMax direct + OR dual route) - deepseek-v4-pro is now the default DeepSeek (v3.2 demoted to custom_only); add api.deepseek.com direct route + new `deepseek` aichat client - profiles repointed; new `direct` profile (subs only, no Gemini) Routing: - defaults.route_preference: openrouter (public default) | direct - _common.resolve_routes / resolve_route_preference reorder the {direct, openrouter} pair only; CLI reviewers are never reordered - --route-pref / --prefer-direct / --prefer-openrouter on dispatch/verify/benchmark/estimate_cost, plus ARGUS_ROUTE_PREF env (precedence: CLI flag > env > config) - OR-balance pre-flight only fires when OR is the resolved primary Docs (README/SKILL/CLAUDE/DEVELOPMENT/CONTRIBUTING) updated; new tests/test_routes.py (resolve_routes + preference precedence). Constraint: keep paid CLI subs primary; OR stays a true fallback for them Confidence: high Scope-risk: moderate Not-tested: live provider dispatch (no network/keys in CI) Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv

Static KB site under docs/, generated by a stdlib-only Python generator (docs/build.py + docs/build_content.py). - Pages: Home, Getting Started (beginner + advanced), Configuration, Reviewers, Profiles, Architecture, Benchmarks, Contributing, FAQ, Glossary, 404 - Collapsible sidebar (mobile-collapsed / desktop-open, persisted, mobile overlay + backdrop), dark/light toggle (system default + persisted override, no flash) - Mermaid diagrams (pipeline, dispatch sequence, merge, cost gate, history.db ERD) + Chart.js benchmark leaderboard, both re-theme on toggle - Per-page "View as Markdown / on GitHub" links, copy buttons, llms.txt, meta/OG tags, .nojekyll - .github/workflows/pages.yml: build docs/ -> upload-pages-artifact -> deploy-pages (pages: write, id-token: write; push to master/main + workflow_dispatch) Reflects the bumped models + route_preference feature from the prior commit. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv

Copilot

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

…red OR-primary helper Addresses two findings from a self code-review of this branch: 1. hermes-4.3 is also a {direct, openrouter} dual-route reviewer and is reordered by route_preference — docstring + README + KB site claimed only glm-5.2/minimax-m3/deepseek-v4-pro. Behavior was already correct (and benign: under the openrouter default it now skips a guaranteed-fail nous-direct attempt); docs corrected to match. 2. Deduplicate the 3-line "is the resolved primary OpenRouter?" helper that was copy-pasted in benchmark.py and estimate_cost.py into _common.primary_is_openrouter (single source of truth). No behavior change; 37 tests still pass. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv

jimstratus · 2026-06-23T08:03:15Z

🔍 Self code-review (round 1)

Copilot review is unavailable (quota), so this is a manual self-review of the branch diff — covered correctness (line-by-line, removed-behavior, cross-file), reuse/simplification/efficiency, altitude, and CLAUDE.md conventions.

No correctness bugs. The route-resolution refactor is sound — the fallback flow is preserved in all four scripts (dispatch/verify/benchmark/estimate_cost), CLI reviewers correctly never reorder, and routing is centralized in one resolver (_common.resolve_routes), matching the repo's "single source of truth" convention.

Findings & fixes (both addressed in `c978931`)

#	Severity	Finding	Fix
1	low (accuracy)	`hermes-4.3` is actually a 4th dual-route reviewer (Nous-direct ↔ OpenRouter), so it's reordered by `route_preference` too — but the docstring, README, and KB site claimed only `glm-5.2`/`minimax-m3`/`deepseek-v4-pro`. Behavior was already correct and benign (under the `openrouter` default it now skips a guaranteed-fail nous-direct attempt, since `NOUSRESEARCH_API_KEY` is unset).	Corrected the `resolve_routes` docstring, README routing section, and `docs/` site content to list `hermes-4.3` (custom-only) as dual-route.
2	low (duplication)	The 3-line "is the resolved primary OpenRouter?" check was copy-pasted into `benchmark.py` and `estimate_cost.py`.	Extracted to `_common.primary_is_openrouter()` and called from both.

No behavior change from the fixes. 37 unit tests pass, the site rebuilds cleanly, and there are no stale model names in docs/.

CI (lint-and-test 3.12/3.13 + fixture-integrity) is green on the prior commit and re-running on c978931.

(The copilot-pull-request-reviewer check showing "failure" is the unavailable Copilot bot, not this repo's CI.)

Generated by Claude Code

The deploy workflow uploaded the whole docs/ folder, so build.py, build_content.py, and a CI-generated __pycache__ were served as static files. Stage the rendered output into _site/ (HTML + assets + .nojekyll + llms.txt) and drop *.py / __pycache__ before upload, using portable cp (rsync isn't guaranteed on the runner). Published tree is now just the site. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv

The Pages workflow stages the rendered site into _site/ before upload; ignore it so a local run of the staging steps can't be committed by accident. Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com> Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv

jimstratus · 2026-06-23T10:07:12Z

🔍 Self code-review — Pages-artifact tidy (`4ea1d7a`)

Focused review of the workflow change that slims the published tree (earlier rounds covered the scripts/config/site).

No correctness bugs. The staging step is sound and was verified locally with the same commands:

cp -r docs/. _site/ copies contents including dotfiles, so .nojekyll is preserved.
rm -f _site/*.py + rm -rf _site/__pycache__ drop the generator sources; rm -f is a no-op if the glob doesn't match.
Resulting published tree = 11 HTML pages + assets/ + .nojekyll + llms.txt, and no build.py / build_content.py / __pycache__.
Portable cp (not rsync, which isn't guaranteed on the runner); permissions / concurrency unchanged.

Finding & fix

#	Severity	Finding	Fix
1	low (tidiness)	The workflow creates `_site/` at the repo root. It's never committed in CI, but a maintainer running the staging steps locally would get an untracked `_site/` not covered by `.gitignore` — a footgun for an accidental commit of a generated tree.	Added `_site/` to `.gitignore` (`5868ff7`).

Net: one low-severity fix; no behavior change. The repo CI (lint-and-test 3.12/3.13 + fixture-integrity) runs on the push; the Pages deploy runs after merge to master.

(Reminder: the copilot-pull-request-reviewer "failure" is the unavailable Copilot bot — quota — not this repo's CI.)

Generated by Claude Code

Copilot AI review requested due to automatic review settings June 23, 2026 07:50

Copilot started reviewing on behalf of jimstratus June 23, 2026 07:50 View session

Copilot AI reviewed Jun 23, 2026

jimstratus marked this pull request as ready for review June 23, 2026 10:06

jimstratus merged commit c8f3180 into master Jun 23, 2026
3 checks passed

jimstratus deleted the claude/model-versions-kb-site-7b30gw branch June 23, 2026 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bump models (GLM-5.2 / MiniMax-M3 / DeepSeek-V4-Pro), add route preference, + GitHub Pages KB site#23

Bump models (GLM-5.2 / MiniMax-M3 / DeepSeek-V4-Pro), add route preference, + GitHub Pages KB site#23
jimstratus merged 5 commits into
masterfrom
claude/model-versions-kb-site-7b30gw

jimstratus commented Jun 23, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

jimstratus commented Jun 23, 2026

Uh oh!

jimstratus commented Jun 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

jimstratus commented Jun 23, 2026

Summary

Models

Routing — OpenRouter vs direct API

Docs

Tests

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Uh oh!

jimstratus commented Jun 23, 2026

🔍 Self code-review (round 1)

Findings & fixes (both addressed in c978931)

Uh oh!

jimstratus commented Jun 23, 2026

🔍 Self code-review — Pages-artifact tidy (4ea1d7a)

Finding & fix

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Findings & fixes (both addressed in `c978931`)

🔍 Self code-review — Pages-artifact tidy (`4ea1d7a`)