Skip to content

Bump models (GLM-5.2 / MiniMax-M3 / DeepSeek-V4-Pro), add route preference, + GitHub Pages KB site#23

Merged
jimstratus merged 5 commits into
masterfrom
claude/model-versions-kb-site-7b30gw
Jun 23, 2026
Merged

Bump models (GLM-5.2 / MiniMax-M3 / DeepSeek-V4-Pro), add route preference, + GitHub Pages KB site#23
jimstratus merged 5 commits into
masterfrom
claude/model-versions-kb-site-7b30gw

Conversation

@jimstratus

Copy link
Copy Markdown
Owner

Summary

Updates reviewer models to the latest versions, adds an easy OpenRouter↔direct-API routing switch, and (incoming) a hosted GitHub Pages knowledge-base site.

Models

  • glm-5.1glm-5.2 (z.ai direct + OpenRouter dual route)
  • minimax-m2.7minimax-m3 (MiniMax direct + OpenRouter dual route)
  • DeepSeek now defaults to deepseek-v4-pro (V3.2 demoted to custom_only, kept for benchmark-history continuity)
  • New deepseek aichat client (api.deepseek.com) for the direct route
  • Profiles repointed; new direct profile (direct-API subs only, no Gemini)

Routing — OpenRouter vs direct API

A single knob decides which provider a dual-route reviewer (glm-5.2, minimax-m3, deepseek-v4-pro) tries first; the other becomes the automatic fallback.

  • defaults.route_preference: openrouter (public default — one OPENROUTER_API_KEY covers most reviewers) | direct
  • Per-run override: --route-pref {openrouter,direct}, shorthands --prefer-direct / --prefer-openrouter, or ARGUS_ROUTE_PREF env. Precedence: CLI flag › env › config
  • Wired through dispatch.py, verify.py, benchmark.py, estimate_cost.py via the single resolver _common.resolve_routes / resolve_route_preference
  • CLI reviewers are never reordered — their paid CLI sub stays primary and OpenRouter remains a true fallback
  • OR-balance pre-flight only fires when OpenRouter is the resolved primary, so a direct-preference run with a depleted OR balance isn't gated

This covers the "depleted OR balance → fall back to direct API for all but Gemini (skipped)" workflow: set route_preference: direct (or --prefer-direct) and use the direct profile.

Docs

  • README / SKILL / CLAUDE / DEVELOPMENT / CONTRIBUTING updated (routing section, env vars incl. DEEPSEEK_API_KEY, flags, profiles, dual-route guidance)
  • GitHub Pages KB site under docs/ — responsive, collapsible sidebar (mobile-collapsed / desktop-open), dark/light toggle (system default + persisted override), Mermaid diagrams, benchmark chart, onboarding (beginner + advanced), contributing/FAQ/glossary, per-page "View as Markdown" links, llms.txt, plus a Pages deploy workflow. (Landing in a follow-up commit on this branch.)

Tests

  • New tests/test_routes.py (route classification, preference reordering, CLI-never-reordered invariant, precedence)
  • Full suite green (37 passed), py_compile clean, config validates (15 reviewers, 8 profiles)

Note: live provider dispatch isn't exercised in CI (no network/keys); model slugs for the bumped versions follow the repo's existing forward-dated convention and may need a verify.py pass against live endpoints.

🤖 Generated with Claude Code


Generated by Claude Code

… route preference

Update reviewer model versions and add a single knob to choose OpenRouter
vs each provider's own API for dual-route reviewers.

Models:
- glm-5.1 -> glm-5.2 (z.ai direct + OR dual route)
- minimax-m2.7 -> minimax-m3 (MiniMax direct + OR dual route)
- deepseek-v4-pro is now the default DeepSeek (v3.2 demoted to custom_only);
  add api.deepseek.com direct route + new `deepseek` aichat client
- profiles repointed; new `direct` profile (subs only, no Gemini)

Routing:
- defaults.route_preference: openrouter (public default) | direct
- _common.resolve_routes / resolve_route_preference reorder the
  {direct, openrouter} pair only; CLI reviewers are never reordered
- --route-pref / --prefer-direct / --prefer-openrouter on
  dispatch/verify/benchmark/estimate_cost, plus ARGUS_ROUTE_PREF env
  (precedence: CLI flag > env > config)
- OR-balance pre-flight only fires when OR is the resolved primary

Docs (README/SKILL/CLAUDE/DEVELOPMENT/CONTRIBUTING) updated; new
tests/test_routes.py (resolve_routes + preference precedence).

Constraint: keep paid CLI subs primary; OR stays a true fallback for them
Confidence: high
Scope-risk: moderate
Not-tested: live provider dispatch (no network/keys in CI)

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv
Copilot AI review requested due to automatic review settings June 23, 2026 07:50
Static KB site under docs/, generated by a stdlib-only Python generator
(docs/build.py + docs/build_content.py).

- Pages: Home, Getting Started (beginner + advanced), Configuration,
  Reviewers, Profiles, Architecture, Benchmarks, Contributing, FAQ,
  Glossary, 404
- Collapsible sidebar (mobile-collapsed / desktop-open, persisted, mobile
  overlay + backdrop), dark/light toggle (system default + persisted
  override, no flash)
- Mermaid diagrams (pipeline, dispatch sequence, merge, cost gate,
  history.db ERD) + Chart.js benchmark leaderboard, both re-theme on toggle
- Per-page "View as Markdown / on GitHub" links, copy buttons, llms.txt,
  meta/OG tags, .nojekyll
- .github/workflows/pages.yml: build docs/ -> upload-pages-artifact ->
  deploy-pages (pages: write, id-token: write; push to master/main +
  workflow_dispatch)

Reflects the bumped models + route_preference feature from the prior commit.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv

Copilot AI left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copilot was unable to review this pull request because the user who requested the review has reached their quota limit.

…red OR-primary helper

Addresses two findings from a self code-review of this branch:

1. hermes-4.3 is also a {direct, openrouter} dual-route reviewer and is
   reordered by route_preference — docstring + README + KB site claimed only
   glm-5.2/minimax-m3/deepseek-v4-pro. Behavior was already correct (and benign:
   under the openrouter default it now skips a guaranteed-fail nous-direct
   attempt); docs corrected to match.
2. Deduplicate the 3-line "is the resolved primary OpenRouter?" helper that was
   copy-pasted in benchmark.py and estimate_cost.py into
   _common.primary_is_openrouter (single source of truth).

No behavior change; 37 tests still pass.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv
@jimstratus

Copy link
Copy Markdown
Owner Author

🔍 Self code-review (round 1)

Copilot review is unavailable (quota), so this is a manual self-review of the branch diff — covered correctness (line-by-line, removed-behavior, cross-file), reuse/simplification/efficiency, altitude, and CLAUDE.md conventions.

No correctness bugs. The route-resolution refactor is sound — the fallback flow is preserved in all four scripts (dispatch/verify/benchmark/estimate_cost), CLI reviewers correctly never reorder, and routing is centralized in one resolver (_common.resolve_routes), matching the repo's "single source of truth" convention.

Findings & fixes (both addressed in c978931)

# Severity Finding Fix
1 low (accuracy) hermes-4.3 is actually a 4th dual-route reviewer (Nous-direct ↔ OpenRouter), so it's reordered by route_preference too — but the docstring, README, and KB site claimed only glm-5.2/minimax-m3/deepseek-v4-pro. Behavior was already correct and benign (under the openrouter default it now skips a guaranteed-fail nous-direct attempt, since NOUSRESEARCH_API_KEY is unset). Corrected the resolve_routes docstring, README routing section, and docs/ site content to list hermes-4.3 (custom-only) as dual-route.
2 low (duplication) The 3-line "is the resolved primary OpenRouter?" check was copy-pasted into benchmark.py and estimate_cost.py. Extracted to _common.primary_is_openrouter() and called from both.

No behavior change from the fixes. 37 unit tests pass, the site rebuilds cleanly, and there are no stale model names in docs/.

CI (lint-and-test 3.12/3.13 + fixture-integrity) is green on the prior commit and re-running on c978931.

(The copilot-pull-request-reviewer check showing "failure" is the unavailable Copilot bot, not this repo's CI.)


Generated by Claude Code

The deploy workflow uploaded the whole docs/ folder, so build.py,
build_content.py, and a CI-generated __pycache__ were served as static
files. Stage the rendered output into _site/ (HTML + assets + .nojekyll +
llms.txt) and drop *.py / __pycache__ before upload, using portable cp
(rsync isn't guaranteed on the runner). Published tree is now just the site.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv
@jimstratus jimstratus marked this pull request as ready for review June 23, 2026 10:06
The Pages workflow stages the rendered site into _site/ before upload;
ignore it so a local run of the staging steps can't be committed by accident.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_011HG8Axfxjf6rv2mKKDEMxv
@jimstratus

Copy link
Copy Markdown
Owner Author

🔍 Self code-review — Pages-artifact tidy (4ea1d7a)

Focused review of the workflow change that slims the published tree (earlier rounds covered the scripts/config/site).

No correctness bugs. The staging step is sound and was verified locally with the same commands:

  • cp -r docs/. _site/ copies contents including dotfiles, so .nojekyll is preserved.
  • rm -f _site/*.py + rm -rf _site/__pycache__ drop the generator sources; rm -f is a no-op if the glob doesn't match.
  • Resulting published tree = 11 HTML pages + assets/ + .nojekyll + llms.txt, and no build.py / build_content.py / __pycache__.
  • Portable cp (not rsync, which isn't guaranteed on the runner); permissions / concurrency unchanged.

Finding & fix

# Severity Finding Fix
1 low (tidiness) The workflow creates _site/ at the repo root. It's never committed in CI, but a maintainer running the staging steps locally would get an untracked _site/ not covered by .gitignore — a footgun for an accidental commit of a generated tree. Added _site/ to .gitignore (5868ff7).

Net: one low-severity fix; no behavior change. The repo CI (lint-and-test 3.12/3.13 + fixture-integrity) runs on the push; the Pages deploy runs after merge to master.

(Reminder: the copilot-pull-request-reviewer "failure" is the unavailable Copilot bot — quota — not this repo's CI.)


Generated by Claude Code

@jimstratus jimstratus merged commit c8f3180 into master Jun 23, 2026
3 checks passed
@jimstratus jimstratus deleted the claude/model-versions-kb-site-7b30gw branch June 23, 2026 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants