Skip to content

chore: update model smoke test results for v1.5.0#215

Merged
griffinmartin merged 1 commit intomainfrom
chore/update-smoke-test-results-2026-04-30
Apr 30, 2026
Merged

chore: update model smoke test results for v1.5.0#215
griffinmartin merged 1 commit intomainfrom
chore/update-smoke-test-results-2026-04-30

Conversation

@griffinmartin
Copy link
Copy Markdown
Owner

Summary

Re-run pnpm run test:models against v1.5.0 to refresh the model compatibility cache and the README "Supported models" table.

Results

Metric Before (v1.4.10) After (v1.5.0)
Tested 17 16
Passed 16 15
Failed 1 1
Skipped 7 8

Newly failing

  • claude-3-haiku-20240307 — older Claude 3 Haiku is no longer reachable through the subscription auth path. Added to failed-models.json so subsequent runs skip it.

Confirmed passing (notable)

  • claude-opus-4-7
  • claude-opus-4-6
  • claude-sonnet-4-6 ✓ (with effort-2025-11-24, context-1m-2025-08-07 excluded as expected)
  • claude-haiku-4-5
  • All Claude 4 family models ✓

Files changed

  • test-results/model-smoke-test.json — full run output regenerated
  • test-results/failed-models.jsonclaude-3-haiku-20240307 added
  • README.md — "Supported models" table regenerated by the script (15 models, sorted)

Test plan

  • pnpm run lint passes (oxfmt run on regenerated files)
  • pnpm run build clean
  • All currently-supported models still pass under the v1.5.0 fingerprint

Run `pnpm run test:models` against v1.5.0:

- 15/16 models pass (was 16/17 on v1.4.10)
- claude-3-haiku-20240307 newly fails and joins the skip-list — older
  Claude 3 Haiku appears to no longer be reachable through the
  subscription auth path
- claude-opus-4-7 continues to pass (verified for the integration suite)
- All Claude 4 family models (haiku-4-5, sonnet-4-x, opus-4-x) pass

README "Supported models" table regenerated by the script.
@griffinmartin griffinmartin merged commit 4049822 into main Apr 30, 2026
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant