Skip to content

Add guided grammar model setting#995

Open
MrNiceRicee wants to merge 1 commit intojundot:mainfrom
MrNiceRicee:feature/guided-grammar-settings
Open

Add guided grammar model setting#995
MrNiceRicee wants to merge 1 commit intojundot:mainfrom
MrNiceRicee:feature/guided-grammar-settings

Conversation

@MrNiceRicee
Copy link
Copy Markdown

@MrNiceRicee MrNiceRicee commented Apr 28, 2026

Summary

Adds guided grammar support as a first-class model setting in the admin dashboard.

This exposes a generic guided_grammar control without adding model-specific presets. The request path normalizes guided_grammar into the existing structured_outputs.grammar flow, so constrained decoding continues to use the current xgrammar-backed implementation.

Changes

  • Add guided_grammar request field as a compatibility alias
  • Add guided_grammar_enabled and guided_grammar to persisted model settings
  • Surface guided grammar in /admin/api/models and model settings update API
  • Add a Guided Grammar textarea/toggle to the model settings modal
  • Include guided grammar in profiles/templates as a universal profile field
  • Trim empty grammar values before persistence
  • Add tests for request parsing, settings roundtrip, admin API exposure, and precedence behavior
  • Classify existing preserve_thinking as profile-excluded to keep model settings field classification complete

Notes

Actual constrained decoding still requires the optional grammar dependency, xgrammar, as before. Without it, structured output requests return the existing dependency error.

Validation

  • 14 passed focused guided grammar tests
  • 71 passed settings/profile regression tests
  • 157 passed nearby request/admin/settings suite
  • 59 passed, 3 skipped, 5 deselected grammar suite excluding optional structural-tag patch tests
  • JSON i18n validation
  • Python syntax check
  • dashboard JS syntax check
  • git diff --check
image

@MrNiceRicee MrNiceRicee marked this pull request as ready for review April 28, 2026 21:58
@MrNiceRicee
Copy link
Copy Markdown
Author

I tested this locally on Qwen3.6-27B-MLX-8bit; grammar-constrained run reduced output from 400 tokens / 29.4s to 94 tokens / 7.6s on one coding prompt.

got the idea from
https://andthattoo.dev/blog/structured_cot

if possible, would love to have a guided grammar setting

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant