fix(tool_calling): recover naked <function=...> calls emitted by Qwen3-Coder#844
Open
mrtkrcm wants to merge 2 commits intojundot:mainfrom
Open
fix(tool_calling): recover naked <function=...> calls emitted by Qwen3-Coder#844mrtkrcm wants to merge 2 commits intojundot:mainfrom
mrtkrcm wants to merge 2 commits intojundot:mainfrom
Conversation
42765dc to
6907419
Compare
6907419 to
e1bc50f
Compare
…3-Coder Qwen3-Coder-30B (and other Qwen-Coder variants) sometimes emits <function=name>...</function> without the outer <tool_call> wrapper that mlx_lm.tool_parsers.qwen3_coder expects. Without this branch the structured call leaks into content with an empty tool_calls[], producing a visible score regression on tool-routing benchmarks. Add a fallback after the XML-wrapper branch that recognises the naked envelope + embedded <parameter=key>val</parameter> pairs, rebuilds a proper ToolCall, and strips residual tags from content. Kern quick-suite: 1/9 -> 5/9 for Qwen3-Coder-30B.
e1bc50f to
8a9cc38
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Recover Qwen-style naked
<function=...>...</function>tool calls when the model omits the outer<tool_call>wrapper.Changes
<parameter=...>values with JSON coercion when possible.Local validation
Built and installed from this branch into
/Applications/oMLX.app(0.3.8.dev2), with port8801owned by the visible app process.201 passed, 12 deselectedfortests/test_tool_calling.py,tests/integration/test_e2e_streaming.py, andtests/test_admin_profiles_api.py.Qwen3-Coder-30B-A3B-Instruct-4bitforcedread_file; response returned structuredtool_callsand no raw<function=...>markup.Qwen3-Coder-30B-A3B-Instruct-4bitload2.8s, canary25.1 tok/s, code23.7 tok/s, toolOKin2.3s.Ternary-Bonsai-8B-mlx-2bitreturnedready; benchmark toolOKin0.6s.Note: benchmark host was not clean; preflight saw active desktop/client load, so throughput is smoke data only.
Test Plan
uv run pytest tests/test_tool_calling.py -qpython3 -m py_compile omlx/api/tool_calling.py tests/test_tool_calling.py