fix: disable thinking/reasoning for submodel calls by Lap1acian · Pull Request #1392 · kwaroran/Risuai

Lap1acian · 2026-04-13T06:21:08Z

Problem

#1373 fixed the <Thoughts> block leaking into SD submodel output via post-processing strip.

However, with a local thinking model (gemma4:31b via OpenAI-compatible API), a deeper issue was found: the model's actual result (image tags for {{slot}}) is often empty or incomplete because the model spends its output budget on chain-of-thought reasoning that gets discarded anyway.

Disabling thinking at the request level — not just stripping it from the output — is needed so the model focuses entirely on producing useful tag content.

Solution

Three layers of defense for mode === 'submodel' calls:

File	Change
`request.ts`	Strip thinking-related `LLMFlags` (`geminiThinking`, `deepSeekThinkingOutput`, `claudeThinking`, `claudeAdaptiveThinking`) from model info before the request is sent
`openAI/requests.ts`	Skip wrapping `reasoning_content` / `reasoning` response fields in `<Thoughts>` tags
`stableDiff.ts`	Strip residual `<Thoughts>` and `<think>` blocks from output (safety net for local models that emit thinking in text)

Testing

Tested with gemma4:31b (local, OpenAI-compatible API):

Tag generation works correctly in most cases — model produces clean keyword output without thinking wrappers
In some cases, the thinking result is still not included in the prompt; the exact conditions for this are not yet identified

Known Limitations

Only verified with a single local model (gemma4:31b). Behavior with other thinking models (DeepSeek R1, Claude thinking, Gemini thinking) has not been tested.
The intermittent failure condition (thinking result missing from prompt) needs further investigation.

Feedback, additional test results from other thinking models, or insights on the intermittent issue would be very welcome.

Reasoning models (Gemini thinking, Claude thinking, DeepSeek R1, OpenRouter reasoning, local models with <think> tags) produce chain-of-thought content that pollutes the image generation prompt when used as submodel for Stable Diffusion / NovelAI tag generation. Three changes to address the root cause: 1. request.ts: Strip thinking-related flags (geminiThinking, deepSeekThinkingOutput, claudeThinking, claudeAdaptiveThinking) from modelInfo when mode is 'submodel', so the API never requests thinking in the first place. 2. openAI/requests.ts: Skip wrapping reasoning_content / reasoning API response fields in <Thoughts> tags for submodel calls. If the model's text content is empty, fall back to reasoning content directly (without wrapper tags) so downstream processing can still work with it. 3. stableDiff.ts: Strip both <Thoughts> and <think> blocks from the submodel result, extending the previous <Thoughts>-only strip to also cover local models that emit <think> tags in their text output. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

The previous approach fell back to raw reasoning_content when the model's text content was empty, which caused reasoning text to leak into the SD prompt for models that put everything in the reasoning field. Simply discard reasoning for submodel calls — if the model produces no text content, an empty result is preferable to garbage. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

kamukande and others added 2 commits April 10, 2026 17:35

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

fix: disable thinking/reasoning for submodel calls#1392

fix: disable thinking/reasoning for submodel calls#1392
Lap1acian wants to merge 2 commits into
kwaroran:mainfrom
Lap1acian:fix/sd-submodel-disable-thinking

Lap1acian commented Apr 13, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

Lap1acian commented Apr 13, 2026

Problem

Solution

Testing

Known Limitations

Related

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants