feat(prompt): runtime.prompt.inline_schemas flag — unblock local Ollama audits by nextlevelshit · Pull Request #1468 · re-cinq/wave

nextlevelshit · 2026-04-28T09:25:51Z

Summary

buildContractPrompt inlines the full json_schema body (~50KB / ~12-15K tokens) into every step prompt. Frontier models tolerate it; local Ollama models at 32K context choke. Concrete observation: audit-doc-scan on opencode-glm hangs at GPU 100%, 0 tokens for 60+ min before timing out, regardless of timeout settings — the model never returns a streamable completion when the prompt is that heavy.

The same function builds a required-fields skeleton + property hints right after the schema dump. The skeleton alone is enough for any model that handled the YAML pipeline at all.

Change

New manifest knob runtime.prompt.inline_schemas: bool. Defaults to true (zero-change for existing repos). Set to false to drop the full schema body and keep only the schema-path reference + the skeleton.
buildContractPrompt takes a third *PromptConfig arg; nil preserves historical behavior so test callers stay valid.
This repo's wave.yaml flips the flag to false and bumps the local timeouts to 90m so live runs on glm-4.7-flash and qwen3.5:27b actually complete.

Side bumps in this repo's wave.yaml

Setting	Before	After
`runtime.default_timeout_minutes`	30	90
`runtime.timeouts.step_default_minutes`	15	90
`runtime.stall_timeout`	10m	90m
`runtime.prompt.inline_schemas`	(default true)	false

Test plan

go test ./... — full suite green
golangci-lint run ./... — 0 issues
New unit test TestContractPrompt_InlineSchemasDisabled covers the slimmed prompt path: skeleton kept, schema reference kept, full body dropped.
go run ./cmd/wave validate --all — clean against current pipelines
Live verification on audit-doc-scan against opencode-glm — pending (will post the run ID + result in a comment)

Why default-true matters

Frontier models perform measurably better with the full schema visible (no degradation observed in the existing pipeline runs on Claude). The flag exists for the local-model envelope, not as a global behavior change.

The contract prompt builder dumps the full json_schema body inline (`buildContractPrompt`, executor.go:~4940). On a 50KB shared schema, that consumes ~12-15K tokens — fine for frontier models, fatal for local Ollama models capped at 32K context. With the dump in place, glm-4.7-flash hangs on `audit-doc-scan` indefinitely (GPU 100%, 0 tokens emitted). The skeleton + required-fields hint that the same function generates right after is enough for any model that handled the YAML pipeline at all. The full schema is the wasteful part. - Add `runtime.prompt.inline_schemas *bool` to the manifest. Default true (no behavior change for existing repos). Set false to drop the full dump and keep only the schema-path reference + skeleton. - Pass `&execution.Manifest.Runtime.Prompt` through to `buildContractPrompt`. The function takes a third `*PromptConfig` arg; nil preserves historical behavior so callers in tests stay untouched. - New table-style test `TestContractPrompt_InlineSchemasDisabled` asserts: skeleton + required-fields kept, schema reference kept, full schema body dropped.

Three knobs aligned for local-Ollama runs on this host: - runtime.default_timeout_minutes: 30 -> 90 - runtime.timeouts.step_default_minutes: 15 -> 90 - runtime.stall_timeout: 10m -> 90m GLM-4.7-flash and qwen3.5:27b take longer than frontier models on real audit prompts. The previous defaults treated even a successful local run as a failure. Plus runtime.prompt.inline_schemas: false (the new flag from the companion commit) — frees ~12-15K tokens per step that GLM was spending on the schema dump alone. Together these unblock audit-* and impl-* on local; without them local was effectively limited to single-step local-* pipelines.

nextlevelshit added 2 commits April 28, 2026 11:25

nextlevelshit mentioned this pull request Apr 28, 2026

feat(adapter): trigger fallback on timeout + context exhaustion, wire manifest fallbacks #1472

Draft

6 tasks

nextlevelshit marked this pull request as draft April 28, 2026 16:39

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat(prompt): runtime.prompt.inline_schemas flag — unblock local Ollama audits#1468

feat(prompt): runtime.prompt.inline_schemas flag — unblock local Ollama audits#1468
nextlevelshit wants to merge 2 commits into
mainfrom
feat/prompt-inline-schemas-flag

nextlevelshit commented Apr 28, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

nextlevelshit commented Apr 28, 2026

Summary

Change

Side bumps in this repo's wave.yaml

Test plan

Why default-true matters

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant