Report date: 2026-05-20 Upstream versions compared against:
- llama.cpp
b9119(flag definitions incommon/arg.cpp, CLI binary intools/cli/cli.cpp) - whisper.cpp
v1.8.4(examples/cli/cli.cpp) - stable-diffusion.cpp
master-596-90e87bc(CLI shell inexamples/cli/main.cpp, model/gen flags inexamples/common/common.cpp)
What triggered this audit: chimera sd had silently shipped without the entire split-checkpoint flag family (--diffusion-model, --vae, --clip-l, --t5xxl, --llm, --offload-to-cpu, --diffusion-fa), which made it impossible to run any Z-Image / Flux / SD3-class model. The hole was invisible because no one had cross-referenced sd's flag surface against ours; this audit closes that blind spot for the four CLI subcommands (gen/chat, embed, whisper, sd). chimera serve (wraps llama-server) is out of scope.
Status legend: ✅ exposed · 🔀 renamed · 🟡 partial · ❌ missing · 🚫 out-of-scope.
Note: chimera's CLI definitions live in
src/chimera_cli/chimera.cpp(bind_*_cmdhelpers) and the option structs insrc/chimera/chimera.h. Both files are referenced throughout this report.
Status update 2026-05-20: the 20 flag groups identified as Tier 1–4 priorities in the original llama section have all landed on
gen/chat/embed. The tables and "Notable gaps" sections below have been edited in-place to reflect this; see the CHANGELOG entry under [Unreleased] for the full list.Whisper coverage closer 2026-05-20: the remaining whisper gaps flagged below — VAD bundle, offset/duration, segment shaping, decoder-fail thresholds, audio-ctx, tinydiarize, token suppression, context-params (
--flash-attn/--no-gpu/--device), and--processors— all landed onwhisper. The whisper table rows are flipped to ✅ in-place; the "Notable gaps worth filing" list is now empty save for the documented out-of-scope items.sd partial: skip-layer guidance + the
--high-noise-diffusion-modelmodel-loading slot landed; the rest of the--high-noise-*family stays out of scope (video-only).sd coverage closer 2026-05-20 (Rounds 1–8): 38 additional flags landed — perf/offload (
--fa,--no-mmap,--max-vram, per-component CPU offload, SDXL VAE fix), sampler/generation (--img-cfg-scale,--eta,--timestep-shift,--sigmas,--prediction,--lora-apply-mode), model-loading (--taesd,--clip-vision,--llm-vision,--tensor-type-rules,--photo-maker,--embd-dir), PhotoMaker bundle (--pm-id-images-dir/--pm-id-embed-path/--pm-style-strength), reference images (--ref-image+ supporting flags), the full hires-fix bundle, and the cache/SCM bundle (--cache-mode,--cache-option,--scm-mask,--scm-policy). Tables below are flipped to ✅ in-place. All sd ❌ rows are now resolved:--disable-image-metadatais reclassified 🚫 (moot — chimera's stockstb_image_writewrites no text chunks, so there's nothing to disable; the inverse "embed metadata" feature is a separate item not yet on the roadmap).
| Subcommand | Upstream flags considered | Exposed | Renamed | Missing (real gap) | Deliberately out of scope |
|---|---|---|---|---|---|
gen (llama-cli) |
~80 CLI-relevant | 51 | 1 | 0 | ~58 |
chat (llama-cli, interactive) |
~85 | 56 | 0 | 0 | ~63 |
embed (llama-embedding) |
~14 | 12 | 1 | 1 | 1 |
whisper (whisper-cli) |
58 | 46 | 1 | 0 | ~13 |
sd (sd cli) |
107 | 60 | 6 | 0 | ~51 |
"Real gaps" are flags whose absence we'd consider filing an issue for. "Deliberately out of scope" covers things like llama-cli's REPL plumbing (chimera replaces it with chat), perplexity/imatrix/training knobs, anything tied solely to llama-server, and obscure research/debug flags. The next two columns of the per-subcommand tables make each call individually.
The headline finding from the original audit — "the sd surface is by far the largest source of meaningful gaps" — no longer applies. After the 2026-05-20 sd closer (Rounds 1–8, 38 additional flags on top of the earlier Tier 1–2 work), the sd surface has zero unresolved ❌ rows. The lone remaining item, --disable-image-metadata, is reclassified 🚫 because chimera's stock stb_image_write writes no text chunks, so there is no metadata to disable; embedding generation params (the reverse direction, for parity with sd-cli's default behaviour) is a separate feature not yet on the roadmap. Everything else is documented out-of-scope (video, standalone modes, shell features, chroma/qwen tuning). The llama coverage is intentionally minimal — chimera leans on its own DSL (chat REPL, serve HTTP) and the wrapped subcommands are deliberately thin. The whisper surface, previously the most "leaky" relative to size, is now ~79% covered after the 2026-05-20 closer (Batches 1–3 + VAD + offset/duration + grammar + stereo diarize + detect-language). All non-niche whisper ❌ rows are resolved. Remaining 🚫/❌ items are token-level DTW, --word-thold, OpenVINO device selection, and a few decoder-print toggles — all explicitly out of scope or low-demand.
Upstream llama-cli inherits ~330 common_arg declarations from common/arg.cpp. Roughly 80 are tagged for the CLI context (the rest are server-only, training-only, perplexity-only, etc.). Chimera deliberately exposes only a thin generation slice and trusts upstream defaults for the rest; this is consistent with the project's framing as a thin C++ shell, so the size of the "missing" column below is expected — what matters is whether the missing ones meaningfully constrain users.
| Upstream flag | Chimera equivalent | Status | Notes |
|---|---|---|---|
--model, -m |
-m,--model |
✅ | Required for gen, soft-required for chat. |
--prompt, -p |
-p,--prompt (gen only) |
✅ | chat uses interactive input instead. |
--prompt-file, -f / --file |
-f,--prompt-file |
✅ | Stdin via - supported. |
--predict, -n / --n-predict |
-n,--n-predict |
✅ | |
--ctx-size, -c |
-c,--ctx-size |
✅ | |
--batch-size, -b |
-b,--batch-size |
✅ | |
--ubatch-size |
--ubatch-size |
✅ | Landed 2026-05-20. |
--threads, -t |
-t,--threads |
✅ | |
--threads-batch |
--threads-batch |
✅ | Landed 2026-05-20. -1 mirrors --threads. |
--seed |
--seed |
✅ | |
--temp |
--temp |
✅ | |
--top-k |
--top-k |
✅ | |
--top-p |
--top-p |
✅ | |
--min-p |
--min-p |
✅ | |
--repeat-penalty |
--repeat-penalty |
✅ | |
--repeat-last-n |
--repeat-last-n |
✅ | Landed 2026-05-20. |
--presence-penalty / --frequency-penalty |
--presence-penalty / --frequency-penalty |
✅ | Landed 2026-05-20. |
--typical |
--typical |
✅ | Landed 2026-05-20. Maps to sampling.typ_p; 1.0 disables. |
--top-nsigma |
--top-nsigma |
✅ | Landed 2026-05-20. sampling.top_n_sigma; -1 disables. |
--xtc-probability / --xtc-threshold |
same | ✅ | Landed 2026-05-20. |
--dry-* (multiplier/base/allowed-length/penalty-last-n/sequence-breaker) |
--dry-multiplier / --dry-base / --dry-allowed-length / --dry-penalty-last-n / --dry-sequence-breaker |
✅ | Landed 2026-05-20. Sequence-breaker is repeatable. |
--mirostat / --mirostat-ent / --mirostat-lr |
same | ✅ | Landed 2026-05-20. |
--samplers / --sampler-seq |
--samplers (sampler-seq 🚫) |
🟡 | Landed 2026-05-20. --samplers parses the same ';'-separated name list as llama-cli (via common_sampler_types_from_names(names, allow_alt_names=true)). --sampler-seq (single-char form) not added — same surface, redundant. |
--dynatemp-range / --dynatemp-exp |
same | ✅ | Landed 2026-05-20. |
--logit-bias |
--logit-bias |
✅ | Landed 2026-05-20. Repeatable, format `"(+ |
--ignore-eos |
--ignore-eos |
✅ | Landed 2026-05-20. |
--grammar / --grammar-file / --json-schema / --json-schema-file |
same | ✅ | Landed 2026-05-20. JSON schema converted via json_schema_to_grammar. Mutually exclusive group. End-to-end smoke verified. |
--flash-attn |
--flash-attn |
✅ | Landed 2026-05-20. Available on gen/chat/embed. |
--mmap / --mlock |
--no-mmap / --mlock |
✅ | Landed 2026-05-20. use_mmap default stays true; --no-mmap to opt out. |
--gpu-layers |
--gpu-layers |
✅ | |
--main-gpu / --tensor-split / --split-mode |
same | ✅ | Landed 2026-05-20. --split-mode accepts none/layer/row/tensor; --tensor-split parses comma-separated floats. |
--device / --list-devices |
--device only |
🟡 | --device landed 2026-05-20 (comma-separated device list). --list-devices skipped — better fit as a chimera info extension. |
--n-cpu-moe / --cpu-moe |
same | ✅ | Landed 2026-05-20. Both manipulate llama_model_params.tensor_buft_overrides via the upstream inline helpers llm_ffn_exps_cpu_override() and llm_ffn_exps_block_regex(i). They stack with --override-tensor. |
--override-tensor / --override-kv |
same | ✅ | Landed 2026-05-20. --override-tensor parses <pattern>=<buft_name> (multiple, comma-separated; backend lookup via ggml_backend_dev_buffer_type enumeration). --override-kv reuses upstream's string_parse_kv_override so the KEY=TYPE:VALUE grammar matches exactly. Both repeatable on the CLI. |
--cache-type-k / --cache-type-v |
same | ✅ | Landed 2026-05-20. Accepts f32/f16/bf16/q8_0/q5_0/q5_1/q4_0/q4_1/iq4_nl. End-to-end smoke verified. |
--rope-freq-base / --rope-freq-scale / --rope-scaling / --rope-scale |
same | ✅ | Landed 2026-05-20. --rope-scaling accepts none/linear/yarn/longrope. |
--yarn-* (orig-ctx, ext-factor, attn-factor, beta-fast, beta-slow) |
same | ✅ | Landed 2026-05-20. |
--lora / --lora-scaled |
--lora <path[:scale]> |
✅ | Landed 2026-05-20. Repeatable. Reuses the serve-side path[:scale] parser. Closes the asymmetry. |
--mmproj |
--mmproj |
✅ | |
--mmproj-offload / --mmproj-auto / --mmproj-url |
--no-mmproj-offload |
🟡 | --no-mmproj-offload landed 2026-05-20 (maps to mtmd_context_params.use_gpu). --mmproj-auto not modeled by upstream at b9119; --mmproj-url is network-fetch (out of scope). |
--image |
--image (gen; repeatable) |
✅ | chat injects images via /image REPL command. |
--image-min-tokens / --image-max-tokens |
same | ✅ | Landed 2026-05-20. Wired on both gen and chat mtmd paths via mtmd_context_params.{image_min,image_max}_tokens. -1 / 0 leaves the model's metadata default. |
--system-prompt |
--system (chat) |
🔀 | Renamed; gen lacks it (it's an interactive concept). |
--system-prompt-file |
--system-prompt-file (chat) |
✅ | |
--chat-template |
--chat-template (chat) |
✅ | |
--chat-template-file / --chat-template-kwargs |
same | ✅ | Landed 2026-05-20 (chat only). --chat-template-file is mutually exclusive with --chat-template. --chat-template-kwargs is repeatable. |
--jinja |
--no-jinja |
✅ | Landed 2026-05-20 (chat only). Jinja defaults ON; --no-jinja opts out. |
--reasoning / --reasoning-budget / --reasoning-format / --reasoning-budget-message |
same | ✅ | Landed 2026-05-20 (chat only). --reasoning-budget enforcement landed in a follow-up the same day — command_chat probes the template via a dummy common_chat_templates_apply to read thinking_{start,end}_tag, tokenizes via common_tokenize(parse_special=true), populates sampling.reasoning_budget_{tokens,start,end,forced}, and common_sampler_init chains the budget sampler into the chain. --reasoning-budget-message is tokenized into the forced-termination sequence as <message> + <end_tag> (mirrors llama-cli). When the active template has no thinking tags, a warning fires and the budget is silently ignored. |
--keep |
— | 🚫 | Architecture mismatch. --keep controls how many tokens llama-cli preserves when its sliding-window context-shift triggers on overflow. Chimera's chat reuses KV-prefix across turns and doesn't run llama-cli's shift loop; gen is one-shot. The field has no effect in chimera's code path. |
--color |
--color (chat) |
✅ | gen is non-interactive so this is fine. |
--verbose-prompt / --special / --escape / --no-context-shift |
— | 🚫 | Debug/edge; out-of-scope. |
--prompt-cache / --prompt-cache-all / --prompt-cache-ro |
— | 🚫 | Tied to llama-cli's prompt-cache on-disk format; nicer to layer above. |
--ctx-checkpoints / --checkpoint-every-n-tokens |
— | 🚫 | Server-only fields (common_params server block: n_ctx_checkpoints, checkpoint_every_nt). Not consumed by chimera's CLI subcommands. Re-evaluate if chimera serve ever surfaces them. |
--swa-full |
--swa-full |
✅ | Landed 2026-05-20. Wired on llama_context_params.swa_full. |
--cache-ram |
— | 🚫 | Server-only field (common_params.cache_ram_mib). Out of scope for the CLI subcommands. |
--n-predict shorthand -n |
covered | ✅ | |
--single-turn / --interactive / --interactive-first / --in-prefix / --in-prefix-bos / --in-suffix / --reverse-prompt / --multiline-input / --conversation / --display-prompt / --simple-io / --print-token-count |
— | 🚫 | Upstream's interactive REPL; chimera replaces with its own chat + linenoise. Do not port. |
--no-warmup |
— | 🚫 | |
--hf-repo / --hf-file / --hf-token / --model-url / --offline / --docker-repo |
— | 🚫 | Network model fetch; chimera assumes the user supplies a local path. |
--cpu-mask* / --cpu-range* / --cpu-strict* / --prio* / --poll* |
— | 🚫 | Thread-affinity knobs; specialist usage. |
--draft* / --spec-* (~30 flags) |
— | 🚫 | Speculative decoding. Out of scope until chimera grows a draft-model story. |
--control-vector* |
same | ✅ | Landed 2026-05-20. --control-vector PATH (scale=1.0), --control-vector-scaled PATH:SCALE, --control-vector-layer-start/-end N. Loaded via common_control_vector_load and applied via llama_set_adapter_cvec after context init. Layer defaults: start=1, end=llama_model_n_layer(model). Both load flags repeatable and comma-separable. |
--diffusion-* (algorithm/steps/eps/etc.) |
— | 🚫 | llama.cpp diffusion LM support; not the same thing as chimera sd. |
--hellaswag* / --winogrande* / --multiple-choice* / --ppl* / --kl-divergence / --perplexity* |
— | 🚫 | Eval-only. |
--logits-output-dir / --save-logits / --save-all-logits |
— | 🚫 | |
--epochs / --learning-rate* / --optimizer / --weight-decay / --method / --pca-* |
— | 🚫 | Training/fine-tune. |
--license / --version / --help / --completion-bash / --list-devices |
partial via top-level chimera | ✅/❌ | --version via chimera -V; --list-devices could be a nice chimera info follow-up. |
--log-file / --log-disable / --log-colors / --log-prefix / --log-timestamps / --verbosity |
partial via top-level -v |
🟡 | Chimera has a single -v/--verbose; finer-grained log control isn't exposed. |
--no-host / --api-key* / --api-prefix |
— | 🚫 | Server-mode only. |
| Chimera flag | Status | Notes |
|---|---|---|
--persist, --resume, --list, --search, --list-limit, --db |
✅ chimera-native | Backed by the embedded SQLite tables. No equivalent in llama-cli. |
All five priorities from the original audit landed on 2026-05-20 (--flash-attn, grammar/json-schema, DRY + repeat-last-n, --lora in gen/chat, reasoning family). Residual items:
-
Long-tail gen/chat closer — 19 flags landed:✅ Landed 2026-05-20. Four upstream flags (--typical,--top-nsigma,--xtc-probability/--xtc-threshold,--dynatemp-range/--dynatemp-exp,--samplers,--threads-batch,--swa-full,--image-min-tokens/--image-max-tokens,--cpu-moe/--n-cpu-moe,--override-tensor/--override-kv, full--control-vector*family.--keep,--ctx-checkpoints,--checkpoint-every-n-tokens,--cache-ram) reclassified 🚫 —--keepis upstream's context-shift loop (chimera uses KV-prefix reuse; no shift), the other three are server-onlycommon_paramsfields not consumed by the CLI subcommands. -
✅ Landed 2026-05-20. The earlier "needs--reasoning-budgetenforcement.chat_sample_looprestructure" comment turned out to be wrong on closer reading:common_sampler_inititself chainscommon_reasoning_budget_initinto the sampler whenever thesampling.reasoning_budget_{tokens,start,end,forced}fields are populated, so the integration is entirely upstream ofcommon_sampler_init— no sample-loop changes needed. Implementation:command_chatprobes the active chat template once at startup via a dummycommon_chat_templates_apply, readsthinking_{start,end}_tag, tokenizes withparse_special=true, and stuffs the result into the sampling params beforemake_sampler. Forced-termination sequence =--reasoning-budget-message + thinking_end_tag. Templates without thinking tags warn and ignore the budget. -
✅ Landed 2026-05-20 as--list-deviceschimera info --list-devices. -
--mmproj-auto— not modeled bymtmd_context_paramsat llama.cppb9119. Revisit on next pin bump.
- Anything under "interactive REPL" or "prompt cache on disk" — chimera owns its own REPL via
chat+ linenoise and persists via SQLite. - HuggingFace/docker/network model fetch — chimera takes local paths.
- Speculative-decoding and draft-model flags — out of scope until chimera adds a draft-model wrapper.
- All training / perplexity / hellaswag / imatrix / cvector / pca / optimizer flags.
- CPU mask / affinity / strict / poll / prio knobs (specialist usage).
- llama.cpp's
--diffusion-*flags — refer to diffusion-LMs, not stable-diffusion.cpp.
llama-embedding was retired as a standalone binary in current llama.cpp — the same flag set is now available via llama-cli with --embedding. Coverage here is excellent because the surface is small.
| Upstream flag | Chimera equivalent | Status | Notes |
|---|---|---|---|
--model, -m |
-m,--model |
✅ | |
--prompt, -p / file via -f |
-p,--prompt / -f,--prompt-file |
✅ | Stdin via -. |
--embedding / --embeddings |
implicit (subcommand intent) | ✅ | Chimera dispatches embed mode automatically. |
--pooling {none,mean,cls,last,rank} |
--pooling |
✅ | rank (reranker) landed 2026-05-20 — LLAMA_POOLING_TYPE_RANK is now accepted alongside mean|cls|last|none. |
--embd-normalize N (-1 / 0 / 1 / 2 / >2) |
--no-normalize flag |
🔀 | Chimera reduces to a boolean (L2 or off). Loses access to taxicab/p-norm. Acceptable simplification; document the choice. |
--embd-output-format |
--embd-output-format |
✅ | Landed 2026-05-20. Values: '' (default; space-separated, preserves prior output), array, json (OpenAI envelope), raw. json+ (cosine-similarity matrix add-on) not implemented. |
--embd-separator |
--embd-separator |
✅ | Landed 2026-05-20. Literal-string splitter (no regex); emits one vector per piece. |
--ctx-size, -c |
-c,--ctx-size |
✅ | |
--batch-size, -b |
-b,--batch-size |
✅ | |
--threads, -t |
-t,--threads |
✅ | |
--gpu-layers |
--gpu-layers |
✅ | |
--attention {causal,non-causal} |
--attention |
✅ | Landed 2026-05-20. Pins llama_context_params.attention_type; empty leaves the model default. |
--flash-attn |
--flash-attn |
✅ | Landed 2026-05-20. |
--cls-separator |
— | 🚫 | Eval/retrieval-specific. |
--chunk / --chunks / --chunk-size / --chunk-separator |
— | 🚫 | Belongs to chimera's own index/search layer, not the model invocation. |
--output-format (general) |
— | 🚫 | See --embd-output-format. |
--cache-embeddings/--cache-db— SQLite memoization layer. No upstream analogue.-o,--output— chimera writes to a file/stdout instead ofembedding.txtstyle upstream behavior. Cleaner.
✅ Landed 2026-05-20.--embd-output-format✅ Landed 2026-05-20.--embd-separator✅ Landed 2026-05-20.--attention causal|non-causalPooling✅ Landed 2026-05-20.rankvalue
embed picked up --flash-attn, --ubatch-size, --no-mmap, --mlock, --main-gpu, --tensor-split, --split-mode, --device, and the full RoPE / YaRN family (--rope-freq-base, --rope-freq-scale, --rope-scale, --rope-scaling, --yarn-orig-ctx, --yarn-ext-factor, --yarn-attn-factor, --yarn-beta-fast, --yarn-beta-slow). These aren't part of llama-embedding's historic surface but are useful for embedding models on long-context fine-tunes / multi-GPU.
- All chunking flags (
--chunk*) — chimera handles chunking at theindex/searchlayer. --cls-separatorand other retrieval-helper flags — same reasoning.
whisper-cli has a flat ~58-flag surface. Chimera exposes 5 of them. The result is a deliberately minimal wrapper, but several gaps are unforced — particularly around output formats and VAD.
| Upstream flag (short) | Chimera | Status | Notes |
|---|---|---|---|
-m / --model |
-m,--model |
✅ | |
-f / --file |
-i,--input |
🔀 | Renamed; upstream supports repeating; chimera takes one. |
-t / --threads |
-t,--threads |
✅ | |
-p / --processors |
--processors |
✅ | Landed 2026-05-20. >1 routes through whisper_full_parallel; default 1 keeps the serial path. |
-l / --language |
-l,--language |
✅ | |
-dl / --detect-language |
--detect-language |
✅ | Landed 2026-05-20. Sets whisper_full_params.detect_language = true; whisper.cpp itself short-circuits before any decode pass (see whisper.cpp ~line 6815 — returns 0 after whisper_lang_auto_detect_with_state). chimera reads the detected language via whisper_full_lang_id(ctx) → whisper_lang_str(...) and prints just the code (e.g. en) to the output sink, then exits. Format-file flags are silently no-op'd since result.segments is empty after the short-circuit. Note: English-only models (*.en.bin) produce garbage codes — language detection requires a multilingual model. |
-tr / --translate |
--translate |
✅ | |
--prompt |
--prompt |
✅ | Landed 2026-05-20. Initial-prompt biasing (whisper_full_params.initial_prompt). |
--carry-initial-prompt |
--carry-initial-prompt |
✅ | Landed 2026-05-20. |
-bs / --beam-size |
--beam-size |
✅ | Landed 2026-05-20. Sets WHISPER_SAMPLING_BEAM_SEARCH when N>0. |
-bo / --best-of |
--best-of |
✅ | Landed 2026-05-20. |
-tp / --temperature |
--temperature |
✅ | Landed 2026-05-20. |
-tpi / --temperature-inc |
--temperature-inc |
✅ | Landed 2026-05-20. NaN sentinel (not negative) because the field's upstream default is positive but logprob_thold's isn't; same scheme across the four fallback knobs. --no-fallback still wins. |
-nf / --no-fallback |
--no-fallback |
✅ | Landed 2026-05-20. Sets temperature_inc<0. |
-mc / --max-context |
— | ❌ | |
-ml / --max-len |
--max-len |
✅ | Landed 2026-05-20. 0 = unlimited (whisper default). Pairs with --output-srt / --output-vtt. |
-sow / --split-on-word |
--split-on-word |
✅ | Landed 2026-05-20. Only takes effect when --max-len > 0. |
-wt / --word-thold |
— | ❌ | |
-et / --entropy-thold / -lpt / --logprob-thold / -nth / --no-speech-thold |
--entropy-thold / --logprob-thold / --no-speech-thold |
✅ | Landed 2026-05-20. NaN sentinel leaves the upstream default (necessary because logprob_thold defaults to a negative value). |
-ot / --offset-t / -on / --offset-n / -d / --duration |
--offset / --duration |
🟡 | Landed 2026-05-20 for the ms-based pair (-ot / -d). -on (sample-offset) is not exposed by whisper_full_params — it's internal to whisper-cli's WAV reader, so deliberately skipped. |
-ac / --audio-ctx |
--audio-ctx |
✅ | Landed 2026-05-20. 0 = model default; common tweak for tiny.en. |
-fa / --flash-attn / -nfa / --no-flash-attn |
--flash-attn |
🟡 | Landed 2026-05-20 as --flash-attn. --no-flash-attn is redundant (default is off) so not added. |
-ng / --no-gpu |
--no-gpu |
✅ | Landed 2026-05-20. Inverts whisper's default use_gpu=true. |
-dev / --device |
--device |
✅ | Landed 2026-05-20. Single CUDA device index (whisper's gpu_device field). Not the comma-separated list shape used by llama-side --device. |
-di / --diarize |
--diarize |
✅ | Landed 2026-05-20. Wrapper-logic feature (no whisper_full_params field). Algorithm matches whisper-cli's estimate_diarization_speaker: per segment, sum |amplitude| over [t0, t1] for both 16 kHz channels; the 1.1× energy ratio picks (speaker 0)/(speaker 1), otherwise (speaker ?). WavData now retains a per_channel view alongside the downmixed mono so the stereo data is available; mono inputs fail before model load with a precise message. Label is both stamped on Segment.speaker (structured) and prefixed to Segment.text so existing format writers (SRT/VTT/JSON/CSV/LRC) render it without changes. |
-tdrz / --tinydiarize |
--tinydiarize |
✅ | Landed 2026-05-20. Requires a tdrz-trained model; silently ignored on others. |
-otxt / -ovtt / -osrt / -ocsv / -olrc / -oj / -ojf |
--output-txt / --output-vtt / --output-srt / --output-csv / --output-lrc / --output-json / --output-json-full |
✅ | Landed 2026-05-20. CLI11 rejects multi-char short flags, so long-only here (no -osrt aliases). All combinable; segment-level timestamps auto-enabled when any format is requested. |
-owts |
— | 🚫 | Karaoke video script; depends on font/ffmpeg toolchain. |
-of / --output-file |
--output-file |
✅ | Landed 2026-05-20. Base name; defaults to input WAV's stem. Each enabled format writes <base>.<ext>. |
-fp / --font-path |
— | 🚫 | Karaoke-only. |
--timestamps (chimera) ↔ -nt / --no-timestamps |
--timestamps flag |
🔀 | Inverted polarity vs upstream default. Document this; don't change. |
--no-context |
--no-context |
✅ | |
--vad |
--vad |
✅ | Landed 2026-05-20. Requires --vad-model; chimera fails with BadInput if the toggle is set without the model path. |
--vad-model / --vad-threshold / --vad-min-speech-duration-ms / --vad-min-silence-duration-ms / --vad-max-speech-duration-s / --vad-speech-pad-ms / --vad-samples-overlap |
same | ✅ | Landed 2026-05-20. Numeric knobs inherit whisper_vad_default_params() when unset (negative-one sentinels). |
-sns / --suppress-nst / --suppress-regex |
--suppress-nst / --suppress-regex |
✅ | Landed 2026-05-20. Regex is matched against token strings; empty string leaves the default. |
--grammar / --grammar-rule / --grammar-penalty |
same (plus --grammar-file) |
✅ | Landed 2026-05-20. Vendored whisper.cpp's examples/grammar-parser.{h,cpp} (~450 LOC, MIT) as src/chimera/chimera_whisper_grammar.{h,cpp} — whisper ships the parser in examples/ rather than libwhisper, so reuse meant copying. --grammar-rule defaults to "root" (whisper-cli convention); --grammar-penalty defaults to 100.0 (matches whisper-cli). --grammar-file added as a chimera-side ergonomic. Mutual-exclusion + bad-rule-name + GBNF parse errors all fire before whisper_full runs. The parser produces a parse_state whose rules outlive the borrowed pointer view (c_rules() output), so command_whisper keeps both on its stack frame for the duration of transcribe(). Verified end-to-end on JFK sample with a literal-string grammar — output is constrained as expected. |
-dtw / --dtw |
— | ❌ | Token-level timestamps. |
-oved / --ov-e-device |
— | 🚫 | OpenVINO-only. |
-debug / --debug-mode / -np / --no-prints / -ps / --print-special / -pc / --print-colors / --print-confidence / -pp / --print-progress / -ls / --log-score |
— | 🚫 | Debug / logging cosmetics; chimera owns its own logging. |
Output-format family (✅ Landed 2026-05-20.-osrt/-ovtt/-oj/-ojf/-ocsv/-olrc).VAD bundle (✅ Landed 2026-05-20.--vad+ the seven knobs).--vadrequires--vad-model; tuning knobs use-1sentinels to inheritwhisper_vad_default_params().✅ Landed 2026-05-20.--prompt/--carry-initial-prompt.Decoding strategy (✅ Landed 2026-05-20.--beam-size,--best-of,--temperature,--no-fallback).Offset/duration (✅ Landed 2026-05-20 as-ot,-d).--offset/--duration(ms-based).-onis internal to whisper-cli's WAV reader and not exposed bywhisper_full_params, so deliberately skipped.Segment shaping + decoder thresholds + audio-ctx + tinydiarize + suppression + flash-attn/no-gpu/device + processors.✅ Landed 2026-05-20 as Batches 1–3 of the whisper closer (see CHANGELOG).
Remaining out-of-scope or deferred (do not re-flag): --dtw token-level DTW (niche), -wt / --word-thold (we already emit per-word timing in --output-json-full), OpenVINO device selection, and a handful of decoder-print toggles (-pc/-pp/-ls/-debug/-np/-ps/--print-confidence) where chimera owns its own logging. The --grammar family, stereo --diarize, and --detect-language were previously listed here; all three landed 2026-05-20 — see the whisper coverage table above.
- Karaoke /
--font-pathplumbing. - OpenVINO device selection (
-oved). - All debug-print toggles — chimera has its own log control.
-dtw(token-level DTW) — niche.
Even after closing the Z-Image/Flux/SD3 model-loading gap, sd remains the largest source of meaningful drift. examples/common/common.cpp declares 107 unique long flags across model loading, perf, sampler, generation, and hires/video extensions.
| Upstream flag | Chimera | Status | Notes |
|---|---|---|---|
--model, -m |
-m,--model |
✅ | |
--diffusion-model |
--diffusion-model |
✅ | Landed in the audit that prompted this report. |
--high-noise-diffusion-model |
--high-noise-diffusion-model |
✅ | Landed 2026-05-20. Model-loading slot only; the full --high-noise-* sampler family is video-only and stays out of scope (chimera-sd is img_gen-only). |
--vae |
--vae |
✅ | |
--taesd / --tae |
--taesd |
✅ | Landed 2026-05-20. TAESD fast preview decode. Single --taesd (no --tae alias). |
--clip_l |
--clip-l |
🔀 | Naming drift. Upstream uses underscore; chimera uses kebab. Stay with kebab in chimera (project convention) but document. |
--clip_g |
--clip-g |
🔀 | Landed 2026-05-20. Naming drift (kebab vs underscore) tracked above. |
--clip_vision |
--clip-vision |
🔀 | Landed 2026-05-20. Kebab-cased per chimera convention. |
--t5xxl |
--t5xxl |
✅ | |
--llm |
--llm |
✅ | Z-Image text encoder. |
--llm_vision / --qwen2vl / --qwen2vl_vision |
--llm-vision (others 🚫) |
🟡 | --llm-vision landed 2026-05-20 (kebab). --qwen2vl is a deprecated alias of --llm; safe to skip. --qwen2vl_vision not modeled here. |
--control-net |
--control-net |
✅ | Landed 2026-05-20. Wired into sd_ctx_params_t.control_net_path. --control-image requires this. |
--embd-dir |
--embd-dir |
✅ | Landed 2026-05-20. Non-recursive scan for .gguf/.safetensors/.pt; filename stem becomes the prompt token. Validated before new_sd_ctx (non-directory exits with BadInput). Pointer-lifetime detail: the kv vector owns the strings, the sd_embedding_t vector borrows from it and is built only after the kv vector is fully sized to avoid realloc-induced pointer dangle. |
--lora-model-dir |
--lora-model-dir |
✅ | Landed 2026-05-20. Base directory used to resolve relative --lora paths (chimera-side; sd.cpp's C API takes resolved paths in sd_lora_t). |
--photo-maker |
--photo-maker |
✅ | Landed 2026-05-20. Model path only; paired with the PhotoMaker generation bundle below. |
--upscale-model / --hires-upscalers-dir |
--upscale-model (hires-upscalers-dir 🚫) |
🟡 | --upscale-model landed 2026-05-20 (sd_hires_params_t.model_path, used with --hires-upscaler Model). --hires-upscalers-dir is sd-cli-shell-only directory scan — out of scope. |
--tensor-type-rules |
--tensor-type-rules |
✅ | Landed 2026-05-20. Per-tensor wtype override. |
--type |
--type |
✅ | Landed 2026-05-20. Maps to sd_ctx_params_t.wtype via str_to_sd_type; unknown values exit with BadInput. |
| Upstream flag | Chimera | Status | Notes |
|---|---|---|---|
--threads |
-t,--threads |
✅ | |
--offload-to-cpu |
--offload-to-cpu |
✅ | Landed in audit. |
--max-vram |
--max-vram |
✅ | Landed 2026-05-20. Soft VRAM cap in GiB; 0 leaves the upstream default. |
--mmap |
--no-mmap |
🔀 | Landed 2026-05-20 with inverted polarity. Chimera defaults enable_mmap=true (sd's upstream default is off), so --no-mmap is the opt-out — mirrors the llama-side flag. |
--fa |
--fa |
✅ | Landed 2026-05-20. Global flash-attn (sd_ctx_params_t.flash_attn); distinct from --diffusion-fa which only flips the diffusion path. |
--diffusion-fa |
--diffusion-fa |
✅ | Landed in audit. |
--diffusion-conv-direct / --vae-conv-direct |
same | ✅ | Landed 2026-05-20. Map directly to sd_ctx_params_t.{diffusion,vae}_conv_direct. |
--clip-on-cpu / --vae-on-cpu / --control-net-cpu |
same | ✅ | Landed 2026-05-20. Per-component CPU offload — more surgical than --offload-to-cpu. |
--force-sdxl-vae-conv-scale |
--force-sdxl-vae-conv-scale |
✅ | Landed 2026-05-20. SDXL VAE conv-scale numerics fix. |
| Upstream flag | Chimera | Status | Notes |
|---|---|---|---|
--prompt, -p |
-p,--prompt |
✅ | |
--negative-prompt |
--negative-prompt |
✅ | |
--width / -W |
-W,--width |
✅ | |
--height / -H |
-H,--height |
✅ | |
--steps |
-s,--steps |
✅ | |
--batch-count |
-b,--batch-count |
✅ | |
--seed |
--seed |
✅ | |
--cfg-scale |
--cfg-scale |
✅ | |
--img-cfg-scale |
--img-cfg-scale |
✅ | Landed 2026-05-20. Sentinel -1 leaves the upstream INFINITY default so sd falls back to --cfg-scale. |
--guidance |
--guidance |
✅ | Landed 2026-05-20. Maps to sd_sample_params_t.guidance.distilled_guidance; -1 sentinel leaves upstream default. |
--clip-skip |
--clip-skip |
✅ | |
--sampling-method |
--sample-method |
🔀 | Naming drift (sampling vs sample). Document. |
--scheduler |
--scheduler |
✅ | |
--sigmas |
--sigmas |
✅ | Landed 2026-05-20. Comma-separated float list (e.g. "14.6,10.0,5.0,1.0"); non-float entries exit with BadInput; the parsed std::vector<float> is borrowed into sd_sample_params_t.custom_sigmas for the duration of generate_image. |
--rng / --sampler-rng |
same | ✅ | Landed 2026-05-20. Resolved via str_to_rng_type; --sampler-rng cpu matches ComfyUI seeds. |
--prediction |
--prediction |
✅ | Landed 2026-05-20. Enum string resolved via str_to_prediction: eps/v/edm_v/flow/flux_flow/flux2_flow. CLI11-validated. |
--eta |
--eta |
✅ | Landed 2026-05-20. DDIM-style stochasticity in [0,1]; sentinel -1 leaves the upstream INFINITY default. |
--flow-shift |
--flow-shift |
✅ | Landed 2026-05-20. Maps to sd_sample_params_t.flow_shift. |
--timestep-shift |
--timestep-shift |
✅ | Landed 2026-05-20. Maps to sd_sample_params_t.shifted_timestep; 0 = no shift (upstream default). |
--moe-boundary |
— | ❌ | High-noise/low-noise MoE boundary. |
--slg-scale / --skip-layer-start / --skip-layer-end / --skip-layers |
same | ✅ | Landed 2026-05-20. --skip-layers parses a comma-separated int list into sd_slg_params_t.layers; empty disables SLG regardless of the other knobs; non-integer tokens fail with BadInput. Scalars use -1.0f sentinels. |
--high-noise-* (cfg-scale, img-cfg-scale, guidance, slg-scale, skip-layer-start/end, eta, sampling-method, skip-layers, steps) |
— | ❌ | Entire high-noise group missing (pairs with --high-noise-diffusion-model). |
| Upstream flag | Chimera | Status | Notes |
|---|---|---|---|
--init-img |
--init-image |
🔀 | Naming. |
--end-img |
— | ❌ | End-frame for img-to-img blending / video. |
--mask |
--mask-image |
🔀 | Naming. |
--control-image |
--control-image |
✅ | Landed 2026-05-20. Requires --control-net. Dimensions must match -W/-H. |
--control-strength |
--control-strength |
✅ | Landed 2026-05-20. Default 0.9; only used with --control-image. |
--control-video |
— | 🚫 | Video-only; chimera-sd is image-only today. |
--strength |
--strength |
✅ | |
--ref-image |
--ref-image |
✅ | Landed 2026-05-20. Repeatable; each entry is decoded to RGB and borrowed into sd_img_gen_params_t.ref_images. Companion flags --increase-ref-index and --no-auto-resize-ref-image also landed (chimera inverts sd's auto-resize default-on into an opt-out). |
--pm-id-images-dir / --pm-id-embed-path / --pm-style-strength |
same | ✅ | Landed 2026-05-20. --pm-id-images-dir scans the directory non-recursively in alphabetical order; non-image entries are skipped, an empty result is BadInput. Decoded images are borrowed into sd_pm_params_t.id_images. |
| Upstream flag | Chimera | Status | Notes |
|---|---|---|---|
--hires |
--hires |
✅ | Landed 2026-05-20. Toggles sd_hires_params_t.enabled. |
--hires-upscaler / --hires-width / --hires-height / --hires-steps / --hires-scale / --hires-denoising-strength / --hires-upscale-tile-size |
same | ✅ | Landed 2026-05-20. --hires-upscaler is the enum-string match against hires_upscaler_to_str (None/Latent/Latent (nearest)/Latent (nearest-exact)/Latent (antialiased)/Latent (bicubic)/Latent (bicubic antialiased)/Lanczos/Nearest/Model); values with spaces must be quoted at the shell. Scalar sentinels (0 for ints, -1 for floats) leave sd_hires_params_init's defaults (scale=2.0, denoising=0.7, tile=128) untouched. --upscale-model (table above) provides the file path for --hires-upscaler Model. |
--vae-tiling |
--vae-tiling |
✅ | Landed 2026-05-20. Enables sd_img_gen_params_t.vae_tiling_params.enabled. |
--vae-tile-size / --vae-relative-tile-size / --vae-tile-overlap |
same | ✅ | Landed 2026-05-20. Sentinels (-1) leave the upstream default; otherwise applied symmetrically to both axes. |
--upscale-repeats / --upscale-tile-size |
— | ❌ | Standalone upscale mode. |
| Upstream flag | Chimera | Status | Notes |
|---|---|---|---|
--video-frames / --fps |
— | 🚫 | Video mode out of scope for chimera-sd today (sd-cli has vid_gen mode). |
--vace-strength / --increase-ref-index / --disable-auto-resize-ref-image |
— | 🚫 | Video / VACE. |
--cache-mode / --cache-option |
same | ✅ | Landed 2026-05-20. Mirrors sd-cli's exact surface — --cache-mode picks the algorithm (disabled/easycache/ucache/dbcache/taylorseer/cache-dit/spectrum), --cache-option overrides per-mode tunables via key=value,... (15 keys with per-mode branching: threshold/start/end/decay/relative/reset/Fn/Bn/warmup/w/m/lam/window/flex/stop). Validated in command_sd before load_model via the chimera-side parse_cache_options() helper so typos exit fast. |
--scm-mask / --scm-policy |
same | ✅ | Landed 2026-05-20. --scm-mask borrows into sd_cache_params_t.scm_mask for the duration of generate; --scm-policy is static or dynamic (empty = sd's default dynamic). |
--lora-apply-mode |
--lora-apply-mode |
✅ | Landed 2026-05-20. Enum string via str_to_lora_apply_mode: auto/immediately/at_runtime. CLI11-validated. |
--circular / --circularx / --circulary |
— | 🚫 | Seamless-tile output; niche. |
--chroma-t5-mask-pad / --chroma-disable-dit-mask / --chroma-enable-t5-mask / --qwen-image-zero-cond-t |
— | 🚫 | Model-specific tuning; advanced. |
--disable-image-metadata |
— | 🚫 | Moot in chimera. sd-cli's flag disables a Civitai/A1111-style parameters tEXt chunk written by a patched stbi_write_png overload in sd's vendored fork of stb_image_write.h. Chimera uses stock stb_image_write, which writes no text chunks at all — so chimera's PNGs are already metadata-free and there is nothing to "disable". The reverse direction (embedding generation params for parity with sd-cli's default) is a separate feature, not yet on the roadmap. |
-o,--output |
-o,--output |
✅ | |
--mode -M {img_gen,vid_gen,upscale,convert,metadata} |
implicit | 🚫 | Chimera's sd subcommand is img_gen-only by design; other modes are out of scope today. |
--preview* / --metadata-* |
— | 🚫 | CLI-only sd-shell features; not portable into chimera. |
✅ Landed 2026-05-20.--guidanceand--flow-shift.✅ Landed 2026-05-20 as--clip_g(alongside--clip-l).--clip-g.✅ Landed 2026-05-20 (ControlNet bundle).--control-image+--control-strength+--control-net.✅ Landed 2026-05-20.--vae-tilingfamily.✅ Landed 2026-05-20.--diffusion-conv-direct/--vae-conv-directSampler-RNG /✅ Landed 2026-05-20.--rng✅ Landed 2026-05-20 alongside--lora-model-dir--lora <path[:scale]>(repeatable). Note: prompt-side<lora:foo:0.8>extraction is not wired yet —--loratakes explicit paths. Follow-up.✅ Landed 2026-05-20.--typePerf/offload bundle (✅ Landed 2026-05-20 (Round 1 of the closer).--fa,--no-mmap,--max-vram,--clip-on-cpu,--vae-on-cpu,--control-net-cpu,--force-sdxl-vae-conv-scale).Sampler/generation core (✅ Landed 2026-05-20 (Round 2).--img-cfg-scale,--eta,--timestep-shift,--sigmas,--prediction,--lora-apply-mode).Model-loading completers (✅ Landed 2026-05-20 (Round 3).--taesd,--clip-vision,--llm-vision,--tensor-type-rules,--photo-maker).PhotoMaker bundle (✅ Landed 2026-05-20 (Round 4).--pm-id-images-dir,--pm-id-embed-path,--pm-style-strength).Reference images (✅ Landed 2026-05-20 (Round 5).--ref-image,--increase-ref-index,--no-auto-resize-ref-image).Hires-fix bundle (✅ Landed 2026-05-20 (Round 6).--hires,--hires-upscaler,--upscale-model,--hires-width/height/scale/steps/denoising-strength/upscale-tile-size).Cache / SCM bundle (✅ Landed 2026-05-20 (Round 7). Mirrors sd-cli's 4-flag surface; the 15-key--cache-mode,--cache-option,--scm-mask,--scm-policy).--cache-optionkv-parser branches on the active mode just like sd-cli does.✅ Landed 2026-05-20 (Round 8). Non-recursive scan for--embd-dir(textual-inversion directory)..gguf/.safetensors/.pt; filename stem becomes the prompt token; validated beforenew_sd_ctx.
All sd items in this list are now resolved. --disable-image-metadata (the prior residual) was reclassified 🚫 in the table above — chimera's stock stb_image_write doesn't embed any metadata to begin with, so the flag has nothing to disable. A future "embed metadata" feature would be net-new functionality, not a port.
- Video mode (
vid_gen,--video-frames,--fps,--vace-strength,--end-img,--control-video). - Upscale-only / convert-only / metadata-only sd modes (chimera-sd is img_gen-scoped).
- Seamless-tile (
--circular*). - sd-cli shell features:
--preview*,--metadata-*,--canny,--mode. - Chroma-specific advanced flags (
--chroma-*) unless we land Chroma support.
- Kebab vs underscore. sd.cpp's text-encoder flags are underscored (
--clip_l,--clip_g,--llm_vision,--qwen2vl); chimera normalizes everything to kebab (--clip-l). This is a defensible house style but should be called out in--helptext so users porting sd command lines don't get a "no such option" surprise. --sample-methodvs--sampling-method. Minor drift, but the kind of thing that breaks copy-pasting from sd-cpp docs. Same for--init-imagevs--init-img,--mask-imagevs--mask,--inputvs--file(whisper).- whisper
--timestampsflips polarity vs upstream's--no-timestamps(chimera defaults to off, upstream to on). Document loudly; do not change.
✅ Landed everywhere (2026-05-20):--flash-attn— exists in upstream llama-cli, whisper-cli, and sd-cpp; not exposed in any of chimera's subcommands.--flash-attnongen/chat/embedandwhisper; onsd, both--diffusion-fa(sd-internal) and the generic global--faare now exposed.--lora— exposed inservebut not ingen/chat/embed/sd. The asymmetry is a footgun.- Output formatting —
embedlacks--embd-output-format,whisperlacks-oj/-osrt/-ovtt. Both subcommands' output stories are unevenly developed compared to upstream.
llama.cpp's common_arg machinery wires several flags to env vars (LLAMA_ARG_CTX_CHECKPOINTS, LLAMA_ARG_CACHE_RAM, LLAMA_ARG_KV_UNIFIED, LLAMA_ARG_CONTEXT_SHIFT, LLAMA_ARG_CACHE_IDLE_SLOTS, …). Chimera honors none of these. For server use this can matter (containerized deploys); for the four CLI subcommands the omission is fine. Flag for follow-up only if chimera serve users start asking.
Three big slabs of upstream surface area are correctly out of scope and should stay that way:
- llama-cli's interactive REPL (
-i,--in-prefix,--reverse-prompt,--multiline-input, etc.) — chimera replaces it withchat+ linenoise + SQLite persistence. - Speculative decoding (
--draft*,--spec-*) — none of the chimera subcommands wrap a draft-model code path yet. - Training / perplexity / hellaswag / cvector-generator / imatrix flags — those upstream binaries don't have chimera analogs.
In priority order (highest user impact first). Items struck through landed on 2026-05-20.
sd: Flux/SD3 guidance pair (✅ Landed 2026-05-20.--guidance,--flow-shift).sd: ControlNet bundle (✅ Landed 2026-05-20.--control-net,--control-image,--control-strength).whisper: output-format family (✅-osrt,-oj,-ovtt,-ojf,-ocsv,-olrc).sd: VAE-tiling bundle (✅ Landed 2026-05-20.--vae-tiling+ tile-size/overlap).llama:✅--grammar/--json-schema/--json-schema-fileingen.All three:✅--flash-attn.llama:✅--loraingen/chat.whisper:✅ Landed 2026-05-20.--prompt+ decoding-strategy basics (--beam-size,--best-of,--temperature,--no-fallback).sd:✅ Landed 2026-05-20.--lora,--lora-model-dir,--clip_g,--typeembed:✅ Landed 2026-05-20 (also--embd-output-format+--embd-separator+--attention--pooling rank).sd coverage closer — Rounds 1–8 (38 flags).✅ Landed 2026-05-20. Perf/offload (Round 1), sampler/generation (Round 2), model-loading completers (Round 3), PhotoMaker bundle (Round 4), reference images (Round 5), hires-fix bundle (Round 6), cache/SCM bundle (Round 7),--embd-dir(Round 8). See the per-section tables above and the CHANGELOG entry for the full enumeration.whisper coverage closer — Batches 1–3 + VAD + offset/duration (22 flags).✅ Landed 2026-05-20.
No residual open items at the close of this audit cycle. Every flag on the gen/chat/embed/whisper/sd surfaces is either landed, deliberately renamed, partial-with-explanation, or explicitly out-of-scope. The remaining 🚫 rows are documented in their per-section tables with a sentence each: video-only sd modes, server-only common_params fields, llama-cli's REPL plumbing (replaced by chimera's own chat + linenoise), speculative decoding, training/perplexity/imatrix flags, OpenVINO and chroma/qwen tuning, low-level decoder-print toggles, and a handful of niche items where the chimera path already supplies an equivalent (e.g. --word-thold is moot because --output-json-full already emits per-word timing). The 14 prior gen/chat residuals all closed in a long-tail batch the same day — 19 flags landed (sampler nibbles, MoE offload, override-tensor/kv, control vectors, etc.) and four upstream flags reclassified 🚫 (--keep for architecture mismatch — chimera uses KV-prefix reuse, not context-shift; --ctx-checkpoints, --checkpoint-every-n-tokens, --cache-ram for server-only common_params fields the CLI never touches). The prior chat --reasoning-budget enforcement gap was closed the same day — the integration turned out to be entirely upstream of common_sampler_init, not inside the sample loop. The prior sd --disable-image-metadata residual was reclassified 🚫 — chimera's stock stb_image_write doesn't embed any text chunks, so there is nothing to disable; a future "embed metadata" feature for parity with sd-cli's default is tracked as net-new functionality, not a port. The three remaining whisper items are wrapper-logic features rather than param plumbing, so they're deferred as bigger lifts rather than mechanical ports.