DeepSeek Runtime Runbook

Use this runbook to diagnose the local DeepSeek Runtime integration without exposing secrets or killing useful processes.

Preconditions

Run commands from the project workspace unless a step says otherwise.
Do not print .env or token values.
Prefer read-only inspection before changing process state.
Treat 7878 as the product default and 7879 as a temporary diagnostic port.

Configuration

The Delphi adapter first reads these variables from the process that launches dex-code.exe. If a variable is not present in the process environment, it then looks for a project .env file, starting from the current working directory and from the executable directory, walking upward a few parent directories.

Variable	Purpose
`DEEPSEEK_RUNTIME_URL`	Runtime base URL. Defaults to `http://127.0.0.1:7878`.
`DEEPSEEK_RUNTIME_TOKEN`	Bearer token sent to the runtime when configured.
`DEEPSEEK_RUNTIME_MODEL`	Default model. Defaults to `deepseek-v4-pro`.

Never log the actual token value. When diagnosing auth, only confirm whether the process has a token and whether the runtime expects the same token.

For local MVP testing, changing DEEPSEEK_RUNTIME_URL in .env is acceptable when the operator explicitly approves it. Do not commit .env or backup copies. Keep 7878 as the product default; use 7879 only as an explicit diagnostic selection.

Smoke Checks

Safe checks:

Invoke-RestMethod http://127.0.0.1:7878/health

Health only proves the runtime process is alive. Also check the useful endpoint that the UI depends on:

Invoke-RestMethod http://127.0.0.1:7878/v1/workspace/status

If /health returns 200 but /v1/workspace/status returns 401, classify the runtime as alive with authorization/configuration mismatch. Do not mark it ready just because the health check passed.

If the runtime requires auth, use a token-bearing environment in your shell, but do not echo the token. The UI should report HTTP 401 with a safe message:

DeepSeek Runtime rejected authorization (HTTP 401). Check DEEPSEEK_RUNTIME_TOKEN
for this process and the runtime server; token value was not displayed.

UI smoke test:

Start the runtime on the chosen port. The app's manual local start uses --insecure --enable vision_model on 127.0.0.1, matching the diagnostic runtime that allowed repeated app launches without token handoff.
Launch bin64\dex-code.exe from a process that has the intended runtime environment, or from the project tree with the intended .env.
Confirm the footer shows the expected runtime URL.
Send: Responda apenas OK dex-code.
Expected result: visible answer OK dex-code, or a controlled error that does not expose token values.

Image/OCR smoke test:

Attach or paste a non-sensitive image with visible text.
Confirm the UI reports the image as a local [image:path] reference.
If OCR is available, confirm the footer reports local OCR context attached to the prompt.
Ask the model to read the image text.
Expected result: answer based on OCR/local reference, or a controlled message that native multimodal payload is not used. Do not claim native vision unless a runtime contract and response prove it.

DeepSeek TUI Ctrl+V note:

The terminal TUI can read a Windows clipboard bitmap and save it under ~\.deepseek\clipboard-images\clipboard-*.png as attached media.
That terminal behavior is not the same as native image payload support in the dex-code runtime path. Keep the Delphi UI wording as local image/reference unless Provider Vision or OCR explicitly supplies text context.

7878 Versus 7879

7878 is the default runtime port used by the adapter.

7879 can be useful for isolation when checking whether another process is blocking or misconfigured, but it is not a product fallback. To test on 7879, start the runtime there and launch dex-code.exe with DEEPSEEK_RUNTIME_URL=http://127.0.0.1:7879.

Conversation Fallback And Emergency Checkpoint

Normal chat must use thread-first runtime identity. If thread-first is unavailable, /v1/stream is not a conversation fallback, memory path or Stop/Pause recovery path. The UI should block with route=blocked, replay_used=false and a clear fallback_reason.

Common Stop/Pause only marks the current turn as terminal or cancelled. It must not write memory automatically and must not wake /v1/stream later.

Emergency checkpoint is reserved for explicit shutdown/restart/loss-of-runtime moments. When the user asks for a recoverable emergency checkpoint, use memoria-viva for a short .agents/ACTIVE.md/.agents/HANDOFF.md update and napkin-projeto only for recurring tactical gotchas. Keep the checkpoint short: workspace, safe summary, thread/session/turn ids when available, lastSeq, route/mapping, terminal state, reason and next safe step. Do not store secrets, raw sensitive prompt, logs or long tool output.

/deepseek resume, /save and /load in dex-code use an explicit local checkpoint contract. They must not fall back to hidden text replay or read sensitive payloads.

/save creates a new metadata-only checkpoint under:

%LOCALAPPDATA%\DexCode\checkpoints

The checkpoint may contain workspace, Pythia session id, chat sessions file, DeepSeek thread-map file, DeepSeek thread id, model, mode and lastSeq. It must not contain messages, prompt text, response text, raw payloads, .env values, secrets or replay buffers. The command output must include command_kind=checkpoint, source=checkpoint_store, replay_used=false, sensitive_payload_read=false, title_policy=not_read and content_policy=not_read.

/load [checkpoint_id|checkpoint_file] and /deepseek resume [checkpoint_id|checkpoint_file] restore by session identity and thread mapping. They reload the existing Pythia sessions file, activate the stored session id and update only the DeepSeek mapping entry required for that session. Before writing the mapping file, they create a timestamped .bak-* copy. They do not replay chat text and do not rewrite saved conversations as a substitute for restore.

If the checkpoint is missing, outside the checkpoint root, from another workspace, malformed or points to a missing visual session, the command must return executed=false, source=checkpoint_store, replay_used=false and sensitive_payload_read=false.

Provider Configuration

dex-code.exe talks to the local DeepSeek Runtime. OpenRouter, Ollama and custom OpenAI-compatible /v1 endpoints are configured as providers below that runtime, not as replacement runtime URLs.

Keep these layers separate:

Layer	Example	Purpose
Runtime URL	`http://127.0.0.1:7878`	Local DeepSeek Runtime endpoint used by `dex-code.exe`.
OpenRouter provider base URL	`https://openrouter.ai/api/v1`	Backend used by DeepSeek Runtime for OpenRouter requests.
Ollama provider base URL	`http://localhost:11434/v1`	OpenAI-compatible Ollama endpoint used by DeepSeek Runtime.
Custom OpenAI-compatible base URL	`https://api.openai.com/v1` or another `/v1` endpoint	Optional backend that follows the OpenAI-compatible models/chat contract.
Active text model	`deepseek-v4-pro`, `llama3.2:1b` or `inclusionai/ring-2.6-1t:free`	Text model used by DeepSeek Runtime thread-first turns; when provider is external, this comes from `[providers.<provider>].text_model` or the compatible `model` alias.

When the active provider is deepseek, use DeepSeek model ids such as deepseek-v4-pro or deepseek-v4-flash. When the active provider is ollama, openrouter or custom-openai, keep the top-level default_text_model = "auto" and store the real text model in [providers.<provider>].text_model plus the backward-compatible model alias. This prevents a stale DeepSeek model id from being sent to an external provider.

Model discovery uses provider-specific read-only checks:

Invoke-RestMethod https://openrouter.ai/api/v1/models
Invoke-RestMethod http://localhost:11434/api/tags
Invoke-RestMethod https://api.openai.com/v1/models

For Ollama, a missing installation or stopped ollama serve must be treated as a recoverable provider state. The app should use a short local timeout and show an actionable message such as installing Ollama or starting ollama serve; it must not freeze the settings panel. A fresh Ollama install can legitimately return an empty model list:

{"models":[]}

After a model is downloaded, /api/tags should include its models[].name.

The app settings panel can now:

choose deepseek, openrouter, ollama or custom-openai;
edit provider base URL;
fetch provider model lists;
save provider/text model and optional image/audio/TTS slots to the DeepSeek CLI config.toml with backup;
sync configured slots into the existing model selector categories (textGeneration, imageCreation, audioCreation, textToSpeech);
restart the local DeepSeek Runtime process when saving a provider, so the running server reloads the updated config.toml.

Empty image/audio/TTS slots mean sem modelo. They are allowed and should not be treated as errors.

Vision/audio status must be labelled carefully:

connected: text provider path proven through DeepSeek Runtime.
advisory: model is cataloged/configured, but the runtime has not proven a native image/audio payload path.
hidden: no safe connected contract is available yet.

A separate OpenRouter vision skill/tool can be connected as imagem -> vision tool -> texto -> runtime, but that is not the same as native multimodal support inside /v1/stream.

Secrets must be configured through the approved DeepSeek CLI/runtime mechanism. The panel only reports whether secret-like keys exist; it must not display, log, diff or version token/API key values.

If a copied dex-code.exe shows placeholder models, verify that the support files were deployed with it:

bin64\dex-code\support\dex-code-model-list.json
bin64\dex-code\dex-code-model-get-replace-version.json

Runtime Ownership

Today dex-code.exe is a runtime client, not a runtime supervisor. It reads the runtime URL, token and model from the launching process environment or approved project .env, connects to the selected runtime and reports health/auth errors in the UI. It does not start, restart, monitor or stop DeepSeek TUI sidecar servers.

Do not assume a process is safe to terminate because it listens on a diagnostic port. A DeepSeek TUI process can own useful child processes such as MCP servers, Node helpers, Python helpers or Playwright tools. Preserve it unless a PID-level inventory proves it is an orphan or a duplicate that is not serving the current session.

A future supervised mode should be implemented as an explicit product feature, not as ad hoc cleanup. It should include:

a single configured runtime port;
a PID/port lock owned by the supervisor;
startup timeout and readiness checks;
bearer-token alignment without logging the token;
restart/backoff policy;
graceful shutdown policy;
UI state for missing, starting, ready, auth failed, failed and stopping.

Process Hygiene

Never kill by image name such as node.exe, python.exe, uvx.exe or codex.exe.

Inventory first:

$names = @(
  'uvx.exe','uv.exe','mcp-filesystem-encoding.exe','python.exe',
  'node.exe','node_repl.exe','codex.exe','cmd.exe','pwsh.exe','dex-code.exe'
)
Get-CimInstance Win32_Process |
  Where-Object {
    $names -contains $_.Name -or
    ($_.CommandLine -match 'deepseek|mcp|playwright|desktop-commander')
  } |
  Select-Object ProcessId, ParentProcessId, Name, CreationDate, CommandLine

Before saving or sharing command lines, redact:

Bearer ...;
--token, --auth-token, --api-key, --password;
TOKEN=, SECRET=, API_KEY=, DEEPSEEK_RUNTIME_TOKEN=;
sk-..., github_pat_..., Slack/GitHub/OpenAI-style token strings.

Classify each process:

Class	Rule
active Codex current	In the active Codex tree or required by this session. Preserve.
preserve	Known runtime/MCP/app-server/listener currently in use.
candidate	Duplicate or orphan with known command line, no active parent and no required listener.
unknown	Generic process, missing command line, unclear parent, or unrelated project.

Only terminate a specific PID after it is classified as candidate and the operator accepts the risk. Re-check MCP visibility and runtime health after any termination.

Common Failures

Symptom	Likely cause	Safe action
Runtime footer uses `7878` while smoke worked on `7879`	UI process still uses default URL and no approved `.env` override was loaded	Launch with `DEEPSEEK_RUNTIME_URL` for the chosen port, or set it in project `.env` for local testing.
HTTP 401	Runtime token mismatch or missing token in the `dex-code.exe` process	Align `DEEPSEEK_RUNTIME_TOKEN` without printing the value.
`/health` passes but `/v1/workspace/status` returns 401	Runtime is alive, but useful endpoint auth is not aligned	Treat as auth/config mismatch, not ready runtime.
`7878` returns 401 while old `7879` worked repeatedly	Old diagnostic runtime was started with `--insecure --enable vision_model`; current canonical runtime may be token-protected	Restart the local runtime through the app's manual start or align `DEEPSEEK_RUNTIME_TOKEN`.
`7879` appears to have many child Node/Python/MCP processes	Diagnostic runtime may be supervising active tools	Preserve until PID-level inventory proves it is safe to stop.
Reasoning card remains visible after auth failure	Visual state needs validation after controlled error	Test after auth is fixed; if still stuck, patch finalization state.
Many MCP/Node/Python processes	Multiple Codex sessions or duplicated MCP startup	Inventory and classify; do not kill by name.
Image chip appears but model says it did not receive an image	UI has a local reference, but native multimodal payload is not proven	Use OCR/local reference wording and keep native vision hidden until tested.
DeepSeek TUI Ctrl+V creates `clipboard-images`, but `dex-code` only has `[image:path]`	Terminal attached media and runtime `/v1/stream` are different contracts	Treat `dex-code` images as local/advisory unless Provider Vision or OCR attaches text context.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

DeepSeek Runtime Runbook

Preconditions

Configuration

Smoke Checks

7878 Versus 7879

Conversation Fallback And Emergency Checkpoint

Provider Configuration

Runtime Ownership

Process Hygiene

Common Failures

FilesExpand file tree

deepseek-runtime-runbook.md

Latest commit

History

deepseek-runtime-runbook.md

File metadata and controls

DeepSeek Runtime Runbook

Preconditions

Configuration

Smoke Checks

7878 Versus 7879

Conversation Fallback And Emergency Checkpoint

Provider Configuration

Runtime Ownership

Process Hygiene

Common Failures