Use this runbook to diagnose the local DeepSeek Runtime integration without exposing secrets or killing useful processes.
- Run commands from the project workspace unless a step says otherwise.
- Do not print
.envor token values. - Prefer read-only inspection before changing process state.
- Treat
7878as the product default and7879as a temporary diagnostic port.
The Delphi adapter first reads these variables from the process that launches
dex-code.exe. If a variable is not present in the process environment, it then
looks for a project .env file, starting from the current working directory and
from the executable directory, walking upward a few parent directories.
| Variable | Purpose |
|---|---|
DEEPSEEK_RUNTIME_URL |
Runtime base URL. Defaults to http://127.0.0.1:7878. |
DEEPSEEK_RUNTIME_TOKEN |
Bearer token sent to the runtime when configured. |
DEEPSEEK_RUNTIME_MODEL |
Default model. Defaults to deepseek-v4-pro. |
Never log the actual token value. When diagnosing auth, only confirm whether the process has a token and whether the runtime expects the same token.
For local MVP testing, changing DEEPSEEK_RUNTIME_URL in .env is acceptable
when the operator explicitly approves it. Do not commit .env or backup copies.
Keep 7878 as the product default; use 7879 only as an explicit diagnostic
selection.
Safe checks:
Invoke-RestMethod http://127.0.0.1:7878/healthHealth only proves the runtime process is alive. Also check the useful endpoint that the UI depends on:
Invoke-RestMethod http://127.0.0.1:7878/v1/workspace/statusIf /health returns 200 but /v1/workspace/status returns 401, classify the
runtime as alive with authorization/configuration mismatch. Do not mark it ready
just because the health check passed.
If the runtime requires auth, use a token-bearing environment in your shell, but do not echo the token. The UI should report HTTP 401 with a safe message:
DeepSeek Runtime rejected authorization (HTTP 401). Check DEEPSEEK_RUNTIME_TOKEN
for this process and the runtime server; token value was not displayed.
UI smoke test:
- Start the runtime on the chosen port.
The app's manual local start uses
--insecure --enable vision_modelon127.0.0.1, matching the diagnostic runtime that allowed repeated app launches without token handoff. - Launch
bin64\dex-code.exefrom a process that has the intended runtime environment, or from the project tree with the intended.env. - Confirm the footer shows the expected runtime URL.
- Send:
Responda apenas OK dex-code. - Expected result: visible answer
OK dex-code, or a controlled error that does not expose token values.
Image/OCR smoke test:
- Attach or paste a non-sensitive image with visible text.
- Confirm the UI reports the image as a local
[image:path]reference. - If OCR is available, confirm the footer reports local OCR context attached to the prompt.
- Ask the model to read the image text.
- Expected result: answer based on OCR/local reference, or a controlled message that native multimodal payload is not used. Do not claim native vision unless a runtime contract and response prove it.
DeepSeek TUI Ctrl+V note:
- The terminal TUI can read a Windows clipboard bitmap and save it under
~\.deepseek\clipboard-images\clipboard-*.pngas attached media. - That terminal behavior is not the same as native image payload support in the
dex-coderuntime path. Keep the Delphi UI wording as local image/reference unless Provider Vision or OCR explicitly supplies text context.
7878 is the default runtime port used by the adapter.
7879 can be useful for isolation when checking whether another process is
blocking or misconfigured, but it is not a product fallback. To test on 7879,
start the runtime there and launch dex-code.exe with
DEEPSEEK_RUNTIME_URL=http://127.0.0.1:7879.
Normal chat must use thread-first runtime identity. If thread-first is
unavailable, /v1/stream is not a conversation fallback, memory path or
Stop/Pause recovery path. The UI should block with route=blocked,
replay_used=false and a clear fallback_reason.
Common Stop/Pause only marks the current turn as terminal or cancelled. It must
not write memory automatically and must not wake /v1/stream later.
Emergency checkpoint is reserved for explicit shutdown/restart/loss-of-runtime
moments. When the user asks for a recoverable emergency checkpoint, use
memoria-viva for a short .agents/ACTIVE.md/.agents/HANDOFF.md update and
napkin-projeto only for recurring tactical gotchas. Keep the checkpoint short:
workspace, safe summary, thread/session/turn ids when available, lastSeq,
route/mapping, terminal state, reason and next safe step. Do not store secrets,
raw sensitive prompt, logs or long tool output.
/deepseek resume, /save and /load in dex-code use an explicit local
checkpoint contract. They must not fall back to hidden text replay or read
sensitive payloads.
/save creates a new metadata-only checkpoint under:
%LOCALAPPDATA%\DexCode\checkpoints
The checkpoint may contain workspace, Pythia session id, chat sessions file,
DeepSeek thread-map file, DeepSeek thread id, model, mode and lastSeq. It
must not contain messages, prompt text, response text, raw payloads, .env
values, secrets or replay buffers. The command output must include
command_kind=checkpoint, source=checkpoint_store, replay_used=false,
sensitive_payload_read=false, title_policy=not_read and
content_policy=not_read.
/load [checkpoint_id|checkpoint_file] and
/deepseek resume [checkpoint_id|checkpoint_file] restore by session identity
and thread mapping. They reload the existing Pythia sessions file, activate the
stored session id and update only the DeepSeek mapping entry required for that
session. Before writing the mapping file, they create a timestamped .bak-*
copy. They do not replay chat text and do not rewrite saved conversations as a
substitute for restore.
If the checkpoint is missing, outside the checkpoint root, from another
workspace, malformed or points to a missing visual session, the command must
return executed=false, source=checkpoint_store, replay_used=false and
sensitive_payload_read=false.
dex-code.exe talks to the local DeepSeek Runtime. OpenRouter, Ollama and
custom OpenAI-compatible /v1 endpoints are configured as providers below that
runtime, not as replacement runtime URLs.
Keep these layers separate:
| Layer | Example | Purpose |
|---|---|---|
| Runtime URL | http://127.0.0.1:7878 |
Local DeepSeek Runtime endpoint used by dex-code.exe. |
| OpenRouter provider base URL | https://openrouter.ai/api/v1 |
Backend used by DeepSeek Runtime for OpenRouter requests. |
| Ollama provider base URL | http://localhost:11434/v1 |
OpenAI-compatible Ollama endpoint used by DeepSeek Runtime. |
| Custom OpenAI-compatible base URL | https://api.openai.com/v1 or another /v1 endpoint |
Optional backend that follows the OpenAI-compatible models/chat contract. |
| Active text model | deepseek-v4-pro, llama3.2:1b or inclusionai/ring-2.6-1t:free |
Text model used by DeepSeek Runtime thread-first turns; when provider is external, this comes from [providers.<provider>].text_model or the compatible model alias. |
When the active provider is deepseek, use DeepSeek model ids such as
deepseek-v4-pro or deepseek-v4-flash. When the active provider is
ollama, openrouter or custom-openai, keep the top-level
default_text_model = "auto" and store the real text model in
[providers.<provider>].text_model plus the backward-compatible model alias.
This prevents a stale DeepSeek model id from being sent to an external provider.
Model discovery uses provider-specific read-only checks:
Invoke-RestMethod https://openrouter.ai/api/v1/models
Invoke-RestMethod http://localhost:11434/api/tags
Invoke-RestMethod https://api.openai.com/v1/modelsFor Ollama, a missing installation or stopped ollama serve must be treated as
a recoverable provider state. The app should use a short local timeout and show
an actionable message such as installing Ollama or starting ollama serve; it
must not freeze the settings panel. A fresh Ollama install can legitimately
return an empty model list:
{"models":[]}After a model is downloaded, /api/tags should include its models[].name.
The app settings panel can now:
- choose
deepseek,openrouter,ollamaorcustom-openai; - edit provider base URL;
- fetch provider model lists;
- save provider/text model and optional image/audio/TTS slots to the DeepSeek
CLI
config.tomlwith backup; - sync configured slots into the existing model selector categories
(
textGeneration,imageCreation,audioCreation,textToSpeech); - restart the local DeepSeek Runtime process when saving a provider, so the
running server reloads the updated
config.toml.
Empty image/audio/TTS slots mean sem modelo. They are allowed and should not
be treated as errors.
Vision/audio status must be labelled carefully:
connected: text provider path proven through DeepSeek Runtime.advisory: model is cataloged/configured, but the runtime has not proven a native image/audio payload path.hidden: no safe connected contract is available yet.
A separate OpenRouter vision skill/tool can be connected as
imagem -> vision tool -> texto -> runtime, but that is not the same as native
multimodal support inside /v1/stream.
Secrets must be configured through the approved DeepSeek CLI/runtime mechanism. The panel only reports whether secret-like keys exist; it must not display, log, diff or version token/API key values.
If a copied dex-code.exe shows placeholder models, verify that the support
files were deployed with it:
bin64\dex-code\support\dex-code-model-list.json
bin64\dex-code\dex-code-model-get-replace-version.json
Today dex-code.exe is a runtime client, not a runtime supervisor. It reads the
runtime URL, token and model from the launching process environment or approved
project .env, connects to the selected runtime and reports health/auth errors
in the UI. It does not start, restart, monitor or stop DeepSeek TUI sidecar
servers.
Do not assume a process is safe to terminate because it listens on a diagnostic port. A DeepSeek TUI process can own useful child processes such as MCP servers, Node helpers, Python helpers or Playwright tools. Preserve it unless a PID-level inventory proves it is an orphan or a duplicate that is not serving the current session.
A future supervised mode should be implemented as an explicit product feature, not as ad hoc cleanup. It should include:
- a single configured runtime port;
- a PID/port lock owned by the supervisor;
- startup timeout and readiness checks;
- bearer-token alignment without logging the token;
- restart/backoff policy;
- graceful shutdown policy;
- UI state for
missing,starting,ready,auth failed,failedandstopping.
Never kill by image name such as node.exe, python.exe, uvx.exe or
codex.exe.
Inventory first:
$names = @(
'uvx.exe','uv.exe','mcp-filesystem-encoding.exe','python.exe',
'node.exe','node_repl.exe','codex.exe','cmd.exe','pwsh.exe','dex-code.exe'
)
Get-CimInstance Win32_Process |
Where-Object {
$names -contains $_.Name -or
($_.CommandLine -match 'deepseek|mcp|playwright|desktop-commander')
} |
Select-Object ProcessId, ParentProcessId, Name, CreationDate, CommandLineBefore saving or sharing command lines, redact:
Bearer ...;--token,--auth-token,--api-key,--password;TOKEN=,SECRET=,API_KEY=,DEEPSEEK_RUNTIME_TOKEN=;sk-...,github_pat_..., Slack/GitHub/OpenAI-style token strings.
Classify each process:
| Class | Rule |
|---|---|
| active Codex current | In the active Codex tree or required by this session. Preserve. |
| preserve | Known runtime/MCP/app-server/listener currently in use. |
| candidate | Duplicate or orphan with known command line, no active parent and no required listener. |
| unknown | Generic process, missing command line, unclear parent, or unrelated project. |
Only terminate a specific PID after it is classified as candidate and the
operator accepts the risk. Re-check MCP visibility and runtime health after any
termination.
| Symptom | Likely cause | Safe action |
|---|---|---|
Runtime footer uses 7878 while smoke worked on 7879 |
UI process still uses default URL and no approved .env override was loaded |
Launch with DEEPSEEK_RUNTIME_URL for the chosen port, or set it in project .env for local testing. |
| HTTP 401 | Runtime token mismatch or missing token in the dex-code.exe process |
Align DEEPSEEK_RUNTIME_TOKEN without printing the value. |
/health passes but /v1/workspace/status returns 401 |
Runtime is alive, but useful endpoint auth is not aligned | Treat as auth/config mismatch, not ready runtime. |
7878 returns 401 while old 7879 worked repeatedly |
Old diagnostic runtime was started with --insecure --enable vision_model; current canonical runtime may be token-protected |
Restart the local runtime through the app's manual start or align DEEPSEEK_RUNTIME_TOKEN. |
7879 appears to have many child Node/Python/MCP processes |
Diagnostic runtime may be supervising active tools | Preserve until PID-level inventory proves it is safe to stop. |
| Reasoning card remains visible after auth failure | Visual state needs validation after controlled error | Test after auth is fixed; if still stuck, patch finalization state. |
| Many MCP/Node/Python processes | Multiple Codex sessions or duplicated MCP startup | Inventory and classify; do not kill by name. |
| Image chip appears but model says it did not receive an image | UI has a local reference, but native multimodal payload is not proven | Use OCR/local reference wording and keep native vision hidden until tested. |
DeepSeek TUI Ctrl+V creates clipboard-images, but dex-code only has [image:path] |
Terminal attached media and runtime /v1/stream are different contracts |
Treat dex-code images as local/advisory unless Provider Vision or OCR attaches text context. |