factory-droid-bridge

A local HTTP proxy that lets the Factory Droid CLI's BYOK (Bring Your Own Key) feature work with any OpenAI-compatible chat-completions endpoint — Zen Go, OpenRouter, Together, Groq, Fireworks, DeepSeek direct, etc.

Why this exists

Droid's provider: "openai" BYOK path uses the OpenAI Responses API (POST /responses) — the newer, agentic protocol with named output items, custom grammar tools, and reasoning items. Most third-party providers only implement the older Chat Completions API (POST /v1/chat/completions).

The proxy translates Responses ↔ Chat Completions in both directions, including:

Streaming SSE event sequences (response.output_item.added, response.output_text.delta, response.function_call_arguments.delta, response.custom_tool_call_input.delta, etc.)
Tool calling (both standard function and OpenAI's newer custom grammar-constrained tools, like Droid's ApplyPatch)
Reasoning content roundtrip — critical for thinking-mode models (DeepSeek, GLM, Kimi, MiMo) that demand the prior turn's reasoning_content be echoed back
Tool result inputs (function_call_output, custom_tool_call_output)
Structured outputs (text.format ↔ response_format)
Image inputs, system instructions, sampling params

It also includes:

Per-model upstream routing via a config file — different model names can hit different providers
Retry with backoff on transient upstream failures (408/425/429/5xx + network errors)
2-hour idle self-exit — no orphaned processes
Auto-start via Droid's SessionStart hook

Requirements

macOS (Linux probably works, untested)
Python 3.10 or newer (uses match/case, walrus, modern type hints)
Factory Droid CLI installed (droid on PATH)
An API key for at least one OpenAI-compatible upstream

Quick install

git clone https://github.com/thelostorbital/factory-droid-bridge.git
cd factory-droid-bridge
./install.sh

install.sh copies droid_responses_proxy.py + start_droid_proxy.sh to ~/.factory/bin/, makes the launcher executable, and creates ~/.factory/logs/. It verifies Python 3.10+ is present.

After that, follow Configure below to register the SessionStart hook + your custom models in ~/.factory/settings.json.

Or install by hand

If you'd rather not run a script:

mkdir -p ~/.factory/bin ~/.factory/logs
cp droid_responses_proxy.py start_droid_proxy.sh ~/.factory/bin/
chmod +x ~/.factory/bin/start_droid_proxy.sh

No pip install, no daemon registration, no compiled deps. Stdlib Python only.

Configure

Step 1 — Register the SessionStart hook

In ~/.factory/settings.json, add this top-level hooks block (or merge with whatever you have):

{
  "hooks": {
    "SessionStart": [
      {
        "matcher": "",
        "hooks": [
          {
            "type": "command",
            "command": "/Users/YOU/.factory/bin/start_droid_proxy.sh"
          }
        ]
      }
    ]
  }
}

Replace /Users/YOU/ with your actual home path (the hook command field doesn't expand ~ or $HOME).

Step 2 — Declare your custom models

In the same settings.json, add customModels entries that point at the proxy (http://127.0.0.1:18080). The model field is what Droid will send on the wire — it's also what the proxy uses to look up the route.

{
  "customModels": [
    {
      "model": "kimi-k2.6",
      "displayName": "Kimi K2.6",
      "baseUrl": "http://127.0.0.1:18080",
      "apiKey": "<your upstream api key>",
      "provider": "openai"
    },
    {
      "model": "deepseek-v4-flash",
      "displayName": "DeepSeek V4 Flash",
      "baseUrl": "http://127.0.0.1:18080",
      "apiKey": "<your upstream api key>",
      "provider": "openai"
    }
  ]
}

baseUrl must be exactly http://127.0.0.1:18080 — no trailing slash, no path. Droid appends /v1/responses itself.

provider must be "openai" — this proxy translates the OpenAI Responses API specifically. (If you need Anthropic-native models, set provider: "anthropic" and a real Anthropic-compatible baseUrl; that bypasses the proxy.)

Step 3 — (Optional) Edit the route table

The first time the proxy starts, it writes ~/.factory/bin/proxy_routes.json with a default config. Edit this file to add new upstreams.

{
  "default_upstream": "https://opencode.ai/zen/go/v1/chat/completions",
  "routes": [
    {
      "models": ["kimi-k2.6", "kimi-k2.5", "deepseek-v4-pro", "deepseek-v4-flash"],
      "upstream": "https://opencode.ai/zen/go/v1/chat/completions"
    },
    {
      "prefix": "openrouter/",
      "strip_prefix": true,
      "upstream": "https://openrouter.ai/api/v1/chat/completions",
      "headers": {
        "HTTP-Referer": "https://factory.ai",
        "X-Title": "Droid CLI"
      }
    },
    {
      "prefix": "together/",
      "strip_prefix": true,
      "upstream": "https://api.together.xyz/v1/chat/completions"
    },
    {
      "regex": "^groq-",
      "upstream": "https://api.groq.com/openai/v1/chat/completions"
    }
  ]
}

Route matchers (first match wins)

Matcher	Field	Behavior
Exact list	`"models": ["m1", "m2"]`	Match if request `model` is exactly one of these
Prefix	`"prefix": "openrouter/"`	Match if request `model` starts with this
Regex	`"regex": "^claude-"`	`re.search` on the model name

Per-route options

Field	Purpose
`upstream` (required)	Full Chat Completions URL to POST to
`strip_prefix` (bool, prefix-only)	Drop the prefix from the model name before forwarding
`model_rewrite` (string)	Replace the model name entirely
`headers` (dict)	Extra headers merged into the upstream request (overrides defaults — useful for `HTTP-Referer`, `X-Title`, etc.)

Restart the proxy after editing this file: kill $(lsof -ti:18080); ~/.factory/bin/start_droid_proxy.sh

Step 4 — Get an API key onto the wire

The proxy forwards whatever Authorization: Bearer <key> Droid sends. Droid populates that header from the apiKey field of the matching customModels entry. So:

One API key per model entry in settings.json.
Different models can use different keys — perfect when you've routed them to different upstreams.

Run

The SessionStart hook starts the proxy automatically when Droid launches. Manual control:

# Start (idempotent — no-op if already running)
~/.factory/bin/start_droid_proxy.sh

# Health check
curl -s http://127.0.0.1:18080/health

# Inspect current route config (also dumps default_upstream + retry settings)
curl -s http://127.0.0.1:18080/routes | python3 -m json.tool

# Stop
kill $(lsof -ti:18080)

# Restart (e.g., after editing proxy_routes.json)
kill $(lsof -ti:18080); ~/.factory/bin/start_droid_proxy.sh

Verify it works

Start Droid, pick one of your custom models.
Ask any question.
You should see a streaming response (not a 400 or 404).
tail -f ~/.factory/logs/droid_responses_proxy.log should show a POST /v1/responses line per request.

If the response is empty or fails, check the most recent body dump:

ls -t ~/.factory/logs/proxy-bodies/ | head -1 | xargs -I{} python3 -m json.tool ~/.factory/logs/proxy-bodies/{}

This shows exactly what Droid sent and exactly what the proxy forwarded — the single best debugging tool.

What's supported

POST /v1/responses (or any path ending in /responses)
Streaming (SSE) and non-streaming
Function tools (Chat Completions standard {type: function})
Custom tools (OpenAI's {type: custom} with Lark or regex grammar — e.g. Droid's ApplyPatch)
Function and custom tool result inputs (function_call_output, custom_tool_call_output)
System instructions
Image inputs (input_image → image_url)
Structured outputs (text.format ↔ response_format)
Reasoning content roundtrip across turns (cached by call_id)
Per-model upstream routing
Retry on 408/425/429/500/502/503/504 + network exceptions
Sampling params: temperature, top_p, max_output_tokens → max_tokens, parallel_tool_calls, seed, user

What's NOT supported

OpenAI built-in tools — web_search, file_search, computer_use, code_interpreter, mcp. These require server-side tooling; they're silently dropped from the request.
ZDR-mode encrypted reasoning — encrypted_content blobs are opaque to the proxy. Plaintext summary is fine.
Anthropic-native upstreams — this proxy translates only to Chat Completions. For Claude, configure Droid with provider: "anthropic" and a real Anthropic-compatible base URL — that path doesn't go through this proxy at all.
Per-token streaming of custom-tool input — emitted as a single delta at end of stream. The rest streams per-token.

Tested with

These upstream models have been verified end-to-end through the proxy with Droid — full streaming, tool calling, custom-tool (ApplyPatch) round-trips, and multi-turn agentic loops with reasoning preserved across turns:

Model	Provider	Notes
`glm-5.1`, `glm-5`	OpenCode Zen Go	Reasoning + ApplyPatch + multi-turn confirmed
`kimi-k2.5`, `kimi-k2.6`	OpenCode Zen Go	reasoning_content cache hits across turns
`deepseek-v4-pro`, `deepseek-v4-flash`	OpenCode Zen Go	Thinking-mode round-trip verified; KV cache reuse working
`mimo-v2.5`, `mimo-v2.5-pro`	OpenCode Zen Go	Same Chat Completions translation path; reasoning preserved

Other OpenAI-compatible providers (OpenRouter, Together, Groq, Fireworks, Anyscale, DeepSeek direct, etc.) should work via the per-model routing config but have not been personally smoke-tested. Open an issue with your config + the relevant body dump from ~/.factory/logs/proxy-bodies/ if a specific provider misbehaves.

Models you access via Droid's native Anthropic adapter (e.g., minimax-m2.7, minimax-m2.5 against Zen Go's /v1/messages endpoint) bypass this proxy entirely — those use Droid's provider: "anthropic" code path directly and don't need a translation layer.

Logs and dumps

Path	What's in it
`~/.factory/logs/droid_responses_proxy.log`	Per-request summary lines (model, upstream, stream flag, msg count, tool count)
`~/.factory/logs/droid_responses_proxy.stdout.log`	Process stdout/stderr — startup messages, watchdog, fatal errors
`~/.factory/logs/proxy-bodies/*.json`	Last 40 requests, full incoming Responses body + outgoing Chat body side by side

The body dumps are the most useful debugging artifact when something behaves unexpectedly — they show you ground truth for both directions of the translation.

Tunables (environment variables)

Set in start_droid_proxy.sh before launching, or as env vars when running manually.

Variable	Default	Purpose
`PORT`	`18080`	Local listen port
`UPSTREAM`	`https://opencode.ai/zen/go/v1/chat/completions`	Fallback when no route matches
`PROXY_ROUTES_FILE`	`~/.factory/bin/proxy_routes.json`	Route config path
`PROXY_IDLE_TIMEOUT_SECONDS`	`7200` (2h)	Self-exit after this many seconds of inactivity. `0` disables.
`PROXY_RETRY_MAX_ATTEMPTS`	`3`	Max attempts on transient failures (total, not retries)
`PROXY_RETRY_BASE_BACKOFF`	`0.5`	Base seconds for exponential backoff
`PROXY_RETRY_TIMEOUT`	`600`	Per-attempt upstream timeout (seconds)
`PROXY_LOG`	`~/.factory/logs/droid_responses_proxy.log`	Per-request log file
`PROXY_DUMP_DIR`	`~/.factory/logs/proxy-bodies`	Body capture dir
`PROXY_DUMP_KEEP`	`40`	How many body dumps to retain
`PROXY_REASONING_CACHE_MAX`	`4096`	Max entries in the call_id → reasoning_text LRU cache
`PROXY_DEBUG`	`0`	Set `1` to log full outgoing Chat request bodies (truncated to 600 chars)

Lifecycle

Event	Behavior
Launch Droid	`SessionStart` hook runs `start_droid_proxy.sh`; cold-starts proxy in ~600ms, or no-ops in ~50ms if already running
Make a request	Increments in-flight counter; refreshes idle timer; both reset on response complete
Long stream (model thinking 10 min)	In-flight stays >0; watchdog can't reap
Two parallel Droid sessions	Share the same proxy; either's traffic keeps it alive; closing one doesn't affect the other
Close all Droid sessions	Proxy keeps running
2 hours with no traffic	Watchdog `os._exit(0)`; next session cold-starts a fresh one
`kill -9 droid` mid-request	Proxy unaffected; recovers normally
Reboot	Proxy gone; next Droid launch cold-starts it

Troubleshooting

`BYOK Error: 404 <!DOCTYPE html>...`

Either the proxy isn't running, or your baseUrl is wrong:

curl http://127.0.0.1:18080/health should return JSON with "ok": true.
baseUrl in settings.json must be exactly http://127.0.0.1:18080 — no trailing slash, no /v1.

`Error from provider (DeepSeek): The reasoning_content in the thinking mode must be passed back to the API`

Right after a proxy restart, the reasoning cache is empty — DeepSeek wants the prior turn's reasoning echoed back, and the proxy has nothing to attach. One turn is degraded (the proxy fills in a (prior reasoning not retained) placeholder so DeepSeek accepts the request). From the next turn onward, the cache will be populated and reasoning gets reattached automatically.

If this keeps happening every turn, something is restarting the proxy:

tail -f ~/.factory/logs/droid_responses_proxy.stdout.log

Model says "I don't have an Edit/Write tool available"

A type: custom tool (typically ApplyPatch) was dropped or mistranslated. Check the latest dump:

ls -t ~/.factory/logs/proxy-bodies/ | head -1 | xargs -I{} python3 -m json.tool ~/.factory/logs/proxy-bodies/{} | head -100

The outgoing_chat.tools list should include an ApplyPatch entry with a single input string parameter. If it's missing, the proxy build is older than the custom-tool support — replace droid_responses_proxy.py with the current version.

Upstream returns 401/403

Your apiKey in settings.json is invalid or doesn't have access to the model. Verify directly:

curl -s -H "Authorization: Bearer <key>" https://YOUR-UPSTREAM/v1/models | python3 -m json.tool

Upstream returns 429

You're being rate-limited. The proxy retries automatically (3 attempts, exponential backoff), but if all attempts fail, the 429 propagates. Options:

Slow down agentic loops (reduce parallel_tool_calls).
Increase PROXY_RETRY_BASE_BACKOFF and PROXY_RETRY_MAX_ATTEMPTS.
Upgrade your tier with the provider.

Empty / cut-off responses

The model probably hit max_tokens while still in reasoning. Increase max_output_tokens in Droid's request settings, or pick a model that uses fewer reasoning tokens for short answers.

The proxy keeps dying

Check tail -100 ~/.factory/logs/droid_responses_proxy.stdout.log for tracebacks or idle ...s — exiting lines. The 2h idle timeout is intentional; if you want it to live longer, set PROXY_IDLE_TIMEOUT_SECONDS=0 or a larger value in start_droid_proxy.sh.

Architecture

                  ~/.factory/settings.json
                          │  customModels[].baseUrl = http://127.0.0.1:18080
                          ▼
                       Droid CLI
                          │
                          │  POST /v1/responses   (OpenAI Responses API)
                          │  stream: true
                          │  tools: [{type:function, ...}, {type:custom, format:{grammar...}}]
                          ▼
            ┌─────────────────────────────────┐
            │   droid_responses_proxy.py      │
            │                                 │
            │  1. Parse Responses body        │
            │  2. Resolve route by model name │ ◄── proxy_routes.json
            │  3. Translate items → messages  │
            │  4. Translate tools internally  │
            │     → externally tagged         │
            │  5. POST upstream (with retry)  │
            │  6. Translate response stream   │
            │     back to Responses SSE       │
            └─────────────────────────────────┘
                          │
                          │  POST /v1/chat/completions
                          │  stream: true
                          │  tools: [{type:function, function:{name, parameters}}]
                          ▼
              Upstream provider
              (Zen Go / OpenRouter / Together / Groq / etc.)

Translation layers

Items → Messages — Responses input is an array of typed items (message, function_call, function_call_output, custom_tool_call, custom_tool_call_output, reasoning). Chat Completions wants messages with tool_calls glued onto assistant messages and role: tool messages for results. The proxy coalesces accordingly.
Tool definitions — Responses tools are internally tagged ({type:function, name, parameters} or {type:custom, name, format}). Chat tools are externally tagged ({type:function, function:{name, parameters}}). Custom tools become a function with a single input: string parameter and the grammar embedded in the description.
Reasoning preservation — Thinking-mode models (DeepSeek, GLM, Kimi, MiMo) require the prior turn's reasoning_content to be passed back on assistant messages. Droid doesn't roundtrip Responses-format reasoning items in subsequent turns. The proxy caches reasoning_text keyed by call_id whenever it emits a reasoning + tool-call response, and reattaches it as reasoning_content when those call_ids reappear in input.
Streaming events — Chat streams delta.content and delta.tool_calls[i].function.arguments chunks. Responses streams a named event sequence (response.created, response.in_progress, response.output_item.added, response.output_text.delta, response.function_call_arguments.delta, response.custom_tool_call_input.delta, response.output_item.done, response.completed, etc.). A per-request StreamBridge state machine handles the conversion.

Adding a new provider

Worked example — adding OpenRouter:

Get an OpenRouter API key from https://openrouter.ai.

Add a route in ~/.factory/bin/proxy_routes.json:

{
  "prefix": "openrouter/",
  "strip_prefix": true,
  "upstream": "https://openrouter.ai/api/v1/chat/completions",
  "headers": {
    "HTTP-Referer": "https://factory.ai",
    "X-Title": "Droid CLI"
  }
}

Add a model entry in ~/.factory/settings.json:

{
  "model": "openrouter/google/gemini-2.0-flash-001",
  "displayName": "Gemini 2.0 Flash (OpenRouter)",
  "baseUrl": "http://127.0.0.1:18080",
  "apiKey": "<openrouter key>",
  "provider": "openai"
}

Restart the proxy:

kill $(lsof -ti:18080); ~/.factory/bin/start_droid_proxy.sh

Restart Droid, pick "Gemini 2.0 Flash (OpenRouter)" from the model picker, chat.

The proxy strips openrouter/ from the model name, so OpenRouter sees google/gemini-2.0-flash-001 (its actual catalog id) with the OpenRouter-required headers attached.

File layout

Repo (what you git clone):

factory-droid-bridge/
├── README.md                                   # this file
├── LICENSE                                     # MIT
├── droid_responses_proxy.py                    # the proxy (stdlib Python)
├── start_droid_proxy.sh                        # idempotent launcher (bash)
├── install.sh                                  # one-shot installer
├── proxy_routes.example.json                   # commented multi-provider example
└── .gitignore

After installing, on your machine:

~/.factory/
├── settings.json                               # hooks.SessionStart + customModels[]
├── bin/
│   ├── droid_responses_proxy.py                # copied by install.sh
│   ├── start_droid_proxy.sh                    # copied by install.sh
│   ├── proxy_routes.example.json               # copied by install.sh
│   └── proxy_routes.json                       # auto-created on first proxy start
└── logs/
    ├── droid_responses_proxy.log               # per-request summary
    ├── droid_responses_proxy.stdout.log        # process stdout/stderr
    └── proxy-bodies/                           # last 40 request/response dumps
        └── 20XXXX-XXXXXX-XXXX.json

Implementation notes

Single-file Python, stdlib only. No pip install, no virtualenv, no Node, no compiled dependencies. If you have a working Python 3.10+, the proxy runs.
Threading server — one OS thread per request. Adequate for one person's CLI usage; not a production gateway.
Retry only covers the initial urlopen call. Once the proxy starts writing SSE bytes back to Droid, it's committed — retrying would duplicate events and corrupt the Stainless SDK's state machine on the client.
The reasoning cache is in-memory (per process). On restart it's empty; the next multi-turn message in a thinking-mode model conversation fires the (prior reasoning not retained) placeholder once, then steady state resumes.
The idle watchdog uses time.monotonic() and checks every 60–120s, so it survives system sleep/wake without spurious reaps.

Disclaimer

Community-built. Not an official Factory product, not endorsed by Factory. It uses Factory's documented BYOK and hook configuration surface; the translation between OpenAI Responses API and Chat Completions API is implemented against the public specs of both. Use at your own risk; review the code before running it.

License

MIT — see LICENSE.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
droid_responses_proxy.py		droid_responses_proxy.py
install.sh		install.sh
proxy_routes.example.json		proxy_routes.example.json
start_droid_proxy.sh		start_droid_proxy.sh

Folders and files

Latest commit

History

Repository files navigation

factory-droid-bridge

Why this exists

Requirements

Quick install

Or install by hand

Configure

Step 1 — Register the SessionStart hook

Step 2 — Declare your custom models

Step 3 — (Optional) Edit the route table

Route matchers (first match wins)

Per-route options

Step 4 — Get an API key onto the wire

Run

Verify it works

What's supported

What's NOT supported

Tested with

Logs and dumps

Tunables (environment variables)

Lifecycle

Troubleshooting

BYOK Error: 404 <!DOCTYPE html>...

Error from provider (DeepSeek): The reasoning_content in the thinking mode must be passed back to the API

Model says "I don't have an Edit/Write tool available"

Upstream returns 401/403

Upstream returns 429

Empty / cut-off responses

The proxy keeps dying

Architecture

Translation layers

Adding a new provider

File layout

Implementation notes

Disclaimer

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

`BYOK Error: 404 <!DOCTYPE html>...`

`Error from provider (DeepSeek): The reasoning_content in the thinking mode must be passed back to the API`

Packages