From e76293e9832744e8d15c4fb85b96e72c5f3c44f0 Mon Sep 17 00:00:00 2001
From: enyst <engel.nyst@gmail.com>
Date: Tue, 9 Jun 2026 15:08:51 +0000
Subject: [PATCH] docs: expand OpenAI-compatible endpoint guide

Co-authored-by: openhands <openhands@all-hands.dev>
---
 sdk/arch/overview.mdx                      |   1 +
 sdk/guides/agent-server/openai-gateway.mdx | 201 +++++++++++++++++++--
 sdk/guides/agent-server/overview.mdx       |   2 +
 sdk/index.mdx                              |  10 +-
 4 files changed, 202 insertions(+), 12 deletions(-)
diff --git a/sdk/arch/overview.mdx b/sdk/arch/overview.mdx
index 96de5f67c..5ef666f22 100644
--- a/sdk/arch/overview.mdx
+++ b/sdk/arch/overview.mdx
@@ -196,6 +196,7 @@ For full list of implemented workspaces, see the [source code](https://github.co
 
 **Features:**
 - REST API & WebSocket endpoints for conversations, bash, files, events, desktop, and VSCode
+- [OpenAI-compatible `/v1/chat/completions` endpoint](/sdk/guides/agent-server/openai-gateway) for clients that expect an OpenAI-style backend
 - Service management with isolated per-user sessions
 - API key authentication and health checking
 
diff --git a/sdk/guides/agent-server/openai-gateway.mdx b/sdk/guides/agent-server/openai-gateway.mdx
index f40a7a603..37ae59e8f 100644
--- a/sdk/guides/agent-server/openai-gateway.mdx
+++ b/sdk/guides/agent-server/openai-gateway.mdx
@@ -1,5 +1,5 @@
 ---
-title: OpenAI-Compatible Gateway
+title: OpenAI-Compatible Endpoint
 description: Call an OpenHands agent-server through the OpenAI Chat Completions protocol.
 ---
 
@@ -7,24 +7,203 @@ import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx";
 
 The agent-server exposes an OpenAI-compatible `/v1/chat/completions` endpoint so clients that already speak the OpenAI protocol can call an OpenHands agent.
 
-Use this when you want an existing chat UI, IDE integration, evaluation harness, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request.
+Use this when you want an existing chat UI, IDE integration, evaluation harness, voice platform, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request.
 
-## How it works
+## What to Configure
 
-1. Save an LLM profile through the agent-server profile API.
-2. List available gateway models with `GET /v1/models`.
-3. Call `POST /v1/chat/completions` with a model ID shaped like `openhands_<profile_name>`.
-4. Read `X-OpenHands-ServerConversation-ID` from the response.
-5. Pass that header back on later requests to continue the same OpenHands conversation.
+Most OpenAI-compatible clients ask for the same three fields:
+
+| Client Field | Value |
+| --- | --- |
+| Base URL | `https://YOUR_AGENT_SERVER/v1` |
+| API key | Your agent-server session API key |
+| Model | `openhands_<profile_name>` |
+
+For example, a saved LLM profile named `gateway_demo` appears as the OpenAI model `openhands_gateway_demo`.
 
 The gateway accepts the same session key in either OpenHands or OpenAI-compatible form:
 
 - `X-Session-API-Key: <key>`
 - `Authorization: Bearer <key>`
 
-<Note>
-The current gateway supports non-streaming Chat Completions requests. Requests with `stream: true` return a `400` response until streaming support is added.
-</Note>
+## Prepare a Profile
+
+OpenAI-compatible traffic is backed by an agent-server LLM profile. Create one with the native profile API first:
+
+```bash
+export AGENT_SERVER_URL="http://localhost:8000"
+export SESSION_API_KEY="your-session-api-key"
+export PROFILE_NAME="gateway_demo"
+export OPENHANDS_MODEL="openhands_${PROFILE_NAME}"
+
+curl -X POST "$AGENT_SERVER_URL/api/profiles/$PROFILE_NAME" \
+  -H "X-Session-API-Key: $SESSION_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d '{
+    "llm": {
+      "model": "gpt-5-nano",
+      "api_key": "YOUR_LLM_API_KEY"
+    },
+    "include_secrets": true
+  }'
+```
+
+Then confirm the profile is visible to OpenAI clients:
+
+```bash
+curl "$AGENT_SERVER_URL/v1/models" \
+  -H "Authorization: Bearer $SESSION_API_KEY"
+```
+
+## Client Recipes
+
+<Tabs>
+<Tab title="curl">
+
+```bash
+curl -i "$AGENT_SERVER_URL/v1/chat/completions" \
+  -H "Authorization: Bearer $SESSION_API_KEY" \
+  -H "Content-Type: application/json" \
+  -d "{
+    \"model\": \"$OPENHANDS_MODEL\",
+    \"messages\": [
+      {
+        \"role\": \"system\",
+        \"content\": \"Answer directly unless you need to inspect files.\"
+      },
+      {
+        \"role\": \"user\",
+        \"content\": \"Explain what this OpenHands endpoint does in one sentence.\"
+      }
+    ]
+  }"
+```
+
+The response includes `X-OpenHands-ServerConversation-ID`. Save that header if you want a later request to continue the same agent conversation.
+
+</Tab>
+<Tab title="Python SDK">
+
+```python
+import os
+
+from openai import OpenAI
+
+client = OpenAI(
+    api_key=os.environ["SESSION_API_KEY"],
+    base_url=f"{os.environ['AGENT_SERVER_URL']}/v1",
+)
+
+response = client.chat.completions.with_raw_response.create(
+    model=os.environ["OPENHANDS_MODEL"],
+    messages=[
+        {"role": "user", "content": "Summarize this repository."},
+    ],
+)
+completion = response.parse()
+conversation_id = response.headers["X-OpenHands-ServerConversation-ID"]
+print(completion.choices[0].message.content)
+
+follow_up = client.chat.completions.create(
+    model=os.environ["OPENHANDS_MODEL"],
+    messages=[{"role": "user", "content": "Now list the main packages."}],
+    extra_headers={"X-OpenHands-ServerConversation-ID": conversation_id},
+)
+print(follow_up.choices[0].message.content)
+```
+
+</Tab>
+<Tab title="JavaScript SDK">
+
+```javascript
+import OpenAI from "openai";
+
+const client = new OpenAI({
+  apiKey: process.env.SESSION_API_KEY,
+  baseURL: `${process.env.AGENT_SERVER_URL}/v1`,
+});
+
+const first = await client.chat.completions
+  .create({
+    model: process.env.OPENHANDS_MODEL,
+    messages: [
+      { role: "user", content: "Summarize this repository." },
+    ],
+  })
+  .withResponse();
+
+const conversationId = first.response.headers.get(
+  "x-openhands-serverconversation-id",
+);
+console.log(first.data.choices[0].message.content);
+
+const followUp = await client.chat.completions.create(
+  {
+    model: process.env.OPENHANDS_MODEL,
+    messages: [{ role: "user", content: "Now list the main packages." }],
+  },
+  {
+    headers: { "X-OpenHands-ServerConversation-ID": conversationId },
+  },
+);
+console.log(followUp.choices[0].message.content);
+```
+
+</Tab>
+<Tab title="Chat UIs">
+
+For Open WebUI, LibreChat, Chatbot UI, and similar OpenAI-compatible frontends, configure a custom OpenAI provider with:
+
+- **Base URL**: `https://YOUR_AGENT_SERVER/v1`
+- **API key**: your agent-server session API key
+- **Model**: `openhands_<profile_name>`
+- **Streaming**: disabled for now
+
+If the UI can store a response header and send a custom request header, persist `X-OpenHands-ServerConversation-ID` per chat thread and send it on follow-up turns. If it cannot, each request starts a new OpenHands conversation and works best for one-shot tasks.
+
+</Tab>
+<Tab title="Voice or Webhook">
+
+Voice platforms and webhook integrations usually have their own session or call ID. Store a mapping from that external ID to the OpenHands conversation ID:
+
+```python
+conversation_id = conversation_ids.get(platform_session_id)
+headers = {}
+if conversation_id:
+    headers["X-OpenHands-ServerConversation-ID"] = conversation_id
+
+response = client.chat.completions.with_raw_response.create(
+    model="openhands_gateway_demo",
+    messages=[{"role": "user", "content": transcript_text}],
+    extra_headers=headers,
+)
+
+conversation_ids[platform_session_id] = response.headers[
+    "X-OpenHands-ServerConversation-ID"
+]
+reply_text = response.parse().choices[0].message.content
+```
+
+Return `reply_text` to the voice or webhook platform. Keep the mapping for as long as that external session should continue.
+
+</Tab>
+</Tabs>
+
+## Conversation State
+
+The OpenAI Chat Completions protocol usually sends full message history on every request. The OpenHands gateway does not reconstruct agent history from prior assistant messages. Instead:
+
+- Omit `X-OpenHands-ServerConversation-ID` to start a new OpenHands conversation.
+- Read `X-OpenHands-ServerConversation-ID` from the response.
+- Send that header on follow-up requests to continue the same OpenHands conversation.
+
+When reusing a conversation, send the newest user turn in `messages`. The server-side OpenHands conversation owns the previous agent state, tool activity, and workspace context.
+
+## Current Limitations
+
+- Only non-streaming Chat Completions requests are supported. Requests with `stream: true` return `400` until streaming support is added.
+- The response contains the final assistant text only. Internal OpenHands tool activity is not exposed as OpenAI tool calls.
+- OpenAI request fields that are not needed by the gateway are ignored or rejected intentionally by the server implementation.
 
 ## Ready-to-run example
 
diff --git a/sdk/guides/agent-server/overview.mdx b/sdk/guides/agent-server/overview.mdx
index 3d21f548b..915c75747 100644
--- a/sdk/guides/agent-server/overview.mdx
+++ b/sdk/guides/agent-server/overview.mdx
@@ -43,6 +43,7 @@ A Remote Agent Server is an HTTP/WebSocket server that:
 - **Manages workspaces** (Docker containers or remote sandboxes)
 - **Streams events** to clients via WebSocket
 - **Handles command and file operations** (execute command, upload, download), check [base class](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/workspace/base.py) for more details
+- **Accepts OpenAI-compatible Chat Completions requests** through the [OpenAI-compatible endpoint](/sdk/guides/agent-server/openai-gateway)
 - **Provides isolation** between different agent executions
 
 Think of it as the "backend" for your agent, while your Python code acts as the "frontend" client.
@@ -159,6 +160,7 @@ Explore different deployment options:
 - **[Local Agent Server](/sdk/guides/agent-server/local-server)** - Run agent server in the same process
 - **[Docker Sandboxed Server](/sdk/guides/agent-server/docker-sandbox)** - Run agent server in isolated Docker containers
 - **[API Sandboxed Server](/sdk/guides/agent-server/api-sandbox)** - Connect to hosted agent server via API
+- **[OpenAI-Compatible Endpoint](/sdk/guides/agent-server/openai-gateway)** - Access an OpenHands agent from OpenAI-compatible clients
 
 For architectural details:
 - **[Agent Server Package Architecture](/sdk/arch/agent-server)** - Remote execution architecture and deployment
diff --git a/sdk/index.mdx b/sdk/index.mdx
index 07ed9e81d..c16863acd 100644
--- a/sdk/index.mdx
+++ b/sdk/index.mdx
@@ -12,6 +12,7 @@ You can use the OpenHands Software Agent SDK for:
 - One-off tasks, like building a README for your repo
 - Routine maintenance tasks, like updating dependencies
 - Major tasks that involve multiple agents, like refactors and rewrites
+- OpenAI-compatible access to an OpenHands agent from chat UIs, IDEs, voice platforms, and other clients
 
 You can even use the SDK to build new developer experiences—it’s the engine behind the [OpenHands CLI](/openhands/usage/cli/quick-start) and [OpenHands Cloud](/openhands/usage/cloud/openhands-cloud).
 
@@ -19,7 +20,7 @@ Get started with some examples or keep reading to learn more.
 
 ## Features
 
-<Columns cols={3}>
+<Columns cols={4}>
   <Card title="Single Python API" icon="python">
     A unified Python API that enables you to run agents locally or in the cloud, define custom agent behaviors, and create custom tools.
   </Card>
@@ -29,6 +30,13 @@ Get started with some examples or keep reading to learn more.
   <Card title="REST-based Agent Server" icon="server">
     A production-ready server that runs agents anywhere, including Docker and Kubernetes, while connecting seamlessly to the Python API.
   </Card>
+  <Card
+    title="OpenAI-Compatible Endpoint"
+    icon="plug"
+    href="/sdk/guides/agent-server/openai-gateway"
+  >
+    Access the OpenHands agent via an OpenAI-compatible endpoint for chat UIs, IDEs, voice platforms, and other OpenAI-style clients.
+  </Card>
 </Columns>
 
 ## Why OpenHands Software Agent SDK?