From e76293e9832744e8d15c4fb85b96e72c5f3c44f0 Mon Sep 17 00:00:00 2001 From: enyst Date: Tue, 9 Jun 2026 15:08:51 +0000 Subject: [PATCH] docs: expand OpenAI-compatible endpoint guide Co-authored-by: openhands --- sdk/arch/overview.mdx | 1 + sdk/guides/agent-server/openai-gateway.mdx | 201 +++++++++++++++++++-- sdk/guides/agent-server/overview.mdx | 2 + sdk/index.mdx | 10 +- 4 files changed, 202 insertions(+), 12 deletions(-) diff --git a/sdk/arch/overview.mdx b/sdk/arch/overview.mdx index 96de5f67c..5ef666f22 100644 --- a/sdk/arch/overview.mdx +++ b/sdk/arch/overview.mdx @@ -196,6 +196,7 @@ For full list of implemented workspaces, see the [source code](https://github.co **Features:** - REST API & WebSocket endpoints for conversations, bash, files, events, desktop, and VSCode +- [OpenAI-compatible `/v1/chat/completions` endpoint](/sdk/guides/agent-server/openai-gateway) for clients that expect an OpenAI-style backend - Service management with isolated per-user sessions - API key authentication and health checking diff --git a/sdk/guides/agent-server/openai-gateway.mdx b/sdk/guides/agent-server/openai-gateway.mdx index f40a7a603..37ae59e8f 100644 --- a/sdk/guides/agent-server/openai-gateway.mdx +++ b/sdk/guides/agent-server/openai-gateway.mdx @@ -1,5 +1,5 @@ --- -title: OpenAI-Compatible Gateway +title: OpenAI-Compatible Endpoint description: Call an OpenHands agent-server through the OpenAI Chat Completions protocol. --- @@ -7,24 +7,203 @@ import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx"; The agent-server exposes an OpenAI-compatible `/v1/chat/completions` endpoint so clients that already speak the OpenAI protocol can call an OpenHands agent. -Use this when you want an existing chat UI, IDE integration, evaluation harness, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request. +Use this when you want an existing chat UI, IDE integration, evaluation harness, voice platform, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request. -## How it works +## What to Configure -1. Save an LLM profile through the agent-server profile API. -2. List available gateway models with `GET /v1/models`. -3. Call `POST /v1/chat/completions` with a model ID shaped like `openhands_`. -4. Read `X-OpenHands-ServerConversation-ID` from the response. -5. Pass that header back on later requests to continue the same OpenHands conversation. +Most OpenAI-compatible clients ask for the same three fields: + +| Client Field | Value | +| --- | --- | +| Base URL | `https://YOUR_AGENT_SERVER/v1` | +| API key | Your agent-server session API key | +| Model | `openhands_` | + +For example, a saved LLM profile named `gateway_demo` appears as the OpenAI model `openhands_gateway_demo`. The gateway accepts the same session key in either OpenHands or OpenAI-compatible form: - `X-Session-API-Key: ` - `Authorization: Bearer ` - -The current gateway supports non-streaming Chat Completions requests. Requests with `stream: true` return a `400` response until streaming support is added. - +## Prepare a Profile + +OpenAI-compatible traffic is backed by an agent-server LLM profile. Create one with the native profile API first: + +```bash +export AGENT_SERVER_URL="http://localhost:8000" +export SESSION_API_KEY="your-session-api-key" +export PROFILE_NAME="gateway_demo" +export OPENHANDS_MODEL="openhands_${PROFILE_NAME}" + +curl -X POST "$AGENT_SERVER_URL/api/profiles/$PROFILE_NAME" \ + -H "X-Session-API-Key: $SESSION_API_KEY" \ + -H "Content-Type: application/json" \ + -d '{ + "llm": { + "model": "gpt-5-nano", + "api_key": "YOUR_LLM_API_KEY" + }, + "include_secrets": true + }' +``` + +Then confirm the profile is visible to OpenAI clients: + +```bash +curl "$AGENT_SERVER_URL/v1/models" \ + -H "Authorization: Bearer $SESSION_API_KEY" +``` + +## Client Recipes + + + + +```bash +curl -i "$AGENT_SERVER_URL/v1/chat/completions" \ + -H "Authorization: Bearer $SESSION_API_KEY" \ + -H "Content-Type: application/json" \ + -d "{ + \"model\": \"$OPENHANDS_MODEL\", + \"messages\": [ + { + \"role\": \"system\", + \"content\": \"Answer directly unless you need to inspect files.\" + }, + { + \"role\": \"user\", + \"content\": \"Explain what this OpenHands endpoint does in one sentence.\" + } + ] + }" +``` + +The response includes `X-OpenHands-ServerConversation-ID`. Save that header if you want a later request to continue the same agent conversation. + + + + +```python +import os + +from openai import OpenAI + +client = OpenAI( + api_key=os.environ["SESSION_API_KEY"], + base_url=f"{os.environ['AGENT_SERVER_URL']}/v1", +) + +response = client.chat.completions.with_raw_response.create( + model=os.environ["OPENHANDS_MODEL"], + messages=[ + {"role": "user", "content": "Summarize this repository."}, + ], +) +completion = response.parse() +conversation_id = response.headers["X-OpenHands-ServerConversation-ID"] +print(completion.choices[0].message.content) + +follow_up = client.chat.completions.create( + model=os.environ["OPENHANDS_MODEL"], + messages=[{"role": "user", "content": "Now list the main packages."}], + extra_headers={"X-OpenHands-ServerConversation-ID": conversation_id}, +) +print(follow_up.choices[0].message.content) +``` + + + + +```javascript +import OpenAI from "openai"; + +const client = new OpenAI({ + apiKey: process.env.SESSION_API_KEY, + baseURL: `${process.env.AGENT_SERVER_URL}/v1`, +}); + +const first = await client.chat.completions + .create({ + model: process.env.OPENHANDS_MODEL, + messages: [ + { role: "user", content: "Summarize this repository." }, + ], + }) + .withResponse(); + +const conversationId = first.response.headers.get( + "x-openhands-serverconversation-id", +); +console.log(first.data.choices[0].message.content); + +const followUp = await client.chat.completions.create( + { + model: process.env.OPENHANDS_MODEL, + messages: [{ role: "user", content: "Now list the main packages." }], + }, + { + headers: { "X-OpenHands-ServerConversation-ID": conversationId }, + }, +); +console.log(followUp.choices[0].message.content); +``` + + + + +For Open WebUI, LibreChat, Chatbot UI, and similar OpenAI-compatible frontends, configure a custom OpenAI provider with: + +- **Base URL**: `https://YOUR_AGENT_SERVER/v1` +- **API key**: your agent-server session API key +- **Model**: `openhands_` +- **Streaming**: disabled for now + +If the UI can store a response header and send a custom request header, persist `X-OpenHands-ServerConversation-ID` per chat thread and send it on follow-up turns. If it cannot, each request starts a new OpenHands conversation and works best for one-shot tasks. + + + + +Voice platforms and webhook integrations usually have their own session or call ID. Store a mapping from that external ID to the OpenHands conversation ID: + +```python +conversation_id = conversation_ids.get(platform_session_id) +headers = {} +if conversation_id: + headers["X-OpenHands-ServerConversation-ID"] = conversation_id + +response = client.chat.completions.with_raw_response.create( + model="openhands_gateway_demo", + messages=[{"role": "user", "content": transcript_text}], + extra_headers=headers, +) + +conversation_ids[platform_session_id] = response.headers[ + "X-OpenHands-ServerConversation-ID" +] +reply_text = response.parse().choices[0].message.content +``` + +Return `reply_text` to the voice or webhook platform. Keep the mapping for as long as that external session should continue. + + + + +## Conversation State + +The OpenAI Chat Completions protocol usually sends full message history on every request. The OpenHands gateway does not reconstruct agent history from prior assistant messages. Instead: + +- Omit `X-OpenHands-ServerConversation-ID` to start a new OpenHands conversation. +- Read `X-OpenHands-ServerConversation-ID` from the response. +- Send that header on follow-up requests to continue the same OpenHands conversation. + +When reusing a conversation, send the newest user turn in `messages`. The server-side OpenHands conversation owns the previous agent state, tool activity, and workspace context. + +## Current Limitations + +- Only non-streaming Chat Completions requests are supported. Requests with `stream: true` return `400` until streaming support is added. +- The response contains the final assistant text only. Internal OpenHands tool activity is not exposed as OpenAI tool calls. +- OpenAI request fields that are not needed by the gateway are ignored or rejected intentionally by the server implementation. ## Ready-to-run example diff --git a/sdk/guides/agent-server/overview.mdx b/sdk/guides/agent-server/overview.mdx index 3d21f548b..915c75747 100644 --- a/sdk/guides/agent-server/overview.mdx +++ b/sdk/guides/agent-server/overview.mdx @@ -43,6 +43,7 @@ A Remote Agent Server is an HTTP/WebSocket server that: - **Manages workspaces** (Docker containers or remote sandboxes) - **Streams events** to clients via WebSocket - **Handles command and file operations** (execute command, upload, download), check [base class](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/workspace/base.py) for more details +- **Accepts OpenAI-compatible Chat Completions requests** through the [OpenAI-compatible endpoint](/sdk/guides/agent-server/openai-gateway) - **Provides isolation** between different agent executions Think of it as the "backend" for your agent, while your Python code acts as the "frontend" client. @@ -159,6 +160,7 @@ Explore different deployment options: - **[Local Agent Server](/sdk/guides/agent-server/local-server)** - Run agent server in the same process - **[Docker Sandboxed Server](/sdk/guides/agent-server/docker-sandbox)** - Run agent server in isolated Docker containers - **[API Sandboxed Server](/sdk/guides/agent-server/api-sandbox)** - Connect to hosted agent server via API +- **[OpenAI-Compatible Endpoint](/sdk/guides/agent-server/openai-gateway)** - Access an OpenHands agent from OpenAI-compatible clients For architectural details: - **[Agent Server Package Architecture](/sdk/arch/agent-server)** - Remote execution architecture and deployment diff --git a/sdk/index.mdx b/sdk/index.mdx index 07ed9e81d..c16863acd 100644 --- a/sdk/index.mdx +++ b/sdk/index.mdx @@ -12,6 +12,7 @@ You can use the OpenHands Software Agent SDK for: - One-off tasks, like building a README for your repo - Routine maintenance tasks, like updating dependencies - Major tasks that involve multiple agents, like refactors and rewrites +- OpenAI-compatible access to an OpenHands agent from chat UIs, IDEs, voice platforms, and other clients You can even use the SDK to build new developer experiences—it’s the engine behind the [OpenHands CLI](/openhands/usage/cli/quick-start) and [OpenHands Cloud](/openhands/usage/cloud/openhands-cloud). @@ -19,7 +20,7 @@ Get started with some examples or keep reading to learn more. ## Features - + A unified Python API that enables you to run agents locally or in the cloud, define custom agent behaviors, and create custom tools. @@ -29,6 +30,13 @@ Get started with some examples or keep reading to learn more. A production-ready server that runs agents anywhere, including Docker and Kubernetes, while connecting seamlessly to the Python API. + + Access the OpenHands agent via an OpenAI-compatible endpoint for chat UIs, IDEs, voice platforms, and other OpenAI-style clients. + ## Why OpenHands Software Agent SDK?