Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions sdk/arch/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -196,6 +196,7 @@ For full list of implemented workspaces, see the [source code](https://github.co

**Features:**
- REST API & WebSocket endpoints for conversations, bash, files, events, desktop, and VSCode
- [OpenAI-compatible `/v1/chat/completions` endpoint](/sdk/guides/agent-server/openai-gateway) for clients that expect an OpenAI-style backend
- Service management with isolated per-user sessions
- API key authentication and health checking

Expand Down
201 changes: 190 additions & 11 deletions sdk/guides/agent-server/openai-gateway.mdx
Original file line number Diff line number Diff line change
@@ -1,30 +1,209 @@
---
title: OpenAI-Compatible Gateway
title: OpenAI-Compatible Endpoint
description: Call an OpenHands agent-server through the OpenAI Chat Completions protocol.
---

import RunExampleCode from "/sdk/shared-snippets/how-to-run-example.mdx";

The agent-server exposes an OpenAI-compatible `/v1/chat/completions` endpoint so clients that already speak the OpenAI protocol can call an OpenHands agent.

Use this when you want an existing chat UI, IDE integration, evaluation harness, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request.
Use this when you want an existing chat UI, IDE integration, evaluation harness, voice platform, or another agent to treat OpenHands as an OpenAI-style backend while still getting the full agent runtime behind the request.

## How it works
## What to Configure

1. Save an LLM profile through the agent-server profile API.
2. List available gateway models with `GET /v1/models`.
3. Call `POST /v1/chat/completions` with a model ID shaped like `openhands_<profile_name>`.
4. Read `X-OpenHands-ServerConversation-ID` from the response.
5. Pass that header back on later requests to continue the same OpenHands conversation.
Most OpenAI-compatible clients ask for the same three fields:

| Client Field | Value |
| --- | --- |
| Base URL | `https://YOUR_AGENT_SERVER/v1` |
| API key | Your agent-server session API key |
| Model | `openhands_<profile_name>` |

For example, a saved LLM profile named `gateway_demo` appears as the OpenAI model `openhands_gateway_demo`.

The gateway accepts the same session key in either OpenHands or OpenAI-compatible form:

- `X-Session-API-Key: <key>`
- `Authorization: Bearer <key>`

<Note>
The current gateway supports non-streaming Chat Completions requests. Requests with `stream: true` return a `400` response until streaming support is added.
</Note>
## Prepare a Profile

OpenAI-compatible traffic is backed by an agent-server LLM profile. Create one with the native profile API first:

```bash
export AGENT_SERVER_URL="http://localhost:8000"
export SESSION_API_KEY="your-session-api-key"
export PROFILE_NAME="gateway_demo"
export OPENHANDS_MODEL="openhands_${PROFILE_NAME}"

curl -X POST "$AGENT_SERVER_URL/api/profiles/$PROFILE_NAME" \
-H "X-Session-API-Key: $SESSION_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"llm": {
"model": "gpt-5-nano",
"api_key": "YOUR_LLM_API_KEY"
},
"include_secrets": true
}'
```

Then confirm the profile is visible to OpenAI clients:

```bash
curl "$AGENT_SERVER_URL/v1/models" \
-H "Authorization: Bearer $SESSION_API_KEY"
```

## Client Recipes

<Tabs>
<Tab title="curl">

```bash
curl -i "$AGENT_SERVER_URL/v1/chat/completions" \
-H "Authorization: Bearer $SESSION_API_KEY" \
-H "Content-Type: application/json" \
-d "{
\"model\": \"$OPENHANDS_MODEL\",
\"messages\": [
{
\"role\": \"system\",
\"content\": \"Answer directly unless you need to inspect files.\"
},
{
\"role\": \"user\",
\"content\": \"Explain what this OpenHands endpoint does in one sentence.\"
}
]
}"
```

The response includes `X-OpenHands-ServerConversation-ID`. Save that header if you want a later request to continue the same agent conversation.

</Tab>
<Tab title="Python SDK">

```python
import os

from openai import OpenAI

client = OpenAI(
api_key=os.environ["SESSION_API_KEY"],
base_url=f"{os.environ['AGENT_SERVER_URL']}/v1",
)

response = client.chat.completions.with_raw_response.create(
model=os.environ["OPENHANDS_MODEL"],
messages=[
{"role": "user", "content": "Summarize this repository."},
],
)
completion = response.parse()
conversation_id = response.headers["X-OpenHands-ServerConversation-ID"]
print(completion.choices[0].message.content)

follow_up = client.chat.completions.create(
model=os.environ["OPENHANDS_MODEL"],
messages=[{"role": "user", "content": "Now list the main packages."}],
extra_headers={"X-OpenHands-ServerConversation-ID": conversation_id},
)
print(follow_up.choices[0].message.content)
```

</Tab>
<Tab title="JavaScript SDK">

```javascript
import OpenAI from "openai";

const client = new OpenAI({
apiKey: process.env.SESSION_API_KEY,
baseURL: `${process.env.AGENT_SERVER_URL}/v1`,
});

const first = await client.chat.completions
.create({
model: process.env.OPENHANDS_MODEL,
messages: [
{ role: "user", content: "Summarize this repository." },
],
})
.withResponse();

const conversationId = first.response.headers.get(
"x-openhands-serverconversation-id",
);
console.log(first.data.choices[0].message.content);

const followUp = await client.chat.completions.create(
{
model: process.env.OPENHANDS_MODEL,
messages: [{ role: "user", content: "Now list the main packages." }],
},
{
headers: { "X-OpenHands-ServerConversation-ID": conversationId },
},
);
console.log(followUp.choices[0].message.content);
```

</Tab>
<Tab title="Chat UIs">

For Open WebUI, LibreChat, Chatbot UI, and similar OpenAI-compatible frontends, configure a custom OpenAI provider with:

- **Base URL**: `https://YOUR_AGENT_SERVER/v1`
- **API key**: your agent-server session API key
- **Model**: `openhands_<profile_name>`
- **Streaming**: disabled for now

If the UI can store a response header and send a custom request header, persist `X-OpenHands-ServerConversation-ID` per chat thread and send it on follow-up turns. If it cannot, each request starts a new OpenHands conversation and works best for one-shot tasks.

</Tab>
<Tab title="Voice or Webhook">

Voice platforms and webhook integrations usually have their own session or call ID. Store a mapping from that external ID to the OpenHands conversation ID:

```python
conversation_id = conversation_ids.get(platform_session_id)
headers = {}
if conversation_id:
headers["X-OpenHands-ServerConversation-ID"] = conversation_id

response = client.chat.completions.with_raw_response.create(
model="openhands_gateway_demo",
messages=[{"role": "user", "content": transcript_text}],
extra_headers=headers,
)

conversation_ids[platform_session_id] = response.headers[
"X-OpenHands-ServerConversation-ID"
]
reply_text = response.parse().choices[0].message.content
```

Return `reply_text` to the voice or webhook platform. Keep the mapping for as long as that external session should continue.

</Tab>
</Tabs>

## Conversation State

The OpenAI Chat Completions protocol usually sends full message history on every request. The OpenHands gateway does not reconstruct agent history from prior assistant messages. Instead:

- Omit `X-OpenHands-ServerConversation-ID` to start a new OpenHands conversation.
- Read `X-OpenHands-ServerConversation-ID` from the response.
- Send that header on follow-up requests to continue the same OpenHands conversation.

When reusing a conversation, send the newest user turn in `messages`. The server-side OpenHands conversation owns the previous agent state, tool activity, and workspace context.

## Current Limitations

- Only non-streaming Chat Completions requests are supported. Requests with `stream: true` return `400` until streaming support is added.
- The response contains the final assistant text only. Internal OpenHands tool activity is not exposed as OpenAI tool calls.
- OpenAI request fields that are not needed by the gateway are ignored or rejected intentionally by the server implementation.

## Ready-to-run example

Expand Down
2 changes: 2 additions & 0 deletions sdk/guides/agent-server/overview.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -43,6 +43,7 @@ A Remote Agent Server is an HTTP/WebSocket server that:
- **Manages workspaces** (Docker containers or remote sandboxes)
- **Streams events** to clients via WebSocket
- **Handles command and file operations** (execute command, upload, download), check [base class](https://github.com/OpenHands/software-agent-sdk/blob/main/openhands-sdk/openhands/sdk/workspace/base.py) for more details
- **Accepts OpenAI-compatible Chat Completions requests** through the [OpenAI-compatible endpoint](/sdk/guides/agent-server/openai-gateway)
- **Provides isolation** between different agent executions

Think of it as the "backend" for your agent, while your Python code acts as the "frontend" client.
Expand Down Expand Up @@ -159,6 +160,7 @@ Explore different deployment options:
- **[Local Agent Server](/sdk/guides/agent-server/local-server)** - Run agent server in the same process
- **[Docker Sandboxed Server](/sdk/guides/agent-server/docker-sandbox)** - Run agent server in isolated Docker containers
- **[API Sandboxed Server](/sdk/guides/agent-server/api-sandbox)** - Connect to hosted agent server via API
- **[OpenAI-Compatible Endpoint](/sdk/guides/agent-server/openai-gateway)** - Access an OpenHands agent from OpenAI-compatible clients

For architectural details:
- **[Agent Server Package Architecture](/sdk/arch/agent-server)** - Remote execution architecture and deployment
10 changes: 9 additions & 1 deletion sdk/index.mdx
Original file line number Diff line number Diff line change
Expand Up @@ -12,14 +12,15 @@ You can use the OpenHands Software Agent SDK for:
- One-off tasks, like building a README for your repo
- Routine maintenance tasks, like updating dependencies
- Major tasks that involve multiple agents, like refactors and rewrites
- OpenAI-compatible access to an OpenHands agent from chat UIs, IDEs, voice platforms, and other clients

You can even use the SDK to build new developer experiences—it’s the engine behind the [OpenHands CLI](/openhands/usage/cli/quick-start) and [OpenHands Cloud](/openhands/usage/cloud/openhands-cloud).

Get started with some examples or keep reading to learn more.

## Features

<Columns cols={3}>
<Columns cols={4}>
<Card title="Single Python API" icon="python">
A unified Python API that enables you to run agents locally or in the cloud, define custom agent behaviors, and create custom tools.
</Card>
Expand All @@ -29,6 +30,13 @@ Get started with some examples or keep reading to learn more.
<Card title="REST-based Agent Server" icon="server">
A production-ready server that runs agents anywhere, including Docker and Kubernetes, while connecting seamlessly to the Python API.
</Card>
<Card
title="OpenAI-Compatible Endpoint"
icon="plug"
href="/sdk/guides/agent-server/openai-gateway"
>
Access the OpenHands agent via an OpenAI-compatible endpoint for chat UIs, IDEs, voice platforms, and other OpenAI-style clients.
</Card>
</Columns>

## Why OpenHands Software Agent SDK?
Expand Down
Loading