redcell serve exposes an OpenAI-compatible HTTP API (redcell/server.py,
built with FastAPI). Any OpenAI client/SDK or scanner can drive the agent through
the standard chat-completions contract — tools, reasoning channel, and all.
Base URL: http://<host>:<port>/v1 (default http://127.0.0.1:8800/v1).
Lists the single advertised model (id = AGENT_MODEL_ID, default redcell).
{ "object": "list",
"data": [ { "id": "redcell", "object": "model", "created": 0, "owned_by": "redcell" } ] }Runs one agent turn. Accepts the OpenAI request shape:
{
"model": "redcell",
"messages": [{ "role": "user", "content": "hello" }],
"stream": false
}Non-streaming response (a chat.completion):
{
"id": "chatcmpl-…",
"object": "chat.completion",
"created": 1234567890,
"model": "redcell",
"choices": [{
"index": 0,
"message": { "role": "assistant", "content": "…", "reasoning_content": "…" },
"finish_reason": "stop"
}],
"usage": { … }
}reasoning_contentis present only when the model exposed a reasoning channel.- Errors during the run are returned as HTTP 500 with an OpenAI-style error body
(
{"error": {"message": …, "type": "agent_error"}}) rather than crashing.
Set "stream": true to receive text/event-stream chat.completion.chunk events
ending with data: [DONE]. Note redcell runs the full tool-use loop to completion
first, then chunks the final text out — so streaming is a UI nicety, not
token-by-token generation. Order of deltas: role → reasoning (if any) → content → stop.
If AGENT_SERVER_API_KEY is set, every request must carry
Authorization: Bearer <key> (401 otherwise). If unset, the API is open. Example:
curl http://127.0.0.1:8800/v1/chat/completions \
-H "Authorization: Bearer $AGENT_SERVER_API_KEY" \
-H "content-type: application/json" \
-d '{"model":"redcell","messages":[{"role":"user","content":"hi"}]}'- Open WebUI (Docker): Base URL
http://host.docker.internal:8800/v1, any API key. - Garak / Promptfoo: target
http://127.0.0.1:8800/v1/chat/completions, modelredcell. - OpenAI SDK: point
base_urlat.../v1,model="redcell".
By default the server is stateless: it builds a fresh Agent per request and runs
only the messages in that request. The client owns conversation history and resends
it each turn (promptfoo: "No — resend the full interaction history"). Multi-turn just
works with no extra setup.
To run redcell as a stateful target — the client sends only the new turn plus a session id and redcell remembers the rest — attach a session id to each request.
server.py looks for an id in this order:
- The header named by
AGENT_SESSION_HEADER(defaultx-redcell-session). - A body field
session_id. - A body field
sessionId.
If one is found (and the server has a SessionStore), the request uses the stateful
path: redcell loads that session's history, appends the incoming turn(s), runs the
loop, and persists the full expansion (assistant text, tool-call turns, tool
results). The id is echoed back in the response header. No id → unchanged stateless
behavior.
SessionStore (redcell/sessions.py) is an in-memory map of id → history with:
- idle TTL eviction (
AGENT_SESSION_TTL_SECONDS, default 1h), and - LRU eviction past
AGENT_SESSION_MAX(default 1000) concurrent sessions.
It is lost on restart — adequate for a local red-team target. It assumes a single conversation is driven sequentially (true for promptfoo multi-turn strategies); overlapping concurrent requests on the same id would race on that session's memory.
Wizard answers: Remembers history → Yes, Session management → Client-generated Session ID, Session ID Extraction → (leave empty). The generated target wires a fresh UUID per test case into each request:
targets:
- id: openai:chat:redcell
config:
apiBaseUrl: http://127.0.0.1:8800/v1
apiKey: redcell # any value unless AGENT_SERVER_API_KEY is set
headers:
x-redcell-session: '{{sessionId}}'
defaultTest:
options:
transformVars: '{ ...vars, sessionId: context.uuid }'SID=demo-1
curl -s localhost:8800/v1/chat/completions -H "x-redcell-session: $SID" \
-d '{"model":"redcell","messages":[{"role":"user","content":"My name is Travis."}]}'
curl -s localhost:8800/v1/chat/completions -H "x-redcell-session: $SID" \
-d '{"model":"redcell","messages":[{"role":"user","content":"What is my name?"}]}' # recalls it
curl -s localhost:8800/v1/chat/completions \
-d '{"model":"redcell","messages":[{"role":"user","content":"What is my name?"}]}' # no id → stateless, does not recallCaveat: very long sessions accumulate unbounded history and can exceed the model's context window. History trimming/summarization is not yet implemented.