Usage

Chat Completions

Standard OpenAI-compatible chat completions. Works with any OpenAI SDK, library, or tool that supports custom base URLs.

# cloud provider (free tier, auto-fallback on rate limit)
curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "mistral-large", "messages": [{"role": "user", "content": "explain mixture of experts"}]}'

# streaming (SSE)
curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "cerebras-qwen3-235b", "messages": [{"role": "user", "content": "write a haiku"}], "stream": true}'

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key=LITELLM_MASTER_KEY,
)

# chat
resp = client.chat.completions.create(
    model="cerebras-qwen3-235b",
    messages=[{"role": "user", "content": "hello"}],
)
print(resp.choices[0].message.content)

# streaming
stream = client.chat.completions.create(
    model="cerebras-qwen3-235b",
    messages=[{"role": "user", "content": "count to 10"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Browser Automation

The browser cluster can be used directly via the REST API, or indirectly by letting an LLM invoke browser tools through MCP.

Direct REST API

# navigate to a page
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "goto", "url": "https://example.com"}'

# get all visible text
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "get_text"}'

# find all interactive elements with their coordinates
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "get_interactive_elements", "visible_only": true}'

# click at coordinates (OS-level, undetectable)
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "system_click", "x": 640, "y": 400}'

# type text (OS-level keyboard input)
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "system_type", "text": "hello world"}'

# screenshot — returns raw PNG (1920x1080 by default, always resize)
curl -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  "http://localhost:4000/stealthy-auto-browse/screenshot/browser?whLargest=512" -o screenshot.png
curl -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  "http://localhost:4000/stealthy-auto-browse/screenshot/browser?width=800" -o screenshot.png

# run a multi-step script atomically (all steps on the same replica, single request)
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "run_script",
    "steps": [
      {"action": "goto", "url": "https://duckduckgo.com"},
      {"action": "system_click", "x": 950, "y": 513},
      {"action": "system_type", "text": "what is groq?"},
      {"action": "send_key", "key": "enter"},
      {"action": "wait_for_element", "selector": "[data-testid='\''result'\'']", "timeout": 10},
      {"action": "get_text"}
    ]
  }'

Browser sessions are sticky via the INSTANCEID cookie. Use a persistent HTTP client to keep your session on the same replica across requests.

Python — search, screenshot, upload, summarize

import requests

session = requests.Session()  # sticky via INSTANCEID cookie
BASE = "http://localhost:4000"
SAB_AUTH = {"Authorization": f"Bearer {STEALTHY_AUTO_BROWSE_AUTH_TOKEN}"}

def browser(action, **kwargs):
    r = session.post(f"{BASE}/stealthy-auto-browse/", headers=SAB_AUTH, json={"action": action, **kwargs})
    r.raise_for_status()
    return r.json()["data"]

# navigate and search
browser("goto", url="https://duckduckgo.com")
browser("system_click", x=950, y=513)
browser("system_type", text="what is groq?")
browser("send_key", key="enter")
browser("wait_for_element", selector="[data-testid='result']", timeout=10000)
text = browser("get_text")["text"]

# screenshot and upload
screenshot = session.get(f"{BASE}/stealthy-auto-browse/screenshot/browser", headers=SAB_AUTH).content
requests.put(
    f"{BASE}/storage/uploads/search.png",
    headers={"Authorization": f"Bearer {HYBRIDS3_UPLOADS_KEY}", "Content-Type": "image/png"},
    data=screenshot,
)

# ask an LLM to summarize
r = requests.post(f"{BASE}/chat/completions",
    headers={"Authorization": f"Bearer {LITELLM_MASTER_KEY}", "Content-Type": "application/json"},
    json={"model": "cerebras-qwen3-235b", "messages": [
        {"role": "user", "content": f"Summarize these search results:\n\n{text[:8000]}"}
    ]})
print(r.json()["choices"][0]["message"]["content"])

Object Storage

hybrids3 — S3-compatible, public-read uploads bucket, bearer token auth, TTL-based expiry.

Basic CRUD

# upload (MIME type auto-detected from content)
curl -X PUT http://localhost:4000/storage/uploads/image.png \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY" \
  -H "Content-Type: image/png" \
  --data-binary @image.png

# download — public, no auth required
curl http://localhost:4000/storage/uploads/image.png -o image.png

# list files (supports ?prefix= and ?max-keys=)
curl "http://localhost:4000/storage/uploads?prefix=images/" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

# delete
curl -X DELETE http://localhost:4000/storage/uploads/image.png \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

Presigned URLs

Generate a time-limited URL that anyone can download without auth credentials:

# generate (default 1 hour, max 7 days)
curl -X POST "http://localhost:4000/storage/presign/uploads/report.pdf?expires=86400" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

# response for public bucket — plain URL (no expiry needed since bucket is public-read anyway)
{"url": "http://localhost:4000/storage/uploads/report.pdf", "expires": null}

# download via presigned URL — no auth header
curl "http://localhost:4000/storage/uploads/report.pdf"

Nested paths

Object keys support / for directory-like organization:

curl -X PUT "http://localhost:4000/storage/uploads/projects/myapp/build.tar.gz" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY" \
  --data-binary @build.tar.gz

# list only that project's files
curl "http://localhost:4000/storage/uploads?prefix=projects/myapp/" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

boto3

import boto3
from botocore.config import Config

s3 = boto3.client(
    "s3",
    endpoint_url="http://localhost:4000/storage",
    aws_access_key_id="uploads",              # bucket name (acts as public_key)
    aws_secret_access_key=HYBRIDS3_UPLOADS_KEY,
    region_name="us-east-1",
    config=Config(signature_version="s3v4"),
)

s3.upload_file("image.png", "uploads", "images/photo.png")
obj = s3.get_object(Bucket="uploads", Key="images/photo.png")
data = obj["Body"].read()

s3.list_objects_v2(Bucket="uploads", Prefix="images/")
s3.delete_object(Bucket="uploads", Key="images/photo.png")

# generate presigned URL
url = s3.generate_presigned_url(
    "get_object",
    Params={"Bucket": "uploads", "Key": "images/photo.png"},
    ExpiresIn=3600,
)

Configure TTL and size limits in .env:

HYBRIDS3_UPLOADS_TTL=168h        # auto-delete after N time (default 7 days)
HYBRIDS3_UPLOADS_MAX_SIZE=100MB  # per-file size limit

Agentic coding — Claudebox + Pibox-zai

Two agentic services wrap a coding agent in a Docker container and expose it as an API. Each request runs the agent's full loop — read/write files, run shell commands, install packages, browse the web, use tools, all within an isolated workspace.

Claudebox — Claude Code, OAuth token or Anthropic API key. Models: claudebox-haiku, claudebox-sonnet, claudebox-opus.
Pibox-zai — pi-coding-agent pointed at z.ai for GLM models. Models: pibox-zai-glm-4.5-air, pibox-zai-glm-4.7, pibox-zai-glm-5.1. Adds /files/* CRUD plus optional Telegram + cron modes.

Both speak the Anthropic wire protocol and expose the same shape of API (sync + async /run, OpenAI-compatible /v1/chat/completions, MCP server).

Via LiteLLM chat completions

The simplest way — just use claudebox models in the standard chat API:

curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claudebox-sonnet",
    "messages": [{"role": "user", "content": "list all Python files in this workspace"}],
    "extra_headers": {"X-Claude-Workspace": "myproject"}
  }'

Via direct API

More control: structured output formats, session resumption, fire-and-forget, tool call history.

# basic run
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "write a Go HTTP server", "workspace": "go-project"}'

# with structured JSON output
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "extract the name and version from package.json",
    "workspace": "myproject",
    "jsonSchema": "{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"version\":{\"type\":\"string\"}},\"required\":[\"name\",\"version\"]}"
  }'

# with full tool call history
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "build the project and run tests", "workspace": "myapp", "outputFormat": "json-verbose"}'

# check which workspaces are busy
curl http://localhost:4000/claudebox/status \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

# cancel a running task
curl -X POST "http://localhost:4000/claudebox/run/cancel?workspace=myapp" \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

File operations

# upload a file to a workspace
curl -X PUT http://localhost:4000/claudebox/files/myproject/data.csv \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  --data-binary @data.csv

# list files in a workspace
curl http://localhost:4000/claudebox/files/myproject \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

# download a file from a workspace
curl http://localhost:4000/claudebox/files/myproject/results.json \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -o results.json

# delete a file
curl -X DELETE http://localhost:4000/claudebox/files/myproject/old.log \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

File + task workflow

# 1. upload input data
curl -X PUT http://localhost:4000/claudebox/files/analysis/sales.csv \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  --data-binary @sales.csv

# 2. run analysis (Claude reads the file, writes a report)
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "analyze sales.csv, compute monthly totals and trends, write a report to report.md", "workspace": "analysis"}'

# 3. download the report
curl http://localhost:4000/claudebox/files/analysis/report.md \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

Always-active skills

Drop a SKILL.md file into a named subdirectory under .data/claudebox/config/.always-skills/ — it will be injected into the system prompt of every Claude invocation automatically. No restarts needed. Applies to API, MCP, chat, everything.

.data/claudebox/config/.always-skills/
└── coding-rules/
    └── SKILL.md   ← injected into every session

Example SKILL.md:

When writing Go code, always use slog for structured logging, never fmt.Println.
When writing Python, always use pathlib for file paths, never os.path.
Always write tests alongside implementations.

Skills stack — every SKILL.md found is appended in alphabetical order by directory name. Per-request appendSystemPrompt or X-Claude-Append-System-Prompt is appended after always-skills, so per-request instructions take precedence.

Image Generation

# image generation (cloud — HuggingFace FLUX)
curl http://localhost:4000/images/generations \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "hf-flux-schnell", "prompt": "cyberpunk city at night"}'

# image generation (local CUDA — sd-turbo, fast)
curl http://localhost:4000/images/generations \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-sdcpp-cuda-sd-turbo", "prompt": "cyberpunk city at night", "size": "512x512"}'

# image generation (local CPU — sd-turbo, slower)
curl http://localhost:4000/images/generations \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-sdcpp-cpu-sd-turbo", "prompt": "cyberpunk city at night", "size": "512x512"}'

Vision

Upload an image to storage (public URL), then pass it to a vision model:

# upload the image
curl -X PUT http://localhost:4000/storage/uploads/photo.jpg \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary @photo.jpg

# public URL — no auth needed to read from uploads bucket
# http://localhost:4000/storage/uploads/photo.jpg

# ask a vision model
curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "groq-llama-4-scout",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "what is in this image?"},
        {"type": "image_url", "image_url": {"url": "http://YOUR_HOST:4000/storage/uploads/photo.jpg"}}
      ]
    }]
  }'

Vision-capable models: groq-llama-4-scout, hf-llama-4-scout, hf-qwen-vl-72b, hf-qwen3-vl-8b, hf-gemma-3-12b, mistral-small, anthropic-claude-opus-4, anthropic-claude-sonnet-4, anthropic-claude-haiku-4, openai-gpt-4o, openai-gpt-4o-mini, claudebox-opus, claudebox-sonnet, claudebox-haiku, local-ollama-cpu-gemma4-e2b, local-ollama-cpu-gemma3-4b, local-ollama-cuda-gemma4-e2b, local-ollama-cuda-gemma4-e4b.

Transcription

curl http://localhost:4000/audio/transcriptions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -F "model=groq-whisper-large-v3" \
  -F "file=@audio.mp3"

Transcription models — talkies (CPU+CUDA), plus the hosted Groq/OpenAI offerings:

Cloud: groq-whisper-large-v3-turbo, groq-whisper-large-v3, voxtral-small, openai-whisper, openai-gpt-4o-transcribe, openai-gpt-4o-mini-transcribe
Local talkies CPU (TALKIES=1): local-talkies-whisper-large-v3, local-talkies-whisper-large-v3-turbo, local-talkies-canary-180m-flash
Local talkies CUDA (TALKIES_CUDA=1): same as CPU plus local-talkies-cuda-parakeet-tdt-0.6b-v3, local-talkies-cuda-canary-1b-flash (EN/DE/FR/ES + EN↔X translation), local-talkies-cuda-canary-qwen-2.5b (hybrid SALM)

talkies-specific knobs (any model): response_format=text|json|verbose_json|srt|vtt, diarization=true (stereo channel-split — left=L, right=R, segments tagged with channel).

Text-to-Speech

# CPU — talkies Kokoro (multiple voices)
curl http://localhost:4000/audio/speech \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-talkies-kokoro-tts", "input": "Hello world", "voice": "af_heart"}' \
  -o speech.mp3

# CUDA — Qwen3-TTS voice cloning (also inside talkies-cuda as of v0.4.0)
curl http://localhost:4000/audio/speech \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-talkies-cuda-qwen3-tts", "input": "Hello world", "voice": "alloy"}' \
  -o speech.mp3

TTS models: local-talkies-kokoro-tts (Kokoro 82M, CPU, ~41 voices across en/es/fr/hi/it/pt), local-talkies-cuda-kokoro-tts (same model, served inside the CUDA talkies container — Kokoro still runs on CPU there), local-talkies-cuda-qwen3-tts (Qwen3-TTS-0.6B voice cloning — drop reference .wav files into ${DATA_DIR_TALKIES}/custom-voices/ and use voice=<filename-without-ext>; samples alloy/echo/fable baked in), openai-tts-1, openai-tts-1-hd.

LibreChat Web UI

Enable with LIBRECHAT=1 in .env. Access at http://localhost:4000/librechat/.

First-time setup

Navigate to http://localhost:4000/librechat/
Register an account — the first user automatically becomes admin
Set LIBRECHAT_ALLOW_REGISTRATION=false in .env and restart (docker compose restart librechat) to lock registration

What's pre-configured

All LiteLLM models are available in the model selector (auto-fetched)
All MCP tools (browser, storage, claudebox, image generation, TTS) are connected and available in conversations
Conversations are stored in MongoDB and persist across restarts
WebSocket streaming for real-time responses

Configuration

All settings are customizable via .env — see services-reference.md for the full list of environment variables.

The LibreChat config file at librechat/librechat.yaml controls endpoints, MCP servers, and interface settings. Edit it directly for advanced customization (e.g. adding more MCP servers, changing interface options).

Web search (SearXNG MCP)

With SEARXNG=1, the MCP search_web tool is auto-registered. Any function-calling model can search the web — the tool aggregates Google, Bing, DuckDuckGo, and Wikipedia results through the self-hosted SearXNG at /searxng/.

# direct MCP tools/call
curl -X POST http://localhost:4000/mcp/ \
  -H "Authorization: Bearer $MCP_TOOLS_AUTH_TOKEN" \
  -H "Accept: application/json, text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call",
       "params":{"name":"search_web","arguments":{"query":"site:arxiv.org diffusion models 2026","limit":5}}}'

You can also hit the SearXNG UI directly at http://localhost:4000/searxng/ for ad-hoc queries (protected by nginx admin auth).

→ SearXNG service reference · MCP tool schema

Time-series forecasting (predictalot)

With PREDICTALOT=1 (CPU) or PREDICTALOT_CUDA=1 (GPU) the /predictalot/ route exposes five foundation forecasters — chronos-2, timesfm-2.5, moirai-2, toto-1, sundial-base-128m — across a type-routed API. Each forecast modality has its own URL prefix, and a model only appears under a type if it implements that modality. Direct route, not registered as a LiteLLM provider. Bearer auth via PREDICTALOT_AUTH_TOKEN. Unauthenticated /predictalot/healthz for liveness.

Type	Base URL	Members
univariate (1D series → quantiles)	`/v1/univariate`	all five
multivariate (channels per series)	`/v1/multivariate`	chronos-2, moirai-2, toto-1
covariates — past only	`/v1/covariates/past`	chronos-2, moirai-2
covariates — future only	`/v1/covariates/future`	chronos-2
covariates — past + future	`/v1/covariates`	chronos-2
samples (raw sample paths)	`/v1/samples`	toto-1, sundial-base-128m

Every base URL exposes the same three sub-paths: <base>/forecast, <base>/forecast/ensemble, <base>/models.

# single-model univariate forecast — context is a list-of-series
curl http://localhost:4000/predictalot/v1/univariate/forecast \
  -H "Authorization: Bearer $PREDICTALOT_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "chronos-2",
    "context": [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]],
    "config": {"horizon": 5}
  }'

# per-type ensemble — run every member in parallel, weighted mean + every individual forecast.
# Weight 0 disables that model entirely. Omitted entry defaults to 1.
curl http://localhost:4000/predictalot/v1/univariate/forecast/ensemble \
  -H "Authorization: Bearer $PREDICTALOT_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "context": [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]],
    "config": {"horizon": 5},
    "weights": {"chronos-2": 2.0, "moirai-2": 1.0, "timesfm-2.5": 0}
  }'

# per-type model listing (which slugs implement this type + load state)
curl http://localhost:4000/predictalot/v1/univariate/models \
  -H "Authorization: Bearer $PREDICTALOT_AUTH_TOKEN"

The response contains a median point forecast and per-quantile arrays (or, for /v1/samples, raw sample paths). Models lazy-load on first call (~50-800MB HuggingFace snapshot) and auto-unload after PREDICTALOT_MODEL_IDLE_TIMEOUT (default 30m). Sundial runs in its own sidecar venv (transformers==4.40.1 pin) and is transparent over the wire.

The same surface is exposed as 26 MCP tools — one per (type, model) cell (e.g. predictalot-forecast_univariate_chronos_2, predictalot-forecast_multivariate_moirai_2, predictalot-forecast_samples_toto_1) plus per-type ensembles (predictalot-forecast_<type>_ensemble) and per-type listings (predictalot-list_<type>_models). Model slug dashes/dots become underscores in tool names (sundial-base-128m → sundial_base_128m). Any function-calling model can run forecasts autonomously.

→ predictalot service reference · predictalot MCP tools

Email gateway (mailbox)

With MAILBOX=1, the /mailbox/ route fronts N email accounts driven by a single YAML config (MAILBOX_CONFIG). Stateless — every read hits the upstream IMAP server live. Bearer auth via MAILBOX_AUTH_TOKEN (also mirrored into the config's auth.tokens: list).

# list configured accounts
curl http://localhost:4000/mailbox/mailboxes \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

# unified inbox across all accounts (paginated)
curl "http://localhost:4000/mailbox/inbox?limit=10" \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

# search a specific account
curl "http://localhost:4000/mailbox/inbox?mailbox=work&subject=invoice&limit=5" \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

# send through a configured account (SMTP)
curl -X POST http://localhost:4000/mailbox/mailboxes/work/send \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"to": ["someone@example.com"], "subject": "hello", "body_text": "from aigate"}'

# delete by uid
curl -X DELETE http://localhost:4000/mailbox/mailboxes/work/messages/<uid> \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

The MCP catalog is flat regardless of how many accounts you've configured — per-account tools take a mailbox parameter (name or address) instead of namespacing.

→ mailbox service reference · mailbox MCP tools

Telegram client (telethon)

With TELETHON=1, the /telethon/ route fronts a Telegram client using the official MTProto user-account API. Requires TELETHON_API_ID / TELETHON_API_HASH from my.telegram.org/apps and a string session in TELETHON_SESSION. Bearer auth via TELETHON_AUTH_KEY.

# who am I — verifies the session is authorized
curl http://localhost:4000/telethon/api/me \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY"

# list dialogs
curl "http://localhost:4000/telethon/api/dialogs?limit=10" \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY"

# send a message (markdown supported via parse_mode)
curl -X POST http://localhost:4000/telethon/api/messages \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY" \
  -H "Content-Type: application/json" \
  -d '{"chat": "@username", "text": "**hello** from aigate", "parse_mode": "md"}'

# read recent messages from a chat
curl "http://localhost:4000/telethon/api/messages?chat=me&limit=5" \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY"

Chat references accept @username, phone numbers, t.me/... links, or numeric IDs as strings. The same surface is exposed as MCP tools (telethon-send_message, telethon-get_dialogs, etc.) so any function-calling model can operate Telegram on your behalf.

→ Telethon service reference · Telethon MCP tools

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Usage

Chat Completions

Python (openai SDK)

Browser Automation

Direct REST API

Python — search, screenshot, upload, summarize

Object Storage

Basic CRUD

Presigned URLs

Nested paths

boto3

Agentic coding — Claudebox + Pibox-zai

Via LiteLLM chat completions

Via direct API

File operations

File + task workflow

Always-active skills

Image Generation

Vision

Transcription

Text-to-Speech

LibreChat Web UI

First-time setup

What's pre-configured

Configuration

Web search (SearXNG MCP)

Time-series forecasting (predictalot)

Email gateway (mailbox)

Telegram client (telethon)

Uh oh!

FilesExpand file tree

usage.md

Latest commit

History

usage.md

File metadata and controls

Usage

Chat Completions

Python (openai SDK)

Browser Automation

Direct REST API

Python — search, screenshot, upload, summarize

Object Storage

Basic CRUD

Presigned URLs

Nested paths

boto3

Agentic coding — Claudebox + Pibox-zai

Via LiteLLM chat completions

Via direct API

File operations

File + task workflow

Always-active skills

Image Generation

Vision

Transcription

Text-to-Speech

LibreChat Web UI

First-time setup

What's pre-configured

Configuration

Web search (SearXNG MCP)

Time-series forecasting (predictalot)

Email gateway (mailbox)

Telegram client (telethon)