Skip to content

Latest commit

 

History

History
624 lines (472 loc) · 25.1 KB

File metadata and controls

624 lines (472 loc) · 25.1 KB

Usage

Chat Completions

Standard OpenAI-compatible chat completions. Works with any OpenAI SDK, library, or tool that supports custom base URLs.

# cloud provider (free tier, auto-fallback on rate limit)
curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "mistral-large", "messages": [{"role": "user", "content": "explain mixture of experts"}]}'

# streaming (SSE)
curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "cerebras-qwen3-235b", "messages": [{"role": "user", "content": "write a haiku"}], "stream": true}'

Python (openai SDK)

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:4000",
    api_key=LITELLM_MASTER_KEY,
)

# chat
resp = client.chat.completions.create(
    model="cerebras-qwen3-235b",
    messages=[{"role": "user", "content": "hello"}],
)
print(resp.choices[0].message.content)

# streaming
stream = client.chat.completions.create(
    model="cerebras-qwen3-235b",
    messages=[{"role": "user", "content": "count to 10"}],
    stream=True,
)
for chunk in stream:
    print(chunk.choices[0].delta.content or "", end="", flush=True)

Browser Automation

The browser cluster can be used directly via the REST API, or indirectly by letting an LLM invoke browser tools through MCP.

Direct REST API

# navigate to a page
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "goto", "url": "https://example.com"}'

# get all visible text
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "get_text"}'

# find all interactive elements with their coordinates
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "get_interactive_elements", "visible_only": true}'

# click at coordinates (OS-level, undetectable)
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "system_click", "x": 640, "y": 400}'

# type text (OS-level keyboard input)
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"action": "system_type", "text": "hello world"}'

# screenshot — returns raw PNG (1920x1080 by default, always resize)
curl -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  "http://localhost:4000/stealthy-auto-browse/screenshot/browser?whLargest=512" -o screenshot.png
curl -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  "http://localhost:4000/stealthy-auto-browse/screenshot/browser?width=800" -o screenshot.png

# run a multi-step script atomically (all steps on the same replica, single request)
curl -X POST http://localhost:4000/stealthy-auto-browse/ \
  -H "Authorization: Bearer $STEALTHY_AUTO_BROWSE_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "action": "run_script",
    "steps": [
      {"action": "goto", "url": "https://duckduckgo.com"},
      {"action": "system_click", "x": 950, "y": 513},
      {"action": "system_type", "text": "what is groq?"},
      {"action": "send_key", "key": "enter"},
      {"action": "wait_for_element", "selector": "[data-testid='\''result'\'']", "timeout": 10},
      {"action": "get_text"}
    ]
  }'

Browser sessions are sticky via the INSTANCEID cookie. Use a persistent HTTP client to keep your session on the same replica across requests.

Python — search, screenshot, upload, summarize

import requests

session = requests.Session()  # sticky via INSTANCEID cookie
BASE = "http://localhost:4000"
SAB_AUTH = {"Authorization": f"Bearer {STEALTHY_AUTO_BROWSE_AUTH_TOKEN}"}

def browser(action, **kwargs):
    r = session.post(f"{BASE}/stealthy-auto-browse/", headers=SAB_AUTH, json={"action": action, **kwargs})
    r.raise_for_status()
    return r.json()["data"]

# navigate and search
browser("goto", url="https://duckduckgo.com")
browser("system_click", x=950, y=513)
browser("system_type", text="what is groq?")
browser("send_key", key="enter")
browser("wait_for_element", selector="[data-testid='result']", timeout=10000)
text = browser("get_text")["text"]

# screenshot and upload
screenshot = session.get(f"{BASE}/stealthy-auto-browse/screenshot/browser", headers=SAB_AUTH).content
requests.put(
    f"{BASE}/storage/uploads/search.png",
    headers={"Authorization": f"Bearer {HYBRIDS3_UPLOADS_KEY}", "Content-Type": "image/png"},
    data=screenshot,
)

# ask an LLM to summarize
r = requests.post(f"{BASE}/chat/completions",
    headers={"Authorization": f"Bearer {LITELLM_MASTER_KEY}", "Content-Type": "application/json"},
    json={"model": "cerebras-qwen3-235b", "messages": [
        {"role": "user", "content": f"Summarize these search results:\n\n{text[:8000]}"}
    ]})
print(r.json()["choices"][0]["message"]["content"])

Object Storage

hybrids3 — S3-compatible, public-read uploads bucket, bearer token auth, TTL-based expiry.

Basic CRUD

# upload (MIME type auto-detected from content)
curl -X PUT http://localhost:4000/storage/uploads/image.png \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY" \
  -H "Content-Type: image/png" \
  --data-binary @image.png

# download — public, no auth required
curl http://localhost:4000/storage/uploads/image.png -o image.png

# list files (supports ?prefix= and ?max-keys=)
curl "http://localhost:4000/storage/uploads?prefix=images/" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

# delete
curl -X DELETE http://localhost:4000/storage/uploads/image.png \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

Presigned URLs

Generate a time-limited URL that anyone can download without auth credentials:

# generate (default 1 hour, max 7 days)
curl -X POST "http://localhost:4000/storage/presign/uploads/report.pdf?expires=86400" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

# response for public bucket — plain URL (no expiry needed since bucket is public-read anyway)
{"url": "http://localhost:4000/storage/uploads/report.pdf", "expires": null}

# download via presigned URL — no auth header
curl "http://localhost:4000/storage/uploads/report.pdf"

Nested paths

Object keys support / for directory-like organization:

curl -X PUT "http://localhost:4000/storage/uploads/projects/myapp/build.tar.gz" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY" \
  --data-binary @build.tar.gz

# list only that project's files
curl "http://localhost:4000/storage/uploads?prefix=projects/myapp/" \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY"

boto3

import boto3
from botocore.config import Config

s3 = boto3.client(
    "s3",
    endpoint_url="http://localhost:4000/storage",
    aws_access_key_id="uploads",              # bucket name (acts as public_key)
    aws_secret_access_key=HYBRIDS3_UPLOADS_KEY,
    region_name="us-east-1",
    config=Config(signature_version="s3v4"),
)

s3.upload_file("image.png", "uploads", "images/photo.png")
obj = s3.get_object(Bucket="uploads", Key="images/photo.png")
data = obj["Body"].read()

s3.list_objects_v2(Bucket="uploads", Prefix="images/")
s3.delete_object(Bucket="uploads", Key="images/photo.png")

# generate presigned URL
url = s3.generate_presigned_url(
    "get_object",
    Params={"Bucket": "uploads", "Key": "images/photo.png"},
    ExpiresIn=3600,
)

Configure TTL and size limits in .env:

HYBRIDS3_UPLOADS_TTL=168h        # auto-delete after N time (default 7 days)
HYBRIDS3_UPLOADS_MAX_SIZE=100MB  # per-file size limit

Agentic coding — Claudebox + Pibox-zai

Two agentic services wrap a coding agent in a Docker container and expose it as an API. Each request runs the agent's full loop — read/write files, run shell commands, install packages, browse the web, use tools, all within an isolated workspace.

  • Claudebox — Claude Code, OAuth token or Anthropic API key. Models: claudebox-haiku, claudebox-sonnet, claudebox-opus.
  • Pibox-zaipi-coding-agent pointed at z.ai for GLM models. Models: pibox-zai-glm-4.5-air, pibox-zai-glm-4.7, pibox-zai-glm-5.1. Adds /files/* CRUD plus optional Telegram + cron modes.

Both speak the Anthropic wire protocol and expose the same shape of API (sync + async /run, OpenAI-compatible /v1/chat/completions, MCP server).

Via LiteLLM chat completions

The simplest way — just use claudebox models in the standard chat API:

curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "claudebox-sonnet",
    "messages": [{"role": "user", "content": "list all Python files in this workspace"}],
    "extra_headers": {"X-Claude-Workspace": "myproject"}
  }'

Via direct API

More control: structured output formats, session resumption, fire-and-forget, tool call history.

# basic run
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "write a Go HTTP server", "workspace": "go-project"}'

# with structured JSON output
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "extract the name and version from package.json",
    "workspace": "myproject",
    "jsonSchema": "{\"type\":\"object\",\"properties\":{\"name\":{\"type\":\"string\"},\"version\":{\"type\":\"string\"}},\"required\":[\"name\",\"version\"]}"
  }'

# with full tool call history
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "build the project and run tests", "workspace": "myapp", "outputFormat": "json-verbose"}'

# check which workspaces are busy
curl http://localhost:4000/claudebox/status \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

# cancel a running task
curl -X POST "http://localhost:4000/claudebox/run/cancel?workspace=myapp" \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

File operations

# upload a file to a workspace
curl -X PUT http://localhost:4000/claudebox/files/myproject/data.csv \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  --data-binary @data.csv

# list files in a workspace
curl http://localhost:4000/claudebox/files/myproject \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

# download a file from a workspace
curl http://localhost:4000/claudebox/files/myproject/results.json \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -o results.json

# delete a file
curl -X DELETE http://localhost:4000/claudebox/files/myproject/old.log \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

File + task workflow

# 1. upload input data
curl -X PUT http://localhost:4000/claudebox/files/analysis/sales.csv \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  --data-binary @sales.csv

# 2. run analysis (Claude reads the file, writes a report)
curl -X POST http://localhost:4000/claudebox/run \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"prompt": "analyze sales.csv, compute monthly totals and trends, write a report to report.md", "workspace": "analysis"}'

# 3. download the report
curl http://localhost:4000/claudebox/files/analysis/report.md \
  -H "Authorization: Bearer $CLAUDEBOX_API_TOKEN"

Always-active skills

Drop a SKILL.md file into a named subdirectory under .data/claudebox/config/.always-skills/ — it will be injected into the system prompt of every Claude invocation automatically. No restarts needed. Applies to API, MCP, chat, everything.

.data/claudebox/config/.always-skills/
└── coding-rules/
    └── SKILL.md   ← injected into every session

Example SKILL.md:

When writing Go code, always use slog for structured logging, never fmt.Println.
When writing Python, always use pathlib for file paths, never os.path.
Always write tests alongside implementations.

Skills stack — every SKILL.md found is appended in alphabetical order by directory name. Per-request appendSystemPrompt or X-Claude-Append-System-Prompt is appended after always-skills, so per-request instructions take precedence.


Image Generation

# image generation (cloud — HuggingFace FLUX)
curl http://localhost:4000/images/generations \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "hf-flux-schnell", "prompt": "cyberpunk city at night"}'

# image generation (local CUDA — sd-turbo, fast)
curl http://localhost:4000/images/generations \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-sdcpp-cuda-sd-turbo", "prompt": "cyberpunk city at night", "size": "512x512"}'

# image generation (local CPU — sd-turbo, slower)
curl http://localhost:4000/images/generations \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-sdcpp-cpu-sd-turbo", "prompt": "cyberpunk city at night", "size": "512x512"}'

Vision

Upload an image to storage (public URL), then pass it to a vision model:

# upload the image
curl -X PUT http://localhost:4000/storage/uploads/photo.jpg \
  -H "Authorization: Bearer $HYBRIDS3_UPLOADS_KEY" \
  -H "Content-Type: image/jpeg" \
  --data-binary @photo.jpg

# public URL — no auth needed to read from uploads bucket
# http://localhost:4000/storage/uploads/photo.jpg

# ask a vision model
curl http://localhost:4000/chat/completions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "groq-llama-4-scout",
    "messages": [{
      "role": "user",
      "content": [
        {"type": "text", "text": "what is in this image?"},
        {"type": "image_url", "image_url": {"url": "http://YOUR_HOST:4000/storage/uploads/photo.jpg"}}
      ]
    }]
  }'

Vision-capable models: groq-llama-4-scout, hf-llama-4-scout, hf-qwen-vl-72b, hf-qwen3-vl-8b, hf-gemma-3-12b, mistral-small, anthropic-claude-opus-4, anthropic-claude-sonnet-4, anthropic-claude-haiku-4, openai-gpt-4o, openai-gpt-4o-mini, claudebox-opus, claudebox-sonnet, claudebox-haiku, local-ollama-cpu-gemma4-e2b, local-ollama-cpu-gemma3-4b, local-ollama-cuda-gemma4-e2b, local-ollama-cuda-gemma4-e4b.


Transcription

curl http://localhost:4000/audio/transcriptions \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -F "model=groq-whisper-large-v3" \
  -F "file=@audio.mp3"

Transcription models — talkies (CPU+CUDA), plus the hosted Groq/OpenAI offerings:

  • Cloud: groq-whisper-large-v3-turbo, groq-whisper-large-v3, voxtral-small, openai-whisper, openai-gpt-4o-transcribe, openai-gpt-4o-mini-transcribe
  • Local talkies CPU (TALKIES=1): local-talkies-whisper-large-v3, local-talkies-whisper-large-v3-turbo, local-talkies-canary-180m-flash
  • Local talkies CUDA (TALKIES_CUDA=1): same as CPU plus local-talkies-cuda-parakeet-tdt-0.6b-v3, local-talkies-cuda-canary-1b-flash (EN/DE/FR/ES + EN↔X translation), local-talkies-cuda-canary-qwen-2.5b (hybrid SALM)

talkies-specific knobs (any model): response_format=text|json|verbose_json|srt|vtt, diarization=true (stereo channel-split — left=L, right=R, segments tagged with channel).


Text-to-Speech

# CPU — talkies Kokoro (multiple voices)
curl http://localhost:4000/audio/speech \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-talkies-kokoro-tts", "input": "Hello world", "voice": "af_heart"}' \
  -o speech.mp3

# CUDA — Qwen3-TTS voice cloning (also inside talkies-cuda as of v0.4.0)
curl http://localhost:4000/audio/speech \
  -H "Authorization: Bearer $LITELLM_MASTER_KEY" \
  -H "Content-Type: application/json" \
  -d '{"model": "local-talkies-cuda-qwen3-tts", "input": "Hello world", "voice": "alloy"}' \
  -o speech.mp3

TTS models: local-talkies-kokoro-tts (Kokoro 82M, CPU, ~41 voices across en/es/fr/hi/it/pt), local-talkies-cuda-kokoro-tts (same model, served inside the CUDA talkies container — Kokoro still runs on CPU there), local-talkies-cuda-qwen3-tts (Qwen3-TTS-0.6B voice cloning — drop reference .wav files into ${DATA_DIR_TALKIES}/custom-voices/ and use voice=<filename-without-ext>; samples alloy/echo/fable baked in), openai-tts-1, openai-tts-1-hd.


LibreChat Web UI

Enable with LIBRECHAT=1 in .env. Access at http://localhost:4000/librechat/.

First-time setup

  1. Navigate to http://localhost:4000/librechat/
  2. Register an account — the first user automatically becomes admin
  3. Set LIBRECHAT_ALLOW_REGISTRATION=false in .env and restart (docker compose restart librechat) to lock registration

What's pre-configured

  • All LiteLLM models are available in the model selector (auto-fetched)
  • All MCP tools (browser, storage, claudebox, image generation, TTS) are connected and available in conversations
  • Conversations are stored in MongoDB and persist across restarts
  • WebSocket streaming for real-time responses

Configuration

All settings are customizable via .env — see services-reference.md for the full list of environment variables.

The LibreChat config file at librechat/librechat.yaml controls endpoints, MCP servers, and interface settings. Edit it directly for advanced customization (e.g. adding more MCP servers, changing interface options).


Web search (SearXNG MCP)

With SEARXNG=1, the MCP search_web tool is auto-registered. Any function-calling model can search the web — the tool aggregates Google, Bing, DuckDuckGo, and Wikipedia results through the self-hosted SearXNG at /searxng/.

# direct MCP tools/call
curl -X POST http://localhost:4000/mcp/ \
  -H "Authorization: Bearer $MCP_TOOLS_AUTH_TOKEN" \
  -H "Accept: application/json, text/event-stream" \
  -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call",
       "params":{"name":"search_web","arguments":{"query":"site:arxiv.org diffusion models 2026","limit":5}}}'

You can also hit the SearXNG UI directly at http://localhost:4000/searxng/ for ad-hoc queries (protected by nginx admin auth).

SearXNG service reference · MCP tool schema


Time-series forecasting (predictalot)

With PREDICTALOT=1 (CPU) or PREDICTALOT_CUDA=1 (GPU) the /predictalot/ route exposes five foundation forecasters — chronos-2, timesfm-2.5, moirai-2, toto-1, sundial-base-128m — across a type-routed API. Each forecast modality has its own URL prefix, and a model only appears under a type if it implements that modality. Direct route, not registered as a LiteLLM provider. Bearer auth via PREDICTALOT_AUTH_TOKEN. Unauthenticated /predictalot/healthz for liveness.

Type Base URL Members
univariate (1D series → quantiles) /v1/univariate all five
multivariate (channels per series) /v1/multivariate chronos-2, moirai-2, toto-1
covariates — past only /v1/covariates/past chronos-2, moirai-2
covariates — future only /v1/covariates/future chronos-2
covariates — past + future /v1/covariates chronos-2
samples (raw sample paths) /v1/samples toto-1, sundial-base-128m

Every base URL exposes the same three sub-paths: <base>/forecast, <base>/forecast/ensemble, <base>/models.

# single-model univariate forecast — context is a list-of-series
curl http://localhost:4000/predictalot/v1/univariate/forecast \
  -H "Authorization: Bearer $PREDICTALOT_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "chronos-2",
    "context": [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]],
    "config": {"horizon": 5}
  }'

# per-type ensemble — run every member in parallel, weighted mean + every individual forecast.
# Weight 0 disables that model entirely. Omitted entry defaults to 1.
curl http://localhost:4000/predictalot/v1/univariate/forecast/ensemble \
  -H "Authorization: Bearer $PREDICTALOT_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "context": [[10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20]],
    "config": {"horizon": 5},
    "weights": {"chronos-2": 2.0, "moirai-2": 1.0, "timesfm-2.5": 0}
  }'

# per-type model listing (which slugs implement this type + load state)
curl http://localhost:4000/predictalot/v1/univariate/models \
  -H "Authorization: Bearer $PREDICTALOT_AUTH_TOKEN"

The response contains a median point forecast and per-quantile arrays (or, for /v1/samples, raw sample paths). Models lazy-load on first call (~50-800MB HuggingFace snapshot) and auto-unload after PREDICTALOT_MODEL_IDLE_TIMEOUT (default 30m). Sundial runs in its own sidecar venv (transformers==4.40.1 pin) and is transparent over the wire.

The same surface is exposed as 26 MCP tools — one per (type, model) cell (e.g. predictalot-forecast_univariate_chronos_2, predictalot-forecast_multivariate_moirai_2, predictalot-forecast_samples_toto_1) plus per-type ensembles (predictalot-forecast_<type>_ensemble) and per-type listings (predictalot-list_<type>_models). Model slug dashes/dots become underscores in tool names (sundial-base-128msundial_base_128m). Any function-calling model can run forecasts autonomously.

predictalot service reference · predictalot MCP tools


Email gateway (mailbox)

With MAILBOX=1, the /mailbox/ route fronts N email accounts driven by a single YAML config (MAILBOX_CONFIG). Stateless — every read hits the upstream IMAP server live. Bearer auth via MAILBOX_AUTH_TOKEN (also mirrored into the config's auth.tokens: list).

# list configured accounts
curl http://localhost:4000/mailbox/mailboxes \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

# unified inbox across all accounts (paginated)
curl "http://localhost:4000/mailbox/inbox?limit=10" \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

# search a specific account
curl "http://localhost:4000/mailbox/inbox?mailbox=work&subject=invoice&limit=5" \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

# send through a configured account (SMTP)
curl -X POST http://localhost:4000/mailbox/mailboxes/work/send \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"to": ["someone@example.com"], "subject": "hello", "body_text": "from aigate"}'

# delete by uid
curl -X DELETE http://localhost:4000/mailbox/mailboxes/work/messages/<uid> \
  -H "Authorization: Bearer $MAILBOX_AUTH_TOKEN"

The MCP catalog is flat regardless of how many accounts you've configured — per-account tools take a mailbox parameter (name or address) instead of namespacing.

mailbox service reference · mailbox MCP tools


Telegram client (telethon)

With TELETHON=1, the /telethon/ route fronts a Telegram client using the official MTProto user-account API. Requires TELETHON_API_ID / TELETHON_API_HASH from my.telegram.org/apps and a string session in TELETHON_SESSION. Bearer auth via TELETHON_AUTH_KEY.

# who am I — verifies the session is authorized
curl http://localhost:4000/telethon/api/me \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY"

# list dialogs
curl "http://localhost:4000/telethon/api/dialogs?limit=10" \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY"

# send a message (markdown supported via parse_mode)
curl -X POST http://localhost:4000/telethon/api/messages \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY" \
  -H "Content-Type: application/json" \
  -d '{"chat": "@username", "text": "**hello** from aigate", "parse_mode": "md"}'

# read recent messages from a chat
curl "http://localhost:4000/telethon/api/messages?chat=me&limit=5" \
  -H "Authorization: Bearer $TELETHON_AUTH_KEY"

Chat references accept @username, phone numbers, t.me/... links, or numeric IDs as strings. The same surface is exposed as MCP tools (telethon-send_message, telethon-get_dialogs, etc.) so any function-calling model can operate Telegram on your behalf.

Telethon service reference · Telethon MCP tools