- Decoupled LLM Layer: The harness never bundles or serves LLMs itself. It connects via HTTP to any OpenAI-compatible endpoint.
- Self-Host First: Everything containerized, network-aware, and privacy-focused.
- Frontend: Leptos (Rust → WASM) or Dioxus — Monaco Editor, live preview, terminal, model selector, chat.
- Backend Orchestrator: Rust (Axum) — lightweight API server, streaming proxy, WebSockets, project FS.
- AI Agent Core: Python microservices (FastAPI, optional) for complex tool calling / RAG / agents. Callable from Rust.
- Model Integration:
- Unified client supporting multiple endpoints.
- Config UI for adding LLM servers (URL, API key if needed, model list).
- Auto-discovery: mDNS for Ollama on LAN; fallback manual entry.
- Built-in proxy/forwarder for consistent streaming and logging.
- Persistence: SQLite (or LiteFS) for projects, settings, chat history.
- Code Sandbox: Docker-based (isolated) or Wasmtime/WASM. Optional integration with user's Proxmox/Docker host.
- Deployment: Docker Compose (multi-service if Python agents used). ARM64 native. Self-Host Wizard generates full stack configs including LLM connection examples.
- User runs harness container (exposes port 3000).
- Configures LLM endpoint(s) via UI (e.g.,
http://host.docker.internal:11434or LAN IP). - Prompts route through Rust proxy → chosen LLM → streaming back to browser.
- Homelab integrations use local network (Proxmox API, MQTT broker, Coolify, etc.).
- Detects generated app stack.
- Generates
docker-compose.ymlfor the app + instructions for connecting the harness to user's existing LLM service. - One-click Coolify/PocketBase/Proxmox deployment paths.