Real agentic apps you can read in one sitting, built on CUGA.
The model is rarely the hard part of an agent. The work is wiring up tools,
holding state together across a long task, adding guardrails, and growing from
one agent to several without a rewrite. CUGA — the Configurable
Generalist Agent, an open-source harness from IBM Research
(pip install cuga) — handles that plumbing, so the part you write shrinks to
a tool list and a system prompt. It plans before it acts, executes with a mix
of tool calls and generated code, holds intermediate state across a long run, and
self-corrects — the machinery behind #1 on AppWorld
and WebArena.
cuga-apps is what that feels like in practice: 35 apps in
cuga-apps/apps/ (21 in the polished showcase set), each a
single-file FastAPI server wrapping one CugaAgent with a tool list and a system
prompt. The right-hand panel of every app shows live structured state the agent
pushes as it works. If you've written a FastAPI route, you can read every line.
Heads up — the real catalog is the inner
cuga-apps/directory, not the repo root. Apps live incuga-apps/apps/.
| If you… | …start here |
|---|---|
| want to try a CUGA agent without building one | the live gallery, or the one-command Docker stack |
| are building an agent app | clone the app closest to your idea and edit its tool list + prompt (Quick start) |
| want to learn the patterns — MCP + inline tools, multi-agent, RAG, web grounding | read one app end to end; they all share the same skeleton |
just want a working CugaAgent example to copy |
Recipe Composer (inline tools only, ~one file) |
Every app, behind a launch button, no install:
cuga-agent-apps.1gxwxi8kos9y.us-east.codeengine.appdomain.cloud ↗
Filter by ✦ Showcase for the polished set.
Prerequisite — Python 3.13.
cugasupports>=3.10,<3.14(3.14 is unsupported). The cleanest setup is auvvenv on 3.13. You also need credentials for one LLM provider — the example below uses watsonx; CUGA also speaks OpenAI, Anthropic, Ollama, and LiteLLM (setLLM_PROVIDERaccordingly — point at a local Ollama model for zero API cost).
Recipe Composer is the easiest first run: its tools are plain Python functions, so there are no MCP servers and no API keys beyond your LLM provider.
git clone https://github.com/cuga-project/cuga-apps
cd cuga-apps/cuga-apps/apps/recipe_composer
uv venv --python 3.13 .venv && source .venv/bin/activate
pip install -r requirements.txt cuga
export LLM_PROVIDER=watsonx
export LLM_MODEL=meta-llama/llama-3-3-70b-instruct
export WATSONX_APIKEY=<your-key>
export WATSONX_PROJECT_ID=<your-project-id> # or WATSONX_SPACE_ID
python main.py --port 28820
# open http://127.0.0.1:28820AGENT_SETTING_CONFIG auto-defaults to settings.<provider>.toml, so you don't
set it. Try: "I have chicken, rice, and broccoli — what can I cook in under 25
minutes?" The pantry, diet, and recipe cards on the right update as the agent
works.
Movie Recommender adds the shared knowledge MCP server (Wikipedia / arXiv /
Semantic Scholar) to its inline tools. Rather than run MCP locally, point it at
the hosted set with CUGA_TARGET=ce — those servers already hold their own keys,
so your laptop still only needs the LLM provider:
cd cuga-apps/cuga-apps/apps/movie_recommender
uv venv --python 3.13 .venv && source .venv/bin/activate # or reuse the one above
pip install -r requirements.txt cuga
# LLM provider vars from step 1 still apply
export CUGA_TARGET=ce # use the hosted MCP servers; no extra keys needed
python main.py --port 28806
# open http://127.0.0.1:28806Try: "I loved Inception and Arrival — recommend 5 cerebral sci-fi films." Same
shape as Recipe Composer; the only difference is one extra line in _make_tools()
(load_tools(["knowledge"])) and the prompt.
Every app folder has its own requirements.txt and README.md with example
prompts — the run pattern is always cd cuga-apps/apps/<app> && pip install -r requirements.txt cuga && python main.py --port <port> (ports in the
reference table).
One command brings up the umbrella UI + all 21 showcase apps + the MCP servers they need in a single container on port 8080:
cd build
cp .env.example .env # set your LLM provider + key; add TAVILY_API_KEY /
# OPENTRIPMAP_API_KEY / ALPHA_VANTAGE_API_KEY for the
# apps that use them
docker compose up --build # first build is large (cuga + Chromium + MCP deps)
# open http://localhost:8080The UI loads at /; click any showcase app — it opens at /a/<app>/ and works
end-to-end (chat → agent → live panel). A missing optional key just degrades the
apps that need it (that tool returns a clear "missing key" error) — everything
still comes up. Details in build/README.md.
Once you've read one app you've read them all — they share the skeleton; only the tools and the prompt change. They fan out across families, so whatever you're building, one already exercises the piece you need:
- Research & knowledge — Paper Scout (arXiv + Semantic Scholar, ranked by citations), Wiki Dive and Web Researcher (cited synthesis), YouTube Research (from transcripts), Webpage Summarizer, GitHub Trending, AI Labs News.
- Everyday & local — Travel Planner (multi-day itinerary), City Beat (daily city briefing), Recipe Composer (pantry-driven), Movie Recommender, Hiking Research, Find a Doctor (OSM + reviews), Meetup Finder (browser-driven events), Smart Todo.
- Content & pipelines — Newsletter Intelligence (RSS → scored daily digest), Architecture Diagram, Deck Forge, API Doc Gen.
- Documents & media Q&A — Box Q&A, Drop Summarizer, Video Q&A, Voice Journal — ingest PDFs / audio / video and answer over them with RAG.
- Ops & alerts — Server Monitor (live system metrics + thresholds), Stock & Crypto Alert (market prices).
- IBM stack — IBM Cloud Advisor (recommends real catalog services), IBM Docs Q&A, IBM What's New.
- Developer & eval — Code Reviewer, BIRD Invocable API.
- Multi-agent — Ouroboros, a seven-specialist
lead-gen system under a
CugaSupervisor— open this one for the multi-agent shape.
The full, filterable list with launch buttons is in the live gallery.
The 21 showcase apps, with the MCP servers each uses and a local port to run it
on. Run pattern: cd cuga-apps/apps/<app> && pip install -r requirements.txt cuga && python main.py --port <port>.
| App | MCP servers | Port | Notes |
|---|---|---|---|
| Recipe Composer | — | 28820 | inline tools only; no keys beyond LLM |
| Movie Recommender | knowledge | 28806 | |
| Travel Planner | web, knowledge, geo | 28090 | |
| City Beat | geo, web, knowledge, finance | 28821 | 4 MCPs + 7 inline session tools |
| Wiki Dive | knowledge | 28809 | |
| Paper Scout | knowledge | 28808 | |
| Web Researcher | web | 28798 | |
| Webpage Summarizer | web | 28071 | |
| YouTube Research | web | 28803 | |
| Newsletter Intelligence | web | 28793 | |
| Architecture Diagram | web | 28804 | |
| Hiking Research | geo, web | 28805 | |
| Find a Doctor | — | 28825 | keyless; OSM + DuckDuckGo reviews |
| GitHub Trending | — | 28823 | keyless; optional GITHUB_TOKEN raises rate limit |
| AI Labs News | — | 28824 | keyless; reads each lab's RSS/Atom feed |
| Meetup Finder | — (+ Playwright) | 28826 | browser-driven; python -m playwright install chromium first |
| Server Monitor | local | 28767 | optional CPU_THRESHOLD / RAM_THRESHOLD |
| Stock & Crypto Alert | finance | 28801 | Alpha Vantage key pasted in browser per session |
| IBM Cloud Advisor | web | 28812 | |
| IBM Docs Q&A | web | 28813 | |
| Ouroboros | — | 28822 | multi-agent (supervisor + 7 specialists); give it APP_MEM=4G |
Apps using an MCP server need either a local MCP server or export CUGA_TARGET=ce
(see Shared MCP servers).
The whole agent is four arguments:
return CugaAgent(
model=create_llm(provider=os.getenv("LLM_PROVIDER"), model=os.getenv("LLM_MODEL")),
tools=_make_tools(), # MCP tools + inline @tools
special_instructions=_SYSTEM, # the procedure, written as ordered steps
cuga_folder=str(_DIR / ".cuga"), # state + policies live here
)Two conventions do the heavy lifting:
- MCP for generic capabilities, inline
@tools for app state. Stateless primitives (web search, Wikipedia/arXiv, geocoding, weather, finance quotes) come from shared MCP servers viaload_tools(["web", ...])— you host nothing. Anything specific to the app is a normal Python function whose docstring tells the agent when to call it. Concatenate both:tools=mcp_tools + [my_tool, ...]. - Every tool returns the same envelope —
{"ok": true, "data": {...}}on success,{"ok": false, "code": "...", "error": "..."}on failure. CUGA's planner recovers from a declared failure ("geocoding returned nothing — skip that section and keep going") and derails on a raw stack trace. Boring, but load-bearing.
State is a per-thread_id Python dict that only the agent writes to, through its
tools; the live panel polls it and redraws the moment a tool fires. No database.
There is a single self-contained spec for building a CUGA app from scratch against the hosted MCP servers — no need to clone the rest of this repo:
It includes the LLM factory, MCP bridge, main.py / ui.py templates, the
tool-envelope rule, and a definition-of-done checklist. Building inside the
repo? See
docs/HOW_TO_BUILD_AN_APP_FAST.md
and docs/ADDING_AN_APP.md.
Hand the spec to any capable LLM coding agent:
You are an expert in creating Cuga web applications using Cuga Agent.
Follow the spec here: <path to> cuga_external_app_spec.md
Create a new web app to <what you want it to do> that is powered by Cuga Agent.
Replace <…> with your idea — "track my reading list and recommend the next
book," "summarise the GitHub PRs I'm reviewing today," "a daily briefing for
any city." A few apps here were generated exactly this way — regular enough for a
model to reproduce means regular enough for you to learn.
apps/city_beat/ is the cleanest worked example: 4
hosted MCP servers (geo, web, knowledge, finance) + 7 inline session
tools. See main.py:104 for the
load_tools([...]) call and
main.py:108-225 for the inline
@tool defs.
Generic capabilities live in shared MCP servers so apps don't reimplement them.
Apps reach them with load_tools(["web", ...]). Run them locally, or use the
hosted set with export CUGA_TARGET=ce (their upstream keys already live
server-side, so your laptop only needs an LLM provider):
| Server | Capabilities |
|---|---|
web |
web_search (Tavily), fetch_webpage, fetch_feed, YouTube transcripts |
knowledge |
Wikipedia, arXiv, Semantic Scholar |
geo |
geocode, get_weather, find_hikes, search_attractions |
finance |
get_crypto_price (CoinGecko), get_stock_quote (Alpha Vantage) |
code |
check_python_syntax, extract_code_metrics, detect_language |
local |
system metrics, processes, disk usage, audio transcription |
text |
chunk_text, count_tokens, extract_text (PDF/DOCX/HTML → markdown) |
An 8th server, invocable_apis, runs local-only (it needs BIRD benchmark data
bind-mounted from the host). The bridge in
apps/_mcp_bridge.py resolves each server's URL
automatically; set MCP_<NAME>_URL to override a single one.
cuga-apps/
├── README.md you are here
├── cuga_external_app_spec.md self-contained spec — point an LLM at this to build a new app
├── build/ all-in-one image (UI + showcase apps + MCP on port 8080) + docker compose
└── cuga-apps/ the umbrella repo: 35 apps, 8 MCP servers, the umbrella UI
├── apps/ one folder per app (main.py + ui.py + requirements.txt)
├── mcp_servers/ MCP servers (7 hosted-capable + invocable_apis)
├── ui/ the umbrella SPA
└── docs/
├── HOW_TO_BUILD_AN_APP_FAST.md 10-minute build guide
├── ADDING_AN_APP.md register a new app end-to-end
└── cuga_app_builder_spec.md full in-repo build spec (MCP + inline)
Apache License 2.0 — see LICENSE and NOTICE.
Contributions welcome; see CONTRIBUTING.md.