Releases: debu-sinha/coco-reference
CoCo v2.0.0 — initial public release
CoCo v2.0.0 — initial public release
The first public release of CoCo, a reference implementation of a
natural-language cohort copilot on Databricks. Clone + deploy to your
own Databricks workspace in about 30 minutes.
What's in the box
- DSPy ReAct agent with native tool calling on Foundation Model API
- Mosaic AI Agent Framework serving endpoint with typed resource bindings
- Databricks Apps (FastAPI + HTMX + SSE) front-end
- Lakebase (managed Postgres) for per-user session state, with short-lived OAuth credentials and a pool-rotation pattern
- MLflow Managed Prompt Registry with
@productionalias and evolutionary prompt optimization viamlflow.genai.optimize_prompts+ GEPA - Unity Catalog for data, registered models, prompts, and UC volumes — everything namespaced per workshop attendee
- Databricks Vector Search for clinical knowledge RAG, with an auto-synced index over the knowledge chunks table
- End-to-end preflight check that exercises CREATE SCHEMA, CAN_QUERY on the LLM endpoint, and the Prompt Registry preview flag so deploys do not fail 20 minutes into setup
- Teardown notebook that removes every per-user resource idempotently. Two attendees running teardown at the same time touch disjoint resources and cannot interfere.
Multi-user isolation
Every per-attendee resource (schema, Lakebase instance, VS endpoint and index, agent endpoint, app, prompts, registered model, MLflow experiment) is auto-namespaced from the workspace username. The only shared resource an attendee touches is the UC catalog itself, which is inherently admin-managed.
Security posture
See docs/SECURITY.md for the full story. Short version: the guardrails in src/coco/agent/guardrails.py are defense-in-depth on top of the app's service principal having read-only, schema-scoped UC grants. The repo ships an adversarial test suite covering multi-statement injection, nested block comments, escaped quotes, and CTE-based bypass attempts.
Deploy
git clone https://github.com/debu-sinha/coco-reference.git
cd coco-reference
python scripts/preflight_check.py -p PROFILE --warehouse-id WH_ID --catalog CATALOG
databricks bundle deploy -t demo -p PROFILE --var unique_id=YOUR_ID --var warehouse_id=WH_ID --var catalog=CATALOG
databricks bundle run setup_workspace -t demo -p PROFILE --var unique_id=YOUR_ID --var warehouse_id=WH_ID --var catalog=CATALOGDocs
- README — quick start
docs/PERMISSIONS.md— every permission the setup job needsdocs/SECURITY.md— threat model and limitsdocs/ARCHITECTURE.md— full request flow and design decisionsdocs/design/apps-mosaic-ai-agent-reference.md— canonical field reference capturing every Databricks Apps + Mosaic AI gotcha surfaced during developmentdocs/cost-attribution/— per-workload cost tracking queries and a draft tagging policy