The AI-Q blueprint uses a multi-stage Dockerfile (deploy/Dockerfile) that produces two build targets: a development image with the CLI and a lean release image for production.
graph TD
A["nvcr.io/nvidia/base/ubuntu:jammy-20251013"] -->|Stage 1| B["builder"]
B -->|Install CLI + debug UI| C["dev-builder"]
C -->|Copy /app| D["dev"]
B -->|Copy /app| E["release"]
F["nvcr.io/nvidia/distroless/python:3.12-v3.5.3"] -->|Runtime base| D
F -->|Runtime base| E
style A fill:#e0e0e0,stroke:#666
style F fill:#e0e0e0,stroke:#666
style B fill:#fff3cd,stroke:#856404
style C fill:#fff3cd,stroke:#856404
style D fill:#d1ecf1,stroke:#0c5460
style E fill:#d4edda,stroke:#155724
The build consists of four stages:
| Stage | Base image | Purpose |
|---|---|---|
builder |
nvcr.io/nvidia/base/ubuntu:jammy-20251013 |
Installs Python 3.12, system dependencies, and all application packages. |
dev-builder |
builder |
Extends builder with the CLI and debug UI packages. |
dev |
nvcr.io/nvidia/distroless/python:3.12-v3.5.3 |
Development runtime -- copies from dev-builder. |
release |
nvcr.io/nvidia/distroless/python:3.12-v3.5.3 |
Production runtime -- copies from builder (no CLI). |
The builder stage handles all compilation and package installation:
- System dependencies -- Installs build tools, curl, git, and Python 3.12 from the
deadsnakesPPA. - Virtual environment -- Creates a venv at
/app/.venvusinguv. - Dependency installation -- Runs
uv sync --frozen --no-dev --no-install-workspaceto install locked dependencies. - Workspace packages -- Installs application packages with
uv pip install -e(the root package uses--no-deps):- Root workspace package (
aiq-agent) usinguv pip install --no-deps -e . sources/google_scholar_paper_search-- Google Scholar searchsources/tavily_web_search-- Tavily web searchsources/exa_web_search-- Exa web searchsources/knowledge_layer[all]-- Knowledge layer with all extrasfrontends/aiq_api-- FastAPI frontendpsycopg[binary]>=3.0.0-- PostgreSQL driver (psycopg v3, installed non-editable)
- Root workspace package (
- File setup -- Makes startup scripts executable, creates
/app/data, and sets ownership to UID 1000.
Only runtime scripts (deploy/entrypoint.py and deploy/start_web.py) are copied from the deploy/ directory. The full deploy/ directory is excluded to avoid leaking .env files, Helm charts, compose files, or other development artifacts into the image.
The development image extends the builder with additional packages:
frontends/cli-- Command-line interface (natCLI commands)frontends/debug-- Debug UI for local development
docker build --target dev -t aiq:dev -f deploy/Dockerfile .- All application packages plus CLI and debug UI.
- Python 3.12 runtime from the NVIDIA distroless base image.
- Startup scripts (
entrypoint.py,start_web.py). - Runs as non-root user (UID 1000).
The release image is built from the base builder stage (no CLI or debug packages):
docker build --target release -t aiq:prod -f deploy/Dockerfile .- Application packages only (no CLI, no debug UI).
APP_ENV=productionset by default.- Same non-root user and distroless base as the dev image.
| Target | Command | Use case |
|---|---|---|
dev |
docker build --target dev -t aiq:dev -f deploy/Dockerfile . |
Local development, testing, CLI access |
release |
docker build --target release -t aiq:prod -f deploy/Dockerfile . |
Production deployment, CI/CD |
When using Docker Compose, the build target is controlled by the BUILD_TARGET variable:
# Dev build (default)
cd deploy/compose
docker compose --env-file ../.env -f docker-compose.yaml up -d --build
# Release build
BUILD_TARGET=release docker compose --env-file ../.env -f docker-compose.yaml up -d --build| Image | Used in | Purpose |
|---|---|---|
nvcr.io/nvidia/base/ubuntu:jammy-20251013 |
Builder stages | Full Ubuntu with package managers for compilation. |
nvcr.io/nvidia/distroless/python:3.12-v3.5.3 |
Runtime stages (dev, release) |
Minimal NVIDIA distroless image with Python 3.12. No shell, no package manager -- reduces attack surface. |
The container entrypoint is python /app/deploy/entrypoint.py, which orchestrates the full startup sequence. There are two scripts involved:
entrypoint.py -- Dask cluster launcher
entrypoint.py is the Docker ENTRYPOINT. It performs the following:
- Argument pass-through -- If command-line arguments are provided, it
execs them directly (useful for running one-off commands in the container). - Dask scheduler -- Starts a
dask-schedulerprocess on the configured port (default8786) with a dashboard on port8787. - Wait for scheduler -- Polls the scheduler with a Dask
Clientfor up to 30 attempts (1 second apart). - Dask worker -- Starts a
dask-workerprocess connected to the scheduler. - Environment variable -- Sets
NAT_DASK_SCHEDULER_ADDRESSso the web server can submit background jobs. - Web server -- Launches
start_web.pyas a subprocess. - Signal handling -- Installs SIGTERM/SIGINT handlers that gracefully shut down all three processes (web, worker, scheduler).
Environment variables:
| Variable | Default | Description |
|---|---|---|
CONFIG_FILE |
/app/configs/config_web_default_llamaindex.yml |
Path to the NeMo Agent Toolkit workflow config. |
HOST |
0.0.0.0 |
Bind address for the web server. |
PORT |
8000 |
Bind port for the web server. |
DASK_SCHEDULER_PORT |
8786 |
Dask scheduler port. |
DASK_NWORKERS |
1 |
Number of Dask workers. |
DASK_NTHREADS |
4 |
Threads per Dask worker. |
start_web.py bypasses the standard nat serve command to avoid an asyncio event loop conflict between the NeMo Agent Toolkit runtime and FastAPI/Starlette's anyio event loop management.
It performs the following:
- Configure logging -- Sets up structured logging matching
nat servebehavior. - Load config -- Validates the NeMo Agent Toolkit YAML config using
nat.runtime.loader.load_config(). - Set environment -- Writes
NAT_CONFIG_FILEandNAT_FRONT_END_WORKERso NeMo Agent Toolkit's FastAPI app can find the config and worker class. - Run uvicorn -- Starts uvicorn directly with
loop="asyncio", letting uvicorn create and manage its own event loop.
Environment variables:
| Variable | Default | Description |
|---|---|---|
CONFIG_FILE |
/app/configs/config_web_frag.yml |
Path to the NeMo Agent Toolkit workflow config. |
HOST |
0.0.0.0 |
Bind address. |
PORT |
8000 |
Bind port. |
LOG_LEVEL |
INFO |
Logging verbosity. |