Skip to content

feat(BA-2327): dockerize agent with DooD (Docker-out-of-Docker)#9596

Draft
Yaminyam wants to merge 4 commits intomainfrom
feat/BA-2327/dockerize-agent
Draft

feat(BA-2327): dockerize agent with DooD (Docker-out-of-Docker)#9596
Yaminyam wants to merge 4 commits intomainfrom
feat/BA-2327/dockerize-agent

Conversation

@Yaminyam
Copy link
Member

@Yaminyam Yaminyam commented Mar 3, 2026

Summary

  • Dockerize the Backend.AI agent with DooD (Docker-out-of-Docker) pattern, enabling the agent to run in a container while managing session containers via the host Docker daemon
  • Fix REPL port binding and kernel_host for DooD agent-kernel TCP communication: session containers' REPL ports now bind to container_bind_host when advertised_host is configured
  • Add agent entrypoint script that sets up krunner file sharing via symlinks to a shared volume, solving the DooD path resolution issue where Docker resolves bind mount paths on the host filesystem
  • Update docker-compose.monorepo.yml with all 6 services (manager, agent, webserver, storage-proxy, appproxy-coordinator, appproxy-worker)
  • Comprehensive documentation update covering all component configurations, Docker address conventions, etcd/DB setup, and DooD-specific troubleshooting

Test plan

  • Build agent Docker image successfully
  • Agent container starts and registers with manager
  • Session creation succeeds in DooD mode (krunner paths resolved correctly)
  • Agent-kernel REPL communication works (REPL ports bind to correct host)
  • Session service ports (sshd, ttyd, jupyter, jupyterlab) initialized
  • AppProxy connectivity verified end-to-end (browser → coordinator → worker → session)
  • Session cleanup (DELETE) works correctly
  • CI checks pass

Resolves BA-2327

- Add agent Dockerfile with Docker CLI for DooD operations
- Add agent service to docker-compose.monorepo.yml
- Update Docker installation docs with agent config and DooD notes

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Copilot AI review requested due to automatic review settings March 3, 2026 04:32
@github-actions github-actions bot added the size:L 100~500 LoC label Mar 3, 2026
@github-actions github-actions bot added the area:docs Documentations label Mar 3, 2026
Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Dockerizes the Backend.AI agent for Docker-out-of-Docker (DooD) deployments and updates local compose/docs to run manager+agent+webserver together.

Changes:

  • Add a dedicated agent Dockerfile that installs Docker CLI for DooD workloads
  • Add backend-ai-agent service to docker-compose.monorepo.yml with Docker socket and required host mounts
  • Extend Docker install docs with agent build/run/configuration guidance

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
docs/install/install-docker.rst Documents building/running the new agent container and DooD-specific configuration notes
docker/backend.ai-agent.dockerfile New agent image build with Python wheels + Docker CLI installation
docker-compose.monorepo.yml Adds agent service to the monorepo compose stack
changes/9596.feature.md Adds changelog entry for agent DooD dockerization

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +18 to +24
RUN install -m 0755 -d /etc/apt/keyrings \
&& curl -fsSL https://download.docker.com/linux/debian/gpg -o /etc/apt/keyrings/docker.asc \
&& chmod a+r /etc/apt/keyrings/docker.asc \
&& echo "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.asc] https://download.docker.com/linux/debian $(. /etc/os-release && echo "$VERSION_CODENAME") stable" > /etc/apt/sources.list.d/docker.list \
&& apt-get update \
&& apt-get install -y --no-install-recommends docker-ce-cli \
&& rm -rf /var/lib/apt/lists/*
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The image uses curl to fetch Docker’s GPG key but never installs curl (and may also be missing ca-certificates for HTTPS trust depending on the base image variant). This can cause the Dockerfile build to fail on bases where curl isn’t preinstalled. Install required packages (e.g., ca-certificates + curl, and if needed gnupg) before invoking curl, ideally in the same apt-get install step as docker-ce-cli.

Copilot uses AI. Check for mistakes.
@@ -0,0 +1,33 @@
ARG PYTHON_VERSION
FROM python:${PYTHON_VERSION} AS builder
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the full python:${PYTHON_VERSION} base for both stages significantly increases image size and pull time. If OS packages/build tooling aren’t required at runtime (beyond installing docker-ce-cli), consider switching the runtime stage to a slimmer variant (e.g., python:${PYTHON_VERSION}-slim) and only add the minimal apt packages needed for Docker CLI + certs.

Copilot uses AI. Check for mistakes.
# Install backend.ai packages from /dist (these are not in requirements.txt or PyPI)
RUN pip wheel --wheel-dir=/wheels --no-cache-dir backend.ai-agent==${PKGVER} --find-links=/dist --no-deps

FROM python:${PYTHON_VERSION}
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using the full python:${PYTHON_VERSION} base for both stages significantly increases image size and pull time. If OS packages/build tooling aren’t required at runtime (beyond installing docker-ce-cli), consider switching the runtime stage to a slimmer variant (e.g., python:${PYTHON_VERSION}-slim) and only add the minimal apt packages needed for Docker CLI + certs.

Copilot uses AI. Check for mistakes.
Comment on lines +404 to +407
.. code-block:: bash

docker stop backend-ai-manager backend-ai-webserver
docker rm backend-ai-manager backend-ai-webserver
docker stop backend-ai-manager backend-ai-agent backend-ai-webserver
docker rm backend-ai-manager backend-ai-agent backend-ai-webserver
Copy link

Copilot AI Mar 3, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These commands are intended to be inside the .. code-block:: bash block but they aren’t indented, so they may render as normal text instead of a code block in RST. Indent the command lines (consistent with other bash blocks in this doc) so the directive formats them correctly.

Copilot uses AI. Check for mistakes.
…mponent configs

- Fix REPL port binding for DooD mode: bind to container_bind_host when
  advertised_host is configured so agent can reach kernel REPL ports
- Fix kernel_host to use data from container creation instead of hardcoded
  127.0.0.1, enabling DooD agent-kernel TCP communication
- Add agent entrypoint script for krunner file sharing via symlinks
- Update docker-compose with storage-proxy and appproxy services
- Remove 30000-31000 port mapping from agent (DooD port conflict)
- Comprehensive docs update with all 6 component configs, DooD notes,
  address configuration guide, and AppProxy troubleshooting

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Yaminyam Yaminyam marked this pull request as draft March 3, 2026 18:17
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@Yaminyam Yaminyam requested a review from a team March 4, 2026 07:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

area:docs Documentations size:L 100~500 LoC

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants