diff --git a/.claude/commands/memo.md b/.claude/commands/memo.md new file mode 100644 index 00000000..fb47a417 --- /dev/null +++ b/.claude/commands/memo.md @@ -0,0 +1,43 @@ +--- +description: Save current task state to auto-memory, then promote reusable lessons to skills and trim memory. +--- + +# Memo + +Save a snapshot of current work to persistent memory, then clean up. + +## Step 1 — Save current state + +Write a concise summary of in-progress or recently completed work to the +auto-memory `MEMORY.md` for this project. Include: + +- What was done (feature, bug, refactor, area of code) +- Current status (completed, blocked, in-progress) +- Key decisions or outcomes worth remembering across conversations + +Do not duplicate information already in skills, CLAUDE.md, or README-CLAUDE.md. + +## Step 2 — Promote to skills + +Review the memory file for items that represent **reusable patterns or +lessons** — things that would help future sessions on this project. For +each such item: + +1. Identify which skill file it belongs in (or create a new one under + `.claude/skills//SKILL.md`). +2. Add it to the appropriate skill. +3. Remove it from memory (it now lives in the skill). + +Examples of promotable items: +- A non-obvious convention specific to this project +- A "foot-gun" pattern worth warning future-you about +- A reusable recipe (test invocation, deploy command, debugging trick) + +## Step 3 — Trim memory + +Remove from memory anything that is: +- Already captured in skills, CLAUDE.md, or README-CLAUDE.md +- Too specific to a single completed task to be useful again +- Stale or superseded by later work + +Keep memory concise — ideally under 30 lines. diff --git a/.claude/commands/pr-squash.md b/.claude/commands/pr-squash.md new file mode 100644 index 00000000..bde9db09 --- /dev/null +++ b/.claude/commands/pr-squash.md @@ -0,0 +1,88 @@ +# PR Squash + +Create a clean PR by grouping the current branch's commits into logical squashed +commits on a new branch, then opening a pull request. + +## Instructions + +1. **Determine the base branch.** Use `$ARGUMENTS` if provided, otherwise detect + the repo's default branch (`main` or `master`) via `gh repo view --json + defaultBranchRef -q .defaultBranchRef.name`. + +2. **Collect the commit history.** Run: + ``` + git log --oneline --reverse .. + ``` + These are the commits to be grouped. + +3. **Analyse and group the commits.** Read the diffs for each commit + (`git show --stat ` and `git show ` for ambiguous cases). + Group commits into logical units: + - Each group should represent one cohesive change (a feature, a fix, a + refactor, a config change, etc.). + - Iterative fix-up commits ("fix typo", "try again", "wip") belong with the + feature they relate to. + - Keep genuinely independent changes in separate groups. + - Preserve chronological order between groups where possible. + +4. **Decide: one PR or multiple PRs.** If the groups fall into distinct, + unrelated topics (e.g. "developer tooling" vs "production feature"), plan + to create **separate PRs** — one per topic. Each PR gets its own squash + branch (`-squash-1`, `-squash-2`, etc.) and contains only + the groups for that topic. Groups that are closely related (e.g. a feature + and its config) stay in the same PR as separate squashed commits. + + Rule of thumb: if a reviewer would reasonably want to merge one topic + without the other, they belong in separate PRs. + +5. **Present the grouping plan.** Show the user a numbered list like: + ``` + PR 1: "Devcontainer hardening and tooling" + Group 1: "harden devcontainer and add Just task runner" + - abc1234 add security settings + - def5678 replace tox with just + + PR 2: "Add Dex OIDC authentication" + Group 2: "configure Dex and argocd-monitor" + - jkl3456 add Dex config + - mno7890 fix client secret + - pqr1234 fix audience mismatch + ``` + If all groups are closely related, show a single PR with multiple groups. + Ask the user to confirm or adjust before proceeding. + +6. **Create squash branch(es).** Once approved, for each PR: + ``` + git checkout -b + ``` + Use `-squash` for a single PR, or + `-squash-` (or a short descriptive suffix) for multiple. + +7. **Cherry-pick and squash each group.** For each group in the PR: + ``` + git cherry-pick --no-commit ... + git commit -m "" + ``` + Use a well-written conventional commit message for each group. Include a + short body if the group contains non-obvious changes. Preserve any + `Co-Authored-By` trailers from the original commits. + +8. **Push and create the PR(s).** + ``` + git push -u origin + ``` + Create each PR with `gh pr create` targeting the base branch. The PR body + should summarise its squashed commit group(s). + +9. **Switch back** to the original branch so the user's working state is + unchanged. + +## Edge cases +- If there are fewer than 3 commits, suggest the user just squash-merge + directly instead — but proceed if they insist. +- If cherry-pick conflicts arise, stop and inform the user rather than + auto-resolving. +- Never force-push or modify the original branch. +- If a `-squash` branch already exists, ask the user before overwriting. +- When merging a PR created by this command, use `gh pr merge --merge` + (not `--squash`) to preserve the curated commit structure. diff --git a/.claude/commands/verify-sandbox.md b/.claude/commands/verify-sandbox.md new file mode 100644 index 00000000..788ae5da --- /dev/null +++ b/.claude/commands/verify-sandbox.md @@ -0,0 +1,130 @@ +--- +description: Verify Claude's mount-namespace sandbox is intact — env canaries, masked credentials, gitconfig bind, and the four VS Code IPC sockets from the Demmel writeup. +--- + +# Verify sandbox + +Run the full sandbox verification described in `README-CLAUDE.md` and +report a PASS/FAIL table. The threat model these checks defend against +is documented in: + +- `README-CLAUDE.md` (this repo) — sections **What's locked down** and + **Verifying the sandbox**. +- Daniel Demmel, *Coding agents in secured VS Code dev containers* — + + — describes the `vscode-ipc-*.sock`, `vscode-git-*.sock`, + `vscode-ssh-auth-*.sock`, and `vscode-remote-containers-ipc-*.sock` + bridges in `/tmp` that re-appear up to ~60s after window attach. Our + defence is the private mount namespace set up by `just claude`, not a + one-shot sweep. + +## How to run + +Execute every check below in a single Bash invocation where practical +(parallel them when independent). For each item, report PASS or FAIL +with a one-line reason. Do not skip a check because an earlier one +failed — collect everything, then summarise. + +If any check FAILs, end the report with: "Sandbox is leaking — do not +trust `--dangerously-skip-permissions` until fixed. Open an issue +against `gilesknap/python-copier-template`." + +## Checks + +### 1. Namespace markers + +- `IS_SANDBOX` env var must be `1` (set by `claude-sandbox.sh` after + `unshare -m`). If unset, Claude was not launched via `just claude`. +- `IN_DEVCONTAINER` env var must be set. + +### 2. Host bridge env vars (must all be unset) + +`SSH_AUTH_SOCK`, `GIT_ASKPASS`, `VSCODE_GIT_IPC_HANDLE`, +`VSCODE_GIT_ASKPASS_NODE`, `VSCODE_GIT_ASKPASS_MAIN`, +`VSCODE_IPC_HOOK_CLI`, `BROWSER`. + +### 3. SSH agent unreachable + +`ssh-add -l` must fail with "Could not open a connection to your +authentication agent." Anything that lists keys is a FAIL. + +### 4. `/tmp` and `/run/user` are private tmpfs + +- `mount | grep ' on /tmp '` must show a `tmpfs` entry (this confirms + the mount namespace is active for `/tmp`). +- `ls /tmp` must NOT contain any of the four Demmel sockets: + `vscode-ipc-*.sock`, `vscode-git-*.sock`, `vscode-ssh-auth-*.sock`, + `vscode-remote-containers-ipc-*.sock`. Glob each one explicitly. +- `ls /run/user/*/` must NOT contain `vscode-*` entries. + +### 5. Host credential dirs masked + +Each of these must be empty or absent: +`/root/.ssh`, `/root/.gnupg`, `/root/.aws`, `/root/.azure`, +`/root/.gcloud`, `/root/.docker`, `/root/.netrc`. + +A non-empty `/root/.ssh` (containing `id_*` or `authorized_keys`) is a +critical FAIL — the host SSH keys are reachable. + +### 6. Gitconfig bind-mount + +- `mount | grep '/root/.gitconfig'` must show a bind mount (typically + `fuse-overlayfs` or `bind` from `/etc/claude-gitconfig`). +- `git config --global --list` must contain ONLY: + - `user.name` / `user.email` (host identity, copied through), + - `safe.directory=*`, + - `url.https://github.com/.insteadof=git@github.com:`, + - `url.https://gitlab.diamond.ac.uk/.insteadof=git@gitlab.diamond.ac.uk:`, + - `credential.https://github.com.helper=` then `!/usr/bin/gh auth git-credential`, + - `credential.https://gitlab.diamond.ac.uk.helper=` then `!/usr/local/bin/glab auth git-credential`. +- Any other `credential.*.helper` (especially one pointing at + `/tmp/vscode-remote-containers-*.js` or `/.vscode-server/...`) is a + FAIL. +- `/etc/gitconfig` must be masked (bind-mounted to `/dev/null` or + absent). `mount | grep '/etc/gitconfig'` should show a bind mount + whose source is `/dev/null` (appears as `devtmpfs` with `mode=755`, + inode for major 1 / minor 3), OR `ls /etc/gitconfig` returns + "No such file or directory". A regular file at `/etc/gitconfig` with + any contents is a FAIL — the host's system-scope gitconfig is + reachable and could carry `url.insteadof`, `http.proxy`, + `core.hooksPath`, or credential helpers that bypass /root/.gitconfig. +- System scope must be empty: `git config --system --list` must produce + no output (exit 0 with empty stdout, or exit non-zero). Any line is a + FAIL — broader than just `credential.helper`, since `core.hooksPath` + or `url.insteadof` at system scope are equally dangerous. + +### 7. Credential source is gh, not a host bridge + +`printf 'protocol=https\nhost=github.com\n\n' | git credential fill` +must return a `password=` line. The token prefix tells you the source: + +- `gho_…` or `github_pat_…` from `gh auth git-credential` → PASS. +- Anything else (e.g. a token from a `vscode-git-*.sock` bridge) → FAIL. + +Do NOT print the token. Redact with `sed 's/password=.*/password=/'`. +Skip this check (mark N/A, not FAIL) if `just gh-auth` has not been run +for this repo — the README explicitly carves that out. + +## Output format + +Print a single table: + +``` +CHECK STATUS DETAIL +1. IS_SANDBOX=1 PASS/FAIL ... +2. Host bridge env vars unset PASS/FAIL ... +3. ssh-add -l fails PASS/FAIL ... +4a. /tmp is tmpfs PASS/FAIL ... +4b. No vscode-*.sock in /tmp PASS/FAIL ... +4c. No vscode-* in /run/user PASS/FAIL ... +5. Host credential dirs masked PASS/FAIL ... +6a. /root/.gitconfig bind-mounted PASS/FAIL ... +6b. Gitconfig contents are sandbox-only PASS/FAIL ... +6c. /etc/gitconfig masked PASS/FAIL ... +6d. System-scope gitconfig is empty PASS/FAIL ... +7. git credential fill source is gh PASS/FAIL/N/A ... +``` + +End with one line: `RESULT: SANDBOX OK` if every check is PASS or N/A, +otherwise `RESULT: SANDBOX LEAKING — see failures above` and the issue +pointer. diff --git a/.claude/hooks/sandbox-check.sh b/.claude/hooks/sandbox-check.sh new file mode 100755 index 00000000..1d560b84 --- /dev/null +++ b/.claude/hooks/sandbox-check.sh @@ -0,0 +1,44 @@ +#!/bin/bash +# UserPromptSubmit hook: verify the Claude sandbox is intact before +# executing any prompt. Exit code 2 blocks the prompt and shows the +# message to the user. See README-CLAUDE.md for the full sandbox model. + +fail() { echo "BLOCKED: $1" >&2; exit 2; } + +# Are we in the devcontainer at all? +[ -n "${IN_DEVCONTAINER:-}" ] || \ + fail "not in the devcontainer (IN_DEVCONTAINER unset). Reopen the project in the devcontainer." + +# IS_SANDBOX=1 is set by the inner `just claude` script after it sets up +# the private mount namespace. If it's missing, Claude was launched +# without the namespace and /tmp/vscode-*.sock host bridges are reachable. +[ -n "${IS_SANDBOX:-}" ] || \ + fail "IS_SANDBOX unset — Claude was not launched via \"just claude\", so the mount-namespace sandbox is not active." + +# Host SSH agent must not be reachable. remoteEnv blanks SSH_AUTH_SOCK and +# `just claude` re-blanks it; if it is set, neither layer applied. +[ -z "${SSH_AUTH_SOCK:-}" ] || \ + fail "SSH_AUTH_SOCK is set ($SSH_AUTH_SOCK) — host SSH agent is reachable. run \"just claude\" or rebuild the devcontainer." + +# GIT_ASKPASS points at a script under /.vscode-server, which the +# namespace does NOT mask. If the env var is non-empty AND the file is +# reachable, claude-sandbox.sh's exec-line blank failed to apply. +[ ! -e "${GIT_ASKPASS:-}" ] || \ + fail "GIT_ASKPASS script ($GIT_ASKPASS) is reachable — claude-sandbox.sh did not blank the env var. Rebuild the devcontainer or re-run \"just claude\"." + +# /root/.gitconfig must be the bind-mounted /etc/claude-gitconfig (gh/glab +# helpers only). VS Code reconnects can drop the bind and re-expose the +# host gitconfig, whose [credential] helper invokes a node script under +# /.vscode-server via /tmp/vscode-remote-containers-*.js — leaking the +# host's git credentials into the sandbox. +! grep -q -e 'vscode-remote-containers' -e '\.vscode-server' /root/.gitconfig 2>/dev/null || \ + fail "/root/.gitconfig contains a VS Code credential bridge — the bind on /root/.gitconfig has been dropped (likely by a VS Code reconnect). Exit Claude and re-run \"just claude\"." + +# /etc/gitconfig must be masked (bind-mounted to /dev/null by claude-sandbox.sh). +# If the host's system-scope gitconfig is reachable, it can carry url.insteadof, +# core.hooksPath, http.proxy, or credential helpers that bypass /root/.gitconfig. +# `git config --system --list` returning any content means the mask is gone. +[ -z "$(git config --system --list 2>/dev/null)" ] || \ + fail "/etc/gitconfig is exposing system-scope settings — the bind-mount mask on /etc/gitconfig has been dropped. Exit Claude and re-run \"just claude\"." + +exit 0 diff --git a/.claude/settings.json b/.claude/settings.json new file mode 100644 index 00000000..6c32c5ea --- /dev/null +++ b/.claude/settings.json @@ -0,0 +1,38 @@ +{ + "permissions": { + "allow": [ + "Edit(/workspaces/**)", + "Write(/workspaces/**)", + "Read(/workspaces/**)", + "Bash(*)" + ], + "deny": [ + "Bash(git push --force *)", + "Bash(git reset --hard*)", + "Bash(ssh *)", + "Bash(ssh-agent *)", + "Bash(*ssh-agent*)", + "Bash(scp *)", + "Bash(rsync *)", + "Bash(sftp *)", + "Bash(telnet *)", + "Bash(mail *)", + "Bash(sendmail *)" + ], + "additionalDirectories": [ + "/workspaces/**" + ] + }, + "hooks": { + "UserPromptSubmit": [ + { + "hooks": [ + { + "type": "command", + "command": ".claude/hooks/sandbox-check.sh" + } + ] + } + ] + } +} diff --git a/.devcontainer/claude-sandbox.sh b/.devcontainer/claude-sandbox.sh new file mode 100755 index 00000000..b502ebe4 --- /dev/null +++ b/.devcontainer/claude-sandbox.sh @@ -0,0 +1,91 @@ +#!/bin/bash +# Inner script for `just claude`: runs inside a private mount namespace +# (created by `unshare -m` from the justfile recipe). Mounts tmpfs over +# the locations VS Code uses for host bridges, builds a Claude-only +# /root/.gitconfig, then exec's claude with PR_SET_PDEATHSIG so it dies +# if its parent shell does. Requires CAP_SYS_ADMIN — granted via +# --cap-add=SYS_ADMIN in devcontainer.json's runArgs. See +# README-CLAUDE.md for the full sandbox model. +set -euo pipefail + +# VS Code drops IPC sockets (vscode-ipc-*.sock, vscode-git-*.sock, +# vscode-ssh-auth-*.sock, vscode-remote-containers-ipc-*.sock) and the +# vscode-remote-containers-*.js credential shim in /tmp, plus more in +# /run/user//. Replacing those directories with tmpfs in Claude's +# namespace makes them invisible. Outside the namespace (the user's +# regular terminal) VS Code keeps using them normally. +mount -t tmpfs tmpfs /tmp +if [ -d /run/user ]; then + mount -t tmpfs tmpfs /run/user +fi + +# Mask credential directories the user may bind-mount from the host for +# their own use from non-Claude terminals (e.g. ~/.ssh for SSH-based +# git push). Claude sees an empty tmpfs; the user's regular shell sees +# the originals. +for d in /root/.ssh /root/.gnupg /root/.aws /root/.azure /root/.gcloud /root/.docker; do + if [ -d "$d" ]; then + mount -t tmpfs tmpfs "$d" + fi +done +# .netrc is a single file, not a dir — mask via bind to /dev/null. +if [ -e /root/.netrc ]; then + mount --bind /dev/null /root/.netrc +fi + +# /etc/gitconfig (system scope) on a VS Code dev-container image carries a +# credential.helper that shells out via /tmp/vscode-remote-containers-*.js — +# the same bridge the per-user mask defends against. Bind /dev/null over it +# so Claude sees an empty system config; only the URL-scoped gh/glab helpers +# in /root/.gitconfig remain. The user's regular terminal is unaffected. +if [ -e /etc/gitconfig ]; then + mount --bind /dev/null /etc/gitconfig +fi + +# Build a Claude-only /root/.gitconfig containing the in-container +# credential helpers (gh / glab) and HTTPS rewrites — and nothing else +# the user has on the host (no SSH url rewrites, no host-specific +# helpers). User identity is read from the original gitconfig BEFORE +# we bind over it, so commits Claude makes are still attributed. +git_name=$(git config --get user.name 2>/dev/null || true) +git_email=$(git config --get user.email 2>/dev/null || true) +gh_path=$(command -v gh || echo /usr/bin/gh) +glab_path=$(command -v glab || echo /usr/local/bin/glab) +cat > /etc/claude-gitconfig <&2 <<'EOF' + +================================================================ +ERROR: This directory is not a git repository. + +setuptools-scm needs git history to compute the package version, +and pre-commit installs its hooks into .git/hooks. Neither will +work without a git repo. + +To fix this, run on the host (outside the devcontainer): + + git init -b main && git add . && git commit -m 'Initial commit' + +then rebuild the devcontainer. + +================================================================ + +EOF + exit 1 +fi + +# Install Python dependencies and pre-commit hooks. `uv venv --clear` wipes +# the venv that lives in /cache (a persistent named volume), so any bash +# hash entries pointing into the old venv (e.g. cached `pre-commit` path) +# are stale. `hash -r` after `uv sync` forces re-resolution against the +# freshly populated venv and against any new `uv` location after a base +# image bump. +uv venv --clear +hash -r +uv sync +pre-commit install --install-hooks + +# Init only submodules that aren't checked out yet — first-clone +# protection without touching already-initialized submodules (which +# would yank in-progress branch work to detached HEAD on rebuild). +if [ -f .gitmodules ]; then + missing=$(git submodule status | awk '/^-/ {print $2}') + [ -n "$missing" ] && git submodule update --init $missing +fi + +# Install Claude Code CLI +curl -fsSL https://claude.ai/install.sh | bash diff --git a/.gitignore b/.gitignore index 0f33bf29..b57a8a3c 100644 --- a/.gitignore +++ b/.gitignore @@ -69,3 +69,7 @@ lockfiles/ # ruff cache .ruff_cache/ + +# Claude Code local state (commit settings.json, commands, skills, hooks) +.claude/settings.local.json +.claude/scheduled_tasks.lock diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 00000000..daced9bd --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,65 @@ +# CLAUDE.md + +Behavioral guidelines to reduce common LLM coding mistakes. Merge with project-specific instructions as needed. + +**Tradeoff:** These guidelines bias toward caution over speed. For trivial tasks, use judgment. + +## 1. Think Before Coding + +**Don't assume. Don't hide confusion. Surface tradeoffs.** + +Before implementing: +- State your assumptions explicitly. If uncertain, ask. +- If multiple interpretations exist, present them - don't pick silently. +- If a simpler approach exists, say so. Push back when warranted. +- If something is unclear, stop. Name what's confusing. Ask. + +## 2. Simplicity First + +**Minimum code that solves the problem. Nothing speculative.** + +- No features beyond what was asked. +- No abstractions for single-use code. +- No "flexibility" or "configurability" that wasn't requested. +- No error handling for impossible scenarios. +- If you write 200 lines and it could be 50, rewrite it. + +Ask yourself: "Would a senior engineer say this is overcomplicated?" If yes, simplify. + +## 3. Surgical Changes + +**Touch only what you must. Clean up only your own mess.** + +When editing existing code: +- Don't "improve" adjacent code, comments, or formatting. +- Don't refactor things that aren't broken. +- Match existing style, even if you'd do it differently. +- If you notice unrelated dead code, mention it - don't delete it. + +When your changes create orphans: +- Remove imports/variables/functions that YOUR changes made unused. +- Don't remove pre-existing dead code unless asked. + +The test: Every changed line should trace directly to the user's request. + +## 4. Goal-Driven Execution + +**Define success criteria. Loop until verified.** + +Transform tasks into verifiable goals: +- "Add validation" → "Write tests for invalid inputs, then make them pass" +- "Fix the bug" → "Write a test that reproduces it, then make it pass" +- "Refactor X" → "Ensure tests pass before and after" + +For multi-step tasks, state a brief plan: +``` +1. [Step] → verify: [check] +2. [Step] → verify: [check] +3. [Step] → verify: [check] +``` + +Strong success criteria let you loop independently. Weak criteria ("make it work") require constant clarification. + +--- + +**These guidelines are working if:** fewer unnecessary changes in diffs, fewer rewrites due to overcomplication, and clarifying questions come before implementation rather than after mistakes. diff --git a/Dockerfile b/Dockerfile index e8462413..655d8f0c 100644 --- a/Dockerfile +++ b/Dockerfile @@ -6,3 +6,31 @@ FROM ghcr.io/diamondlightsource/ubuntu-devcontainer:noble AS developer RUN apt-get update -y && apt-get install -y --no-install-recommends \ graphviz \ && apt-get dist-clean + +# Node is required by Claude Code's hook runtime; just powers the +# container's claude/gh-auth/glab-auth recipes in justfile. +# TODO: nodejs, just, gh and glab will move into the ubuntu-devcontainer +# base image once it ships on Ubuntu 26.04, where all are available +# from apt at sufficient versions. At that point these blocks can be +# dropped. +RUN apt-get update -y && apt-get install -y --no-install-recommends \ + nodejs \ + just \ + && apt-get dist-clean + +# GitHub CLI — used by Claude to authenticate to github.com via PAT. +RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | \ + dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg && \ + chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg && \ + echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \ + | tee /etc/apt/sources.list.d/github-cli.list > /dev/null && \ + apt-get update && apt-get install -y --no-install-recommends gh && \ + apt-get dist-clean + +# GitLab CLI — used by Claude to authenticate to gitlab instances via PAT. +# No apt repo, so install from the upstream release tarball. +ARG GLAB_VERSION=1.93.0 +RUN curl -fsSL "https://gitlab.com/gitlab-org/cli/-/releases/v${GLAB_VERSION}/downloads/glab_${GLAB_VERSION}_linux_amd64.tar.gz" \ + | tar -xz -C /tmp bin/glab && \ + install -m 0755 /tmp/bin/glab /usr/local/bin/glab && \ + rm -rf /tmp/bin diff --git a/README-CLAUDE.md b/README-CLAUDE.md new file mode 100644 index 00000000..7fdd2998 --- /dev/null +++ b/README-CLAUDE.md @@ -0,0 +1,187 @@ +# Claude sandbox + +This project's devcontainer is configured to run Claude Code with +`--dangerously-skip-permissions` (see `justfile`'s `claude` recipe). To make +that safe, the container is set up as a sandbox: Claude can use the project +toolchain, push/pull through PATs it owns, and persist its own settings — +but it cannot reach back to the host's identity or shared resources. + +This file documents what's locked down, what's deliberately left exposed, +and how to verify the sandbox is intact. + +## What's locked down + +- **No host bridges via VS Code IPC sockets.** VS Code's server creates + several unix sockets in `/tmp` and `/run/user//` that are bridges + back to the host: `vscode-ipc-*.sock` (runs `code` CLI on the host), + `vscode-git-*.sock` (git credential bridge — surfaces host PATs), + `vscode-ssh-auth-*.sock` (host SSH agent forward), and + `vscode-remote-containers-ipc-*.sock` (Dev Containers extension RPC). + These are re-created on every window attach and continue to appear up + to ~60s later — see [the threat-model writeup][demmel-blog] — so any + one-shot cleanup leaves a window. The defence is the **`unshare -m`** + call in `just claude`: Claude runs in a private mount namespace where + `/tmp` and `/run/user//` are fresh tmpfs. The bridges still exist + in the parent namespace (VS Code keeps using them normally) but are + invisible to Claude. No race, no sweeper, no recurring check needed. + Requires `--cap-add=SYS_ADMIN` in `runArgs` for rootless podman. + + [demmel-blog]: https://www.danieldemmel.me/blog/coding-agents-in-secured-vscode-dev-containers +- **No host SSH keys, AWS/GCP/Azure/Docker credentials, GPG keys, or + netrc.** The same `unshare -m` masks `/root/.ssh`, `/root/.gnupg`, + `/root/.aws`, `/root/.azure`, `/root/.gcloud`, `/root/.docker`, and + `/root/.netrc` (where present) with empty tmpfs. This means you *can* + bind-mount your host `~/.ssh` into the container if you want to use + SSH keys from a regular terminal — Claude's namespace blanks them out + while non-Claude shells see the originals. `SSH_AUTH_SOCK` is blanked + in the namespace exec line so VS Code's agent forwarding (which the + user terminal keeps) cannot reach Claude. +- **Claude dies with its parent shell.** `setpriv --pdeathsig SIGKILL` + on the inner `claude` exec sets `PR_SET_PDEATHSIG`, so if the wrapping + `unshare`'d shell exits (terminal closed, Ctrl-C, etc.) the kernel + immediately kills Claude — there's no orphaned-claude window where the + namespace context is gone but Claude is still running tools. +- **Claude has its own `/root/.gitconfig` via bind-mount.** + `claude-sandbox.sh` writes `/etc/claude-gitconfig` containing only the + in-container gh/glab credential helpers, the `git@*:` → `https://` + url rewrites, `safe.directory = *`, and the user identity (read from + the host-copied gitconfig before we bind over it, so commits Claude + makes are still attributed). It then `mount --bind`s that file onto + `/root/.gitconfig` inside Claude's namespace. The user's regular + terminal keeps the original `/root/.gitconfig` (host content, copied + by `dev.containers.copyGitConfig`'s default), so the host's SSH url + rewrites, custom credential helpers, and identity all work normally + outside Claude — but Claude only ever sees the curated config. + `/etc/gitconfig` (system scope) is also masked: VS Code dev-container + images bake a `credential.helper` there that shells out via + `/tmp/vscode-remote-containers-*.js`, so `claude-sandbox.sh` binds + `/dev/null` over it inside the namespace. +- **The "log in to GitHub" popup is closed for Claude.** The user + terminal keeps `git.terminalAuthentication` at its default (true), so + `GIT_ASKPASS` and `VSCODE_GIT_IPC_HANDLE` are injected into terminals + and the user gets the natural VS Code OAuth popup when an HTTPS git + operation needs credentials. For Claude two things close that channel: + `claude-sandbox.sh`'s exec line blanks `GIT_ASKPASS`, + `VSCODE_GIT_IPC_HANDLE`, `VSCODE_GIT_ASKPASS_NODE`, + `VSCODE_GIT_ASKPASS_MAIN`, `VSCODE_IPC_HOOK_CLI`, and `BROWSER`; and + the IPC socket the askpass script would talk to lives in `/tmp`, + which is tmpfs-masked. Both layers must be defeated for Claude to + surface a popup. + + `.claude/hooks/sandbox-check.sh` is the periodic verifier: it fires + on every prompt submit and refuses to run Claude if `IS_SANDBOX` is + unset, `SSH_AUTH_SOCK` is set, or the path `GIT_ASKPASS` references + is reachable. +- **Auth is per-repo.** `gh-auth-${repo}` and `glab-auth-${repo}` are + named volumes, not bind mounts — each project gets its own scoped PAT + via `just gh-auth` / `just glab-auth`. Authenticate once per repo and + the token survives container rebuilds. + +## What the user terminal gets (and why) + +VS Code's regular terminal runs *outside* Claude's namespace. It is +deliberately set up with the standard developer experience so working +in the devcontainer feels natural: + +- **Host gitconfig copied in.** `dev.containers.copyGitConfig` defaults + to true, so `/root/.gitconfig` carries the user's name, email, push + preferences, and any host url rewrites. Claude overrides this via + bind-mount; the user terminal sees the original. +- **SSH agent forwarding.** VS Code forwards the host SSH agent into + the container as it normally would; `SSH_AUTH_SOCK` points at + `/tmp/vscode-ssh-auth-*.sock`. Inside Claude's namespace `/tmp` is + tmpfs and the variable is blanked, so Claude cannot reach the agent. +- **VS Code OAuth popup for HTTPS git.** `git.terminalAuthentication` + is left at its default, so when an HTTPS git operation needs creds + the user gets the standard "log in to GitHub" popup. Claude's exec + blanks `GIT_ASKPASS` / `VSCODE_GIT_IPC_HANDLE` and masks the IPC + socket path, so the popup channel does not exist for Claude. +- **`code` CLI and host browser.** `VSCODE_IPC_HOOK_CLI` and `BROWSER` + are inherited by the user terminal so `code ` and tools that + open URLs do the natural thing. Both env vars are blanked in + Claude's exec and the sockets they reference live in `/tmp`. + +## What's deliberately exposed (and why) + +- **`/root/.claude` is bind-mounted from the host's `~/.claude`.** Claude's + settings, memory, hooks, and skills are shared between the host and the + container — that's the whole point. Anything Claude writes to its own + config persists to the host home directory. Treat `~/.claude` on the + host as part of the sandbox boundary, not outside it. +- **`/workspaces` is the parent of the project, not the project itself.** + The `workspaceMount` source is `${localWorkspaceFolder}/..`, so all + sibling repos in the same parent directory are visible inside the + container. This is intentional — it lets `pip install -e ../peer-repo` + work and lets Claude read across related projects when asked. If you + keep unrelated work in the same parent dir, Claude can see it. +- **`--net=host` shares the host's network namespace.** The container's + hostname will match the host's, and any service bound to `localhost` on + the host is reachable from inside. This is needed for X11, EPICS CA, + and to avoid devcontainer port-forwarding hassles. It also means the + container can talk to anything the host can talk to on its LAN. +- **`/cache` is a shared named volume across all devcontainers** built + from this template — uv cache, pre-commit cache, and the project venv + live there. Faster rebuilds; the trade-off is that a poisoned cache + affects every project sharing the volume. + +## Verifying the sandbox + +Run inside `just claude` itself (use Claude's bash tool, or run the same +commands manually after dropping into a shell that has `unshare -m` set +up the way `just claude` does). The mount-namespace defences only apply +inside that namespace — a regular VS Code terminal will see the bridges +exactly as VS Code created them, which is correct. + +```bash +# Canaries: should be unset (env blanks) and 1 (sandbox marker) +echo "SSH_AUTH_SOCK='${SSH_AUTH_SOCK:-}'" +echo "GIT_ASKPASS='${GIT_ASKPASS:-}'" +echo "VSCODE_GIT_IPC_HANDLE='${VSCODE_GIT_IPC_HANDLE:-}'" +echo "VSCODE_IPC_HOOK_CLI='${VSCODE_IPC_HOOK_CLI:-}'" +echo "BROWSER='${BROWSER:-}'" +echo "IS_SANDBOX='${IS_SANDBOX:-}'" # should be 1 +ssh-add -l # "Could not open a connection..." + +# /tmp and /run/user should be empty tmpfs inside Claude's namespace. +ls /tmp # only claude-* runtime dirs +ls /run/user/*/ 2>/dev/null # nothing matching vscode-* +mount | grep -E ' on /tmp |/run/user' # tmpfs entries from claude-sandbox.sh + +# /root/.ssh and friends should be empty even if you bind-mount the host +# originals via devcontainer.json — Claude's namespace masks them. +ls /root/.ssh /root/.gnupg /root/.aws 2>/dev/null # all empty (or missing) + +# Claude's bind-mounted gitconfig: only gh/glab helpers + HTTPS rewrites, +# no host SSH url rewrites or unrelated host helpers. +git config --global --list | grep -E 'credential|insteadof' +mount | grep '/root/.gitconfig' # bind from /etc/claude-gitconfig +git config --system --get credential.helper # should exit non-zero +mount | grep '/etc/gitconfig' # bind from /dev/null + +# Should return creds only if `just gh-auth` has been run for this repo. +printf 'protocol=https\nhost=github.com\n\n' | git credential fill +``` + +If `git credential fill` returns a `password=gho_...` for github.com when +you have not run `just gh-auth`, or if `ls /tmp` shows any `vscode-*` +entries inside the namespace, the sandbox is leaking — open an issue +against the python-copier-template. + +## Authenticating + +```bash +just gh-auth # paste a github.com PAT (repo + workflow scope is enough) +just glab-auth # gitlab.com (pass a hostname arg for self-hosted instances) +``` + +## Starting Claude + +```bash +just claude # runs `claude --dangerously-skip-permissions` inside the mount namespace +``` + +After a rebuild from a previous version of this template, the user +terminal's `/root/.gitconfig` may still carry HTTPS rewrites or per-host +helpers that older `postStart.sh` runs added globally. Either rebuild +the devcontainer for a clean state, or `git config --global --unset-all` +the affected keys. diff --git a/copier.yml b/copier.yml index 23459245..0fe4f5f5 100644 --- a/copier.yml +++ b/copier.yml @@ -121,22 +121,6 @@ add_claude: the container, mounts ~/.claude from the host, installs Claude Code CLI, and enables `--dangerously-skip-permissions` autopilot mode. -install_gh: - type: bool - when: "{{ add_claude }}" - help: | - Install the GitHub CLI (gh) so Claude can push/pull via PAT auth? - Only useful inside the Claude sandbox — ordinary users typically - rely on SSH keys or VS Code git credentials. - -install_glab: - type: bool - when: "{{ add_claude }}" - help: | - Install the GitLab CLI (glab) for projects that talk to a GitLab - instance (e.g. gitlab.diamond.ac.uk submodules)? - Only useful inside the Claude sandbox. - docs_type: type: str help: | diff --git a/example-answers.yml b/example-answers.yml index 91ea509a..c053d6cd 100644 --- a/example-answers.yml +++ b/example-answers.yml @@ -8,8 +8,6 @@ distribution_name: dls-python-copier-template-example docker: true docker_debug: true add_claude: true -install_gh: true -install_glab: true docs_type: sphinx git_platform: github.com github_org: DiamondLightSource diff --git a/justfile b/justfile new file mode 100644 index 00000000..096e3ece --- /dev/null +++ b/justfile @@ -0,0 +1,29 @@ +# Start Claude Code in sandbox mode (no SSH agent, skip permission prompts). +# Runs Claude inside a private mount namespace so VS Code's host-bridge +# sockets (vscode-ipc-*.sock, vscode-git-*.sock, vscode-ssh-auth-*.sock, +# vscode-remote-containers-ipc-*.sock) in /tmp and /run/user// are +# invisible — Claude sees empty tmpfs at those paths. setpriv +# --pdeathsig SIGKILL inside the inner script makes Claude die if the +# wrapping shell exits. See README-CLAUDE.md for the full sandbox model. +claude: + exec unshare -m --propagation private .devcontainer/claude-sandbox.sh + + +# Authenticate gh CLI with a GitHub PAT (token not stored in shell history) +gh-auth: + #!/bin/bash + read -sp "GitHub PAT: " t && echo + echo "$t" | gh auth login --with-token + unset t + gh auth setup-git + gh auth status + + +# Authenticate glab CLI with a GitLab PAT (token not stored in shell history). +# --git-protocol https prevents glab's SSH insteadOf rewrite. +glab-auth hostname="gitlab.com": + #!/bin/bash + read -sp "GitLab PAT for {{ hostname }}: " t && echo + echo "$t" | glab auth login --stdin --hostname {{ hostname }} --git-protocol https + unset t + glab auth status diff --git a/template/.devcontainer/devcontainer.json.jinja b/template/.devcontainer/devcontainer.json.jinja index 24f72656..835fc410 100644 --- a/template/.devcontainer/devcontainer.json.jinja +++ b/template/.devcontainer/devcontainer.json.jinja @@ -11,17 +11,8 @@ "remoteUser": "root",{% endif %} "remoteEnv": { // Allow X11 apps to run inside the container - "DISPLAY": "${localEnv:DISPLAY}",{% if add_claude %} - // Disable SSH agent forwarding — prevents Claude from using host SSH keys - "SSH_AUTH_SOCK": "", - // Disable VS Code git credential injection — prevents askpass from - // relaying host GitHub credentials into the container over the IPC socket - "GIT_ASKPASS": "", - "VSCODE_GIT_IPC_HANDLE": "", - "VSCODE_GIT_ASKPASS_MAIN": "", - "VSCODE_GIT_ASKPASS_NODE": "", - "VSCODE_GIT_ASKPASS_EXTRA_ARGS": "",{% endif %} - // Mark this shell as running inside the devcontainer + "DISPLAY": "${localEnv:DISPLAY}", + // Mark this shell as running inside the devcontainer. "IN_DEVCONTAINER": "1", // Put things that allow it in the persistent cache "PRE_COMMIT_HOME": "/cache/pre-commit", @@ -44,10 +35,7 @@ "python.terminal.activateEnvironment": false, // Workaround to prevent garbled python REPL in the terminal // https://github.com/microsoft/vscode-python/issues/25505 - "python.terminal.shellIntegration.enabled": false{% if sphinx %}, - // Only forward explicitly listed ports — auto-detection races with - // sphinx-autobuild and steals the port on restart - "remote.autoForwardPorts": false{% endif %} + "python.terminal.shellIntegration.enabled": false }, // Add the IDs of extensions you want installed when the container is created. "extensions": [ @@ -61,18 +49,17 @@ "anthropic.claude-code"{% endif %} ] } - },{% if sphinx %} - // Explicitly forward sphinx-autobuild port (auto-detection disabled above) - "forwardPorts": [ - 8000 - ],{% endif %} + }, // Create host-side dirs needed for bind mounts before the container starts "initializeCommand": "mkdir -p ${localEnv:HOME}/.config/terminal-config{% if add_claude %} ${localEnv:HOME}/.claude{% endif %}", "runArgs": [ // Allow the container to access the host X11 display and EPICS CA "--net=host", // Make sure SELinux does not disable with access to host filesystems like tmp - "--security-opt=label=disable" + "--security-opt=label=disable"{% if add_claude %}, + // Required for `unshare -m` in the `just claude` recipe; without it, + // rootless podman blocks mount-namespace creation. See README-CLAUDE.md. + "--cap-add=SYS_ADMIN"{% endif %} ], "mounts": [ // Mount in the user terminal config folder so it can be edited @@ -86,19 +73,19 @@ "source": "devcontainer-shared-cache", "target": "/cache", "type": "volume" - }{% if install_gh %}, + }{% if add_claude %}, // Persist gh auth across container rebuilds with per-repo scoped PAT { "source": "gh-auth-${localWorkspaceFolderBasename}", "target": "/root/.config/gh", "type": "volume" - }{% endif %}{% if install_glab %}, + }, // Persist glab auth across container rebuilds (GitLab CLI) { "source": "glab-auth-${localWorkspaceFolderBasename}", "target": "/root/.config/glab-cli", "type": "volume" - }{% endif %}{% if add_claude %}, + }, // Mount Claude config from host (settings, memory, skills) { "source": "${localEnv:HOME}/.claude", @@ -107,13 +94,6 @@ }{% endif %} ], // Mount the parent as /workspaces so we can pip install peers as editable - "workspaceMount": "source=${localWorkspaceFolder}/..,target=/workspaces,type=bind",{% if add_claude %} - "postCreateCommand": ".devcontainer/postCreate.sh", - "postStartCommand": ".devcontainer/postStart.sh", - // VS Code's Dev Containers extension re-injects its credential bridge - // when the editor attaches — after postStart has already run. Re-run - // the cleanup at attach so the leak is closed before any git operation. - "postAttachCommand": ".devcontainer/postStart.sh"{% else %} - // After the container is created, recreate the venv then make pre-commit first run faster - "postCreateCommand": "uv venv --clear && uv sync && pre-commit install --install-hooks"{% endif %} + "workspaceMount": "source=${localWorkspaceFolder}/..,target=/workspaces,type=bind", + "postCreateCommand": ".devcontainer/postCreate.sh" } diff --git a/template/.devcontainer/postCreate.sh.jinja b/template/.devcontainer/postCreate.sh.jinja new file mode 100755 index 00000000..b7a1177f --- /dev/null +++ b/template/.devcontainer/postCreate.sh.jinja @@ -0,0 +1,52 @@ +#!/bin/bash +set -euo pipefail + +# Refuse to continue without a git repo. setuptools-scm needs git +# tags to compute the package version, and pre-commit installs its +# hooks into .git/hooks — both fail with cryptic errors that VS Code +# then hides behind a generic "postCreateCommand failed" message. +# Better to stop here with a clear explanation. +if [ ! -d .git ]; then + cat >&2 <<'EOF' + +================================================================ +ERROR: This directory is not a git repository. + +setuptools-scm needs git history to compute the package version, +and pre-commit installs its hooks into .git/hooks. Neither will +work without a git repo. + +To fix this, run on the host (outside the devcontainer): + + git init -b main && git add . && git commit -m 'Initial commit' + +then rebuild the devcontainer. + +================================================================ + +EOF + exit 1 +fi + +# Install Python dependencies and pre-commit hooks. `uv venv --clear` wipes +# the venv that lives in /cache (a persistent named volume), so any bash +# hash entries pointing into the old venv (e.g. cached `pre-commit` path) +# are stale. `hash -r` after `uv sync` forces re-resolution against the +# freshly populated venv and against any new `uv` location after a base +# image bump. +uv venv --clear +hash -r +uv sync +pre-commit install --install-hooks + +# Init only submodules that aren't checked out yet — first-clone +# protection without touching already-initialized submodules (which +# would yank in-progress branch work to detached HEAD on rebuild). +if [ -f .gitmodules ]; then + missing=$(git submodule status | awk '/^-/ {print $2}') + [ -n "$missing" ] && git submodule update --init $missing +fi +{% if add_claude %} +# Install Claude Code CLI +curl -fsSL https://claude.ai/install.sh | bash +{% endif -%} diff --git a/template/.devcontainer/{% if add_claude %}claude-sandbox.sh{% endif %} b/template/.devcontainer/{% if add_claude %}claude-sandbox.sh{% endif %} new file mode 100755 index 00000000..8c611b26 --- /dev/null +++ b/template/.devcontainer/{% if add_claude %}claude-sandbox.sh{% endif %} @@ -0,0 +1,82 @@ +#!/bin/bash +# Inner script for `just claude`: runs inside a private mount namespace +# (created by `unshare -m` from the justfile recipe). Mounts tmpfs over +# the locations VS Code uses for host bridges, builds a Claude-only +# /root/.gitconfig, then exec's claude with PR_SET_PDEATHSIG so it dies +# if its parent shell does. Requires CAP_SYS_ADMIN — granted via +# --cap-add=SYS_ADMIN in devcontainer.json's runArgs. See +# README-CLAUDE.md for the full sandbox model. +set -euo pipefail + +# VS Code drops IPC sockets (vscode-ipc-*.sock, vscode-git-*.sock, +# vscode-ssh-auth-*.sock, vscode-remote-containers-ipc-*.sock) and the +# vscode-remote-containers-*.js credential shim in /tmp, plus more in +# /run/user//. Replacing those directories with tmpfs in Claude's +# namespace makes them invisible. Outside the namespace (the user's +# regular terminal) VS Code keeps using them normally. +mount -t tmpfs tmpfs /tmp +if [ -d /run/user ]; then + mount -t tmpfs tmpfs /run/user +fi + +# Mask credential directories the user may bind-mount from the host for +# their own use from non-Claude terminals (e.g. ~/.ssh for SSH-based +# git push). Claude sees an empty tmpfs; the user's regular shell sees +# the originals. +for d in /root/.ssh /root/.gnupg /root/.aws /root/.azure /root/.gcloud /root/.docker; do + if [ -d "$d" ]; then + mount -t tmpfs tmpfs "$d" + fi +done +# .netrc is a single file, not a dir — mask via bind to /dev/null. +if [ -e /root/.netrc ]; then + mount --bind /dev/null /root/.netrc +fi + +# Build a Claude-only /root/.gitconfig containing the in-container +# credential helpers (gh / glab) and HTTPS rewrites — and nothing else +# the user has on the host (no SSH url rewrites, no host-specific +# helpers). User identity is read from the original gitconfig BEFORE +# we bind over it, so commits Claude makes are still attributed. +git_name=$(git config --get user.name 2>/dev/null || true) +git_email=$(git config --get user.email 2>/dev/null || true) +gh_path=$(command -v gh || echo /usr/bin/gh) +glab_path=$(command -v glab || echo /usr/local/bin/glab) +cat > /etc/claude-gitconfig < ` only replaces if there is a single value. -# IMPORTANT: VS Code writes its credential.helper to /etc/gitconfig -# (system scope), not ~/.gitconfig — so the system scope must also be -# cleared, otherwise the helper still runs. -for scope in --system --global; do - git config $scope --unset-all credential.helper 2>/dev/null || true - git config $scope --unset-all credential.https://github.com.helper 2>/dev/null || true -{%- if install_glab %} - git config $scope --unset-all credential.https://gitlab.diamond.ac.uk.helper 2>/dev/null || true -{%- endif %} - git config $scope --unset-all url.ssh://git@github.com/.insteadOf 2>/dev/null || true -done - -# VS Code drops a Node-based credential bridge in /tmp that talks back -# to the host over a named pipe — even with VSCODE_GIT_IPC_HANDLE blank -# it can still surface host PATs. Remove it so any stale `credential.helper` -# entries cannot fall through to it. -rm -f /tmp/vscode-remote-containers-*.js - -# Force all SSH-style remotes to use HTTPS so the gh/glab credential helpers -# handle auth. This keeps the container SSH-key-free (Claude stays sandboxed) -# while still allowing push/pull on repos whose remotes are set to git@...:. -git config --global url."https://github.com/".insteadOf "git@github.com:" -{%- if install_glab %} -git config --global url."https://gitlab.diamond.ac.uk/".insteadOf "git@gitlab.diamond.ac.uk:" -{%- endif %} - -{% if install_gh -%} -# Pin per-host helper to the in-container gh path. The host gitconfig may -# reference /usr/local/bin/gh which doesn't exist here (apt installs to -# /usr/bin/gh); without this, git falls through to the next helper. -if command -v gh >/dev/null; then - git config --global credential.https://github.com.helper "!$(command -v gh) auth git-credential" -fi - -# If gh CLI has cached credentials (survive container rebuild), re-register -# its git credential helper so HTTPS remotes authenticate automatically. -if gh auth status &>/dev/null; then - gh auth setup-git -fi -{%- endif %} -{% if install_glab %} -# Pin per-host helper to the in-container glab path. -if command -v glab >/dev/null; then - git config --global credential.https://gitlab.diamond.ac.uk.helper "!$(command -v glab) auth git-credential" -fi -{%- endif %} diff --git a/template/Dockerfile.jinja b/template/Dockerfile.jinja index 422cfa17..416bbb95 100644 --- a/template/Dockerfile.jinja +++ b/template/Dockerfile.jinja @@ -9,19 +9,23 @@ RUN apt-get update -y && apt-get install -y --no-install-recommends \ # Node is required by Claude Code's hook runtime; just powers the # container's claude/gh-auth/glab-auth recipes in justfile. +# TODO: nodejs, just, gh and glab will move into the ubuntu-devcontainer +# base image once it ships on Ubuntu 26.04, where all are available +# from apt at sufficient versions. At that point these blocks can be +# dropped. RUN apt-get update -y && apt-get install -y --no-install-recommends \ nodejs \ just \ - && apt-get dist-clean{% endif %}{% if install_gh %} + && apt-get dist-clean -# GitHub CLI — used by Claude to authenticate to github.com via PAT +# GitHub CLI — used by Claude to authenticate to github.com via PAT. RUN curl -fsSL https://cli.github.com/packages/githubcli-archive-keyring.gpg | \ dd of=/usr/share/keyrings/githubcli-archive-keyring.gpg && \ chmod go+r /usr/share/keyrings/githubcli-archive-keyring.gpg && \ echo "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/githubcli-archive-keyring.gpg] https://cli.github.com/packages stable main" \ | tee /etc/apt/sources.list.d/github-cli.list > /dev/null && \ apt-get update && apt-get install -y --no-install-recommends gh && \ - apt-get dist-clean{% endif %}{% if install_glab %} + apt-get dist-clean # GitLab CLI — used by Claude to authenticate to gitlab instances via PAT. # No apt repo, so install from the upstream release tarball. diff --git a/template/README.md.jinja b/template/README.md.jinja index 1365caa9..473058fa 100644 --- a/template/README.md.jinja +++ b/template/README.md.jinja @@ -16,7 +16,7 @@ how it does it, and why people should use it. {% if pypi %}PyPI | `pip install {{distribution_name}}` {% endif %}{% if docker %}Docker | `docker run ghcr.io/{{github_org | lower}}/{{repo_name}}:latest` {% endif %}{% if sphinx %}Documentation | <{{docs_url}}> -{% endif %}{% if add_claude %}Claude sandbox | [README-CLAUDE.md](./README-CLAUDE.md) +{% endif %}{% if add_claude %}Claude sandbox | README-CLAUDE.md {% endif %}Releases | <{{repo_url}}/releases> This is where you should put some images or code snippets that illustrate diff --git a/template/{% if add_claude %}.claude{% endif %}/commands/pr-squash.md b/template/{% if add_claude %}.claude{% endif %}/commands/pr-squash.md new file mode 100644 index 00000000..bde9db09 --- /dev/null +++ b/template/{% if add_claude %}.claude{% endif %}/commands/pr-squash.md @@ -0,0 +1,88 @@ +# PR Squash + +Create a clean PR by grouping the current branch's commits into logical squashed +commits on a new branch, then opening a pull request. + +## Instructions + +1. **Determine the base branch.** Use `$ARGUMENTS` if provided, otherwise detect + the repo's default branch (`main` or `master`) via `gh repo view --json + defaultBranchRef -q .defaultBranchRef.name`. + +2. **Collect the commit history.** Run: + ``` + git log --oneline --reverse .. + ``` + These are the commits to be grouped. + +3. **Analyse and group the commits.** Read the diffs for each commit + (`git show --stat ` and `git show ` for ambiguous cases). + Group commits into logical units: + - Each group should represent one cohesive change (a feature, a fix, a + refactor, a config change, etc.). + - Iterative fix-up commits ("fix typo", "try again", "wip") belong with the + feature they relate to. + - Keep genuinely independent changes in separate groups. + - Preserve chronological order between groups where possible. + +4. **Decide: one PR or multiple PRs.** If the groups fall into distinct, + unrelated topics (e.g. "developer tooling" vs "production feature"), plan + to create **separate PRs** — one per topic. Each PR gets its own squash + branch (`-squash-1`, `-squash-2`, etc.) and contains only + the groups for that topic. Groups that are closely related (e.g. a feature + and its config) stay in the same PR as separate squashed commits. + + Rule of thumb: if a reviewer would reasonably want to merge one topic + without the other, they belong in separate PRs. + +5. **Present the grouping plan.** Show the user a numbered list like: + ``` + PR 1: "Devcontainer hardening and tooling" + Group 1: "harden devcontainer and add Just task runner" + - abc1234 add security settings + - def5678 replace tox with just + + PR 2: "Add Dex OIDC authentication" + Group 2: "configure Dex and argocd-monitor" + - jkl3456 add Dex config + - mno7890 fix client secret + - pqr1234 fix audience mismatch + ``` + If all groups are closely related, show a single PR with multiple groups. + Ask the user to confirm or adjust before proceeding. + +6. **Create squash branch(es).** Once approved, for each PR: + ``` + git checkout -b + ``` + Use `-squash` for a single PR, or + `-squash-` (or a short descriptive suffix) for multiple. + +7. **Cherry-pick and squash each group.** For each group in the PR: + ``` + git cherry-pick --no-commit ... + git commit -m "" + ``` + Use a well-written conventional commit message for each group. Include a + short body if the group contains non-obvious changes. Preserve any + `Co-Authored-By` trailers from the original commits. + +8. **Push and create the PR(s).** + ``` + git push -u origin + ``` + Create each PR with `gh pr create` targeting the base branch. The PR body + should summarise its squashed commit group(s). + +9. **Switch back** to the original branch so the user's working state is + unchanged. + +## Edge cases +- If there are fewer than 3 commits, suggest the user just squash-merge + directly instead — but proceed if they insist. +- If cherry-pick conflicts arise, stop and inform the user rather than + auto-resolving. +- Never force-push or modify the original branch. +- If a `-squash` branch already exists, ask the user before overwriting. +- When merging a PR created by this command, use `gh pr merge --merge` + (not `--squash`) to preserve the curated commit structure. diff --git a/template/{% if add_claude %}.claude{% endif %}/hooks/sandbox-check.sh b/template/{% if add_claude %}.claude{% endif %}/hooks/sandbox-check.sh index 2fe82937..1d560b84 100755 --- a/template/{% if add_claude %}.claude{% endif %}/hooks/sandbox-check.sh +++ b/template/{% if add_claude %}.claude{% endif %}/hooks/sandbox-check.sh @@ -9,25 +9,36 @@ fail() { echo "BLOCKED: $1" >&2; exit 2; } [ -n "${IN_DEVCONTAINER:-}" ] || \ fail "not in the devcontainer (IN_DEVCONTAINER unset). Reopen the project in the devcontainer." -# Host SSH agent must not be reachable. +# IS_SANDBOX=1 is set by the inner `just claude` script after it sets up +# the private mount namespace. If it's missing, Claude was launched +# without the namespace and /tmp/vscode-*.sock host bridges are reachable. +[ -n "${IS_SANDBOX:-}" ] || \ + fail "IS_SANDBOX unset — Claude was not launched via \"just claude\", so the mount-namespace sandbox is not active." + +# Host SSH agent must not be reachable. remoteEnv blanks SSH_AUTH_SOCK and +# `just claude` re-blanks it; if it is set, neither layer applied. [ -z "${SSH_AUTH_SOCK:-}" ] || \ - fail "SSH_AUTH_SOCK is set ($SSH_AUTH_SOCK) — host SSH agent is reachable." - -# VS Code git credential bridge must be silenced. -[ -z "${VSCODE_GIT_IPC_HANDLE:-}" ] || \ - fail "VSCODE_GIT_IPC_HANDLE is set — VS Code credential bridge is reachable." -[ -z "${GIT_ASKPASS:-}" ] || \ - fail "GIT_ASKPASS is set — VS Code askpass is injected." - -# The /tmp credential helper script VS Code drops in must have been removed. -if compgen -G '/tmp/vscode-remote-containers-*.js' >/dev/null; then - fail "/tmp/vscode-remote-containers-*.js bridge present — re-run .devcontainer/postStart.sh." -fi - -# system-scope credential.helper is where VS Code injects; if anything -# is set there git will use it before our per-host helpers. -if git config --system --get credential.helper >/dev/null 2>&1; then - fail "system credential.helper is still set — re-run .devcontainer/postStart.sh." -fi + fail "SSH_AUTH_SOCK is set ($SSH_AUTH_SOCK) — host SSH agent is reachable. run \"just claude\" or rebuild the devcontainer." + +# GIT_ASKPASS points at a script under /.vscode-server, which the +# namespace does NOT mask. If the env var is non-empty AND the file is +# reachable, claude-sandbox.sh's exec-line blank failed to apply. +[ ! -e "${GIT_ASKPASS:-}" ] || \ + fail "GIT_ASKPASS script ($GIT_ASKPASS) is reachable — claude-sandbox.sh did not blank the env var. Rebuild the devcontainer or re-run \"just claude\"." + +# /root/.gitconfig must be the bind-mounted /etc/claude-gitconfig (gh/glab +# helpers only). VS Code reconnects can drop the bind and re-expose the +# host gitconfig, whose [credential] helper invokes a node script under +# /.vscode-server via /tmp/vscode-remote-containers-*.js — leaking the +# host's git credentials into the sandbox. +! grep -q -e 'vscode-remote-containers' -e '\.vscode-server' /root/.gitconfig 2>/dev/null || \ + fail "/root/.gitconfig contains a VS Code credential bridge — the bind on /root/.gitconfig has been dropped (likely by a VS Code reconnect). Exit Claude and re-run \"just claude\"." + +# /etc/gitconfig must be masked (bind-mounted to /dev/null by claude-sandbox.sh). +# If the host's system-scope gitconfig is reachable, it can carry url.insteadof, +# core.hooksPath, http.proxy, or credential helpers that bypass /root/.gitconfig. +# `git config --system --list` returning any content means the mask is gone. +[ -z "$(git config --system --list 2>/dev/null)" ] || \ + fail "/etc/gitconfig is exposing system-scope settings — the bind-mount mask on /etc/gitconfig has been dropped. Exit Claude and re-run \"just claude\"." exit 0 diff --git a/template/{% if add_claude %}.claude{% endif %}/skills/copier-derived/SKILL.md b/template/{% if add_claude %}.claude{% endif %}/skills/copier-derived/SKILL.md deleted file mode 100644 index e6965e62..00000000 --- a/template/{% if add_claude %}.claude{% endif %}/skills/copier-derived/SKILL.md +++ /dev/null @@ -1,60 +0,0 @@ ---- -name: copier-derived -description: This project was generated from python-copier-template. Use when editing devcontainer / Dockerfile / .github / pre-commit / justfile / .gitleaks / renovate config, or when the user asks about updating from the template, resolving copier conflicts, or why a config looks the way it does. ---- - -# Copier-template-derived project - -This project was generated from -[python-copier-template](https://github.com/diamondlightsource/python-copier-template). -The template is recorded in `.copier-answers.yml`: - -```bash -grep _src_path .copier-answers.yml # template source -grep _commit .copier-answers.yml # version applied -``` - -## Template-managed files - -`copier update` overwrites these from the template. Local edits will -either merge cleanly (good) or produce `.rej` / inline conflicts. -**Prefer editing the upstream template** for any change that should -apply to all projects — otherwise the next update reverts it. - -- `.devcontainer/**` -- `Dockerfile` -- `.github/workflows/*.yml`, `.github/CONTRIBUTING.md`, - `.github/ISSUE_TEMPLATE/`, `.github/PULL_REQUEST_TEMPLATE/` -- `.pre-commit-config.yaml`, `.gitleaks.toml`, `renovate.json` -- `justfile` -- `pyproject.toml` — top-level metadata, build-system, ruff/pyright/mypy - config, tox config (project deps and scripts are project-owned) -- `tests/conftest.py`, `tests/test_cli.py` -- `CLAUDE.md`, `README-CLAUDE.md`, `.claude/**` - -## Project-owned files - -Edit freely; never overwritten by `copier update`: - -- `src//**` -- New tests under `tests/` (other than the seeded `test_cli.py`) -- `README.md` (rendered once with placeholders, then yours) -- `.copier-answers.yml` answers (only `_commit` / `_src_path` are bumped - by `copier update`) - -## When the user asks to change a template-managed file - -1. Make the requested change in this project so it works now. -2. **Tell the user** the file is template-managed, and offer to also - update the upstream template if they have it checked out (commonly - at `/workspaces/python-copier-template`). Phrase as a choice — they - may want a project-only patch. -3. If both edits are made, the project edit can be reverted on the - next `copier update` once the template change reaches a release. - -## Running `copier update` - -The user runs this themselves (it touches many files); only run it -yourself if explicitly asked. Always pass `--trust`. After update, -resolve any conflicts (look for `<<<<<<<` markers and `.rej` files) -before committing. diff --git a/template/{% if add_claude %}README-CLAUDE.md{% endif %}.jinja b/template/{% if add_claude %}README-CLAUDE.md{% endif %}.jinja index 73f14aa9..2230e580 100644 --- a/template/{% if add_claude %}README-CLAUDE.md{% endif %}.jinja +++ b/template/{% if add_claude %}README-CLAUDE.md{% endif %}.jinja @@ -11,30 +11,92 @@ and how to verify the sandbox is intact. ## What's locked down -- **No host SSH keys.** `SSH_AUTH_SOCK` is unset in `remoteEnv`, so any - SSH-agent forwarded by the host is invisible inside the container. No - private keys are mounted into `/root/.ssh` either — only `known_hosts`. -- **No VS Code git credential injection.** `GIT_ASKPASS`, - `VSCODE_GIT_IPC_HANDLE`, `VSCODE_GIT_ASKPASS_*` are all blanked, and - `postStart.sh` aggressively unsets `credential.helper` and per-host - helpers in BOTH `--system` (`/etc/gitconfig`) and `--global` scopes — - VS Code writes the helper into the *system* gitconfig, so a - global-only cleanup leaves the leak open. The script also removes the - `/tmp/vscode-remote-containers-*.js` bridge that VS Code drops in. - The cleanup re-runs on `postAttachCommand` because VS Code re-injects - the helper after `postStartCommand`. -- **Per-host helpers point at the in-container CLI.** The host gitconfig - often references `/usr/local/bin/gh`; here `gh` is at `/usr/bin/gh`. We - rewrite the helper to `command -v gh` / `command -v glab` so it doesn't - fall through to a stale entry. -- **All git remotes forced to HTTPS.** `url..insteadOf` rewrites - `git@github.com:` and `git@gitlab.diamond.ac.uk:` so push/pull always - uses the gh/glab credential helper rather than SSH. +- **No host bridges via VS Code IPC sockets.** VS Code's server creates + several unix sockets in `/tmp` and `/run/user//` that are bridges + back to the host: `vscode-ipc-*.sock` (runs `code` CLI on the host), + `vscode-git-*.sock` (git credential bridge — surfaces host PATs), + `vscode-ssh-auth-*.sock` (host SSH agent forward), and + `vscode-remote-containers-ipc-*.sock` (Dev Containers extension RPC). + These are re-created on every window attach and continue to appear up + to ~60s later — see [the threat-model writeup][demmel-blog] — so any + one-shot cleanup leaves a window. The defence is the **`unshare -m`** + call in `just claude`: Claude runs in a private mount namespace where + `/tmp` and `/run/user//` are fresh tmpfs. The bridges still exist + in the parent namespace (VS Code keeps using them normally) but are + invisible to Claude. No race, no sweeper, no recurring check needed. + Requires `--cap-add=SYS_ADMIN` in `runArgs` for rootless podman. + + [demmel-blog]: https://www.danieldemmel.me/blog/coding-agents-in-secured-vscode-dev-containers +- **No host SSH keys, AWS/GCP/Azure/Docker credentials, GPG keys, or + netrc.** The same `unshare -m` masks `/root/.ssh`, `/root/.gnupg`, + `/root/.aws`, `/root/.azure`, `/root/.gcloud`, `/root/.docker`, and + `/root/.netrc` (where present) with empty tmpfs. This means you *can* + bind-mount your host `~/.ssh` into the container if you want to use + SSH keys from a regular terminal — Claude's namespace blanks them out + while non-Claude shells see the originals. `SSH_AUTH_SOCK` is blanked + in the namespace exec line so VS Code's agent forwarding (which the + user terminal keeps) cannot reach Claude. +- **Claude dies with its parent shell.** `setpriv --pdeathsig SIGKILL` + on the inner `claude` exec sets `PR_SET_PDEATHSIG`, so if the wrapping + `unshare`'d shell exits (terminal closed, Ctrl-C, etc.) the kernel + immediately kills Claude — there's no orphaned-claude window where the + namespace context is gone but Claude is still running tools. +- **Claude has its own `/root/.gitconfig` via bind-mount.** + `claude-sandbox.sh` writes `/etc/claude-gitconfig` containing only the + in-container gh/glab credential helpers, the `git@*:` → `https://` + url rewrites, `safe.directory = *`, and the user identity (read from + the host-copied gitconfig before we bind over it, so commits Claude + makes are still attributed). It then `mount --bind`s that file onto + `/root/.gitconfig` inside Claude's namespace. The user's regular + terminal keeps the original `/root/.gitconfig` (host content, copied + by `dev.containers.copyGitConfig`'s default), so the host's SSH url + rewrites, custom credential helpers, and identity all work normally + outside Claude — but Claude only ever sees the curated config. +- **The "log in to GitHub" popup is closed for Claude.** The user + terminal keeps `git.terminalAuthentication` at its default (true), so + `GIT_ASKPASS` and `VSCODE_GIT_IPC_HANDLE` are injected into terminals + and the user gets the natural VS Code OAuth popup when an HTTPS git + operation needs credentials. For Claude two things close that channel: + `claude-sandbox.sh`'s exec line blanks `GIT_ASKPASS`, + `VSCODE_GIT_IPC_HANDLE`, `VSCODE_GIT_ASKPASS_NODE`, + `VSCODE_GIT_ASKPASS_MAIN`, `VSCODE_IPC_HOOK_CLI`, and `BROWSER`; and + the IPC socket the askpass script would talk to lives in `/tmp`, + which is tmpfs-masked. Both layers must be defeated for Claude to + surface a popup. + + `.claude/hooks/sandbox-check.sh` is the periodic verifier: it fires + on every prompt submit and refuses to run Claude if `IS_SANDBOX` is + unset, `SSH_AUTH_SOCK` is set, or the path `GIT_ASKPASS` references + is reachable. - **Auth is per-repo.** `gh-auth-${repo}` and `glab-auth-${repo}` are named volumes, not bind mounts — each project gets its own scoped PAT via `just gh-auth` / `just glab-auth`. Authenticate once per repo and the token survives container rebuilds. +## What the user terminal gets (and why) + +VS Code's regular terminal runs *outside* Claude's namespace. It is +deliberately set up with the standard developer experience so working +in the devcontainer feels natural: + +- **Host gitconfig copied in.** `dev.containers.copyGitConfig` defaults + to true, so `/root/.gitconfig` carries the user's name, email, push + preferences, and any host url rewrites. Claude overrides this via + bind-mount; the user terminal sees the original. +- **SSH agent forwarding.** VS Code forwards the host SSH agent into + the container as it normally would; `SSH_AUTH_SOCK` points at + `/tmp/vscode-ssh-auth-*.sock`. Inside Claude's namespace `/tmp` is + tmpfs and the variable is blanked, so Claude cannot reach the agent. +- **VS Code OAuth popup for HTTPS git.** `git.terminalAuthentication` + is left at its default, so when an HTTPS git operation needs creds + the user gets the standard "log in to GitHub" popup. Claude's exec + blanks `GIT_ASKPASS` / `VSCODE_GIT_IPC_HANDLE` and masks the IPC + socket path, so the popup channel does not exist for Claude. +- **`code` CLI and host browser.** `VSCODE_IPC_HOOK_CLI` and `BROWSER` + are inherited by the user terminal so `code ` and tools that + open URLs do the natural thing. Both env vars are blanked in + Claude's exec and the sockets they reference live in `/tmp`. + ## What's deliberately exposed (and why) - **`/root/.claude` is bind-mounted from the host's `~/.claude`.** Claude's @@ -60,23 +122,43 @@ and how to verify the sandbox is intact. ## Verifying the sandbox -From inside the container: +Run inside `just claude` itself (use Claude's bash tool, or run the same +commands manually after dropping into a shell that has `unshare -m` set +up the way `just claude` does). The mount-namespace defences only apply +inside that namespace — a regular VS Code terminal will see the bridges +exactly as VS Code created them, which is correct. ```bash -# Should be empty / unset +# Canaries: should be unset (env blanks) and 1 (sandbox marker) echo "SSH_AUTH_SOCK='${SSH_AUTH_SOCK:-}'" -ssh-add -l # "Could not open a connection..." -ls /root/.ssh # only known_hosts - -# Should NOT return a host PAT +echo "GIT_ASKPASS='${GIT_ASKPASS:-}'" +echo "VSCODE_GIT_IPC_HANDLE='${VSCODE_GIT_IPC_HANDLE:-}'" +echo "VSCODE_IPC_HOOK_CLI='${VSCODE_IPC_HOOK_CLI:-}'" +echo "BROWSER='${BROWSER:-}'" +echo "IS_SANDBOX='${IS_SANDBOX:-}'" # should be 1 +ssh-add -l # "Could not open a connection..." + +# /tmp and /run/user should be empty tmpfs inside Claude's namespace. +ls /tmp # only claude-* runtime dirs +ls /run/user/*/ 2>/dev/null # nothing matching vscode-* +mount | grep -E ' on /tmp |/run/user' # tmpfs entries from claude-sandbox.sh + +# /root/.ssh and friends should be empty even if you bind-mount the host +# originals via devcontainer.json — Claude's namespace masks them. +ls /root/.ssh /root/.gnupg /root/.aws 2>/dev/null # all empty (or missing) + +# Claude's bind-mounted gitconfig: only gh/glab helpers + HTTPS rewrites, +# no host SSH url rewrites or unrelated host helpers. +git config --global --list | grep -E 'credential|insteadof' +mount | grep '/root/.gitconfig' # bind from /etc/claude-gitconfig + +# Should return creds only if `just gh-auth` has been run for this repo. printf 'protocol=https\nhost=github.com\n\n' | git credential fill - -# Should show only gh/glab helpers (no /tmp/vscode-remote-containers-*.js) -git config --global --list | grep -i credential ``` If `git credential fill` returns a `password=gho_...` for github.com when -you have not run `just gh-auth`, the sandbox is leaking — open an issue +you have not run `just gh-auth`, or if `ls /tmp` shows any `vscode-*` +entries inside the namespace, the sandbox is leaking — open an issue against the python-copier-template. ## Authenticating @@ -89,5 +171,11 @@ just glab-auth # gitlab.com (pass a hostname arg for self-hosted instances) ## Starting Claude ```bash -just claude # runs `claude --dangerously-skip-permissions` with SSH_AUTH_SOCK blanked +just claude # runs `claude --dangerously-skip-permissions` inside the mount namespace ``` + +After a rebuild from a previous version of this template, the user +terminal's `/root/.gitconfig` may still carry HTTPS rewrites or per-host +helpers that older `postStart.sh` runs added globally. Either rebuild +the devcontainer for a clean state, or `git config --global --unset-all` +the affected keys. diff --git a/template/{% if add_claude %}justfile{% endif %}.jinja b/template/{% if add_claude %}justfile{% endif %}.jinja index c7fe2f97..f83a107e 100644 --- a/template/{% if add_claude %}justfile{% endif %}.jinja +++ b/template/{% if add_claude %}justfile{% endif %}.jinja @@ -1,6 +1,12 @@ -# Start Claude Code in sandbox mode (no SSH agent, skip permission prompts) +# Start Claude Code in sandbox mode (no SSH agent, skip permission prompts). +# Runs Claude inside a private mount namespace so VS Code's host-bridge +# sockets (vscode-ipc-*.sock, vscode-git-*.sock, vscode-ssh-auth-*.sock, +# vscode-remote-containers-ipc-*.sock) in /tmp and /run/user// are +# invisible — Claude sees empty tmpfs at those paths. setpriv +# --pdeathsig SIGKILL inside the inner script makes Claude die if the +# wrapping shell exits. See README-CLAUDE.md for the full sandbox model. claude: - SSH_AUTH_SOCK= IS_SANDBOX=1 claude --dangerously-skip-permissions{% if install_gh %} + exec unshare -m --propagation private .devcontainer/claude-sandbox.sh # Authenticate gh CLI with a GitHub PAT (token not stored in shell history) @@ -10,7 +16,7 @@ gh-auth: echo "$t" | gh auth login --with-token unset t gh auth setup-git - gh auth status{% endif %}{% if install_glab %} + gh auth status # Authenticate glab CLI with a GitLab PAT (token not stored in shell history). @@ -20,4 +26,4 @@ glab-auth hostname="gitlab.com": read -sp "GitLab PAT for {{ '{{' }} hostname {{ '}}' }}: " t && echo echo "$t" | glab auth login --stdin --hostname {{ '{{' }} hostname {{ '}}' }} --git-protocol https unset t - glab auth status{% endif %} + glab auth status diff --git a/tests/test_example.py b/tests/test_example.py index 42459c11..8cf83631 100644 --- a/tests/test_example.py +++ b/tests/test_example.py @@ -215,8 +215,6 @@ def test_meta_matches_no_claude_template(tmp_path: Path): copy_project( tmp_path, add_claude=False, - install_gh=False, - install_glab=False, docker=False, docker_debug=False, )