privsep: uid separation (PrivSec Layer 1b) — Linux-proven, macOS-pending#118
privsep: uid separation (PrivSec Layer 1b) — Linux-proven, macOS-pending#118mattjoyce wants to merge 58 commits into
Conversation
…+ ADR grill decisions
Pure resolver (config -> ResolvedWorker, fail-closed on undefined grant) and a pure credential builder kept separate from the Setpgid lifecycle (Hickey A1). Threaded through the executor and spawnPlugin; unconfined is a no-op, a confined worker on a non-Unix platform fails the spawn closed. Unit-tested on darwin; the privileged EACCES wall test is linux+root-gated and validated on the Dell host.
…#92) Validated on a privileged Linux host (Dell, root golang:1.25 container): the dropped worker gets EACCES on the 0600 age key. t.TempDir nests under 0700 root parents the worker can't traverse, so the probe never ran; a single 0755 dir under /tmp fixes it. Wall test now PASS on privilege, skips cleanly elsewhere.
…te_dir, no duplicate uid (#84) Open map (any number of rows); two tiers are the documented posture, not a cap. uid<=0 rejected so a worker can never be root; duplicate uid rejected as false isolation (#87 would chown both state_dirs to one owner). Adds workers + WorkerConf to config.schema.json (authoring aid, ADR §11). Absent/empty map is valid here — the capability/refuse boot gate is #86.
…ource (#85) No-grant now falls back to the shared `default` tier when configured (operator decision Q2), else unconfined; an explicit grant still wins. Adds Source (granted/default/unconfined) to ResolvedWorker — the third member of the value the grill named. Authority split holds by construction: resolveWorker takes only the Config, never the manifest, so a plugin cannot choose its own privilege.
Boot gate (cmd/ductile/runtime.go): capability-to-drop x workers-configured must agree or the daemon refuses to start; service.unconfined is the explicit, loud override. Pure decision (evaluateBootGate) + a platform capability probe (root, or Linux CAP_SETUID/SETGID via /proc/self/status). The dispatcher drops only when the gate says enforce (WithPrivsepEnforce), so dev/override paths skip resolution. A refused drop is now a typed ErrWorkerDropFailed with its own plugin.drop_failed event, distinct from a missing binary, and classified TERMINAL (never retried). Verified on macOS; wall + refuse paths to re-verify on the Dell.
TestHasDropCapabilityAsRoot asserts hasDropCapability()==true under root (Dell / privileged container); skips on non-root dev. Full #86 enforce half now proven on privileged Linux: wall still bites under the enforce gate, capability probe reads true, drop-failed is typed + terminal.
When enforcing, ReconcileWorkerFilesystem locks the secrets surface (gateway-owned files tightened to 0600/0700, foreign-owned or still-loose fails closed) and gives each worker a private 0700 dir it owns (created/chowned). All-or-refuse — a failure aborts the boot, never run half-confined (B3). Wired into buildRuntime's enforce path. Secret-path tightening tested on macOS; worker-dir chown is root-gated (Dell).
…verifies under cap-only (#88) A cap-only gateway (CAP_SETUID+CAP_SETGID, no CAP_CHOWN) cannot chown worker dirs, so they are init-provisioned (sysusers.d accounts + tmpfiles.d 0700 dirs) and the gateway VERIFIES them fail-closed (reconcileWorkerDir: best-effort provision, then verify). Ships deploy/systemd/ templates + DEPLOYMENT.md §5b. Test gates switch from euid to hasDropCapability; adds TestPrivsepDropUnderCapabilityOnly for systemd-run.
…; carve #95 for Mac/rollout
A binary swapped since its grant must not inherit a trusted worker identity: bindWorkerToFingerprint (reusing #12's PluginVerifier) downgrades it to the most-restricted tier (untrusted) so a supply-chain swap cannot reach a sibling's memory; fail-closed if there is no tier to downgrade to. Blast-radius reduction, not an execution gate (grill B5). Wired into spawnPlugin with a worker_downgraded event. Unit-tested with a fake verifier; downgrade/keep/fail-closed paths covered.
TestPrivsepNegativeSuite drops one worker and probes the whole surface — age key, config, state DB, and a SIBLING worker's dir — asserting EACCES on all, plus a positive control (own dir writable). Two different workers (never default/default) so the cross-worker probe can't pass trivially; within-default residual noted, not claimed. CI builds the test binary and runs the privileged privsep tests under sudo (they skip as non-root — skip is never a pass), making the wall a real CI gate.
…dir-0700 + validated conversions validateWorkers now rejects uid/gid > MaxInt32 so the spawn-time uint32 conversion is provably safe (not just suppressed); inline #nosec G115 on the conversions referencing the validation, and #nosec G302 on the worker-dir 0700 chmod (0700 on a directory is the isolation floor, not an over-permissive file). gosec -severity medium now clean of privsep findings (2 remaining are pre-existing SQL in stopwatch_prune.go). Fixes the validateWorkers test to match the new message.
…workers) The privsep identity concept collided with the dispatch concurrency 'workers' (service.max_workers, the worker pool) and the vault 'principal'. Rename to the service-account lineage: config keys workers:→accounts:, per-plugin worker:→run_as:; Go WorkerConf→AccountConf, ResolvedWorker→ResolvedAccount, resolveWorker→resolveAccount, etc. Pure rename — no behaviour change. Adds run_as to the PluginConf JSON schema (was missing) and TestAccountYAMLKeys to pin the struct tags. Concurrency 'workers' deliberately untouched. Validates uid/gid range (gosec G115) was kept.
…+rebase, review order code→docs
…face, error taxonomy) Folds the safe, high-signal feedback from the 4-panel code review (all panels approved, zero merge-blockers): - boot-time grant resolution (validateAccountGrants): a run_as grant naming an undefined account now fails at config LOAD, not per-job at first spawn — moves a knowable fault from spawn-time/late to boot-time/early (the 4/4 headline finding). Plus boot warns when the default/untrusted tiers are absent. - SecretSurfacePaths: single-sourced secrets surface that reconciles the config DIRECTORY (not just the file), closing the file-form sibling-secret gap. - ErrNoDowngradeTarget split from ErrAccountDropFailed (honest taxonomy); the dispatcher treats both as terminal/no-retry. - vocab residue cleaned from code (log keys, comments, "a account" strings, workersConfigured -> accountsConfigured); _other euid documented as unused. Deferred with reasons (card #97): ResolvedAccount sum-type (naive Confined() is a zero-value fail-open footgun), once-per-spawn attestation (risky refactor, minor). gofmt/golangci-lint/gosec clean; suite green; independent review confirmed no fail-open regression.
- #96 (ADR vocab sync) -> done. - #83 epic: luminary code review done (unanimous approve, zero blockers); Tier A+B folded; T7 finding (live host loads a vault -> compose-attestation active, but no accounts map -> privsep unconfined, enforce macOS-pending #95). Next: doc review. - #97: deferred non-blocking review follow-ups (T3, T5, T9, T15, vocab lint).
…next-action at it Replaces the planned "documentation review" with an action card: capture the privsep theory (Naur) sorted into Diátaxis registers, folding the synthesis Tier D doc items (T5c, T11, T12, T13, T14, T17). Last gate before the PR.
ADR (explanation): T5c downgrade-vs-secret-denial asymmetry in §4; T17 point-in-time-at-boot tradeoff in §8. DEPLOYMENT.md: §5b how-to gains T11 uid-coupling SSOT, T12 Linux-proven/ macOS-pending (#95), T13 root-refusal dev note, and account:→run_as: drift fix; new §5c reference (keys, reserved keywords, boot-gate matrix, failure modes). Schema: add service.unconfined to ServiceConfig — was a live key absent from the strict schema, so config validate --whole would reject the one documented escape hatch. Closes #98.
…#101, v1.0 line #102 Kay × Victor session split: privsep is mechanism (v1.0 must be true/correct/documented), explain is comprehension (v1.x). #99 broadened to the explain verb family (privsep + vault). #100 promotes the T3 ResolvedAccount sum-type (the one correctness deferral) to the v1.0 line. #101 is the cheap anti-footgun (valid != enforcing, loud at boot + config check). #102 tracks the v1.0 ship line. Epic #83 open-list + v1.0 framing updated.
Deploying with
|
| Status | Name | Latest Commit | Preview URL | Updated (UTC) |
|---|---|---|---|---|
| ✅ Deployment successful! View logs |
ductile | 367c028 | Commit Preview URL Branch Preview URL |
Jun 07 2026, 10:52 AM |
|
Note Reviews pausedIt looks like this branch is under active development. To avoid overwhelming you with review comments due to an influx of new commits, CodeRabbit has automatically paused this review. You can configure this behavior by changing the Use the following commands to manage reviews:
Use the checkboxes below for quick actions:
📝 WalkthroughWalkthroughAdds account-based privilege separation across config/schema, loader validation, boot-gate evaluation, filesystem reconciliation, spawn-time UID/GID drops with fingerprint binding and fail-closed handling, dispatcher wiring, privileged integration tests (CI sudo gating), systemd deploy/install artifacts, and operator runbooks/kanban. ChangesPrivilege Separation Core & Enforcement
Sequence Diagram(s): sequenceDiagram
participant Runtime as buildRuntime
participant BootGate as dispatch.BootGate
participant Dispatcher as Dispatcher
participant Verifier as PluginVerifier
participant Executor as subprocessExecutor
Runtime->>BootGate: BootGate(cfg)
BootGate-->>Runtime: BootMode(enforce|unconfined)
Runtime->>Dispatcher: New(..., WithPrivsepEnforce(enforce))
Dispatcher->>Dispatcher: resolveAccount(plugin)
Dispatcher->>Verifier: verify fingerprint (if confined)
Verifier-->>Dispatcher: match / mismatch
Dispatcher->>Executor: newSubprocessExecutor(cmd, ResolvedAccount)
Executor->>Executor: applyAccountCredential(cmd, ResolvedAccount)
Executor-->>Dispatcher: start success or ErrAccountDropFailed
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Poem
✨ Finishing Touches🧪 Generate unit tests (beta)
|
… (Thinkpad) Deploy-as-new beats migration for home-entangled gateways; chown-only-/etc footgun; empty-DB for newer binary; wall-proof recipe (sys_exec id → uid 1002); config-dir EACCES DX; confinable vs unconfinable plugin split (admin automation stays unconfined → Phase 2 #103).
…ry boundary First repo ADR. Canonical layout (binary /usr/bin root, config /etc/ductile 0750 ductile, state+vault /var/lib/ductile, plugin code /opt root world-r-x, accounts state dirs per-account 0700, runtime /run, journald logs, nothing under /home). Age key resolved sshd-style at /etc/ductile/secret/age.key 0600 ductile-owned, with backup-exclusion promoted to a TESTED invariant. Validated by the live Thinkpad enforce deploy. Provisioning via sysusers.d/tmpfiles.d.
…o test, #105 FHS install artifact #104 enforces the filesystem-layout ADR load-bearing invariant (age key never in a --scope config backup) as a regression test. #105 turns the hand-run deploy-as-new into a repeatable v1.0 install implementing the full ADR layout (/opt code, /etc/ductile/secret, /run, packaged binary+units).
Operator direction: convert fully to the new enforced/FHS ductile and migrate everything. #103 updated (all confinable, shared default, fabric last, establish via #105 packaging, 5-step migration sequence). #106 = the unconfinable admin automation (docker/apt/perf/file_handler + notifies) gets its own unconfined ductile instance — the ADR data-plane/admin split made concrete.
There was a problem hiding this comment.
Actionable comments posted: 5
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
docs/runbooks/privsep-thinkpad-enforce.md (3)
79-79:⚠️ Potential issue | 🟠 Major | ⚡ Quick winAge-key path here drifts from the ADR and weakens the backup-safety contract.
The runbook uses
/etc/ductile/age.key, but the accepted ADR defines/etc/ductile/secret/age.keyand ties exclusion/invariant checks tosecret/. Keep pathing consistent across deploy + wall-bite instructions.Also applies to: 81-81, 84-84, 116-116
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/runbooks/privsep-thinkpad-enforce.md` at line 79, The runbook uses the wrong age key path (/etc/ductile/age.key) which diverges from the ADR; update every occurrence (the sudo cp command and any later references at the noted locations) to use the ADR-approved path /etc/ductile/secret/age.key so backup/exclusion invariants remain correct; specifically change the copy command and subsequent mentions to point at /etc/ductile/secret/age.key and ensure any directory comments reflect the "secret/" component.
80-80:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove recursive
chownover/var/lib/ductile; it breaks account ownership boundaries.This command can clobber per-account dir ownership and cause fail-closed boot refusal. The same runbook already documents this exact footgun in the validated learnings.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/runbooks/privsep-thinkpad-enforce.md` at line 80, The recursive chown command "sudo chown -R ductile:ductile /etc/ductile /var/lib/ductile" is unsafe because it reassigns ownership of all per-account directories under /var/lib/ductile; remove or narrow the /var/lib/ductile portion. Replace the single recursive command with one that only chowns /etc/ductile recursively (keep "sudo chown -R ductile:ductile /etc/ductile") and either drop the recursive change for /var/lib/ductile or change ownership only for specific known service subdirs (e.g., target /var/lib/ductile/<service> or use a non-recursive chown on /var/lib/ductile itself) so per-account directories remain untouched.
23-24:⚠️ Potential issue | 🟠 Major | ⚡ Quick winConfig-dir mode is inconsistent (
0700vs ADR’s canonical0750).Please align the runbook wording with the accepted ADR mode contract to avoid operator misconfiguration during cutover.
Also applies to: 18-18, 138-140
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/runbooks/privsep-thinkpad-enforce.md` around lines 23 - 24, The runbook currently states the boot "fs-reconcile" tightens the config dir to 0700 and verifies each account's state_dir as 0700; update all occurrences to match the ADR canonical mode 0750 and clarify ownership semantics (e.g., “config dir mode 0750, owned by ductile”) so operators follow the ADR contract; search for the phrases "boot fs-reconcile", "config dir", and "state_dir" (also other mentions around the top and near the bottom of the document) and replace 0700 with 0750 and adjust the explanatory text to reflect the ADR-prescribed permissions and ownership model.
♻️ Duplicate comments (1)
docs/runbooks/privsep-thinkpad-enforce.md (1)
89-89:⚠️ Potential issue | 🔴 Critical | ⚡ Quick winRun config check against the staged binary, not the production path yet.
At this step,
/usr/local/bin/ductilecan still be old/missing; the check should target~/staging/ductile-newprepared in Step 2.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/runbooks/privsep-thinkpad-enforce.md` at line 89, The runbook currently runs the config check against /usr/local/bin/ductile; update the instruction so the command uses the staged binary ~/staging/ductile-new instead (keep sudo -u ductile and the existing arguments `config check --config /etc/ductile/config.yaml`) so the check targets the prepared staged binary (`~/staging/ductile-new`) rather than the production path `/usr/local/bin/ductile`.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@kanban/103-privsep-thinkpad-phase2-restore-plugins.md`:
- Line 56: Update the integration list string that currently contains
"discord/web/youtube/identity/healthdata/github-confinable" to use the official
capitalization "YouTube" (i.e., change "youtube" to "YouTube") so the entry
reads "discord/web/YouTube/identity/healthdata/github-confinable".
- Around line 39-54: The MD022 lint failure is due to missing blank lines after
ATX headings; add a single blank line immediately after the "## Operator
direction (2026-06-07) — FULL conversion, migrate everything" heading and
another blank line immediately after the "## Migration sequence (high level —
execute with ductile-admin over radio)" heading so each heading is followed by
an empty line as required by markdownlint.
In `@kanban/104-backup-excludes-secret-zero-test.md`:
- Around line 27-35: Add a blank line after the "## Acceptance" and "##
Narrative" headings to satisfy MD022; locate the two headings in the markdown
(the literal strings "## Acceptance" and "## Narrative") and insert an empty
line immediately below each so the headings are separated from the following
paragraph content.
In `@kanban/105-v1.0-fhs-install-artifact.md`:
- Around line 30-44: The markdown has MD022 heading-spacing violations: add a
single blank line immediately after each top-level heading "## Scope", "##
Acceptance", and "## Narrative" so there is one empty line between the heading
and the following paragraph; update the document to insert those blank lines
after each of those headings to satisfy markdownlint.
In `@kanban/106-ductile-admin-glue-unconfined-instance.md`:
- Around line 34-49: The markdown headings "## Shape", "## Acceptance", and "##
Narrative" each lack the required blank line beneath them (MD022); add a single
blank line immediately after each of those headings so the heading lines are
followed by an empty line before the next content paragraph or list, ensuring
the sections render correctly and satisfy the linter rule.
---
Outside diff comments:
In `@docs/runbooks/privsep-thinkpad-enforce.md`:
- Line 79: The runbook uses the wrong age key path (/etc/ductile/age.key) which
diverges from the ADR; update every occurrence (the sudo cp command and any
later references at the noted locations) to use the ADR-approved path
/etc/ductile/secret/age.key so backup/exclusion invariants remain correct;
specifically change the copy command and subsequent mentions to point at
/etc/ductile/secret/age.key and ensure any directory comments reflect the
"secret/" component.
- Line 80: The recursive chown command "sudo chown -R ductile:ductile
/etc/ductile /var/lib/ductile" is unsafe because it reassigns ownership of all
per-account directories under /var/lib/ductile; remove or narrow the
/var/lib/ductile portion. Replace the single recursive command with one that
only chowns /etc/ductile recursively (keep "sudo chown -R ductile:ductile
/etc/ductile") and either drop the recursive change for /var/lib/ductile or
change ownership only for specific known service subdirs (e.g., target
/var/lib/ductile/<service> or use a non-recursive chown on /var/lib/ductile
itself) so per-account directories remain untouched.
- Around line 23-24: The runbook currently states the boot "fs-reconcile"
tightens the config dir to 0700 and verifies each account's state_dir as 0700;
update all occurrences to match the ADR canonical mode 0750 and clarify
ownership semantics (e.g., “config dir mode 0750, owned by ductile”) so
operators follow the ADR contract; search for the phrases "boot fs-reconcile",
"config dir", and "state_dir" (also other mentions around the top and near the
bottom of the document) and replace 0700 with 0750 and adjust the explanatory
text to reflect the ADR-prescribed permissions and ownership model.
---
Duplicate comments:
In `@docs/runbooks/privsep-thinkpad-enforce.md`:
- Line 89: The runbook currently runs the config check against
/usr/local/bin/ductile; update the instruction so the command uses the staged
binary ~/staging/ductile-new instead (keep sudo -u ductile and the existing
arguments `config check --config /etc/ductile/config.yaml`) so the check targets
the prepared staged binary (`~/staging/ductile-new`) rather than the production
path `/usr/local/bin/ductile`.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 08dfe3d8-e998-4746-9970-d2c490447357
📒 Files selected for processing (6)
docs/adr/filesystem-layout.mddocs/runbooks/privsep-thinkpad-enforce.mdkanban/103-privsep-thinkpad-phase2-restore-plugins.mdkanban/104-backup-excludes-secret-zero-test.mdkanban/105-v1.0-fhs-install-artifact.mdkanban/106-ductile-admin-glue-unconfined-instance.md
| ## Operator direction (2026-06-07) — FULL conversion, migrate everything | ||
| This is no longer "restore a subset" — the new enforced/FHS ductile **becomes the system** and | ||
| **everything migrates onto it** (operator: "we are converting to the new ductile... establish it and | ||
| migrate everything; you have the shape"). Decisions locked: | ||
| - **Scope:** restore ALL currently-enabled confinable integrations. | ||
| - **Isolation:** shared `default` uid, accept the §2 sibling-residual (no per-plugin tier yet). | ||
| - **fabric:** confine it into the new setup, but **LAST** (home-bound external tool, most work). | ||
| - **Unconfinable admin automation → a SECOND, UNCONFINED ductile instance** (admin-role gateway, | ||
| hygiene-only ADR Layer 1a, runs as a privileged user with docker-group/apt access): `docker compose` | ||
| (astro_rebuild), `check-apt-security.sh`, `stopwatch-daily-perf.py`, `file_handler`(/home reader), | ||
| and their `*_notify` siblings. Tracked in [[106-ductile-admin-glue-unconfined-instance]]. The enforced | ||
| gateway is the data plane; this is the ADR data-plane/admin split made concrete (not a fallback). | ||
| - **Establish it *properly*, not hand-run** — couple with [[105-v1.0-fhs-install-artifact]] so the | ||
| enforced gateway is laid down from the packaged FHS layout, not the runbook by hand. | ||
|
|
||
| ## Migration sequence (high level — execute with ductile-admin over radio) |
There was a problem hiding this comment.
Fix markdown heading spacing to satisfy MD022.
## Operator direction and ## Migration sequence need a blank line after each heading (markdownlint MD022).
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 39-39: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 54-54: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kanban/103-privsep-thinkpad-phase2-restore-plugins.md` around lines 39 - 54,
The MD022 lint failure is due to missing blank lines after ATX headings; add a
single blank line immediately after the "## Operator direction (2026-06-07) —
FULL conversion, migrate everything" heading and another blank line immediately
after the "## Migration sequence (high level — execute with ductile-admin over
radio)" heading so each heading is followed by an empty line as required by
markdownlint.
Source: Linters/SAST tools
|
|
||
| ## Migration sequence (high level — execute with ductile-admin over radio) | ||
| 1. Enforced data-plane gateway established via the FHS install (#105) — vault carried, admission on, locked. | ||
| 2. Migrate all confinable integrations onto it (discord/web/youtube/identity/healthdata/github-confinable), |
There was a problem hiding this comment.
Use the official capitalization for “YouTube”.
Change youtube to YouTube in the integration list for consistency and correctness.
🧰 Tools
🪛 LanguageTool
[uncategorized] ~56-~56: The official name of this popular video platform is spelled with a capital “T”.
Context: ...nable integrations onto it (discord/web/youtube/identity/healthdata/github-confinable),...
(YOUTUBE)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kanban/103-privsep-thinkpad-phase2-restore-plugins.md` at line 56, Update the
integration list string that currently contains
"discord/web/youtube/identity/healthdata/github-confinable" to use the official
capitalization "YouTube" (i.e., change "youtube" to "YouTube") so the entry
reads "discord/web/YouTube/identity/healthdata/github-confinable".
Source: Linters/SAST tools
| ## Acceptance | ||
| - A test creates a config with an age key under the config dir, runs `system backup --scope config`, | ||
| and **fails** if the archive contains the age key path or any `secret/` entry. | ||
| - Covers the `config`, `plugins`, and `all` scopes (the nested ladder — each must still exclude the key). | ||
| - Also assert the `BACKUP_MANIFEST.txt` records the key as excluded (the existing claim becomes checked). | ||
| - If the test cannot be made to hold, the ADR's fallback applies: move the key to a sibling outside the | ||
| archived tree (systemd `LoadCredential` / `/etc/credstore.encrypted`) and document that instead. | ||
|
|
||
| ## Narrative |
There was a problem hiding this comment.
Add blank lines after headings to clear MD022 warnings.
## Acceptance and ## Narrative should be followed by a blank line.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 27-27: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 35-35: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kanban/104-backup-excludes-secret-zero-test.md` around lines 27 - 35, Add a
blank line after the "## Acceptance" and "## Narrative" headings to satisfy
MD022; locate the two headings in the markdown (the literal strings "##
Acceptance" and "## Narrative") and insert an empty line immediately below each
so the headings are separated from the following paragraph content.
Source: Linters/SAST tools
| ## Scope | ||
| 1. Extend `deploy/` to the full ADR layout: tmpfiles for `/opt/ductile`, `/etc/ductile/secret`, | ||
| `/run/ductile`; align ownership/modes to the ADR table. | ||
| 2. A repeatable install path — a `.deb`/`.rpm` (preferred) OR an `install.sh` + tarball — that places | ||
| binary, units, sysusers/tmpfiles, and the `/opt` plugin tree, idempotently. | ||
| 3. Decide binary path (`/usr/bin` packaged vs `/usr/local/bin` source) and make the unit match. | ||
| 4. Verify on a clean host: install → enforce boots green → wall-bite passes (the runbook's §7), | ||
| with zero hand steps. | ||
|
|
||
| ## Acceptance | ||
| A clean host goes from package-install to a green enforced gateway (wall-bite passing) with no manual | ||
| file shuffling; the layout matches the ADR exactly; the runbook becomes "install the package", not a | ||
| 20-step hand procedure. | ||
|
|
||
| ## Narrative |
There was a problem hiding this comment.
Resolve heading-spacing markdownlint violations (MD022).
Add a blank line after ## Scope, ## Acceptance, and ## Narrative.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 30-30: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 39-39: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 44-44: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kanban/105-v1.0-fhs-install-artifact.md` around lines 30 - 44, The markdown
has MD022 heading-spacing violations: add a single blank line immediately after
each top-level heading "## Scope", "## Acceptance", and "## Narrative" so there
is one empty line between the heading and the following paragraph; update the
document to insert those blank lines after each of those headings to satisfy
markdownlint.
Source: Linters/SAST tools
| ## Shape | ||
| - A second ductile service running **unconfined** (no `accounts:` map → boot gate = unconfined, | ||
| quiet) as a **privileged user** (matt, or a dedicated `ductile-admin` user in the docker group). | ||
| - Its OWN config + state + port (e.g. `/etc/ductile-admin` or a `--user` instance; distinct listen | ||
| addr from the enforced gateway's 8081). NOT under the enforced gateway's `/etc/ductile`. | ||
| - Hygiene-only is honest here: it runs trusted first-party admin scripts the operator wrote; the | ||
| threat it doesn't defend (popped admin plugin) is accepted because these need privilege anyway. | ||
|
|
||
| ## Acceptance | ||
| - Admin automation (docker rebuild, apt check, perf) runs on the unconfined instance + fires its | ||
| discord notifies, exactly as before the conversion. | ||
| - The enforced data-plane gateway carries NONE of these (verified: no docker/apt/home-reader plugins | ||
| enabled there). | ||
| - Both instances coexist; old `--user` ductile-local decommissioned once both are green. | ||
|
|
||
| ## Narrative |
There was a problem hiding this comment.
Add blank lines after section headings to satisfy MD022.
## Shape, ## Acceptance, and ## Narrative each need a blank line below the heading.
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 34-34: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 42-42: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
[warning] 49-49: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kanban/106-ductile-admin-glue-unconfined-instance.md` around lines 34 - 49,
The markdown headings "## Shape", "## Acceptance", and "## Narrative" each lack
the required blank line beneath them (MD022); add a single blank line
immediately after each of those headings so the heading lines are followed by an
empty line before the next content paragraph or list, ensuring the sections
render correctly and satisfy the linter rule.
Source: Linters/SAST tools
…vault fail-open ductile-admin source diligence found the secret_ref-for-plugins scheme cannot work: PluginConf has no secret_ref (only webhook/relay do); the only plugin secret path is the spawn-time stdin secrets map, needing principal==plugin-name (kebab, no normalization) AND the plugin reading that map. #107 = the dev workstream to make first-party plugins vault-native (the real secret-holder enforce path). #108 = vault compose is fail-OPEN on an unknown principal (silent no-secret) — out of step with privsep fail-closed spine.
…locate, secrets not a config reconcile Relocating plugin code to /opt invalidates attestation → #93 fail-safe downgrades to untrusted until plugin lock re-records fingerprints (witnessed live). Plugin secrets need the #107 vault-native workstream (PluginConf has no secret_ref; principals must be kebab; vault compose fail-open #108). migrate-everything splits 3 ways: keyless-now / secret-holders-after-#107 / unconfinable-admin-#106.
…substitution Tested live on discord_notify: config check rejects webhook_url_ref as not satisfying required webhook_url; runtime "No webhook_url configured". Vault→plugin is ONLY the stdin secrets map → per-plugin kebab-rename + code change to read secrets[...]. Not a config tweak.
Enforced data-plane gateway live on :8081: vault carried, 5 keyless integrations enforced+attested on default(1001), config/plugin lock, admission re-enabled, all wall-bites pass. #93 downgrade proven live. Remaining carded as dev workstreams: #107 (HEAVY, secret-holders), #106 (admin instance), #108 (vault fail-open), #105 (FHS package). Old --user decommission deferred until #106+#107 land.
Compose already errors on unknown principal, but composePluginSecrets deliberately opted out (silent no-secrets) — fine for keyless plugins, a footgun for one that should receive secrets. New plugins.<name>.requires_vault: true makes an unknown/unregistered principal (or no vault wired) fail the spawn CLOSED + loud (ErrVaultPrincipalRequired). Default false preserves the coexistence opt-out. + schema + 2 tests (unknown-principal and no-composer fail closed).
doctor/config-check now WARNs (valid, not error) when secrets/vault are configured but no accounts map (plugins run unconfined → a popped plugin reads the decrypted secrets — the T7 trap), and when service.unconfined sits on a configured accounts map (wall declared but disabled). Boot now logs an explicit UNCONFINED posture line for the plain dev case too, so posture is never silent. + 2 tests.
Idempotent root installer lays the ADR filesystem-layout package layer: service accounts + the FHS dir skeleton (/etc/ductile + secret/ ductile-0700, /opt/ductile/plugins root-0755 world-rx, /var/lib + /run via tmpfiles), the binary (root 0755 never-setuid), and the cap-only systemd unit. Config/secrets/plugin-code stay operator data (runbook). Extended tmpfiles for /run/ductile. Turns the hand-run deploy-as-new into "run the package", validated by the upcoming redeploy.
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (2)
schemas/config.schema.json (1)
37-46:⚠️ Potential issue | 🟠 Major | ⚡ Quick winDuplicate
tcc_pathskey in JSON schema.The
tcc_pathsproperty is defined twice (lines 37-46). JSON objects cannot have duplicate keys — the second definition silently overrides the first. Remove one of them; they also have slightly differentitemsconstraints (minLength: 1vs none).🐛 Proposed fix
"tcc_paths": { "type": "array", "description": "macOS-only: paths stat()-ed on cold start to surface TCC permission popups synchronously.", "items": { "type": "string", "minLength": 1 } }, - "tcc_paths": { - "type": "array", - "description": "macOS-only: List of absolute paths to stat() on cold start to surface TCC popups synchronously.", - "items": { "type": "string" } - },🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@schemas/config.schema.json` around lines 37 - 46, Remove the duplicate "tcc_paths" property in the JSON schema and consolidate into a single definition for "tcc_paths"; keep the more descriptive description ("macOS-only: List of absolute paths to stat() on cold start to surface TCC popups synchronously.") and ensure the "items" constraint enforces non-empty strings (type: "string" with "minLength": 1) so the schema validates absolute path entries correctly.docs/runbooks/privsep-thinkpad-enforce.md (1)
80-80:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove recursive ownership reset of
/var/lib/ductilein cutover instructions.Line 80 currently
chown -Rs/var/lib/ductile, which overwrites account-ownedstate_dirownership created by tmpfiles and can cause fail-closed startup refusal. The same runbook confirms this failure mode in Lines 151-153.Suggested doc fix
-sudo chown -R ductile:ductile /etc/ductile /var/lib/ductile +sudo chown -R ductile:ductile /etc/ductile +# Do not recursively chown /var/lib/ductile; tmpfiles owns account dirs.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/runbooks/privsep-thinkpad-enforce.md` at line 80, The runbook currently runs the command "sudo chown -R ductile:ductile /etc/ductile /var/lib/ductile" which recursively resets ownership of /var/lib/ductile and clobbers the tmpfiles-created state_dir ownership; remove /var/lib/ductile from that chown. Update the instruction to only chown /etc/ductile (e.g., "sudo chown -R ductile:ductile /etc/ductile") and add a short note explaining not to change ownership of tmpfiles-managed state_dir (/var/lib/ductile) to avoid startup refusal.
♻️ Duplicate comments (3)
kanban/83-privsep-epic.md (1)
164-165:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd a blank line below the Narrative heading (MD022).
Line 164 should be followed by an empty line before the first bullet at Line 165.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@kanban/83-privsep-epic.md` around lines 164 - 165, The Markdown linter flagged a missing blank line after the "## Narrative" heading; open the document and insert a single empty line immediately after the "## Narrative" header (before the first bullet that starts with "- **Authoritative design:** `~/Obsidian/Personal1/ductile/Ductile - PrivSec and Secrets.md`") so the header is separated from the following list and satisfies MD022.Source: Linters/SAST tools
kanban/101-privsep-valid-not-enforcing-antifootgun.md (1)
32-33:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winFix MD022 heading spacing under Acceptance and Narrative.
Line 32 and Line 38 need a blank line immediately below each heading to satisfy markdown lint (
blanks-around-headings).Also applies to: 38-39
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@kanban/101-privsep-valid-not-enforcing-antifootgun.md` around lines 32 - 33, Add a blank line immediately after the "Acceptance" and "Narrative" headings to satisfy markdown lint rule MD022 (blanks-around-headings); locate the lines containing the headings "Acceptance" and "Narrative" and insert an empty line directly below each so there is one blank line between the heading and the following paragraph/content.Source: Linters/SAST tools
kanban/105-v1.0-fhs-install-artifact.md (1)
30-31:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winAdd missing blank lines after section headings (MD022).
Line 30, Line 39, and Line 44 should each be followed by a blank line before body content.
Also applies to: 39-40, 44-45
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@kanban/105-v1.0-fhs-install-artifact.md` around lines 30 - 31, Add a blank line immediately after each Markdown section heading in this document (for example after "## Scope" and the other top-level/subsection headings flagged in the review) so that every heading is followed by one empty line before its body content, thereby satisfying MD022; update all heading occurrences in the file to include that single blank line.Source: Linters/SAST tools
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@kanban/107-plugins-vault-native-conformance.md`:
- Around line 48-49: The markdown has MD022 heading-spacing violations: add
exactly one blank line after each affected heading (the "Work (per
secret-needing first-party plugin: ...)" heading and the "Acceptance" and
"Narrative" headings referenced) so that there is a single empty line between
the heading line and the following content; update those headings in the file
(search for the exact heading text shown in the diff) to insert one blank line
after each heading to resolve the lint errors.
In `@kanban/108-vault-compose-fail-open-unknown-principal.md`:
- Around line 30-31: The markdown violates MD022 by not having blank lines after
heading lines—insert a single blank line immediately after the "Options (decide
on pickup)" heading and likewise after the "Acceptance" and "Narrative" headings
so each heading is followed by one empty line before its paragraph/list; update
the blocks containing those headings (e.g., the "Options (decide on pickup)"
heading and the subsequent lists at lines referenced) to include the blank line
to satisfy MD022.
---
Outside diff comments:
In `@docs/runbooks/privsep-thinkpad-enforce.md`:
- Line 80: The runbook currently runs the command "sudo chown -R ductile:ductile
/etc/ductile /var/lib/ductile" which recursively resets ownership of
/var/lib/ductile and clobbers the tmpfiles-created state_dir ownership; remove
/var/lib/ductile from that chown. Update the instruction to only chown
/etc/ductile (e.g., "sudo chown -R ductile:ductile /etc/ductile") and add a
short note explaining not to change ownership of tmpfiles-managed state_dir
(/var/lib/ductile) to avoid startup refusal.
In `@schemas/config.schema.json`:
- Around line 37-46: Remove the duplicate "tcc_paths" property in the JSON
schema and consolidate into a single definition for "tcc_paths"; keep the more
descriptive description ("macOS-only: List of absolute paths to stat() on cold
start to surface TCC popups synchronously.") and ensure the "items" constraint
enforces non-empty strings (type: "string" with "minLength": 1) so the schema
validates absolute path entries correctly.
---
Duplicate comments:
In `@kanban/101-privsep-valid-not-enforcing-antifootgun.md`:
- Around line 32-33: Add a blank line immediately after the "Acceptance" and
"Narrative" headings to satisfy markdown lint rule MD022
(blanks-around-headings); locate the lines containing the headings "Acceptance"
and "Narrative" and insert an empty line directly below each so there is one
blank line between the heading and the following paragraph/content.
In `@kanban/105-v1.0-fhs-install-artifact.md`:
- Around line 30-31: Add a blank line immediately after each Markdown section
heading in this document (for example after "## Scope" and the other
top-level/subsection headings flagged in the review) so that every heading is
followed by one empty line before its body content, thereby satisfying MD022;
update all heading occurrences in the file to include that single blank line.
In `@kanban/83-privsep-epic.md`:
- Around line 164-165: The Markdown linter flagged a missing blank line after
the "## Narrative" heading; open the document and insert a single empty line
immediately after the "## Narrative" header (before the first bullet that starts
with "- **Authoritative design:** `~/Obsidian/Personal1/ductile/Ductile -
PrivSec and Secrets.md`") so the header is separated from the following list and
satisfies MD022.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: ba54dd28-5bd6-4a92-8cd0-b30aa5054d03
📒 Files selected for processing (16)
cmd/ductile/runtime.godeploy/install.shdeploy/systemd/ductile-accounts.tmpfiles.confdocs/runbooks/privsep-thinkpad-enforce.mdinternal/config/types.gointernal/dispatch/dispatcher.gointernal/dispatch/secret_delivery.gointernal/dispatch/secret_delivery_test.gointernal/doctor/doctor.gointernal/doctor/doctor_test.gokanban/101-privsep-valid-not-enforcing-antifootgun.mdkanban/105-v1.0-fhs-install-artifact.mdkanban/107-plugins-vault-native-conformance.mdkanban/108-vault-compose-fail-open-unknown-principal.mdkanban/83-privsep-epic.mdschemas/config.schema.json
| ## Options (decide on pickup) | ||
| - **Fail-closed:** if a plugin is configured/known to require secrets and its principal is unknown, |
There was a problem hiding this comment.
Insert blank lines after Options/Acceptance/Narrative headings.
Line 30, Line 38, and Line 42 should have a blank line before the following paragraph/list to satisfy MD022.
Also applies to: 38-39, 42-43
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 30-30: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@kanban/108-vault-compose-fail-open-unknown-principal.md` around lines 30 -
31, The markdown violates MD022 by not having blank lines after heading
lines—insert a single blank line immediately after the "Options (decide on
pickup)" heading and likewise after the "Acceptance" and "Narrative" headings so
each heading is followed by one empty line before its paragraph/list; update the
blocks containing those headings (e.g., the "Options (decide on pickup)" heading
and the subsequent lists at lines referenced) to include the blank line to
satisfy MD022.
Source: Linters/SAST tools
… never become a root drop ResolvedAccount encodes one fact in several fields; a zero/malformed value could be misread as silent-unconfined or (under a naive Confined()) confined-to-uid-0 = root. Validate() asserts the invariant (unconfined carries no identity; confined = uid>0/gid>0/name + confined source) and applyAccountCredential calls it at the drop seam → fail closed (ErrAccountDropFailed), never root. Matrix test + seam test.
Existing TestBackupIncludesVaultBlobNotKey covers age.key in the config dir; this adds the ADR layout (key in a secret/ subdir, secrets.age_key_file pointing inside the config tree) and asserts a config/all archive contains NEITHER the key file NOR any secret/ entry. The age-key-outside-the- bundle invariant is now enforced, not incidental.
…l (+ schema vault_file) The vault rejects non-kebab principal names; #107 would otherwise need every secret-holding plugin RENAMED to kebab (breaking config/pipeline refs). New plugins.<name>.vault_principal lets a snake_case plugin (discord_notify) compose its secrets under a kebab principal (discord-notify) WITHOUT a rename; attestation still uses the plugin name. This is the repo-side enabler — the remaining #107 work (each plugin reading the stdin secrets map) lives in the external plugin repos. Also added the missing secrets.vault_file to the schema (drift: live key, would fail strict validate). +test.
The unconfined second ductile role for the unconfinable admin automation (docker/apt/fabric/ file_handler + their notifies). deploy/systemd/ductile-admin.service (template) + a runbook for the two-instance split + stand-up. The live stand-up on the host is operational (separate user/port/config).
…dmin instance template)
…live) discord_notify proved end-to-end: webhook from the encrypted vault over stdin, uid 1001, attested, requires_vault fail-closed. Recorded the reusable 5-step recipe on card #107 (read stdin secrets map, manifest required:[], kebab principal, vault_principal+requires_vault, relocate+lock) + a runbook pointer. NOTE: the discord_notify plugin code change is in the external ductile-plugins repo.
github_repo_sync et al are vault-native + ready but exit 127 under enforce: uv run --script needs uv on PATH + writable HOME/cache and walks config up-tree into the 0700 cwd. Affects the 4 git plugins; disabled for now. Fix: account-friendly uv (UV_CACHE_DIR/HOME + cwd off 0700) or de-uv to plain python3. Found during the #107 push.
There was a problem hiding this comment.
Actionable comments posted: 3
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (3)
docs/runbooks/privsep-thinkpad-enforce.md (1)
80-80:⚠️ Potential issue | 🟠 Major | ⚡ Quick winRemove recursive chown of
/var/lib/ductilefrom cutover steps.Line 80 can break privsep boot by overwriting account-owned state-dir ownership; this is the same footgun documented later in this runbook.
Suggested fix
-sudo chown -R ductile:ductile /etc/ductile /var/lib/ductile +sudo chown -R ductile:ductile /etc/ductile🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@docs/runbooks/privsep-thinkpad-enforce.md` at line 80, Remove the recursive ownership change of /var/lib/ductile in the cutover step that currently runs "sudo chown -R ductile:ductile /etc/ductile /var/lib/ductile"; instead only change ownership for /etc/ductile (keep the -R if needed) and do not touch /var/lib/ductile (or, if specific files under /var/lib/ductile must be fixed, target them explicitly and non-recursively). Ensure the offending command string is updated so /var/lib/ductile is omitted to avoid overwriting privsep-owned state-dir ownership.internal/dispatch/process_unix.go (1)
49-50: 🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick winMake the process setup helpers order-insensitive.
applyAccountCredentialnow preserves an existingSysProcAttr, butconfigurePluginProcessstill overwrites it. If a caller ever invokes these helpers in the opposite order, the credential disappears and the plugin silently runs unconfined.♻️ Proposed fix
func configurePluginProcess(cmd *exec.Cmd) { - cmd.SysProcAttr = &syscall.SysProcAttr{Setpgid: true} + if cmd.SysProcAttr == nil { + cmd.SysProcAttr = &syscall.SysProcAttr{} + } + cmd.SysProcAttr.Setpgid = true }Also applies to: 64-76
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/dispatch/process_unix.go` around lines 49 - 50, configurePluginProcess currently overwrites cmd.SysProcAttr and can clobber credentials set by applyAccountCredential; instead, make configurePluginProcess merge into an existing SysProcAttr (or create one if nil) by setting the Setpgid field to true on the existing *syscall.SysProcAttr so existing fields like Credential are preserved; apply the same merge approach to the other helper(s) around lines 64-76 that also set SysProcAttr to avoid order-sensitive behavior between configurePluginProcess and applyAccountCredential.internal/dispatch/dispatcher.go (1)
400-425:⚠️ Potential issue | 🟠 Major | ⚡ Quick winWrite vault audit rows with the resolved principal.
When
vault_principalremaps the auth identity, the failure audit still persistsPrincipal: job.Plugin. Unknown/revoked-principal failures are then recorded against the wrong subject, which breaks principal-based audit lookups and obscures the actual misconfiguration.🐛 Proposed fix
principal := job.Plugin if vp := d.cfg.Plugins[job.Plugin].VaultPrincipal; vp != "" { principal = vp // `#107`: map snake plugin name → kebab vault principal without renaming } secrets, err := composePluginSecrets(d.secretComposer, d.pluginVerifier, job.Plugin, principal, d.cfg.Plugins[job.Plugin].RequiresVault, jobLogger) @@ if auditErr := d.state.AppendVaultAudit(ctx, state.VaultAuditEvent{ - Op: op, Principal: job.Plugin, Actor: "core", Outcome: outcome, Detail: err.Error(), + Op: op, Principal: principal, Actor: "core", Outcome: outcome, Detail: err.Error(), }); auditErr != nil { jobLogger.Error("vault audit write failed", "op", op, "error", auditErr) }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@internal/dispatch/dispatcher.go` around lines 400 - 425, The audit row is written using the original plugin name (job.Plugin) rather than the resolved vault principal; update the call that appends the vault audit (d.state.AppendVaultAudit) to use the resolved principal variable (principal) so failures due to unknown/revoked vault identities are recorded against the correct subject; locate the flow around composePluginSecrets, composeFailureEscalation and the error handling block that builds the VaultAuditEvent and change Principal: job.Plugin to Principal: principal (and audit any related event publishing or logging that should reflect the resolved principal).
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/ductile/system_backup_test.go`:
- Line 496: The test loop over backup scopes omits scopePlugins; update the
iteration slice that currently reads "for _, scope := range
[]backupScope{scopeConfig, scopeAll} {" to include scopePlugins (e.g.,
[]backupScope{scopeConfig, scopePlugins, scopeAll}) so the secret-zero exclusion
invariant is exercised for plugins as well; adjust any expectations in the
surrounding test that rely on the scope set in the loop (function/test using
backupScope, scopeConfig, scopePlugins, scopeAll) accordingly.
In `@docs/runbooks/ductile-admin-instance.md`:
- Around line 46-47: Add a blank line after the header "## Notes" to satisfy
markdownlint rule MD022; edit the heading block in
docs/runbooks/ductile-admin-instance.md (the "## Notes" line) and insert one
empty line between that header and the following list item so the header is
separated from the content.
In `@schemas/config.schema.json`:
- Around line 351-357: Add a kebab-case constraint to the vault_principal schema
and update the requires_vault description to reflect that fail-closed uses
vault_principal when set; specifically, for the "vault_principal" property add a
pattern that enforces lowercase alphanumerics and hyphen-separated segments
(kebab-case) and optional minLength, and for the "requires_vault" property
update the description to state the fail-closed decision will use the configured
vault_principal (if present) rather than only the plugin name. Ensure you
reference the existing "vault_principal" and "requires_vault" property entries
when making these changes.
---
Outside diff comments:
In `@docs/runbooks/privsep-thinkpad-enforce.md`:
- Line 80: Remove the recursive ownership change of /var/lib/ductile in the
cutover step that currently runs "sudo chown -R ductile:ductile /etc/ductile
/var/lib/ductile"; instead only change ownership for /etc/ductile (keep the -R
if needed) and do not touch /var/lib/ductile (or, if specific files under
/var/lib/ductile must be fixed, target them explicitly and non-recursively).
Ensure the offending command string is updated so /var/lib/ductile is omitted to
avoid overwriting privsep-owned state-dir ownership.
In `@internal/dispatch/dispatcher.go`:
- Around line 400-425: The audit row is written using the original plugin name
(job.Plugin) rather than the resolved vault principal; update the call that
appends the vault audit (d.state.AppendVaultAudit) to use the resolved principal
variable (principal) so failures due to unknown/revoked vault identities are
recorded against the correct subject; locate the flow around
composePluginSecrets, composeFailureEscalation and the error handling block that
builds the VaultAuditEvent and change Principal: job.Plugin to Principal:
principal (and audit any related event publishing or logging that should reflect
the resolved principal).
In `@internal/dispatch/process_unix.go`:
- Around line 49-50: configurePluginProcess currently overwrites cmd.SysProcAttr
and can clobber credentials set by applyAccountCredential; instead, make
configurePluginProcess merge into an existing SysProcAttr (or create one if nil)
by setting the Setpgid field to true on the existing *syscall.SysProcAttr so
existing fields like Credential are preserved; apply the same merge approach to
the other helper(s) around lines 64-76 that also set SysProcAttr to avoid
order-sensitive behavior between configurePluginProcess and
applyAccountCredential.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: ASSERTIVE
Plan: Pro Plus
Run ID: 5a2d2071-5253-4acf-9e51-3f1dfd10853a
📒 Files selected for processing (17)
cmd/ductile/system_backup_test.godeploy/systemd/ductile-admin.servicedocs/runbooks/ductile-admin-instance.mddocs/runbooks/privsep-thinkpad-enforce.mdinternal/config/types.gointernal/dispatch/account.gointernal/dispatch/account_test.gointernal/dispatch/account_unix_test.gointernal/dispatch/dispatcher.gointernal/dispatch/process_unix.gointernal/dispatch/secret_delivery.gointernal/dispatch/secret_delivery_test.gokanban/100-privsep-resolvedaccount-sumtype-t3.mdkanban/104-backup-excludes-secret-zero-test.mdkanban/106-ductile-admin-glue-unconfined-instance.mdkanban/107-plugins-vault-native-conformance.mdschemas/config.schema.json
| // inside the config tree). A `--scope config`/`all` archive must contain NEITHER the | ||
| // key file NOR any `secret/` entry — the key's safety is enforced, not incidental. | ||
| func TestBackupNeverContainsSecretZeroADRLayout(t *testing.T) { | ||
| for _, scope := range []backupScope{scopeConfig, scopeAll} { |
There was a problem hiding this comment.
🛠️ Refactor suggestion | 🟠 Major | ⚡ Quick win
Include scopePlugins in the secret-zero exclusion matrix.
The new invariant test skips a meaningful scope case; add scopePlugins so the exclusion guarantee is exercised across the full scope ladder.
Suggested fix
- for _, scope := range []backupScope{scopeConfig, scopeAll} {
+ for _, scope := range []backupScope{scopeConfig, scopePlugins, scopeAll} {📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| for _, scope := range []backupScope{scopeConfig, scopeAll} { | |
| for _, scope := range []backupScope{scopeConfig, scopePlugins, scopeAll} { |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/ductile/system_backup_test.go` at line 496, The test loop over backup
scopes omits scopePlugins; update the iteration slice that currently reads "for
_, scope := range []backupScope{scopeConfig, scopeAll} {" to include
scopePlugins (e.g., []backupScope{scopeConfig, scopePlugins, scopeAll}) so the
secret-zero exclusion invariant is exercised for plugins as well; adjust any
expectations in the surrounding test that rely on the scope set in the loop
(function/test using backupScope, scopeConfig, scopePlugins, scopeAll)
accordingly.
| ## Notes | ||
| - It is correct for this instance to log **UNCONFINED** at boot (the #101 posture line) — that's the |
There was a problem hiding this comment.
Add a blank line after ## Notes to satisfy MD022.
Suggested fix
## Notes
+
- It is correct for this instance to log **UNCONFINED** at boot (the `#101` posture line) — that's the📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| ## Notes | |
| - It is correct for this instance to log **UNCONFINED** at boot (the #101 posture line) — that's the | |
| ## Notes | |
| - It is correct for this instance to log **UNCONFINED** at boot (the `#101` posture line) — that's the |
🧰 Tools
🪛 markdownlint-cli2 (0.22.1)
[warning] 46-46: Headings should be surrounded by blank lines
Expected: 1; Actual: 0; Below
(MD022, blanks-around-headings)
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@docs/runbooks/ductile-admin-instance.md` around lines 46 - 47, Add a blank
line after the header "## Notes" to satisfy markdownlint rule MD022; edit the
heading block in docs/runbooks/ductile-admin-instance.md (the "## Notes" line)
and insert one empty line between that header and the following list item so the
header is separated from the content.
Source: Linters/SAST tools
| "vault_principal": { | ||
| "type": "string", | ||
| "description": "The vault principal this plugin composes its secrets under (default: the plugin name). The vault rejects non-kebab principal names, so this maps a snake_case plugin (discord_notify) to a kebab principal (discord-notify) WITHOUT renaming the plugin and breaking its config/pipeline refs (#107). Attestation still uses the plugin name." | ||
| }, | ||
| "requires_vault": { | ||
| "type": "boolean", | ||
| "description": "When true, vault secret delivery is MANDATORY for this plugin: an unknown/unregistered vault principal (the plugin name) fails the spawn CLOSED instead of opting out (#108). Closes the fail-open seam where a misnamed/unregistered principal silently runs the plugin with no secrets. Default false preserves the coexistence opt-out for keyless plugins.", |
There was a problem hiding this comment.
Reject invalid vault_principal values in the schema.
Line 351 currently accepts any string, but the description on Line 353 says the vault rejects non-kebab principals. That means config validate can pass a value the secret-delivery path is documented to reject later. Line 357 is also now stale: with vault_principal set, the fail-closed decision is no longer based only on the plugin name.
Suggested schema fix
"vault_principal": {
"type": "string",
- "description": "The vault principal this plugin composes its secrets under (default: the plugin name). The vault rejects non-kebab principal names, so this maps a snake_case plugin (discord_notify) to a kebab principal (discord-notify) WITHOUT renaming the plugin and breaking its config/pipeline refs (`#107`). Attestation still uses the plugin name."
+ "minLength": 1,
+ "pattern": "^[a-z0-9]+(?:-[a-z0-9]+)*$",
+ "description": "The vault principal this plugin composes its secrets under (default: the plugin name). This must be kebab-case, so a snake_case plugin (discord_notify) can map to a principal like discord-notify WITHOUT renaming the plugin and breaking its config/pipeline refs (`#107`). Attestation still uses the plugin name."
},
"requires_vault": {
"type": "boolean",
- "description": "When true, vault secret delivery is MANDATORY for this plugin: an unknown/unregistered vault principal (the plugin name) fails the spawn CLOSED instead of opting out (`#108`). Closes the fail-open seam where a misnamed/unregistered principal silently runs the plugin with no secrets. Default false preserves the coexistence opt-out for keyless plugins.",
+ "description": "When true, vault secret delivery is MANDATORY for this plugin: an unknown/unregistered resolved vault principal (vault_principal if set, otherwise the plugin name) fails the spawn CLOSED instead of opting out (`#108`). Closes the fail-open seam where a misnamed/unregistered principal silently runs the plugin with no secrets. Default false preserves the coexistence opt-out for keyless plugins.",
"default": false
},🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@schemas/config.schema.json` around lines 351 - 357, Add a kebab-case
constraint to the vault_principal schema and update the requires_vault
description to reflect that fail-closed uses vault_principal when set;
specifically, for the "vault_principal" property add a pattern that enforces
lowercase alphanumerics and hyphen-separated segments (kebab-case) and optional
minLength, and for the "requires_vault" property update the description to state
the fail-closed decision will use the configured vault_principal (if present)
rather than only the plugin name. Ensure you reference the existing
"vault_principal" and "requires_vault" property entries when making these
changes.
…on + fail-closed proven live Recipe step 1 generalized: name the vault secret via a config key (salt_secret/token_secret) so a shared base plugin stays reusable across instances. Status: discord_notify + ap_canary vault-native + live (B form: snake instance + vault_principal + requires_vault); fail-closed PROVEN live (bogus principal → refused spawn). Plugin code @ ductile-plugins a1934e5. git/uv plugins → #109.
…on card #110 captures the proven notify model (on-event + on-hook, real Discord posts under enforce), the bucket classification, and the 2 findings (notify_on_complete gating; each notify route needs its own vault_principal'd instance — corrects the trivial-carry assumption). Epic #83 records Phase 3: discord_notify + ap_canary vault-native + fail-closed live, full pipeline model proven, in-binary hardening (#100/#101/#104/#108) redeployed, plugin code @ a1934e5.
… not uv exit-127 fixed (system uv); real failure is exit-2: confined plugins get NO writable HOME and cannot reach their own state_dir (0700 /var/lib/ductile blocks traversal) — affects EVERY plugin (proven by github_repo_sync mkdir). Frame: privsep silently changed the plugin runtime contract. Fix = C (tmpfiles 0711 + gateway HOME/XDG=state_dir at spawn). Revised B = urllib + drop the spurious downstream mkdir (works today). A rejected (cache poisoning). + design norm.
Under enforce a dropped account had no writable HOME and could not even reach its own 0700 state_dir (the 0700 on /var/lib/ductile blocked traversal) — so any plugin writing state, or a runtime needing a cache, failed closed. Two-part fix: - tmpfiles: /var/lib/ductile + .../accounts -> 0711 (traverse-only, not listable; per-account dirs stay 0700). The /home pattern: an account reaches the dir it owns, secret files inside stay 0600-unreadable. The boot fs-reconcile gate tightens the db FILE to 0600 and never touches the dir mode, so 0711 survives boot. - spawn: for a confined account, withAccountRuntimeEnv drops inherited HOME/XDG_CACHE_HOME and re-points both at the account state_dir, and cmd.Dir is set there. Env carries it because Go drops uid before chdir; with the 0711 floor the dropped uid can now also chdir into its own dir. Unconfined plugins untouched. Tests in env_test.go (rebases home+cache; no gateway-home leak).
Codify the runtime contract privsep made explicit, and give authors a copy-me Tier-1 pattern so the exemplar rewrite is "clone and fill in". - ADR docs/adr/confined-plugin-runtime-contract.md: guaranteed (writable HOME/cache/cwd = state_dir, secrets over stdin) vs forbidden (writes outside state_dir, ambient home, deps fetched at spawn) + the 3-tier decision rule (stdlib default / sys_exec floor / fetch-at-spawn advanced). - plugins/_template/: a real, runnable, tested stdlib plugin embodying the contract (run.py, manifest.yaml, test_run.py 3/3, vendored _lib helpers, README with operator wiring + node/TS build-ahead appendix). - PLUGIN_DEVELOPMENT.md: fix the now-misleading defaults — the "canonical" example cargo-culted `uv run --script` with empty deps; §9/§10 claimed plugins inherit the dispatcher cwd and run as the gateway user (both false under enforce). Add §10.6 Runtime contract + extend the §11 checklist with contract items.
What
Delivers PrivSec ADR Layer 1b — uid separation (privilege separation): a privileged gateway drops each plugin to an unprivileged account (a dedicated OS user) at spawn, so filesystem permissions finally bite. This turns plugin scoping from an honour system into an OS-enforced wall.
The property bought: blast-radius containment of your own fallible code. A popped first-party plugin can no longer read the gateway's age key/config/state DB, nor attach to a sibling plugin's memory (same-uid
ptrace).How (the theory — see the ADR for the full why)
CAP_SETUID+CAP_SETGID(not full root). No privilege → hygiene-only. The binary is never setuid; privilege is init-conferred (systemd/launchd), so the same binary runs safely as an unprivileged utility.service.unconfined: true.defaulttier for trusted first-party plugins, an isolateduntrustedtier for arbitrary-command/third-party. Openaccounts:map — adding isolation is config-only.untrusted(or fails closed if no such tier) — a substituted binary cannot inherit the original's identity/tokens.0600/0700and each account gets a private0700dir — all-or-refuse at boot.unconfined: it loads a vault (so the secret-path swap-defence is active), but has noaccounts:map and noCAP_SETUIDunder launchd → the boot gate resolves tounconfined. Merging this does not confine the Mac yet — that's 🛡️ Sentinel: [MEDIUM] Fix unhandled json.Marshal errors #95 + the v1.0 host decision (🛡️ Sentinel: [CRITICAL] Fix SQL injection in sqliteColumnExists #102).Review & state
docs/DEPLOYMENT.md§5b how-to / §5c reference, Naur × Diátaxis; schema drift fixed (service.unconfined).-raceflakes (not regressions):TestSpawnPluginTimeoutKillsProcessGroup,TestDispatcher_Start_ParallelExecution— pass in isolation/serial.Follow-ups (none block this PR)
config check) · 🛡️ Sentinel: [CRITICAL] Fix SQL injection in sqliteColumnExists #102 (v1.0 ship line + macOS-host decision).explainverb family — privsep + vault) · 🛡️ Sentinel: [CRITICAL] Fix SQL injection in sqliteColumnExists #97 (T5/T9/T15/vocab lint) · 🛡️ Sentinel: [MEDIUM] Fix unhandled json.Marshal errors #95 (macOS launchd enforce).Vocabulary
account(privsep OS user) ≠ concurrencyworkers(service.max_workers) ≠ vaultprincipal— three distinct concepts, kept distinct (rename landed commit5590091).Summary by CodeRabbit