Skip to content

feat(identity): peer-PID binary digest + env per paper3.1#88

Merged
manzil-infinity180 merged 6 commits into
devfrom
fix/identity-discovery-63
May 9, 2026
Merged

feat(identity): peer-PID binary digest + env per paper3.1#88
manzil-infinity180 merged 6 commits into
devfrom
fix/identity-discovery-63

Conversation

@manzil-infinity180
Copy link
Copy Markdown
Contributor

@manzil-infinity180 manzil-infinity180 commented Apr 28, 2026

closes #63

What

Replaces aflock's PATH-based identity discovery with kernel-attested
peer introspection over Unix domain sockets. The MCP server now derives
the connecting agent's identity from the peer process itself, not from
whatever binary happens to sit on aflock's $PATH.

Why

Pre-#88, internal/identity/discover.go looked up "claude" on aflock's
own $PATH and used aflock's own /proc/self/cgroup + os.Environ().
Any peer could be identified as whatever binary aflock happened to
resolve — a clear spoofing gap. Paper §3.1 specifies the identity must
be derived from the actual peer.

How

Kernel peer credentials (internal/identity/peercred/)

  • Linux: SO_PEERCRED getsockopt → real UID/GID/PID
  • macOS: LOCAL_PEERCRED + LOCAL_PEERPID
  • Other platforms: stub returning unsupported

Peer introspection (internal/identity/peer{,_linux,_darwin}.go)

  • Linux reads peer's /proc/<pid>/exe, /cgroup, /environ directly
  • FD-pinning on /proc/<pid>/exe closes the exec-race TOCTOU
  • Starttime check guards against PID-reuse during introspection
  • macOS: binary digest + UID/GID only (no /proc analog) via hardened
    lsof lookup

UDS hardening (internal/mcp/server.go, umask_unix.go)

  • 0077 umask around socket bind
  • Explicit unlink of stale sockets
  • Stricter listener flags

Identity derivation (internal/identity/agent.go, discover.go)

  • Now consumes peer-introspected binary digest + env, not aflock's own

Signed-off-by: Rahul Vishwakarma <rahulvs2809@gmail.com>
ServeUnix now reads peer's /proc/<pid>/exe, /proc/<pid>/cgroup,
and /proc/<pid>/environ instead of aflock's own PATH lookup,
/proc/self/cgroup, and os.Environ(). UID/GID come from SO_PEERCRED.

Closes the spoofing gap where any peer's identity hash pinned to
whatever "claude" happened to be on aflock's PATH.

macOS: binary digest + UID/GID only (no /proc analog).

Refs: paper §3.1
@github-actions
Copy link
Copy Markdown

Docs Preview

Status Deployed
Branch fix/identity-discovery-63
Preview URL https://fix-identity-discovery-63.aflock-d0m.pages.dev

@manzil-infinity180 manzil-infinity180 changed the title feat(identity): peer-PID binary digest + env per paper §3.1 feat(identity): peer-PID binary digest + env per paper3.1 Apr 29, 2026
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a Unix-domain-socket (UDS) MCP transport that derives agent identity from kernel-attested peer credentials (PID/UID/GID) and peer-process introspection, aligning implementation with the paper’s §3.1 identity guarantees.

Changes:

  • Add aflock serve --unix <socket> and a new ServeUnix server path that uses UDS peer credentials to drive identity discovery.
  • Introduce internal/identity/peercred (SO_PEERCRED / LOCAL_PEERPID+LOCAL_PEERCRED) and peer PID introspection helpers (/proc/<pid>/... on Linux; ps fallbacks on macOS).
  • Refactor identity discovery to support peer-derived environment/binary/container data and update docs to reflect the new modes.

Reviewed changes

Copilot reviewed 16 out of 16 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
internal/mcp/server.go Adds ServeUnix and refactors identity init to support peer-cred discovery.
cmd/aflock/main.go Adds --unix flag and routes serve execution to UDS/HTTP/stdio modes.
internal/identity/peercred/peercred.go New cross-platform API for extracting kernel-attested peer PID/UID/GID from a UDS conn.
internal/identity/peercred/peercred_linux.go Linux SO_PEERCRED implementation.
internal/identity/peercred/peercred_darwin.go macOS LOCAL_PEERPID + LOCAL_PEERCRED implementation.
internal/identity/peercred/peercred_other.go Stub for unsupported platforms.
internal/identity/peercred/peercred_test.go Tests for peercred extraction and non-UDS rejection.
internal/identity/peer.go Adds PeerInfo + peer introspection + peer-derived Binary/Environment identity helpers.
internal/identity/peer_linux.go Linux peer introspection via /proc/<pid>/* (exe, cgroup, environ, status).
internal/identity/peer_darwin.go macOS peer introspection via ps (comm/uid) with documented gaps.
internal/identity/peer_other.go Stubs for unsupported platforms.
internal/identity/peer_test.go Tests for peer introspection + peer-derived identity properties (linux/darwin).
internal/identity/discover.go Generalizes discovery to start from a provided PID and uses peer env in peer mode.
internal/identity/agent.go Adds peer-based agent identity entrypoints and wires peer-derived binary/env identity.
docs/reference/specification.md Updates spec notes for macOS PID extraction and current HTTP limitations.
docs/concepts/identity.md Updates identity docs to reflect heuristic vs kernel-attested UDS modes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread internal/mcp/server.go Outdated
Comment thread internal/identity/peer_linux.go Outdated
Comment thread cmd/aflock/main.go
Signed-off-by: Rahul Vishwakarma <rahulvs2809@gmail.com>
@manzil-infinity180
Copy link
Copy Markdown
Contributor Author

Known gaps / follow-ups

  • test/ integration tests are fragile in hardened environments: test/mcp_client_test.go:54
    builds aflock to /tmp/aflock-test and execs it, which fails when /tmp is mounted noexec
    (permission denied). Follow-up: build into t.TempDir() (or another known-executable temp dir)
    instead of hardcoding /tmp.

  • macOS peer-binary attestation remains “hardened heuristic”: no /proc//exe FD pinning /
    pidfd, so there’s an unavoidable exec race window vs Linux (even with post-hash Kill(pid, 0)).

  • Heuristic (stdio/HTTP) discovery still shells out to ps/lsof and may be blocked by sandboxing;
    consider syscall-based alternatives or explicit graceful-degradation behavior.

  • ProcessMetadata.ParentPID effectively means “start PID for discovery” in peer-cred mode;
    consider adding PeerPID/StartPID to avoid confusion while keeping backward compatibility.

pls take a look at this closely or create a new issue to fix this all in follow-up pr

@manzil-infinity180 manzil-infinity180 marked this pull request as ready for review April 30, 2026 14:48
Signed-off-by: Rahul Vishwakarma <rahulvs2809@gmail.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 20 out of 20 changed files in this pull request and generated 6 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +47 to +51
// probe-only signal; ESRCH means the process exited or was recycled.
// Same caveat as SPIRE's darwin attestor: this catches recycle but
// not in-flight exec().
if err := syscall.Kill(pid, 0); err != nil {
return "", "", fmt.Errorf("peer process exited during introspection: %w", err)
Comment on lines +86 to +100
scanner := bufio.NewScanner(strings.NewReader(string(out)))
atTxt := false
for scanner.Scan() {
line := scanner.Text()
switch {
case line == "ftxt":
atTxt = true
case strings.HasPrefix(line, "f"):
atTxt = false
case atTxt && strings.HasPrefix(line, "n"):
return line[1:], nil
}
}
return "", fmt.Errorf("no ftxt entry in lsof output for pid %d", pid)
}
Comment on lines +24 to +32
// FromConn extracts peer credentials from an accepted Unix-domain-socket
// connection. The connection must be a *net.UnixConn (or wrap one); other
// connection types are rejected.
func FromConn(conn net.Conn) (PeerCred, error) {
uc, ok := conn.(*net.UnixConn)
if !ok {
return PeerCred{}, errors.New("peercred: connection is not a *net.UnixConn")
}
return fromUnixConn(uc)
AflockPID: os.Getpid(),
ParentPID: os.Getppid(),
UserID: os.Getuid(),
ParentPID: startPID,
Comment on lines +5 to +7
// withRestrictiveUmask is a no-op on non-unix platforms. ServeUnix
// rejects connections on those platforms before reaching the umask
// guard anyway (peercred.ErrUnsupportedPlatform).
Comment thread internal/mcp/server.go
Comment on lines +313 to +317
func (s *Server) ServeUnix(policyPath, socketPath string) error {
if socketPath == "" {
return fmt.Errorf("socket path is required")
}

@manzil-infinity180
Copy link
Copy Markdown
Contributor Author

hmm , testing it with oscal compliance example manually

Signed-off-by: Rahul Vishwakarma <rahulvs2809@gmail.com>
@manzil-infinity180
Copy link
Copy Markdown
Contributor Author

┌─────────────────────────────────────────────────────────────────────┐
│                         User's machine                              │
│                                                                     │
│   ┌──────────────────┐         ┌──────────────────┐                 │
│   │   Claude Code    │ stdio   │  socat bridge    │                 │
│   │   (CLI agent)    │◄───────►│ stdin/stdout ↔   │                 │
│   │                  │ MCP     │ unix socket      │                 │
│   └──────────────────┘         └────────┬─────────┘                 │
│                                         │ UDS                       │
│                                         ▼ MCP                       │
│   ┌─────────────────────────────────────────────────────┐           │
│   │                aflock MCP server                    │           │
│   │                                                     │           │
│   │   • Reads peer identity via SO_PEERCRED (kernel)    │           │
│   │   • Loads policy (.aflock)                          │           │
│   │   • Exposes 9 tools (read_file, bash, get_token...) │           │
│   │   • Signs every action with SPIRE-issued SVID       │           │
│   │   • Writes state + attestations to disk             │           │
│   └────┬─────────────────────────────────────┬──────────┘           │
│        │                                     │                      │
│        │ workload-API                        │ writes               │
│        ▼ socket                              ▼                      │
│   ┌──────────────────┐                  ┌─────────────────────┐     │
│   │   SPIRE agent    │  attests aflock  │  ~/.aflock/sessions │     │
│   │ (workload attst) │  binary digest   │   <session-id>/     │     │
│   └────────┬─────────┘                  │     state.json      │     │
│            │ join token                 │     attestations/   │     │
│            ▼                            │     *.intoto.json   │     │
│   ┌──────────────────┐                  └─────────────────────┘     │
│   │   SPIRE server   │ trust root              ▲                    │
│   │   (CA, registry) │                         │                    │
│   └──────────────────┘                         │                    │
│                                                │                    │
│                              ┌─────────────────┴─────────────────┐  │
│                              │   aflock verify --session <id>   │   │
│                              │   (6-phase verify pipeline)      │   │
│                              └──────────────────────────────────┘   │
└─────────────────────────────────────────────────────────────────────┘

colek42
colek42 previously approved these changes May 7, 2026
Copy link
Copy Markdown
Contributor

@colek42 colek42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Architecture and security thinking are sound — this is the right direction for #63 and a real improvement over the PATH-spoofable status quo. Inline comments on concrete issues; the parent-dir TOCTOU (server.go:344) is the one that materially weakens the property the PR is selling, so I'd hold merge on that one.

Strengths

  • FD pinning + starttime check correctly closes the exec-race TOCTOU on Linux. Comments explain why, not just what.
  • Fail-closed on attestation error, with a test (TestDiscoverAgentIdentityFromPeer_DeadProcessFailsClosed) that pins the invariant.
  • extractContainerHexID regex handles cgroup v2 systemd-scope, cri-containerd, cgroup v1, and the mountinfo fallback for namespaced cgroups.
  • Honest darwin caveats — "hardened heuristic," no env, no container ID, no FD pin. Better than pretending parity.
  • peerBinaryIdentity preserves digest when path is empty (Readlink failure path). The digest is the load-bearing component; dropping it because of an unrelated path-resolution failure would be a silent identity downgrade.

Minor (not inlined)

  • peer_unix.go is build-constrained linux || darwin, but Go's unix build tag covers FreeBSD/OpenBSD/etc. Naming is misleading vs. umask_unix.go which actually uses //go:build unix.
  • Container ID truncated to 12 chars preserves prior behavior, but throws away entropy. Worth a TODO to keep the full 64-char digest in canonical identity (12-char short form has known collisions in large fleets).
  • stdioServer.Listen(context.Background(), ...) has no cancellation path — SIGINT won't unblock a stuck session.
  • Darwin lsof per-connect is an extra fork + parse on the hot path. Negligible at single-session, but swap for proc_pidpath via cgo (or golang.org/x/sys/unix.Sysctl) if multi-session ever lands.

Test gaps worth filling

  • extractContainerHexID is unit-tested with synthetic data — no test reads a real /proc/<pid>/cgroup.
  • ServeUnix's pre-bind Lstat check has no test (drop a file at the path → expect refusal).
  • No test that the socket actually lands at 0600 (stat after bind in a goroutine).
  • No darwin test that lsofExecPath correctly parses real lsof -Fn output.

Comment thread internal/mcp/server.go Outdated
return fmt.Errorf("stat socket path: %w", err)
}

if err := os.MkdirAll(filepath.Dir(socketPath), 0700); err != nil {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Parent-directory TOCTOU — partially defeats the hijack-race claim.

MkdirAll is a no-op on existing directories — it does not tighten permissions. If the user passes /tmp/aflock.sock, the parent is /tmp (1777) and the 0700 here does nothing. A local attacker can win the bind race by creating the path between the Lstat at line 338 and net.Listen at line 353. The 0077 umask only governs what your bind sets — it doesn't help if someone else creates the file first.

The ServeUnix doc-comment claim "Refuses to start if the socket path already exists, to avoid a hijack race against an attacker who pre-creates it" is only partially true — the LstatListen window is the race.

Fix options:

  • Require the parent dir to be 0700 and owned by the current UID before binding (os.Stat parent → check Mode().Perm() == 0700 and Sys().(*syscall.Stat_t).Uid == os.Getuid()); reject otherwise.
  • Or bind into a freshly-created private dir (os.MkdirTemp + chmod 0700) whose perms you fully control, and document the constraint on --unix /path accordingly.

Without one of these, the kernel-attested-identity property still has a local-attacker bypass.

Comment thread internal/mcp/server.go Outdated
// Single-connection by design: the connection itself is the MCP session, and
// the kernel-attested PID is bound to that session's identity. A second client
// connecting to the same socket path after the first disconnects starts a
// fresh server invocation.
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doc text contradicts the code.

This comment says "A second client connecting to the same socket path after the first disconnects starts a fresh server invocation," but nothing in ServeUnix re-listens — the function accepts once, closes the listener, serves the session, and returns. "Starts a fresh server invocation" is true only if the operator re-runs aflock serve --unix. Single-session-per-process matches stdio's lifecycle and is probably correct — just fix the doc to say so explicitly, or implement the re-listen loop.

Comment thread internal/mcp/server.go

pc, err := peercred.FromConn(conn)
if err != nil {
return fmt.Errorf("extract peer credentials: %w", err)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No explicit same-UID assertion before serving.

The 0600 socket and same-UID requirements of /proc/<pid>/environ make cross-UID connections fail in practice, but there's no if pc.UID != uint32(os.Getuid()) { return error }. Adding it makes the trust boundary unambiguous and prevents future regressions if the UDS perms are ever loosened. Cheap defense in depth — costs one line.

Comment thread internal/identity/peer_darwin.go Outdated
//
// The 'n' line that immediately follows 'ftxt' is the executable path.
func lsofExecPath(pid int) (string, error) {
lsofPath, err := exec.LookPath("lsof")
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

exec.LookPath undermines this PR's whole premise on darwin.

The stated goal is "don't trust aflock's $PATH for identity binaries." Then for darwin attestation we shell out to lsof resolved via exec.LookPath — a hostile PATH shadowing /usr/sbin/lsof defeats the very property being added.

Flip the order: prefer /usr/sbin/lsof unconditionally, and only fall back to LookPath if the absolute path is missing. Same reasoning the surrounding peer code uses for not trusting $PATH.

//
// If the readlink fails (rare — file unlinked, etc.) we accept an
// empty path; peerBinaryIdentity preserves the digest in that case.
path, _ := os.Readlink(fmt.Sprintf("/proc/self/fd/%d", f.Fd()))
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(deleted) suffix can land verbatim in BinaryPath.

If the peer's binary was unlinked between open and readlink (which doesn't change the FD-pinned digest, by design), the kernel returns /foo/bar (deleted) as the readlink target. Today this string flows directly into BinaryIdentity.Path and the canonical env. Strip the trailing (deleted) (or set Path to "" in that case — the digest is what matters, and peerBinaryIdentity already preserves it without a path).

Comment thread cmd/aflock/main.go Outdated
serveCmd.Flags().StringVarP(&servePolicyPath, "policy", "p", "", "Path to .aflock policy file")
serveCmd.Flags().IntVar(&serveHTTPPort, "http", 0, "HTTP port for SSE transport (default: stdio)")
serveCmd.Flags().StringVar(&serveUnixSocket, "unix", "", "Unix-domain socket path for kernel-attested peer-cred identity (mutually exclusive with --http)")
serveCmd.MarkFlagsMutuallyExclusive("http", "unix") //nolint:errcheck // flag validation is best-effort; serve still guards at runtime
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Duplicate flag-conflict logic.

Cobra enforces MarkFlagsMutuallyExclusive at parse time — by the time Run executes, the conflict has already been rejected. The manual check + os.Exit(2) at line 659 is dead code under normal cobra usage, and the //nolint:errcheck comment hand-waves it as "best-effort." Pick one — the MFM call is the cleaner of the two; drop the runtime check (or the MFM, but the runtime check is the harder-to-reason-about one).


// TestErrUnsupportedPlatform_IsExported verifies the sentinel is reachable.
func TestErrUnsupportedPlatform_IsExported(t *testing.T) {
if !errors.Is(ErrUnsupportedPlatform, ErrUnsupportedPlatform) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test is trivially true.

errors.Is(x, x) is true for any non-nil x by definition. This isn't testing anything about ErrUnsupportedPlatform — not its export, not its identity, not its message. Just delete it. If the goal was to assert the sentinel is exported, the test only compiles because it's exported, so the compilation itself is the test.

Signed-off-by: Rahul Vishwakarma <rahulvs2809@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

arch: kernel-level identity via SO_PEERCRED instead of PID-walking

3 participants