Skip to content

Land capability-based security#6

Merged
jserv merged 1 commit into
mainfrom
capability
May 12, 2026
Merged

Land capability-based security#6
jserv merged 1 commit into
mainfrom
capability

Conversation

@jserv
Copy link
Copy Markdown
Contributor

@jserv jserv commented May 12, 2026

Object-bearing syscalls (FD, timer, sync primitives, message queue) now gate on a per-process cap_space instead of relying solely on the coarse syscall allow-list. The legacy struct proc::fd_table is retired; small-integer file descriptors are slot indices into the caller's cap_space, and capability-management syscalls (SYS_CAP_{GET_TOKEN,DROP, TRANSFER,REVOKE_DELEGATE}) take 64-bit cap_handle tokens carrying a generation snapshot for stale-handle detection.

Properties enforced:

  • Unforgeability via the generation/type/rights snapshot in cap_handle. cap_validate_token_locked rejects any token whose snapshot diverges from the live slot.
  • Authority confinement: a cap minted in process A is never visible in process B unless an explicit cap_transfer or spawn-time inherit mints a fresh slot in B.
  • Lazy revocation: cap_drop bumps the slot generation before clearing the slot. Old tokens for that slot observe EBADF on the next validate. No poisoning, no exit-time scan.
  • Single-hop delegation: cap_transfer requires GRANT on the source, strips GRANT from the destination, and records the originating grant_epoch so the supervisor retains a revocable handle. cap_inherit_fd applies the same attenuation during spawn inheritance.
  • Dup-escape-safe mass revocation: cap_revoke_delegate scans the destination cap_space for slots matching (type, object_index, delegate_epoch) and invalidates every match, neutralizing any dup() performed by the delegate before revocation.
  • Active-use pin: cap_lookup_fd / cap_lookup_timer / cap_lookup_object bump a per-object refcount inside the caller's fd_lock and return a cap_ref. Callers pair with cap_put_ref so a concurrent destroy cannot recycle the underlying pool entry while another thread is still operating on it.
  • Deadlock-free SMP: two-process cap operations acquire fd_locks in ascending pid order via the cap_lock_pair helpers, so concurrent A->B and B->A flows cannot deadlock. cap_revoke_delegate runs a four-phase protocol (snapshot under src lock; release; reacquire in pid order; re-validate; scan) so the supervisor's revoke is robust against a concurrent cap_drop on the delegate slot.

Spawn integrates in two ordered phases. sys_spawn does blanket GRANT-gated inheritance of every valid parent FD via cap_inherit_fd (which strips GRANT from the destination); a non-GRANT parent FD aborts spawn with EACCES. spawn_apply_file_actions then applies SPAWN_FA_* in caller order against the child's already-inherited table. SPAWN_FA_OPEN mints with GRANT because the supervisor is explicitly configuring the child's FD table; plain sys_open mints non-delegable caps (READ + optional WRITE, no GRANT).


Summary by cubic

Introduces per‑process capability tables that gate all object‑bearing syscalls and replace the legacy fd_table. FDs are now cap‑space slot indices backed by unforgeable 64‑bit tokens, with safe delegation, revocation, active‑use pins, and thread handles for per‑thread scheduling.

  • New Features

    • Per‑process cap_space replaces fd_table; FDs are slot indices.
    • Unforgeable tokens for SYS_CAP_{GET_TOKEN,DROP,TRANSFER,REVOKE_DELEGATE} with rights and GRANT‑gated single‑hop delegation; bulk revoke of delegated duplicates.
    • Lifetime safety: active‑use pins plus refcounted teardown for timers, sync primitives, and message queues; stale tokens hit EBADF.
    • Thread capabilities: SYS_THREAD_{SET,GET}SCHEDPARAM accept CAP_TYPE_THREAD handles (0 = self); user tasks get reserved cap slots.
    • Deadlock‑free SMP for cross‑process cap ops via ordered locks and a revoke protocol.
    • VFS marks non‑seekable nodes with VFS_FLAG_NOSEEK (dirs and synthetic devices); lseek on these returns ESPIPE.
  • Migration

    • Spawn: only FDs with GRANT inherit; others cause EACCES. SPAWN_FA_OPEN mints with GRANT; sys_open mints non‑delegable caps.
    • Use cap_get_token/cap_drop/cap_transfer; cap_revoke_delegate revokes all delegated duplicates.
    • Expect ESPIPE on lseek against non‑seekable descriptors.

Written for commit 06308aa. Summary will update on new commits.

cubic-dev-ai[bot]

This comment was marked as resolved.

Object-bearing syscalls (FD, timer, sync primitives, message queue) now
gate on a per-process cap_space instead of relying solely on the coarse
syscall allow-list. The legacy struct proc::fd_table is retired;
small-integer file descriptors are slot indices into the caller's
cap_space, and capability-management syscalls (SYS_CAP_{GET_TOKEN,DROP,
TRANSFER,REVOKE_DELEGATE}) take 64-bit cap_handle tokens carrying a
generation snapshot for stale-handle detection.

Properties enforced:
  - Unforgeability via the generation/type/rights snapshot in
    cap_handle. cap_validate_token_locked rejects any token whose
    snapshot diverges from the live slot.
  - Authority confinement: a cap minted in process A is never
    visible in process B unless an explicit cap_transfer or
    spawn-time inherit mints a fresh slot in B.
  - Lazy revocation: cap_drop bumps the slot generation before
    clearing the slot. Old tokens for that slot observe EBADF on
    the next validate. No poisoning, no exit-time scan.
  - Single-hop delegation: cap_transfer requires GRANT on the
    source, strips GRANT from the destination, and records the
    originating grant_epoch so the supervisor retains a revocable
    handle. cap_inherit_fd applies the same attenuation during
    spawn inheritance.
  - Dup-escape-safe mass revocation: cap_revoke_delegate scans the
    destination cap_space for slots matching (type, object_index,
    delegate_epoch) and invalidates every match, neutralizing any
    dup() performed by the delegate before revocation.
  - Active-use pin: cap_lookup_fd / cap_lookup_timer /
    cap_lookup_object bump a per-object refcount inside the
    caller's fd_lock and return a cap_ref. Callers pair with
    cap_put_ref so a concurrent destroy cannot recycle the
    underlying pool entry while another thread is still operating
    on it.
  - Deadlock-free SMP: two-process cap operations acquire fd_locks
    in ascending pid order via the cap_lock_pair helpers, so
    concurrent A->B and B->A flows cannot deadlock.
    cap_revoke_delegate runs a four-phase protocol (snapshot under
    src lock; release; reacquire in pid order; re-validate; scan)
    so the supervisor's revoke is robust against a concurrent
    cap_drop on the delegate slot.

Spawn integrates in two ordered phases. sys_spawn does blanket
GRANT-gated inheritance of every valid parent FD via cap_inherit_fd
(which strips GRANT from the destination); a non-GRANT parent FD
aborts spawn with EACCES. spawn_apply_file_actions then applies
SPAWN_FA_* in caller order against the child's already-inherited
table. SPAWN_FA_OPEN mints with GRANT because the supervisor is
explicitly configuring the child's FD table; plain sys_open mints
non-delegable caps (READ + optional WRITE, no GRANT).
@jserv jserv merged commit 49fdb61 into main May 12, 2026
6 checks passed
@jserv jserv deleted the capability branch May 12, 2026 14:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant