Skip to content

feat: Workspaces — multi-user teams (backend + dashboard)#282

Draft
jiashuoz wants to merge 9 commits into
mainfrom
feat/workspaces
Draft

feat: Workspaces — multi-user teams (backend + dashboard)#282
jiashuoz wants to merge 9 commits into
mainfrom
feat/workspaces

Conversation

@jiashuoz

Copy link
Copy Markdown
Member

Summary

Introduces a Workspace (team) concept to e2a. A workspace is the tenant that owns agents, domains, keys, limits, and usage; users are members with an admin / member role. Every user gets one default workspace at signup and can invite teammates into it; multi-membership is supported (a user may belong to several workspaces and switch the active one), while creating additional workspaces is deferred to a future feature.

Full design: docs/design/2026-06-23-workspaces.md.

How this was built

Design-first, then two adversarial multi-agent review passes (52 then 86 agents) that found and fixed 7 blockers before implementation — multi-tenant cascade teardown, NULL-row migration wedges, deploy-window ordering, a last-admin write-skew race, incomplete table audit / token revocation, and unsequenced PK flips. Implemented in ordered, verified slices.

Backend (5 commits)

  • Migration Aworkspaces / workspace_members / workspace_invitations / audit_log tables, ws_system sentinel, workspace_id retrofit + backfill across owned tables, PK/UNIQUE flips, account_usage storage-trigger re-key, last_active_workspace_id column.
  • Identity layerensurePersonalWorkspace shared helper (no user-creation path can mint a workspace-less user), workspace/member/invitation store methods, Principal + role, re-keyed tenant queries.
  • AuthZ — active-workspace resolution (header → re-verified last-active → default), requireWorkspaceRole, last-admin shared-row lock, OAuth/MCP tokens re-verify live membership, account-scope OAuth fails closed.
  • Endpoints/v1/workspaces (list/get/rename), members (list/role/remove), invitations (create/list/revoke/accept), whoami additions; X-E2A-Workspace header modeled; audit-log writes in-tx; invite rate-limit + system-mail send.
  • Keys & teardownapi_keys re-key + created_by; creator/admin revoke; workspace-aware account deletion (revokes tokens, detaches from multi-member workspaces, runs the SES deprovision hook).

Dashboard (4 commits)

  • Context + header plumbingWorkspaceProvider, X-E2A-Workspace injection in the central fetch wrapper, types + SWR keys/invalidators.
  • Switcher + nav — workspace switcher above the user card; a "Workspace" nav item directly above Settings.
  • /workspace page — rename (admin), members table with role pills + role dropdown + remove/leave, invite form + pending-invitations table; role-gated.
  • /invite/accept — token accept flow handling joined (200) / email-mismatch (403) / gone (410); plus the settings danger-zone sole-admin guidance.

Credential model (unchanged security stance, extended)

  • Admin is session-only. No API key or OAuth/MCP token carries admin authority.
  • API keys are workspace service credentials (survive offboarding); user-consented OAuth/MCP tokens track live membership (revoked on removal).
  • e2a_agt_ retained as the static, single-inbox, least-privilege credential.

Verification

  • Go: go build ./..., make test-unit, DB-backed tests (incl. last-admin race, invitation edge cases, multi-member teardown, two-members-one-key) — green. Spec-drift gate green.
  • Web: npm run build + npm run lint (0 errors), 233 web tests pass.
  • ⚠️ Not yet manually smoke-tested in a browser against a running backend — recommend a click-through before merge.

Deferred (by design — follow-ups, not bugs)

  • Migration B: cascade-FK drop + NOT NULL finalize (kept separate for rollback safety; ships after the code deploy is stable). Currently a no-op scaffold.
  • usage_events bulk backfill out-of-band script.
  • Per-workspace limits/usage reads in whoami (still user-keyed; fine for single-member v1, billing-seam re-key is the proprietary ops repo's job).
  • web/public/openapi.yaml (frozen docs copy for the API-reference page) is stale vs. the regenerated backend spec — reconcile separately.

🤖 Generated with Claude Code

jiashuoz and others added 9 commits June 23, 2026 13:43
Implements the additive, safe-on-prod phase of the Workspaces design
(docs/design/2026-06-23-workspaces.md §4.1/§4.7/§4.8 "Migration A").

migrations/048_workspaces_migration_a.sql:
- CREATE TABLE workspaces, workspace_members (role CHECK admin|member,
  PK (workspace_id,user_id)), workspace_invitations (partial UNIQUE on
  (workspace_id,email) WHERE status='pending'), audit_log; add advisory
  last_active_workspace_id column on user_sessions.
- Seed the protected ws_system sentinel (owns the shared agents.e2a.dev
  domain + any user-NULLed usage_events rows).
- Backfill one personal workspace per existing user (deterministic id
  ws_+md5(user_id)) + admin membership, fully idempotent via ON CONFLICT.
- ADD COLUMN workspace_id (nullable) on every workspace-owned table and
  backfill from user_id→personal workspace else ws_system (no NULLs left);
  add api_keys.created_by. Identity-owned tables (user_sessions, oauth_*)
  are NOT re-keyed and keep their user cascade.
- Constraint flips on the small tables: account_limits / account_usage /
  usage_summaries / idempotency_keys PK → workspace_id; suppressions UNIQUE
  and uniq_domains_primary_per_user → workspace_id. DROP NOT NULL on the
  retained user_id columns so workspace-keyed writes succeed.
- CREATE OR REPLACE the e2a_messages_storage_delta trigger to resolve+upsert
  account_usage by workspace_id (ON CONFLICT (workspace_id)), keeping its
  NULL-guard so window-created agents' message writes never abort.
- usage_events: additive ADD COLUMN + bounded sweep; the bulk historical
  backfill runs out-of-band (noted), not blocking the migration.

migrations/049_workspaces_migration_b.sql: deferred scaffold (no-op today)
for the cascade-FK drop + NOT NULL finalize, with templated steps so it is
promoted only after the code deploy is stable (deploy-1 rollback safety, B1).

Tests (DB-backed, internal/identity/workspaces_migration_test.go): idempotent
re-apply is stable; backfill leaves no NULL workspace_id; ws_system owns the
shared domain; the re-keyed storage trigger accrues by workspace_id and the
NULL-guard holds. Verified the full embedded set applies cleanly twice via the
real RunMigrations runner on a fresh DB. testutil truncate now resets the
workspace tables + re-seeds ws_system.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…nt re-point

Slice 2 of the Workspaces design (docs/design/2026-06-23-workspaces.md
§4.1/§4.2/§4.5), built on Slice 1's Migration A schema.

Shared provisioning helper (B3, §4.5):
- ensurePersonalWorkspace(tx, userID, name, email) — single choke point that
  inserts the deterministic personal workspace (ws_+md5(userID), matching the
  migration backfill) + admin membership, fully idempotent (ON CONFLICT DO
  NOTHING). Exported DefaultWorkspaceID(userID) is the one source of truth for
  that id, shared by store + the limits/usage/idempotency re-points.
- CreateOrGetUser + BootstrapUser now thread a single tx through user-insert →
  ensurePersonalWorkspace → signing-secret, so no creation path can mint a
  workspace-less user. CreateOrGetUserWithStatus returns the xmax=0
  new-vs-returning discriminant (CreateOrGetUser keeps its 2-value signature
  for the ~200 existing call sites).
- ensureUserHasSigningSecretTx is the tx-accepting signing-secret variant;
  EnsureUserHasSigningSecret + CreateSigningSecret now stamp workspace_id (no
  NULL rows). EnsureSharedDomain stamps ws_system. CreateAgent(Tx) /
  CreateScopedAPIKey stamp workspace_id (+ created_by on keys) — every
  owned-row INSERT path is now workspace_id-aware (B3 deploy ordering).

Workspace store methods (internal/identity/workspaces.go):
- workspaces: Get / ListWorkspacesForUser (+role) / RenameWorkspace.
- members: ResolveMembership, ListMembers, AddMember, SetMemberRole,
  RemoveMember, CountAdmins. Last-admin guard uses the correct shared-row
  lock (SELECT … FROM workspaces FOR UPDATE then plain count(*)) — §5/B1, not
  the rejected FOR-UPDATE-on-aggregate approach.
- invitations: CreateInvitation (e2a_inv_ CSPRNG token, hash-only persist,
  re-invite upserts the pending row), GetInvitationByToken, ListPending,
  RevokeInvitation, AcceptInvitation (single tx: lock row, re-check
  pending/unexpired, email match, INSERT member ON CONFLICT DO NOTHING, flip
  status). Idempotent double-accept → member/no-error; email mismatch →
  ErrInvitationEmailMismatch; torn-down/revoked/expired → ErrInvitationNotFound.

Principal & resolution (§4.2/§4.2.1/§4.3.1):
- Principal gains Workspace + Role. Key auth fixes Role=member (member-capped
  regardless of minter) and resolves the workspace intrinsically from
  api_keys.workspace_id.

Tenant re-point (keep behavior identical, just workspace-scoped):
- ClaimOrCreateDomain squat/claim + the primary-per-tenant logic in
  SetDomainPrimary now key on workspace_id (a different member of the same
  workspace can re-claim; the primary unique index is per-workspace).
- The constraint-flipped writers broken by Migration A are re-pointed to
  workspace_id: suppressions (UNIQUE), account_limits (PK, limits.Upsert),
  usage_summaries (PK, usage.IncrementUsageSummary), idempotency_keys (PK,
  idempotency Claim/Complete/Release — dedup widens to the workspace, §4.1).
  usage.GetStorageBytes / RecordUsageEvent and the webhookpub outbox stamp/
  read workspace_id. user_id is retained on every table for audit.

Tests: DB-backed coverage for helper idempotency + email fallback,
bootstrap provisioning, list/rename, membership CRUD, last-admin guard
(both orderings of concurrent demotes under the shared-row lock), full
invitation accept tx (double-accept/mismatch/revoke/re-invite), and
key→workspace/role resolution. Updated the seed/read SQL in the
usage/limits/dashboard/outbox DB tests that Slice 1's PK flips broke.

Build, make test-unit, and make test-integration all green; no /v1 handler
changed so the spec/SDK golden gates are untouched.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…lice 3)

Design §4.2/§4.3/§4.3.1/§5. Builds on slice 2's store layer.

- Active-workspace resolution for human sessions
  (identity.ResolveActiveWorkspace): X-E2A-Workspace header
  (membership-verified, non-member → 403) → re-verified
  last_active_workspace_id → default workspace. The no-header path never
  403s. last_active is written conditionally (IS DISTINCT FROM), advisory
  only — never an authz input. Wired into agent.principalFromSession.

- requireWorkspaceRole(ctx, minRole) choke point alongside
  requireAccountScope, with requireWorkspaceMember / requireWorkspaceAdmin
  conveniences. Resource ops require member; people/workspace/billing ops
  require admin (+ account scope). Agent-scoped key pinning is preserved.

- Keys/tokens carry NO admin authority (Role fixed to member). OAuth
  ate2a_ tokens re-verify the consenting user's LIVE membership of the
  resolved workspace per request (B4) — replacing the dropped
  ag.UserID==u.ID check — so a removed member's token is rejected on the
  next request. account-scope OAuth pins WorkspaceID into oauth.Session at
  consent; an empty WorkspaceID fails closed (force re-consent), never a
  default fallback.

- ErrWorkspaceForbidden maps a session naming a non-member workspace to
  403 (not 401) at requirePrincipal.

Tests: ResolveActiveWorkspace header/last-active/default + fail-closed +
conditional last_active; the role × scope authz matrix (member/admin gate,
agent-key cannot reach admin, no-workspace fails closed); a removed
member's OAuth token rejected next request; account-scope-no-workspace
fail-closed; a concurrent-leave last-admin race (complements slice 2's
concurrent-demote race).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…slice 4)

Add the Huma handlers for the workspace surface (design 2026-06-23 §4.4–§4.6),
building on the slice-3 store + authz layer:

- Workspaces: GET /v1/workspaces, GET/PATCH /v1/workspaces/{id} (rename = admin).
  No POST/DELETE in v1 (creation/teardown deferred, §2).
- Members: GET .../members, PATCH .../members/{user_id} (set role, admin,
  last-admin guard → 409), DELETE .../members/{user_id} (admin remove, or self
  = leave).
- Invitations: POST .../invitations (admin, NormalizeEmail case-fold,
  invite-existing-member → 409 already_member, per-workspace rate limit → 429,
  send accept link via the system-mail noreply path), GET/DELETE .../invitations,
  POST /v1/invitations/{token}/accept (idempotent 200, email mismatch → 403,
  torn-down/revoked/expired → 410).

Model X-E2A-Workspace as a shared Huma header input embed (WorkspaceHeaderInput),
the way Idempotency-Key is — so the header is declared in the OpenAPI contract
and visible to the generated SDKs (not a SecurityScheme).

GET /v1/account (whoami) extends additively with the active workspace {id,name}
and the caller's role.

Audit log: invite / revoke / remove / role-change / rename each write an
audit_log row in the SAME tx as the mutation (writeAuditTx); the four mutating
store methods gain an actorUserID arg + ListAuditLog reader. Slice-3 call sites
updated.

Wiring: new httpapi.Deps workspace closures bound in apiserver.BuildDeps;
agent.API gains SendInvitationCore (noreply system-mail) + a per-workspace
inviteLimit. Ran `make generate`; committed the regenerated api/openapi.yaml +
TS/Python SDK bases (TestSpecGoldenNoDrift green).

Tests: endpoint happy-path + the permission/invitation edge cases (member-vs-
admin, last-admin, leave-vs-remove, already_member, rate-limited, email
mismatch, torn-down). DB-backed identity store tests updated + passing.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…slice 5)

API keys are workspace service credentials (§4.3.1): list is per-workspace and
surfaces created_by; revoke is creator-own / admin-any (ErrAPIKeyForbidden →
403, ErrAPIKeyNotFound → 404). No role axis on keys — every key tops out at the
member floor. CreateScopedAPIKey now stamps the returned struct's WorkspaceID +
CreatedBy (column wiring landed in slice 4).

Rewrites DeleteUserData to be workspace-aware (§5, blockers B1/B2): it no longer
blanket-cascades through users. Multi-member workspaces detach — the leaving
user's owned rows are re-homed to a surviving admin so the still-live
user-cascade (Migration B deferred) leaves shared agents/domains/keys intact;
solo workspaces (incl. the deterministic default) are torn down by workspace_id
(the cascade owner), with the per-domain SES deprovision hook run only for those
torn-down tenants. usage_events is deleted only within torn-down workspaces
(tenant-aware GDPR vs workspace-owns-usage). Sole-admin-of-a-multi-member
account delete fails closed (ErrSoleAdminWorkspace). The workspace classify
read takes the shared workspace-row lock (FOR UPDATE OF w) for write-skew safety.

Audit FKs (workspaces.created_by, api_keys.created_by, *.invited_by,
audit_log.actor_user_id) are already ON DELETE SET NULL in Migration A (048) —
no schema change needed this slice.

Re-ran make generate (APIKeyView gained created_by + redocumented scope);
committed the regenerated spec + TS/Python SDK bases.

Tests (DB-backed, run against Postgres :5433):
- multi-member workspace member delete leaves shared resources intact
- sole-admin-of-multi-member delete fails closed
- user delete revokes their OAuth/MCP tokens (no orphan bearer)
- a removed member's surviving service key still authenticates (member-capped,
  workspace intrinsic) — the conscious §4.3.1 decision
- idempotency dedup widening: two members in one workspace collide on
  (workspace_id, key)

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Plumbing slice for the Workspace dashboard UI (§4.2/§4.4/§4.6). No
visible UI yet — later slices depend on this.

- types: add Workspace, WorkspaceMember, Invitation, WorkspaceRole,
  CreateInvitationResponse + whoami workspace/role shapes.
- onboarding/api: typed functions for every /v1/workspaces endpoint
  (list/get/rename, members list/set-role/remove, invitations
  list/create/revoke/accept). Inject the X-E2A-Workspace header in the
  central request<T> from a module-level active-workspace slot, exposed
  via setActiveWorkspaceId + workspaceHeaders; stamp the same selector on
  the settings page's direct /api/auth/me PATCH.
- WorkspaceProvider: fetch GET /v1/workspaces, seed active workspace +
  role from whoami, persist active id to localStorage, expose
  { workspaces, activeWorkspace, role, switchWorkspace }. switchWorkspace
  flips the active id, updates the header slot, and clears tenant-scoped
  SWR cache so agents/domains/messages refetch under the new tenant.
  Wired into the (app) layout.
- swrKeys: workspacesKey/membersKey/invitationsKey + invalidators and a
  tenant-scoped cache invalidator for workspace switches.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Build the active-workspace switcher and surface the Workspace screen in
the sidebar, on top of W1's WorkspaceProvider context (§4.2).

- WorkspaceSwitcher (loft/): dropdown above the user card listing the
  workspaces the session belongs to, each with a role pill (accent=admin,
  neutral=member) and a check on the active one; clicking calls
  switchWorkspace(id). No create-workspace affordance (v1 scope).
  Collapses to a static label (no dropdown chrome) when the user belongs
  to a single workspace; renders nothing while unresolved.
- Sidebar: drop it into the reserved slot above the user card; add a
  'Workspace' bottom-nav entry (route /workspace, new 'users' icon)
  directly above Settings, with the shared active-route treatment.
- Add a /workspace stub page so the nav entry isn't a dead link; W3
  replaces it with the real members/invitations UI.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Build out the /workspace management surface (slice W3, §4.6) on top of the
W1/W2 WorkspaceProvider + switcher:

- PageShell with an inline, admin-only Rename affordance on the active
  workspace name (renameWorkspace PATCH); non-admins see it read-only.
- Stats strip: members / admins / pending invites.
- Members table (api-keys table markup): avatar initials, name + email,
  role pill (Chip accent=admin / neutral=member), joined date. Admin-only
  row actions: a role select (admin<->member) and Remove/Leave (own row =
  Leave, confirm() guarded). Members see a read-only roster.
- Toggle-inline Invite form (AddDomainForm pattern) with email + role,
  plus a Pending invitations table with a Revoke action.
- last_admin / already_member 409s surface as dismissible inline banners.
- SWR keyed by workspace id; members/invitations invalidated after each
  mutation; invitations fetched only for admins (admin-only endpoint).

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
…e (W4)

Add /invite/accept?token= under the auth-gated (app) layout: auto-accepts the
token via POST /v1/invitations/{token}/accept and handles the three designed
outcomes — 200 joins (switch into the workspace + route to /workspace), 403
email-mismatch (name the signed-in account + offer switch), 410 gone (expired/
revoked state back to /dashboard). Adds WorkspaceProvider.enterWorkspace(id) so
a freshly-joined workspace (not yet in the cached list) becomes active before
routing.

Settings danger zone now parses the error envelope and surfaces the server's
message inline instead of a generic "Failed:". Detects the sole-admin-of-a-
multi-member-workspace block and appends actionable guidance pointing at the
workspace to promote another admin or remove members first.

Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant