services: add path-based matching with name as canonical identifier#164
Conversation
|
💬 Discussion in Slack: #pr-review-agent-vault-164-services-add-path-based-matching-with-name-as-canonica Posted by Review Police — reviews, comments, new commits, and CI failures will stream into this channel. |
|
Preview deployment for your docs. Learn more about Mintlify Previews.
💡 Tip: Enable Workflows to automatically generate PRs for you. |
Services can now scope by host AND optional URL path, so the same host
can carry multiple credentials at different paths (the motivating Slack
case: bot-token at /api/* alongside connection-token at
/api/apps.connections.*). The host-only matcher is replaced by
specificity-scored selection — exact-host beats wildcard-host, then
longer literal path prefix wins, then declaration order.
The host-keyed identity model is replaced by a slugged `name` as the
canonical per-vault service identifier, fixing a long-standing weakness
where two services on the same host couldn't be addressed
independently. Existing host-only services are auto-named on first read
(via broker.NormalizeServices) — no migration step, no manual config
edits, no schema change.
Spans broker matcher, brokercore credential injection, proposal merge
flow, server endpoints, CLI, React UI, TypeScript SDK, and docs.
Highlights:
- broker.MatchService(host, path, services) replaces MatchHost; returns
(*Service, MatchScore) so the debug log can audit which rule won and
why without recomputing the comparison.
- DELETE/PATCH /v1/vaults/{vault}/services/{name-or-host} tries name
first then falls back to host with 409+candidate list on ambiguity.
- Inline-form host (e.g. `slack.com/api/*`) is split server-side at
every ingest path; CLI mirrors the same split for `--host` flags.
- ProxyEvent.MatchedService stores the canonical name (slug); new
MatchedHost/MatchedPath fields carry the matched pattern in-memory
(no schema change). Operators filtering request logs by `?service=`
should use the slug going forward — pre-upgrade rows still contain
hosts; documented in cmd/skill_http.md.
- broker.ValidateSlug consolidates the 3-64 lowercase-alphanumeric-
hyphens rule that previously lived in a private validateSlug in
internal/server/handle_agents.go; vault and agent name validation
now share the same source of truth.
- TypeScript SDK v0.2.0: additive `name?` and `path?` on Service, new
removeByName(name) method; existing remove(host) keeps working
against the back-compat host-shim route (surfaces 409 as ApiError).
6364906 to
606b951
Compare
The path-based-matching commit tripped two gocritic checks: the character-class branch in Slugify reads more naturally as a switch, and the in-place splice in handleServiceRemove was assigning the append result to a renamed variable. Rebuild the filtered slice explicitly so the intent is unambiguous.
handleServicesUpsert auto-slugged missing names by calling broker.NormalizeServices on the incoming batch *before* loading existing services, so a no-name upsert whose slug happened to match an unrelated stored service's Name silently overwrote it via the byName index. Same hazard the proposal flow already guards against in normalizeProposalServices — apply the same seeding pattern here. The new normalizeIncomingAgainstExisting helper preserves the legacy "upsert by host" pattern by first looking up an existing service with matching (Host, Path) and adopting its Name; if none exists, it auto-slugs and bumps (`-2`) on collision against names already taken by existing services or explicit names elsewhere in the batch. Two regression tests pin the behavior: - TestServicesUpsertAutoSlugBumpsAroundExistingName: the bot's example (incoming slack.com /api/* whose slug "slack-com-api" collides with an unrelated stored service of the same explicit Name) — both services must survive, the new one stored under "slack-com-api-2". - TestServicesUpsertExplicitNameMatchingExistingReplaces: explicit Name matching an existing service still replaces (intended POST semantic, no bumping). Reported in claude[bot] review on PR #164.
The proposal apply path was loading existing broker config via raw GetBrokerConfig+Unmarshal — bypassing the broker.NormalizeServices backfill that every other service-loading path in this PR uses. When a vault held services stored before the name-as-identifier PR (Name=""), MergeServices's nameIndex collapsed every legacy entry onto the "" key. A pre-PR proposal applied post-deploy (proposedServices also Name="") would then upsert against nameIndex[""] and overwrite the wrong entry — silent data loss whose blast radius is whichever legacy service was indexed last. Use s.loadServices for existing (already normalized via NormalizeServices) and slug-backfill proposedServices Names from broker.Slugify(Host, Path) so the merge index has canonical Names on both sides. Inline-form host split is applied to proposed too, in case a pre-PR proposal stored "slack.com/api/*" in the Host field. TestAdminProposalApprovePreservesLegacyServices pins the corruption path: two pre-PR existing services (api.github.com first, api.stripe.com second) plus a pre-PR proposal targeting api.github.com — the test asserts api.stripe.com's auth survives untouched (the bug overwrote it because nameIndex[""] resolved to stripe). Closes the apply-path gap flagged in the PR review.
- Drop the `description` field from Service / proposal Service end-to- end (Go, SDK, web, docs); server returns 400 on legacy description-bearing payloads instead of silently dropping. - Consolidate path scoping under inline-form host (slack.com/api/*). Remove `--path` / `--description` flags; the web form drops the separate Path Glob input. Reads still return host + path split so the matcher rule is inspectable. - Per-vault sync.Mutex serializes the load → mutate → save cycle on every service-mutating handler (POST/PATCH/DELETE/PUT, proposal apply). Without it, two concurrent upserts can both pass the auto-slug collision check against the same pre-state. - Apply-time auto-slug normalization mirrors the bulk-upsert path: a stale pre-PR proposal whose Slugify(host, path) collides with an unrelated existing service is bumped to slug-2 rather than overwriting it. A host-only set proposal that matches an existing (host, path) adopts the existing Name so legacy rotate-credentials proposals update in place rather than create a duplicate. A delete with no host match is rejected at create with "name is required" rather than fabricating a slug. - ValidateSlug rejects leading/trailing hyphens and consecutive hyphens (defensive — Slugify never produces them, but user/agent-supplied names bypass Slugify). - Slugify distinguishes root-literal paths (`/`, `/*`) from empty-path catch-alls so the two don't collide on the same host. - MergeServices panics on empty Name on either side — a programming- error invariant, since the Name-keyed index would otherwise collapse every empty-name entry onto "" and silently overwrite. - ServicesTab UX: single Host Pattern field with tooltip-style help, refetch after save (server may rename), stable React keys for custom-header and substitution rows so deleting a middle row no longer bleeds values into adjacent rows. Default auth type is passthrough.
|
@claude review |
…tcher field everywhere Writes already accept a single inline-form `host` (e.g. `slack.com/api/*`) and the server splits it into bare Host + Path on ingest. Reads, though, returned `host` and `path` separately — asymmetric with the write shape and surprising for SDK/CLI/UI consumers that wrote one field and read back two. Make every read surface emit a single `host` carrying the joined inline form, matching what was sent. Mechanism: custom MarshalJSON on broker.Service and proposal.Service joins Host + Path back into `host` and suppresses the `path` field via omitempty. The Path json tag stays in place so on-disk records written before this change still unmarshal correctly. Every read site that unmarshals from storage (loadServices, brokercore.Inject, handle_discovery, cmd/service_interactive.fetchServices) calls SplitInlineHost defensively after decode so the matcher invariant — stored Host has no '/' — holds regardless of whether the row was persisted in legacy split form or new joined form. Idempotent on both. Wire shape consequences: - discoverService, candidateRef, serviceRef lost their Path field; callers compute the wire `host` via svc.MatcherPattern(). - SDK Service / CredentialUsageEntry / WireService / WireCredentialUsageEntry lost path? on the read shape (also dropped a stale description?: string|null leftover on WireService — description was already removed end-to-end in 09c2a6e). - CLI `vault discover` table dropped the PATH column; service_interactive applies the defensive split so interactive append over a vault with path-scoped services validates instead of failing on the Host-must-not-contain-/ invariant. - Web ServicesTab and ProposalPreview lost path? from their interfaces and display. - skill_cli.md / skill_http.md and the docs/learn services/protocol/ vaults/proposals pages reframed to single-host. Request-log fields ("method, host, path, status") stay as-is — that `path` is the request URL path, a different concept from the matcher's path glob. TestServicesGetReturnsJoinedHostNoPathField pins both the new wire shape AND the legacy-format read path: it seeds storage with the old split form and asserts the GET response carries the joined-form host and no `path` field. TestServicesUpsertSplitsInlineHost flipped to assert the new persistence shape — joined form on disk, no `path` on the wire — while still proving the auto-slug derives from the split (so the inline form was actually parsed).
…/UI fixes
Four issues flagged in the PR review.
Apply-time delete bug (silent data loss). handleAdminProposalApprove
normalized proposed services inline: an ActionDelete with no Name
fell through to broker.Slugify(host) and was marked autoSlugged=false
(the gate was Action==ActionSet), so the bump-on-collision pass
skipped it. MergeServices then indexed by Name and silently deleted
whichever existing service happened to carry that slug as its Name —
even if its host was unrelated. Slugify is many-to-one
(api.foo.com and api-foo.com both produce api-foo-com), so the
collision is realistic, not theoretical. Trigger window: pre-PR
proposals stored with no Name, applied within their 7-day TTL.
Replace the inline loop with a call to the create-time helper
normalizeProposalServices, which already has the correct delete-arm
logic: 1 host match adopts Name, 2+ matches return hostAmbiguityError
(now surfaced as 409 + candidates), 0 matches leave Name empty. Add a
guard after the call to return 409 if any ActionDelete reaches
MergeServices with Name="" — MergeServices panics on empty Name and
silently dropping the entry would let a stale approval succeed
without applying anything, which is worse than a clear error.
Two regression tests pin the apply-path behavior:
- TestAdminProposalApproveRejectsStaleDeleteWithoutName: 0-hit case
(host no longer in vault) returns 409, unrelated service whose
Name collides with Slugify(host) survives untouched.
- TestAdminProposalApproveRejects409OnAmbiguousDelete: 2+ hit case
(host shared by path-scoped services) returns 409 with candidates
array, both services survive.
SDK README example. The "Configure proxy rules" snippet passed fields
that aren't on ServiceInput: description on stripe (TS2353, server
returns 400 via rejectDeprecatedDescription), and path as a separate
property on the slack entries. Drop description; switch slack to
inline-form host (slack.com/api/*, slack.com/api/apps.connections.*).
Daytona quickstart (run.mjs). The example POST body still carried
description: "OpenAI Realtime", which now 400s. Drop it.
CredentialsTab React key collision. Pre-PR, host was unique per
vault so key={svc.host} was safe in the delete-credential modal.
With path-based matching, two services can share a host (the Slack
split that motivated this PR), and if both reference the same
credential, /credential-usage returns two entries with identical
host — duplicate React keys. The server already populates `name` on
every entry; switch to key={svc.name ?? svc.host}.
- close SSRF gap on direct upsert: move ValidateHost into broker package
and call it from broker.Validate so admin POST /services rejects IPs
and internal hosts the same way the proposal flow does
- fix interactive `vault service set` (was failing with "name is required"
on every run) by NormalizeServices-ing the merged slice before validate;
drop client-side normalize+validate from `vault service add -f` so the
server's existing-aware collision protection isn't defeated
- replace confusing "name is required" on stale-host deletes with a
structured hostNotFoundError → 404 at create, 409 at apply; unify both
call sites via a single writeNormalizeError helper
- redirect skill_http.md / protocol.mdx Notes off the unregistered
GET /services/{name} endpoint to list-and-filter
- ValidatePath now uses unicode.IsControl (covers C1 range alongside C0)
normalizeProposalServices now honors Path when the caller scoped via inline form (e.g. `slack.com/api/*`), so a path-scoped delete resolves to its single match instead of 409'ing against unrelated path siblings on the same host. Bare-host deletes still surface multi-service ambiguity. Adds a regression test. Also trims verbose docstrings and inline comments across internal/ (net -183 lines) while keeping load-bearing WHY annotations: the MergeServices panic rationale, SSRF reasoning on internalHosts, the vaultServiceMu SQLite invariant, MarshalJSON/SplitInlineHost coupling, and InjectResult's secret-vs-safe-to-log distinctions.
- handleServicesClear: acquire lockVaultServices to match the documented invariant covering every /services write handler - cmd/service.go: quote the slack.com/api/* example so zsh's nomatch doesn't trip - skill_http.md: move the Matching priority H3 out of the Key fields bullet list so the trailing credentials bullets render under Key fields - services.set() JSDoc: rewrite to upsert-by-name (was still "by host" pre-PR semantics)
|
@claude review |
Drop host-based name auto-derivation (Slugify, NormalizeServices, EnsureUniqueName) so service identity is always a deliberate choice rather than a side effect of host parsing. Server validation rejects ActionSet without name; CLI flag handlers and the interactive prompt surface the same error up front. ActionDelete keeps the by-host fallback for the proposal flow.
- AGENTS.md: /discover entries are {name, host}, not name+host+path
- cmd/proposal.go: print service Name alongside host so the CLI review
shows the identifier the merge actually keys on
- cmd/service_interactive.go: rename findDuplicateHosts to
findDuplicateMatchers and key on (Host, Path) so legitimate
path-scoped siblings (slack.com/api/* vs slack.com/api/apps.connections.*)
don't trip the duplicate warning; reword the message to match the
matcher's first-declared-wins tiebreak
- cmd/cmd_test.go: add "name" to TestProposalCreateFlagsRegistered
|
@claude review |
| func MergeServices(existing []broker.Service, proposed []Service) ([]broker.Service, []string) { | ||
| // Index existing services by host for O(1) lookup. | ||
| hostIndex := make(map[string]int, len(existing)) | ||
| for i, s := range existing { | ||
| hostIndex[s.Host] = i | ||
| if s.Name == "" { | ||
| panic(fmt.Sprintf("proposal.MergeServices: existing[%d] has empty Name (host=%q)", i, s.Host)) | ||
| } | ||
| } | ||
| for i, p := range proposed { | ||
| if p.Name == "" { | ||
| panic(fmt.Sprintf("proposal.MergeServices: proposed[%d] has empty Name (host=%q, action=%q)", i, p.Host, p.Action)) | ||
| } | ||
| } |
There was a problem hiding this comment.
🔴 Approving any proposal on a pre-PR vault panics the request goroutine. Legacy services persisted before this PR have Name="" (the field is new in this diff), and MergeServices's new guards (merge.go:15-25) fire on the first existing entry. The PR description claims "Legacy services are normalized lazily on read," but broker.NormalizeServices and broker.Slugify don't exist anywhere in this codebase — loadServices (handle_services.go:134-150) only applies SplitInlineHost, never backfilling Name. Fix: implement the missing normalizer and call it in loadServices so existing entries are backfilled before reaching MergeServices, or replace the panic with a returned error surfaced as a 500 with a migration message. The Web UI's PUT-services edit path is also broken for the same root cause (broker.Validate hard-rejects empty Name at broker.go:374-376), so operators have no in-product remediation.
Extended reasoning...
What the bug is
Legacy services persisted before this PR have no Name field on the struct (the diff shows Description *string being removed and Name string being added on broker.Service). When json.Unmarshal runs against a pre-PR services_json row, every legacy entry materialises with Name="". This PR adds new panic guards at the top of proposal.MergeServices (merge.go:15-25):
for i, s := range existing {
if s.Name == "" {
panic(fmt.Sprintf("proposal.MergeServices: existing[%d] has empty Name (host=%q)", i, s.Host))
}
}
for i, p := range proposed {
if p.Name == "" {
panic(fmt.Sprintf("proposal.MergeServices: proposed[%d] has empty Name (host=%q, action=%q)", i, p.Host, p.Action))
}
}The PR description and the docstring on MergeServices both say "callers must normalize before MergeServices" / "Legacy services are normalized lazily on read." That contract is unimplemented.
The missing helpers
grep -rn 'NormalizeServices\|Slugify' across the entire repository (Go, MDX, MD, TS, TSX) returns zero matches. The PR description, several review comments, and the docstring on MergeServices reference these helpers as if they exist, but only the call sites that depend on them landed — not the helpers themselves. Functions actually present in internal/broker/broker.go are Validate, MatchService, ValidateSlug, ValidatePath, ValidateHost, SplitInlineHost. None of them backfill Name.
The specific code path that triggers it
handleAdminProposalApprove (handle_proposals.go:478-493) runs:
existingServices, err := s.loadServices(ctx, ns.ID)— returns legacy entries withName="".proposedServices, err = normalizeProposalServices(proposedServices, existingServices)— only fillsNameforActionDelete(handle_services.go:71-83);ActionSetentries with emptyNamepass through unchanged. Even onActionDelete, when the matched existing service is legacy, the adoptedNameis still"".merged, _ := proposal.MergeServices(existingServices, proposedServices)— panics on the first iteration of the first guard loop at merge.go:17, before any merge work happens.
I verified loadServices end-to-end: it calls GetBrokerConfig, unmarshals into []broker.Service, applies SplitInlineHost per entry, returns. No name normalization step exists. I verified no migration was added — internal/store/migrations/ ends at 045_no_access_role.sql; nothing touches broker_configs.services_json. I verified there is no recover() anywhere in internal/server/, so Go's net/http default panic handler closes the connection and the operator sees a 500 / dropped connection with the stack trace in logs.
Step-by-step proof (legacy vault)
- Operator upgrades to this PR. Vault holds
[{"host":"api.stripe.com","auth":{...}}, {"host":"*.github.com","auth":{...}}]from before — JSON has no"name"key. - Agent submits a fresh proposal post-upgrade with a valid Name field:
{"services":[{"action":"set","name":"slack-bot","host":"slack.com","auth":{...}}], ...}. Create succeeds. - Admin clicks Approve.
handleAdminProposalApproveruns.loadServicesreturns the two legacy entries, both withName="".normalizeProposalServicesreturns the proposed slice unchanged (the new entry already hasName="slack-bot"). proposal.MergeServices(existingServices, proposedServices)enters the first guard loop. Ati=0,existing[0].Name == ""— panic fires withproposal.MergeServices: existing[0] has empty Name (host="api.stripe.com").- The goroutine's stack trace gets dropped to logs; net/http's connection handler catches it and the operator sees a connection drop / 500. The proposal cannot be approved — there is no UI workaround.
Web UI breakage (same root cause)
The Web UI's PUT-services edit flow is also fully broken for legacy vaults. GET /v1/vaults/{name}/services returns legacy entries with name="" because the read path doesn't backfill either. Saving any edit via the ServicesTab modal triggers PUT /services with the full list (ServicesTab.tsx:saveServices), which calls broker.Validate server-side — and broker.go:374-376 hard-rejects every empty-Name entry: return fmt.Errorf("service %d: name is required", i). Operators cannot edit a single service without first manually filling Names for every legacy service. There is no UI path that backfills automatically.
Why this is normal severity and how to fix
Severity is normal (not critical) because Go's net/http catches the panic at the connection level so the process keeps running, and brand-new installs with no legacy data are unaffected. But the trigger is a routine operator action (approve a proposal) on the headline data shape this PR claims to support, and there is no in-product remediation path. The fix the PR description implies is missing: implement Slugify(host, path) (a deterministic, idempotent slug builder) plus NormalizeServices(svcs []Service) []Service that backfills Name = Slugify(Host, Path) (with collision-bumping if needed) for any entry with empty Name, and call it in loadServices so every read path produces normalized entries. Alternatively/additionally: replace the panic in MergeServices with a returned error, and have the apply path surface it as a 500 with a clear migration message — at least no crash. The first option restores the documented "lazy normalization on read" contract; the second is a defense-in-depth backstop for the data-loss class concern the panic docstring cites.
| export interface ServiceInput { | ||
| /** Service name (slug, 3–64 chars, lowercase alphanumeric and hyphens). Required on write. */ | ||
| name?: string; | ||
| /** Host pattern. Accepts `api.stripe.com`, `*.github.com`, or an inline path form like `slack.com/api/*`. */ | ||
| host: string; | ||
| /** Whether the service is active. Omitted/undefined is treated as enabled. */ | ||
| enabled?: boolean; | ||
| /** Authentication configuration. */ | ||
| auth: ServiceAuth; | ||
| /** Optional placeholder→credential substitutions applied before forwarding. */ | ||
| substitutions?: Substitution[]; | ||
| } |
There was a problem hiding this comment.
🟡 The TypeScript SDK type ServiceInput.name at sdks/sdk-typescript/src/resources/services.ts:99 is declared name?: string (optional), but the JSDoc literally one line above says "Required on write" and the server hard-rejects missing Name with 400 (broker.Validate at internal/broker/broker.go:353-355: service N: name is required). A TypeScript user can write vault.services!.set([{host:"api.stripe.com", auth:{...}}]) and tsc --strict compiles cleanly — the runtime failure is the first signal. The same mismatch exists on WireService.name at sdks/sdk-typescript/src/types.ts:108-109. Fix: drop the ? on both interfaces so the type matches the contract.
Extended reasoning...
What the bug is
The SDK exports two write-side types whose name field is optional in the type system but mandatory at runtime:
ServiceInput.name?: stringat sdks/sdk-typescript/src/resources/services.ts:99 — the JSDoc directly above on line 98 says "Required on write". This is the write shape consumed byServicesResource.set()andreplaceAll().WireService.name?: stringat sdks/sdk-typescript/src/types.ts:108-109 — the JSDoc says "Required on write — pick deliberately; the server does not derive it fromhost". This is the internal wire type.
The server enforces the contract hard at broker.Validate (internal/broker/broker.go:353-355):
if s.Name == "" {
return fmt.Errorf("service %d: name is required", i)
}Surfaced by handleServicesUpsert as HTTP 400 with Invalid services: service 0: name is required.
Why it matters
Every other write-required field on ServiceInput is non-optional in the type — host: string, auth: ServiceAuth. Optional fields are correctly typed optional — enabled?, substitutions?. The lone outlier is name?, where prose says required but the type system says optional. TypeScript strict mode protects against typos in object literals (e.g. the description excess-property error users hit), but it cannot protect against a field whose own type signature permits omission.
Step-by-step proof
- TypeScript user installs
@infisical/agent-vault-sdk@0.2.0, opensServiceInputin their IDE, seesname?: string. - Following the visible type, they write:
await vault.services!.set([{ host: "api.stripe.com", auth: { type: "bearer", token: "STRIPE_KEY" }, }]);
tsc --strictcompiles cleanly —name?permits omission.- SDK marshals
{services:[{host:"api.stripe.com",auth:{...}}]}and POSTs to/v1/vaults/{name}/services. - Server
handleServicesUpsertrunsbroker.Validate(&incoming). At broker.go:353 the check fires. - Server returns HTTP 400
{"error":"Invalid services: service 0: name is required"}. - The user got no compile-time signal — only a runtime 400 on a field they were never told they had to fill in.
Why the JSDoc is not enough
JSDoc renders in IDE hover tooltips but does not constrain the build. The strict-mode excess-property check that catches typos on the user-supplied side does NOT engage for missing optional fields. The type system is what users trust; the JSDoc is the second source of truth at best.
Why this is nit, not normal
The runtime behavior is correct (server returns a clear 400 with an actionable message), no data loss, no security risk. But this is the v0.2.0 release whose headline feature is exactly "name is canonical" — the SDK's type system is the first line of error feedback for users adopting the new model, and a self-contradicting type signature on the canonical write method undermines the migration story this release exists to deliver. Fix is one character per location.
How to fix
Drop the ? in both files:
// sdks/sdk-typescript/src/resources/services.ts:99
- name?: string;
+ name: string;
// sdks/sdk-typescript/src/types.ts:109
- name?: string;
+ name: string;Existing callers that supply name are unaffected. Callers omitting name now fail at compile time with TS2741: Property 'name' is missing in type '{ host: string; auth: ...; }' but required in type 'ServiceInput'. — immediate, actionable feedback matching the server contract.
| In agent mode (AGENT_VAULT_TOKEN set), AGENT_VAULT_VAULT (or --vault) is | ||
| required — there is no project-file or interactive-picker fallback. | ||
|
|
||
| Flag-driven mode (common cases): | ||
| Flag-driven mode (common cases). When --host is provided, --name is required: | ||
|
|
||
| # Service + credential | ||
| agent-vault vault proposal create \ | ||
| --host api.stripe.com --auth-type bearer --token-key STRIPE_KEY \ | ||
| --name stripe --host api.stripe.com --auth-type bearer --token-key STRIPE_KEY \ | ||
| --credential STRIPE_KEY="Stripe API key" --message "Need Stripe access" | ||
|
|
||
| # Path-scoped service (inline-form host) | ||
| agent-vault vault proposal create \ | ||
| --name slack-bot --host 'slack.com/api/*' \ | ||
| --auth-type bearer --token-key SLACK_BOT_TOKEN \ | ||
| --credential SLACK_BOT_TOKEN="Slack Bot token" --message "Slack bot access" | ||
|
|
||
| # Credential only (no host/service) | ||
| agent-vault vault proposal create \ | ||
| --credential DB_PASSWORD="Database password" --message "Need DB access" |
There was a problem hiding this comment.
🟡 The fallback error message printed when vault proposal create is invoked with no flags (cmd/proposal_create.go:86) shows an example that copies-and-pastes into an immediate failure: --host api.example.com is given without the now-required --name flag, so the user hits --name is required when --host is specified from buildFromFlags. The Long help block (lines 27-45) was correctly migrated to include --name my-service in every --host example; this parallel error-fallback string was missed. Fix is one token: insert --name my-service before --host in the second example on line 86.
Extended reasoning...
What the bug is
The error returned at cmd/proposal_create.go:86 (when the user runs agent-vault vault proposal create with no -f, no --host, and no --credential) prints three examples to guide the user:
provide either --host/--credential flags or -f <file>
Examples:
agent-vault vault proposal create --credential MY_KEY="description" --message "reason"
agent-vault vault proposal create --host api.example.com --auth-type bearer --token-key MY_KEY --credential MY_KEY --message "reason"
agent-vault vault proposal create -f proposal.json
The second example is broken on this PR: it sets --host api.example.com but omits --name. This PR introduced a hard requirement in buildFromFlags (cmd/proposal_create.go:194-197):
name, _ := cmd.Flags().GetString("name")
if name == "" {
return nil, fmt.Errorf("--name is required when --host is specified")
}A user who copies the example verbatim hits this check immediately and gets an error referencing a flag that was not in the help they just read.
Why existing code does not prevent it
The Long help block at cmd/proposal_create.go:27-45 was correctly updated in this PR to include --name in every --host example (e.g. --name stripe --host api.stripe.com, --name slack-bot --host 'slack.com/api/*'). The author was aware of the new requirement and migrated the canonical docs surface — but missed this second string constant at line 86, which is a separate piece of inline text only visible when the user provides no flags at all. The asymmetry between two places in the same file is concrete evidence this is an oversight rather than a deliberate choice.
Step-by-step proof
- User runs
agent-vault vault proposal createwith no flags. - Lines 79-86 hit the
elsebranch (nofilePath, nohost, nocredentials), returning the error string at line 86. - User reads the output and copies the second example verbatim:
agent-vault vault proposal create --host api.example.com --auth-type bearer --token-key MY_KEY --credential MY_KEY --message "reason". - The new invocation reaches
buildFromFlags(cmd/proposal_create.go:164), which builds the auth config and then runs the--namecheck at line 194-197. cmd.Flags().GetString("name")returns""(the flag was never passed); buildFromFlags returns--name is required when --host is specified.- The proposal create aborts; the user sees an error about a flag that was not present in the example they just copied.
Impact
Documentation-only — no runtime regression, no data loss, no security risk. Severity is nit because the user gets a clear error and can recover by adding --name. But the printed examples are a copy-paste invitation and currently guide users straight into a contradiction with a contract this PR newly introduced. The Long help that was correctly updated in this PR already demonstrates the right shape (--name slack-bot --host 'slack.com/api/*'), so the fix is unambiguous.
How to fix
One-token change at line 86 — insert --name my-service before --host in the second example:
- agent-vault vault proposal create --host api.example.com --auth-type bearer --token-key MY_KEY --credential MY_KEY --message "reason"
+ agent-vault vault proposal create --name my-service --host api.example.com --auth-type bearer --token-key MY_KEY --credential MY_KEY --message "reason"This matches the convention the Long help block at lines 27-45 already uses for every --host example.
PR #164 advertised that legacy services without a Name field would be slug-backfilled lazily on read and that write-time callers could omit Name to have one derived from host+path. Neither path was implemented: loadServices returned legacy entries with Name="", and the UI's PUT-all-services edit flow then failed validation on the unedited siblings with "Invalid services: service N: name is required". Add broker.Slugify(host, path) and broker.AssignSlugNames(services, reserved), and call them at every ingress where an empty Name should heal: loadServices (read), handleServicesUpsert + handleServicesSet (write), and normalizeProposalServices for ActionSet entries. CLI flag-level --name requirement drops to match the server contract.
…heal legacy on read (#165) ## Summary Fixes the **"Invalid services: service N: name is required"** error users hit in the Edit Service sidebar after PR #164. The vault held two-plus services persisted before `name` became required; opening any one for edit and clicking Save failed validation on the unedited siblings. PR #164's commit message advertised auto-naming at write time + lazy heal on read, but neither path was implemented (`broker.Slugify` did not exist; `loadServices` only split inline hosts). After review iteration, this PR lands a tighter contract than the original auto-everywhere design: | Path | Behavior | | -------------------------- | ------------------------------------------------------------------------------------------------------- | | **Reads** (legacy heal) | `loadServices`, `/discover`, `credential.Inject` auto-slugify empty `Name` from `host`+`path` per read. | | **Writes** (new services) | `name` is **required**. Empty `name` with no host match → 400 "name is required". | | **Writes** (host-adoption) | Empty `name` whose `(host, path)` uniquely matches an existing service adopts that entry's name. | | **Proposal apply** | Stale `name` (e.g. existing service was renamed between create and apply) rebinds by `(host, path)`. | This unifies set/upsert with the existing `ActionDelete`-by-host pattern: agents that know the host but not the canonical name can still target an existing service, but the server never silently invents a name for a brand-new service. ## Implementation **Broker helpers** (`internal/broker/broker.go`) - `Slugify(host, path)` — deterministic, `ValidateSlug`-conformant id derivation. Strips wildcards, normalizes non-`[a-z0-9]` chars to `-`, collapses, trims, truncates to 64, pads `<3` chars. - `AssignSlugNames(services)` — read-path heal. Fills empty `Name` via `Slugify`+`DisambiguateSlug` (intra-slice disambiguation only). - `DisambiguateSlug(base, taken)` — lowest `base-<n>` (n≥2) not in taken, length-bounded. - `AssignSlugNamesAvoiding(services, existing)` — kept for existing-state-aware heal; reservation branch is now defensive (write paths use `adoptByHost` instead, see below). **Read-side heal — three ingress points** - `loadServices` (`handle_services.go`) — heals on every read; the slug becomes durable on the next write. - `handleDiscover` (`handle_discovery.go`) — agents identify services by `name` per skill docs; blank `Name` would make a service un-addressable from `/discover`. - `StoreCredentialProvider.Inject` (`brokercore/credential.go`) — `MatchedName` lands in the request log + `X-Vault-Service` header; the documented `?service=<name>` log filter depends on this. **Write-side adopt-by-host — server handlers** (`internal/server/handle_services.go`) - `adoptByHost(services, existing, rebindStale)` — single helper used by both writes and proposals. Empty `Name` adopts a unique `(Host, Path)` match in existing. When `rebindStale=true`, non-empty `Name` that misses existing's nameset is also rebound — proposals pass `true` to close the create→rename→apply race; direct admin upserts pass `false` so a caller-supplied name is never silently rewritten. - `handleServicesUpsert` (POST) → `adoptByHost(..., false)`. - `handleServicesSet` (PUT) → no adopt; PUT is replace-all, callers round-trip names from GET. - `normalizeProposalServices` → `adoptByHost(..., true)` for `ActionSet` entries with non-empty `Host`. Called at both proposal create AND proposal apply, so the rebind heals the rename race at apply time. **CLI + docs** - `cmd/service.go`, `cmd/proposal_create.go` — flag-help reflects the new contract: "Required for new services; may be omitted when `--host` uniquely matches an existing service." - `cmd/skill_http.md`, `cmd/skill_cli.md`, `docs/learn/services.mdx`, `docs/reference/cli.mdx`, `AGENTS.md` — same wording across every agent-facing surface. **Unrelated thread bundled here**: removal of the deprecated `AGENT_VAULT_SESSION_TOKEN` env-var alias from server, CLI, SDK, and docs (commit `2a4c7db`). Self-contained; the env var was already documented as deprecated. ## Test plan - [x] `go test ./...` green. - [x] `Slugify` / `AssignSlugNames` / `DisambiguateSlug` / `AssignSlugNamesAvoiding` unit tests in `internal/broker/broker_test.go`. - [x] **Rejection paths**: `TestServicesUpsertRejectsMissingNameForNewService`, `TestServicesUpsertEmptyNameDoesNotReplaceUnrelatedExisting`, `TestProposalCreateActionSetRejectsMissingNameForNewService`, `TestProposalCreateActionSetEmptyNameUnrelatedHostRejects` — all assert 400 with `name is required` instead of silent auto-slug-and-overwrite. - [x] **Adopt-by-host**: `TestServicesUpsertEmptyNameAdoptsExistingByHostPath`, `TestProposalCreateActionSetEmptyNameAdoptsExistingByHostPath` — empty `Name` + matching host → adopts the existing entry's name in place. - [x] **Create→rename→apply race**: `TestAdminProposalApplyRebindsStaleNameByHost` — admin renames `api-stripe-com` → `stripe-prod` between proposal create and approve; apply rebinds and updates `stripe-prod` instead of appending a ghost. - [x] **Read-path heal**: `TestDiscoverHealsLegacyUnnamedServices`, `TestInject_HealsLegacyUnnamedServiceMatchedName`, `TestLegacyUnnamedServicesGetSetRoundTrip` — pins heal on `/discover`, proxy hot path, and GET→modify-one→PUT. - [ ] Manually verify: open Edit Service sidebar on a legacy unnamed service in the running web UI, set a custom name, save — succeeds; refresh confirms all sibling services now show backfilled slugs. Closes the immediate user-reported bug from the post-PR-#164 release.
Summary
/api/*for the Bot token and/api/apps.connections.*for Socket Mode).namebecomes the canonical per-vault service identifier, fixing a long-standing weakness where two services on the same host couldn't be addressed independently.Highlights
broker.MatchService(host, path, services)replacesMatchHost. Returns(*Service, MatchScore)so the debug log can audit(host_tier, path_prefix_len, decl_order)without recomputing.DELETE/PATCH /v1/vaults/{vault}/services/{name-or-host}: tries name first, falls back to host, returns 409 with the candidate list on ambiguity. Single-rule callers see no behavior change.slack.com/api/*) is split server-side at every ingest path; CLI mirrors the split for--host.nameare slugged viabroker.Slugify(host, path)at write time. Legacy services are normalized lazily on read — no migration step, no manual config edits, no DB schema change.broker.ValidateSlugconsolidates the 3–64 lowercase-alphanumeric-hyphens rule that previously lived in a privatevalidateSlugshared by vault and agent name handlers.name?/path?onService, newremoveByName(name)method, existingremove(host)still works against the back-compat host-shim route.Breaking changes (operator-visible)
ProxyEvent.MatchedServicesemantic flip: previously stored the matched host string; now stores the canonical service name (slug). NewMatchedHost/MatchedPathfields carry the matched pattern alongside (in-memory only — no schema change). Documented incmd/skill_http.md. Operators filtering request logs by?service=should switch to the slug; pre-upgrade rows still contain hosts.Test plan
make test— full Go suite passes (only pre-existingTestParseAndValidateMount_RejectDockerSocketininternal/isolationfails, environment-dependent on a local docker socket; unrelated to this change)cd sdks/sdk-typescript && npm test(98/98)cd web && npx tsc --noEmitgo vet ./...cleanmake buildproduces a working binary end-to-end/, declaration-order tiebreak, port-stripping, empty-path catch-allname/pathround-trip andremoveByNameslack.com/api/*, one via separate Host + Path fields), toggle one off, confirm the other still proxies, list shows distinct rows keyed by nameagent-vault vault service add --name slack-bot --host slack.com --path '/api/*' …, then--name slack-conn --host slack.com --path '/api/apps.connections.*' …, send requests through the proxy and confirm the right credential is injected for each (visible invault rundebug logs asservice=slack-bot|slack-conn)