Skip to content

Add agent policy hooks enforcement#9957

Open
etherman-os wants to merge 26 commits intowarpdotdev:masterfrom
etherman-os:etherman-os/spec-agent-policy-hooks
Open

Add agent policy hooks enforcement#9957
etherman-os wants to merge 26 commits intowarpdotdev:masterfrom
etherman-os:etherman-os/spec-agent-policy-hooks

Conversation

@etherman-os
Copy link
Copy Markdown

@etherman-os etherman-os commented May 3, 2026

Description

Adds the product/tech spec plus a working Agent Policy Hooks implementation. Hooks are disabled by default in Agent Profiles, but when configured they run as host-side preflight before governed Warp-owned Agent actions execute.

The implementation covers shell command execution, writes to existing long-running shell commands, file reads, file write/code diff requests, MCP tool calls, and MCP resource reads. Warp remains the enforcement point: hook denials return policy-blocked action results before the underlying command, shell write, file operation, or MCP call starts. Run-to-completion decisions still pass through policy hooks.

Implemented in code

  • Disabled-by-default agent_policy_hooks field on AIExecutionProfile.
  • Serializable stdio and HTTP hook config with timeout, unavailable behavior, environment-variable credential references, HTTPS-only remote HTTP validation with localhost exception, rejection of embedded URL credentials even when hooks are disabled, rejection of credential-looking HTTP query/fragment values, rejection of raw token-shaped query/fragment values even behind innocuous keys, rejection of credential-looking stdio argv values, adjacent-argument rejection for split stdio credential forms such as --token secret, --client-secret secret, --refresh-token secret, --authorization "Bearer ...", and X-API-Key: secret, and rejection of partial literal credential bypasses such as secret$TOKEN while still allowing pure env references such as $TOKEN, ${TOKEN}, Bearer $TOKEN, or X-API-Key: $TOKEN.
  • Hook secret references are validated as environment variable names only; object-shaped or literal secret refs such as { "env": "sk-..." } or { "env": "Bearer ..." } are rejected and unsafe profile serialization falls back to the default disabled config.
  • Stable warp.agent_policy_hook.v1 event/response schema with redacted command payloads, redacted long-running shell write payloads, bounded path and MCP argument-key collections with omitted counts, and MCP argument keys only.
  • Policy events and preflight coverage for writes to existing long-running shell commands, including redacted/truncated input, write mode, block id, and stable policy-denied result handling.
  • Shell command and long-running shell write policy denials round-trip through API/persistence with explicit host-policy markers; unlabeled/generic errors remain cancelled instead of being reclassified as host-policy denials.
  • Policy-denied command results redact command text both when the result is created and again before agent inputs are sent through the API path.
  • File-edit hook ask decisions defer diff preprocessing and original-file reads until after the user confirms the policy prompt; confirmed or hook-autoapproved file edits preprocess and then retry execution without dropping the prior confirmation/autoapproval state.
  • Async preflight state in BlocklistAIActionExecutor with pending decision cache scoped by conversation id, action id, and redacted action payload; cancellation cleanup; retry event wiring; and cached hook-result recomposition against the current Warp permission snapshot on retry.
  • Conservative decision composition: hook deny wins, hook ask routes to confirmation, hard Warp denials are not upgraded, and hook allow only autoapproves Warp ask when explicitly configured, at least one hook is configured, every hook autoapproval condition passes, every hook evaluation is a successful allow response, and the cached hook saw the same Warp permission snapshot.
  • Explicit preservation of terminal Warp denials such as command denylist matches, protected file-write paths, and MCP denylist matches before hook autoapproval can run.
  • Stdio hook execution with JSON stdin/stdout, cleared child process environment, explicitly configured env references only, timeout, nonzero-exit handling, malformed-response fallback, bounded event input, bounded stdout, and truncated/redacted stderr.
  • HTTP hook execution with POST JSON body, event id idempotency header, redirect refusal, single timeout budget, status/error fallback, bounded event input, and bounded response body.
  • Local JSONL audit writer under secure state, with private Unix directory creation (0700), private audit file creation (0600), redacted action payload, hook results, effective decision, error class, reason, and external audit id.
  • Policy-blocked agent results for shell commands, long-running shell writes, file reads, file edits, MCP tools, and MCP resources. File-edit denials use RequestFileEditsResult::PolicyDenied rather than a generic diff-application failure.
  • Policy ask reason is carried into the blocked conversation status while preserving the existing confirmation flow.

Security notes

  • Long-running shell writes are governed by the same hook preflight before input reaches an existing shell or REPL; write input is redacted and truncated before event/audit serialization.
  • Shell command and long-running shell write policy denials use explicit markers when converted through the API shape, so generic unlabeled errors are not misclassified as host-policy blocks after reload/session-sharing.
  • Policy-denied command results do not retain raw token/API-key-like command text in the result object or in the agent input redaction pass.
  • File-edit policy denials survive API conversion/reload as RequestFileEditsResult::PolicyDenied rather than becoming generic diff failures.
  • File-edit hook ask decisions do not apply candidate diffs or load original file contents before the user sees and confirms the policy prompt.
  • File-edit retries after deferred preprocessing preserve the confirmation or hook autoapproval that allowed preprocessing to begin, so the action does not block on the same confirmation again.
  • Stdio hook credentials must use the persisted env secret-reference map; credential-looking literal args such as --token=..., --token ..., --client-secret ..., --refresh-token ..., --authorization "Bearer ...", API_KEY=..., clientSecret=..., Authorization: Bearer ..., split header forms such as X-API-Key: ..., partial literal/env hybrids such as secret$TOKEN, or bearer/basic literal fragments are rejected before validation/persistence.
  • HTTP hook credentials must use header env references; URL userinfo plus credential-looking query and fragment keys/values such as ?token=..., ?api_key=..., ?clientSecret=..., ?accessToken=..., ?refresh-token=..., #access_token=..., and raw token-shaped values such as ?q=sk-... or #state=ghp_... are rejected before validation/persistence.
  • Disabled or malformed HTTP hook URLs are also checked before persistence, including malformed http:/https: values with userinfo-like authority or token-shaped query/fragment suffixes.
  • Hook autoapproval requires a nonempty hook list and successful hook allow evaluations. Timeout, malformed response, invalid configuration, HTTP/stdio failure, oversized payload, or other unavailable fallback decisions cannot autoapprove a Warp ask even if on_unavailable: allow is configured.
  • Hook payloads and stdio hook child processes do not include file contents, full inherited environment variables, unbounded MCP arguments, command output, unredacted bearer/API-token-like command values, raw hook credentials, basic-auth credentials, URL-embedded credentials, unbounded path/key collections, or unbounded hook output by default.
  • Command redaction covers env-like secret assignments, bearer/basic Authorization headers, common token shapes, URL userinfo, split and inline secret flags such as --token secret, --password secret, --api-key secret, --client-secret secret, --refresh-token secret, --access-token secret, --clientSecret=secret, --authorization Bearer ..., and curl -u/--user/--proxy-user credentials before hook/audit/API payloads are serialized.
  • Hook credentials are persisted only as environment-variable references; raw inline stdio env/header secret values and object-shaped raw literal refs are rejected.
  • Credential-bearing fields are validated even when hook config is disabled, so inactive profile config cannot validate with URL-embedded credentials or token-shaped query/fragment values.
  • AgentPolicyHookConfig serialization runs safe-to-persist checks and sanitizes unsafe config to the default disabled config instead of panicking, so disabled/inactive hook config with persisted credentials cannot be written into local or cloud-synced execution profile JSON.
  • URL-embedded credential detection no longer depends only on successful URL parsing; malformed http:/https: values with userinfo-like authority or query/fragment credentials are rejected before profile storage.
  • Cached hook results are recomposed with the current Warp permission snapshot before reuse, and cached hook allow cannot autoapprove if that snapshot changed while the hook was pending.
  • Cached hook results are keyed by conversation id, action id, and redacted action payload; edited commands, changed shell-write input, or other changed action payloads cannot reuse a previous hook decision for the same action id.
  • A user click while a hook is pending is not treated as confirmation for a later hook ask; the completed hook reason must be shown in the confirmation flow.
  • Policy event JSON is serialized through a capped writer before stdio/HTTP dispatch. Oversized events return the configured unavailable decision and are not sent to hooks.
  • Model-controlled path lists and MCP argument-key lists are capped with omitted counts; individual command/resource/path/key strings are redacted/truncated before serialization.
  • Configured hook secrets are redacted from hook-returned reasons and external audit ids by full value and by bearer/basic credential fragment when hooks echo only the token portion.
  • Remote HTTP hook URLs must use HTTPS unless targeting localhost, must not include embedded username/password credentials, and redirects are not followed.
  • Hook stdout/HTTP response bodies are capped; stderr is truncated and configured secret values are redacted before inclusion in failure detail.
  • Hook timeout/crash/malformed response/oversized event maps to configured unavailable behavior, defaulting to ask.
  • Audit directories are created with private Unix permissions at creation time, and audit records are redacted JSONL written with private Unix file permissions.
  • cargo audit and cargo deny check advisories were run. Both fail on existing Cargo.lock baseline advisories; this PR does not modify Cargo.toml or Cargo.lock. Current reported baseline items include rustls-webpki RUSTSEC-2026-0098/0099/0104, local crate-name collision on warp RUSTSEC-2022-0082, websocket RUSTSEC-2022-0035, plus existing diesel, rand, bincode, core2, and safemem warnings/advisories.

Follow-ups

  • Add dedicated settings UI for editing hook config instead of relying on the serialized Agent Profile field.
  • Add broader end-to-end UI/integration coverage with a real terminal/MCP peer harness.

Linked Issue

N/A - no active linked issue. This PR is a standalone implementation.

Screenshots / Videos

N/A - no new visual UI surface.

Testing

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p ai --lib (157 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (61 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp redact_inputs_redacts_policy_denied_command_result_command --lib (1 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input --lib (10 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib
  • CARGO_BUILD_JOBS=1 cargo test -p warp terminal::model::grid::grid_handler::secrets::tests::test_bytes_processed_for_secrets_after_turning_redaction --lib -- --nocapture (4 passed; isolates the secret-redaction tests that fail only in the full-suite order/environment)
  • CARGO_BUILD_JOBS=1 cargo test -p warp --lib (3801 passed, 13 failed, 7 ignored. All ai::policy_hooks tests pass in the full suite, including stdio_engine_does_not_inherit_parent_environment. Remaining failures are outside this PR surface in this local environment: telemetry/settings/bootstrap/terminal view/input/workspace tests; the secret-redaction failures pass when isolated as noted above.)
  • cargo audit (run; fails on existing Cargo.lock baseline advisories, no dependency file changes in this PR; 5 vulnerabilities and 15 allowed warnings reported)
  • cargo deny check advisories (run; fails on existing Cargo.lock baseline advisories, no dependency file changes in this PR; reported diesel, rand, and rustls-webpki advisories)

Agent Mode

  • Warp Agent Mode - This PR was created via Warp Agent Mode

@cla-bot cla-bot Bot added the cla-signed label May 3, 2026
@github-actions github-actions Bot added the external-contributor Indicates that a PR has been opened by someone outside the Warp team. label May 3, 2026
@etherman-os etherman-os changed the title Spec Agent Policy Hooks for governed autonomous actions Add agent policy hooks spec and preflight scaffold May 3, 2026
@etherman-os etherman-os changed the title Add agent policy hooks spec and preflight scaffold Add agent policy hooks enforcement May 3, 2026
@etherman-os etherman-os marked this pull request as ready for review May 3, 2026 04:02
@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm starting a first review of this pull request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds host-enforced agent policy hooks across command, file, and MCP action surfaces, plus specs and audit/event plumbing. The enforcement direction is sound, but several implementation details need fixes before merge.

Concerns

  • Hook stdout/stderr and HTTP response bodies are buffered before size checks, so the advertised bounds do not prevent memory exhaustion.
  • Stdio hook stderr is included in policy failure details without secret redaction.
  • Write-file policy events report zero additions/deletions, which can mislead external policy decisions.
  • API conversion collapses policy denials into denylist results and drops the policy reason.

Security

  • A faulty or malicious configured hook can return unbounded stdio/HTTP output and exhaust memory before the current caps are checked.
  • Hook stderr can contain secrets from configured env values and is propagated after truncation but without redaction.

Verdict

Found: 0 critical, 5 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread app/src/ai/policy_hooks/engine.rs Outdated

let timeout = Duration::from_millis(self.config.hook_timeout_ms(hook));
let output = child
.output()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] child.output() buffers all stdout and stderr before MAX_HOOK_STDOUT_BYTES is checked, so a noisy hook can exhaust memory; read the pipes with an enforced byte cap and kill the child when the cap is exceeded.

Comment thread app/src/ai/policy_hooks/engine.rs Outdated
detail: format!(
"policy hook exited with {}; stderr={}",
output.status,
truncate_for_policy(stderr.trim())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Truncating stderr is not enough here because hooks receive secret env values and can echo them on failure; redact stderr before storing it in the failure detail/audit reason.

Comment thread app/src/ai/policy_hooks/engine.rs Outdated
}

let response_bytes = response
.bytes()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] response.bytes() buffers the entire HTTP response before the size check, so a remote hook can still cause unbounded memory use; stream the body with a hard cap or reject by content_length before buffering.

.collect::<Vec<_>>();
let diff_stats = PolicyDiffStats {
files_changed: paths.len(),
additions: 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Write-file policy events always report zero additions/deletions, so hooks that enforce diff-size or destructive-change policies receive misleading data; compute the real diff stats or omit the field until accurate.

}
RequestCommandOutputResult::Denylisted { command } =>
RequestCommandOutputResult::Denylisted { command }
| RequestCommandOutputResult::PolicyDenied { command, .. } =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Converting PolicyDenied as a denylist result drops the hook reason and mislabels the denial for API clients; add/preserve a policy-denied result so agents can distinguish host policy blocks and avoid blind retries.

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds host-enforced agent policy hooks across command, file, and MCP action surfaces, plus specs and audit/event plumbing. The enforcement direction is sound, but several implementation details need fixes before merge.

Concerns

  • Hook stdout/stderr and HTTP response bodies are buffered before size checks, so the advertised bounds do not prevent memory exhaustion.
  • Stdio hook stderr is included in policy failure details without secret redaction.
  • Write-file policy events report zero additions/deletions, which can mislead external policy decisions.
  • API conversion collapses policy denials into denylist results and drops the policy reason.

Security

  • A faulty or malicious configured hook can return unbounded stdio/HTTP output and exhaust memory before the current caps are checked.
  • Hook stderr can contain secrets from configured env values and is propagated after truncation but without redaction.

Verdict

Found: 0 critical, 5 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread app/src/ai/policy_hooks/engine.rs Outdated

let timeout = Duration::from_millis(self.config.hook_timeout_ms(hook));
let output = child
.output()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] child.output() buffers all stdout and stderr before MAX_HOOK_STDOUT_BYTES is checked, so a noisy hook can exhaust memory; read the pipes with an enforced byte cap and kill the child when the cap is exceeded.

Comment thread app/src/ai/policy_hooks/engine.rs Outdated
detail: format!(
"policy hook exited with {}; stderr={}",
output.status,
truncate_for_policy(stderr.trim())
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Truncating stderr is not enough here because hooks receive secret env values and can echo them on failure; redact stderr before storing it in the failure detail/audit reason.

Comment thread app/src/ai/policy_hooks/engine.rs Outdated
}

let response_bytes = response
.bytes()
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] response.bytes() buffers the entire HTTP response before the size check, so a remote hook can still cause unbounded memory use; stream the body with a hard cap or reject by content_length before buffering.

.collect::<Vec<_>>();
let diff_stats = PolicyDiffStats {
files_changed: paths.len(),
additions: 0,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Write-file policy events always report zero additions/deletions, so hooks that enforce diff-size or destructive-change policies receive misleading data; compute the real diff stats or omit the field until accurate.

}
RequestCommandOutputResult::Denylisted { command } =>
RequestCommandOutputResult::Denylisted { command }
| RequestCommandOutputResult::PolicyDenied { command, .. } =>
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Converting PolicyDenied as a denylist result drops the hook reason and mislabels the denial for API clients; add/preserve a policy-denied result so agents can distinguish host policy blocks and avoid blind retries.

@etherman-os
Copy link
Copy Markdown
Author

Addressed the Oz review findings in d5d6be5:

  • Replaced stdio child.output() buffering with capped concurrent stdout/stderr reads and process kill on timeout or cap failure.
  • Streamed HTTP hook responses with a content-length/chunk cap before buffering.
  • Redacted configured stdio hook secret env values from stderr before surfacing failure details/audit reasons.
  • Stopped emitting misleading zero diff_stats for write-file policy events when accurate stats are unavailable.
  • Stopped labeling host policy denials as denylisted commands; the current MAA schema has no host-policy reason variant, so PermissionDenied.reason is left unset and the host policy reason is preserved in the result output/round-trip conversion.

Validation run locally:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p warp policy_hooks --lib
  • cargo test -p warp test_convert_tool_call_result_to_input_preserves_host_policy_denial --lib
  • cargo test -p ai --lib

Note: cargo audit and cargo deny check advisories could not run locally because those cargo subcommands are not installed in this environment.

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds host-enforced Agent Policy Hooks across shell, file, MCP, conversion, and audit paths, plus product and technical specs. The implementation is close, but the preflight lifecycle and hook I/O handling have correctness/security issues that should be fixed before merge.

Concerns

  • Cached ask decisions are consumed before user confirmation, causing confirmed actions to run a second hook preflight with a new event id and duplicate external side effects/audit entries.
  • Stdio hook timeouts do not cover stdin writes, so a hook that stops reading can leave the governed action stuck indefinitely.

Security

  • Hook-provided reason strings are propagated into audit/conversation surfaces without redaction or truncation.

Verdict

Found: 0 critical, 3 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

return None;
}

if let Some(decision) = self.completed_policy_preflights.remove(&action.id) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Consuming the cached Ask decision here means the subsequent user-confirmed retry starts a brand-new hook preflight with a new event id, so an ask result produces duplicate hook calls/audit records and can re-trigger external approval flows. Keep the decision cached through confirmation (or mark it confirmed) until the action executes/cancels.

Comment thread app/src/ai/policy_hooks/engine.rs Outdated
});
};

stdin
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] The stdio timeout starts after both stdin writes finish, so a hook that stops reading stdin can hang write_all indefinitely and leave the action stuck. Wrap the entire spawn/write/read/wait sequence in the configured timeout (or separately time-bound stdin writes) and kill the child on expiry.

Comment thread app/src/ai/policy_hooks/decision.rs Outdated
Self {
hook_name: hook_name.into(),
decision: response.decision,
reason: response.reason,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Hook response reasons are stored, audited, and shown to the agent without redaction/truncation, so a misconfigured hook that echoes tokens or sensitive policy context leaks them into conversation/audit data. Sanitize and cap response.reason before persisting it.

@wiserautomation

This comment was marked as spam.

@wiserautomation

This comment was marked as spam.

@etherman-os
Copy link
Copy Markdown
Author

Addressed the second Oz review in a7aa47f:

  • Kept cached ask decisions through the confirmation step, so a user-confirmed retry consumes the original decision instead of starting a second hook preflight/event/audit flow.
  • Moved stdio hook stdin writes inside the configured timeout window and kill the child when the lifecycle times out, including blocked stdin writes.
  • Sanitized and capped hook-provided reason, external_audit_id, and hook names before they enter audit/conversation/result surfaces.
  • Redacted configured stdio env secrets and HTTP header secrets from hook response reasons, including Bearer <token> header values where the hook echoes only the raw token.
  • Removed reqwest URLs from HTTP transport error details before surfacing unavailable-hook reasons.

Additional local audit pass focused on policy-hook ingress/egress surfaces: hook stdout/stderr, HTTP response bodies, hook reasons, external audit ids, unavailable details, API conversion, conversation YAML, SDK output, and preflight cache cleanup/cancellation paths.

Validation run locally after this commit:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p warp policy_hooks --lib
  • cargo test -p warp test_convert_tool_call_result_to_input_preserves_host_policy_denial --lib
  • cargo test -p ai --lib

Dependency advisory tools are still unavailable in this environment:

  • cargo audit -> cargo subcommand not installed
  • cargo deny check advisories -> cargo subcommand not installed

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds agent policy hook configuration, event/decision plumbing, hook execution transports, audit logging, and executor preflight enforcement for governed agent actions.

Concerns

  • Remote HTTP hooks use the default reqwest redirect policy, which can bypass the HTTPS/localhost validation after the initial URL check.
  • Command redaction only captures unquoted whitespace-free assignment values, so quoted secret values can be partially sent to hooks and audit records.

Security

  • The redirect behavior can downgrade or reroute policy event delivery, including configured hook headers, outside the validated endpoint constraints.
  • The redaction regex can leak parts of quoted secret assignments in command payloads.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread app/src/ai/policy_hooks/engine.rs Outdated
detail: format!("failed to serialize policy event: {source}"),
})?;

let client = reqwest::Client::new();
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Client::new() follows redirects by default, so an HTTPS hook can redirect to plaintext/non-local HTTP and bypass the URL validation. Disable redirects or validate every redirect target before sending the policy event and configured headers.

Comment thread app/src/ai/policy_hooks/redaction.rs Outdated

static SECRET_ASSIGNMENT_RE: Lazy<Regex> = Lazy::new(|| {
Regex::new(
r"(?i)\b([A-Z0-9_.-]*(?:TOKEN|SECRET|PASSWORD|PASSWD|API[_-]?KEY|ACCESS[_-]?KEY)[A-Z0-9_.-]*)=([^\s;&|]+)",
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] This only redacts until whitespace, so quoted assignments can leak trailing secret content; for example PASSWORD='foo bar' becomes PASSWORD=<redacted> bar' before being sent to hooks/audit. Redact complete quoted assignment values or use shell-aware parsing.

@etherman-os
Copy link
Copy Markdown
Author

Follow-up for the latest Oz findings, pushed in e31aec2:

  • Disabled redirects for HTTP policy hooks with reqwest::redirect::Policy::none(), so configured hook headers and event payloads are not replayed to a redirected endpoint after the initial URL validation.
  • Added http_engine_does_not_follow_redirects, which asserts a 307 response is treated as hook unavailability instead of being followed.
  • Expanded command secret-assignment redaction to cover double-quoted, single-quoted, escaped-quote, and unterminated quoted values.
  • Added command_redaction_handles_quoted_secret_assignments to ensure quoted secret values are not leaked to hook/audit payloads.
  • Fixed the agent_mode_evals profile initializer for the new agent_policy_hooks field and made the test-only event helper #[cfg(test)] to keep feature checks clean.

Verification run locally:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p warp policy_hooks --lib
  • cargo test -p warp test_convert_tool_call_result_to_input_preserves_host_policy_denial --lib
  • cargo test -p ai --lib
  • cargo check -p warp --features agent_mode_evals --lib

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@wiserautomation

This comment was marked as spam.

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds policy hook configuration, event/decision plumbing, hook execution, audit output, and enforcement integration for shell, file, and MCP agent actions.

Concerns

  • Policy preflight is invoked from try_to_execute_action, after RequestFileEdits preprocessing has already generated candidate diffs and read file contents, so a deny cannot prevent that governed work from starting.
  • The configured hook autoapproval path returns Allowed but then falls through to the existing needs_confirmation check, so it still prompts instead of executing.

Security

  • Hook env/header secret values serialize transparently into the execution profile, which is a synced JSON cloud object; configured tokens can be uploaded in raw form.

Verdict

Found: 0 critical, 3 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

|| can_auto_execute
|| (is_agent_autonomous && action.action.is_request_command_output()));
#[cfg(not(target_family = "wasm"))]
if let Some(preflight_state) = self.start_policy_preflight_if_needed(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This preflight runs after queue_actions has awaited executor preprocessing; RequestFileEdits::preprocess_action already applies/builds candidate diffs and reads file contents before a hook denial can happen. Move policy preflight before governed preprocessing, or defer diff preprocessing until the hook allows/asks is confirmed.


return TryExecuteResult::ExecutedSync;
}
PolicyPreflightState::Allowed => {}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] A hook allow with allow_hook_autoapproval does not autoapprove because this branch falls through to the original if needs_confirmation block below. Carry an autoapproval override or skip that confirmation branch when the effective policy decision explicitly allowed it.

Comment thread app/src/ai/policy_hooks/config.rs Outdated
}

#[derive(Clone, PartialEq, Eq, Serialize, Deserialize)]
#[serde(transparent)]
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] #[serde(transparent)] serializes hook env/header secrets as raw strings, and execution profiles are synced JSON cloud objects. Store secret references or keep these values in a private local secret store instead of embedding tokens in the profile payload.

@etherman-os
Copy link
Copy Markdown
Author

Follow-up for the latest Oz review, pushed in a83865e:

  • Moved RequestFileEdits policy preflight ahead of file-edit diff preprocessing when policy hooks are active, so a hook deny prevents the governed diff generation/file-read work from starting.
  • Carried hook autoapproval through PolicyPreflightState::Allowed { skip_confirmation: true }, so an effective hook allow over a Warp ask no longer falls through to the normal confirmation prompt.
  • Stopped serializing hook env and HTTP headers secret maps into execution profile JSON while still allowing local deserialization, preventing raw hook tokens from being uploaded with synced profiles.
  • Added regression coverage for hook autoapproval and secret serialization.

Verification run locally:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p warp policy_hooks --lib
  • cargo test -p warp hook_autoapproval_skips_warp_confirmation --lib
  • cargo test -p warp test_convert_tool_call_result_to_input_preserves_host_policy_denial --lib
  • cargo test -p ai --lib
  • cargo check -p warp --features agent_mode_evals --lib

@etherman-os
Copy link
Copy Markdown
Author

/oz-review

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds the agent policy hook configuration, event schema, hook engine, audit logging, and action-executor enforcement paths. The implementation covers the main governed surfaces, but I found two correctness issues that should be fixed before merge.

Concerns

  • Policy hooks are bypassed for long-running shell writes because unsupported action kinds fall through without starting a preflight.
  • File-edit policy denials lose the new stable PolicyDenied result when converted through the API shape, so reloaded/round-tripped results become generic diff failures.

Security

  • Long-running shell writes can send arbitrary text to an existing shell/REPL without invoking the configured host policy hook, leaving a bypass around the intended shell-action enforcement.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

} => Some(AgentPolicyAction::ReadMcpResource(
PolicyReadMcpResourceAction::new(*server_id, name.clone(), uri.clone()),
)),
_ => None,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] WriteToLongRunningShellCommand and other shell-write actions fall through here, so configured hooks are never run before text is sent to an existing shell/REPL. Add policy coverage for PTY writes or explicitly deny them while hooks are active so shell execution cannot bypass host policy.

},
),
),
RequestFileEditsResult::PolicyDenied { reason } => Ok(
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] Converting PolicyDenied to a generic ApplyFileDiffs::Error loses the stable policy-denied variant on API/persistence round trips; the loader maps this back to DiffApplicationFailed. Preserve a machine-readable policy denial, or add inverse parsing like the shell-command path, so agents and restored conversations do not treat host-policy blocks as diff failures.

@etherman-os
Copy link
Copy Markdown
Author

/oz-review

Latest fixes pushed in aa0a1b6:

  • Added policy hook coverage for WriteToLongRunningShellCommand, including redacted/truncated write input in the policy event and PolicyDenied handling before input reaches the existing shell.
  • Preserved RequestFileEditsResult::PolicyDenied across API conversion/reload instead of round-tripping into a generic diff failure.
  • Added regression coverage for shell-write policy action generation/redaction, shell-write policy denial, file-edit policy denial round-trip, and shell-write policy denial reload handling.

Verification:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p ai --lib (157 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (51 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input_preserves --lib (3 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib
  • cargo audit and cargo deny check advisories were run; both still fail on existing Cargo.lock baseline advisories and this PR does not modify dependency files.

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

@moirahuang
Copy link
Copy Markdown
Contributor

Could a maintainer please minimize the unrelated promotional comments above as off-topic/spam? They are not part of this PR review and may distract from the implementation discussion. @moirahuang

@etherman-os I just reported them as spam!

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds product/tech specs plus host-enforced Agent Policy Hooks for shell commands, file operations, MCP calls, hook transports, redaction, auditing, and result plumbing.

Security

  • Hook autoapproval can currently treat fail-open/unavailable fallback decisions as successful hook approval, allowing a timed-out or malformed hook to bypass Warp confirmation when on_unavailable: allow and autoapproval are configured.
  • Empty hook lists are treated as autoapproval-capable during cached recomposition because Iterator::all returns true for an empty iterator, which compounds invalid fail-open configs.

Concerns

  • See inline comments for the two autoapproval bypass paths that should be fixed before merge.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Comment thread app/src/ai/policy_hooks/decision.rs Outdated
warp_permission,
hook_results,
},
WarpPermissionDecisionKind::Ask if allow_hook_autoapproval && !hook_results.is_empty() => {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] This autoapproves a Warp Ask for any non-empty allow result set, including synthetic Allow evaluations produced by timeout/malformed/unavailable fallback (error: Some(_)). With on_unavailable: allow, an unavailable hook can bypass the confirmation UI; only autoapprove when the allow came from successful hook responses.

Comment thread app/src/ai/policy_hooks/config.rs Outdated

pub(crate) fn allow_autoapproval_for_all_hooks(&self) -> bool {
self.allow_hook_autoapproval
|| self
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Iterator::all returns true for an empty hook list, so an enabled-but-invalid config can be treated as autoapproval-capable during cached recomposition. Require at least one configured hook before applying per-hook autoapproval.

@etherman-os
Copy link
Copy Markdown
Author

/oz-review

Latest fixes pushed in d5086da:

  • Disabled hook autoapproval for empty hook lists, so enabled-but-invalid configs cannot become autoapproval-capable through Iterator::all on an empty list.
  • Restricted Warp ask autoapproval to successful hook allow evaluations only; unavailable/fallback allow results with error: Some(_) now leave Warp confirmation intact.
  • Added regression coverage for empty hook-list autoapproval, nonempty hook autoapproval, and unavailable-allow composition.

Verification:

  • cargo fmt --all --check
  • git diff --check
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (54 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib
  • Prior full checks on this head series: cargo test -p ai --lib (157 passed), CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input_preserves --lib (3 passed), cargo audit, and cargo deny check advisories with only existing dependency baseline failures and no dependency file changes.

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds profile-scoped agent policy hooks, policy-event schemas, async preflight enforcement, audit records, and policy-denied result plumbing for shell, file, and MCP actions.

Concerns

  • File-edit hook ask decisions still allow diff preprocessing to run before the user confirms, which can read original file contents and prepare candidate diffs before the policy prompt is shown.

Security

  • Stdio hook arguments can persist secrets and expose them through process argv.
  • HTTP hook URLs reject userinfo but still allow token-like query or fragment credentials to be persisted in the profile.

Verdict

Found: 0 critical, 3 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

return;
}

if denied {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] A hook ask still falls through to request_file_edits_executor.preprocess_action, which applies candidate diffs and loads original file contents before the user sees the policy confirmation; defer diff preprocessing until after confirmation.

Stdio {
command: String,
#[serde(default)]
args: Vec<String>,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] args are serialized in the profile and later passed as process argv, so hook credentials placed here are persisted and exposed in process listings; reject credential-looking args or require secrets to use the env map.

match self {
Self::Stdio { env, .. } => validate_secret_value_map(env)?,
Self::Http { url, headers } => {
if http_url_contains_credentials(url) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] This only rejects URL userinfo, so https://policy.example/hook?token=... is still accepted and persisted; reject credential-looking query/fragment values or require credentials through header env refs.

@moirahuang
Copy link
Copy Markdown
Contributor

Could a maintainer please minimize the unrelated promotional comments above as off-topic/spam? They are not part of this PR review and may distract from the implementation discussion. @moirahuang

@etherman-os I just reported them as spam!

@etherman-os and one of our repo owners has now blocked them

@etherman-os
Copy link
Copy Markdown
Author

/oz-review

Latest fixes pushed in 667e9fd:

  • Deferred file-edit diff preprocessing/original-file reads when the policy hook returns ask; preprocessing now starts only after user confirmation and then retries execution.
  • Rejected credential-looking stdio hook args before validation/persistence; credentials should use the hook env secret-reference map.
  • Rejected credential-looking HTTP URL query/fragment values before validation/persistence; HTTP credentials should use header env refs.
  • Added regression coverage for file-edit policy ask preprocessing deferral, stdio arg credential rejection, and HTTP query/fragment credential rejection.

Verification:

  • cargo fmt --all --check
  • git diff --check
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (56 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib
  • Prior full checks on this head series: cargo test -p ai --lib (157 passed), CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input_preserves --lib (3 passed), cargo audit, and cargo deny check advisories with only existing dependency baseline failures and no dependency file changes.

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds specs and a host-side Agent Policy Hooks implementation across shell, file, and MCP action surfaces.

Concerns

  • File-edit actions that pass policy but need deferred preprocessing can lose the prior user confirmation or hook autoapproval on retry, causing the same action to block again instead of executing.

Verdict

Found: 0 critical, 1 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

.remove(&(conversation_id, action.id.clone()))
{
return Some(PolicyPreflightState::Allowed {
skip_confirmation: false,
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This retry path drops the confirmation or hook autoapproval that already allowed the file edit to start preprocessing. start_request_file_edits_preprocess_if_needed returns NotReady, then the retry is no longer user-initiated; with skip_confirmation: false, the later needs_confirmation branch blocks on the same confirmation again instead of executing. Preserve the confirmed/autoapproved state through the preprocessing retry.

Suggested change
skip_confirmation: false,
skip_confirmation: true,

@etherman-os
Copy link
Copy Markdown
Author

/oz-review

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds specs and a host-side Agent Policy Hooks implementation covering profile config, async action preflight, stdio/HTTP transports, redacted policy events, audit logging, and policy-denied result plumbing.

Concerns

  • [SECURITY] Stdio hook arg validation can still persist raw secrets supplied as split flag/value args, so credential-bearing profile JSON can slip through validation.
  • Long-running shell write result conversion now treats every unlabeled shell write error as a host-policy denial, which can misclassify older or generic API payloads after reload/session-sharing.

Security

  • Stdio hook configuration needs adjacent-argument credential detection before persistence.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

}

fn validate_stdio_args(args: &[String]) -> Result<(), AgentPolicyHookConfigError> {
if args.iter().any(|arg| stdio_arg_contains_credentials(arg)) {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Validating each argv element independently still allows raw secrets in common split forms like ["--token", "secret"] or ["--authorization", "Bearer ..."]. Track sensitive option names across adjacent args and reject non-env-reference values there too.

r#type: Some(api::shell_command_error::Type::CommandNotFound(()))
})) => WriteToLongRunningShellCommandResult::Error(ShellCommandError::BlockNotFound),
Some(api::write_to_long_running_shell_command_result::Result::Error(_)) | None => WriteToLongRunningShellCommandResult::Cancelled,
Some(api::write_to_long_running_shell_command_result::Result::Error(api::ShellCommandError{
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This treats every ShellCommandError { type: None } as a policy denial. Older or generic unlabeled errors were previously cancelled, so API reload/session-sharing can now report unrelated failures as host-policy blocks; add an explicit policy marker/reason before mapping this to PolicyDenied.

@etherman-os
Copy link
Copy Markdown
Author

@oz-for-oss /oz-review

Latest fixes pushed in f81b9cc:

  • Reject split stdio credential args such as --token secret, --api-key secret, and --authorization "Bearer ..." before validation/persistence, while still allowing env-reference values such as $API_TOKEN and Bearer $POLICY_TOKEN.
  • Preserve long-running shell write policy denials through API/persistence with an explicit host-policy marker, and keep unlabeled/generic shell-write errors as Cancelled instead of reclassifying them as policy denials.
  • PR description now has no #9914/SupraWall issue reference and marks the PR as standalone.

Verification:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p ai --lib (157 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (58 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input --lib (9 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds Agent Policy Hooks across Warp-owned agent action surfaces, including profile configuration, async preflight execution, policy decision composition, API round-tripping, audit logging, and redaction.

Concerns

  • Command-result reload logic can misclassify legacy or generic permission-denied command results as host-policy denials when the expected marker prefix is absent.

Security

  • Command redaction for policy hook and audit payloads misses common split-argument secret forms, which can expose model-proposed credentials to external hooks and local audit logs.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

let output = result.output.as_str();
let reason = output
.strip_prefix("Command blocked by host policy: ")
.unwrap_or(output);
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This treats any legacy PermissionDenied { reason: None } with non-empty output as a host-policy denial. Require the Command blocked by host policy: prefix before producing PolicyDenied; otherwise older or generic permission-denied results will be reclassified after reload and fed back to the agent as policy blocks.

redact_sensitive_text_for_policy(command)
}

pub(crate) fn redact_sensitive_text_for_policy(value: &str) -> String {
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Command payload redaction does not cover adjacent secret flags such as --token secret, --password secret, or --api-key secret, so those values can be sent to external hooks and written to audit JSONL. Add split-argument credential redaction before serializing command actions.

@etherman-os
Copy link
Copy Markdown
Author

Thanks for the repeated reviews. This ended up being broader than I expected, and I did not anticipate how many edge cases this policy-hook surface would expose. I am going to pause further re-review requests for now and do a more comprehensive local audit/test pass before asking for another review. Apologies for the review noise.

@etherman-os
Copy link
Copy Markdown
Author

/oz-review

Final local audit pass completed after pausing repeated review requests.

Latest fixes pushed in 1efb850:

  • Require an explicit host-policy marker before reloading legacy/generic command PermissionDenied results as policy denials; unmarked legacy/generic permission-denied results remain cancelled.
  • Redact adjacent split secret args before hook/audit serialization, including --token secret, --password secret, --api-key secret, --authorization Bearer ..., and inline equivalents.
  • Additional local hardening: reject partial literal/env stdio credential values such as secret$TOKEN while still allowing pure env references like $TOKEN, ${TOKEN}, and Bearer $TOKEN.

Verification after the latest changes:

  • cargo fmt --all --check
  • git diff --check
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (59 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input --lib (10 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p ai --lib (157 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib
  • CARGO_BUILD_JOBS=1 cargo test -p warp --lib was also run after the latest changes; remaining failures are local baseline/environment failures outside this PR surface and are documented in the PR description.
  • cargo audit and cargo deny check advisories were run; both fail on existing Cargo.lock baseline advisories with no dependency file changes in this PR.

I will avoid further re-review requests unless there is a new code change backed by a local audit/test pass.

@oz-for-oss
Copy link
Copy Markdown
Contributor

oz-for-oss Bot commented May 3, 2026

@etherman-os

I'm re-reviewing this pull request in response to a review request.

You can view the conversation on Warp.

I completed the review and no human review was requested for this pull request.

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

Copy link
Copy Markdown
Contributor

@oz-for-oss oz-for-oss Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overview

This PR adds host-enforced Agent Policy Hooks across shell, file, and MCP action surfaces, along with config validation, redaction, audit writing, API conversion, and specs.

Concerns

  • HTTP hook URL validation still permits raw token-shaped secrets in query or fragment values when the key is not credential-looking.
  • The new stdio environment isolation test depends on HOME, and the PR's own validation notes that this fails in the full suite when the environment is cleared.

Security

  • The URL credential validation should reject common raw token patterns in query/fragment values before profile JSON can persist them.

Verdict

Found: 0 critical, 2 important, 0 suggestions

Request changes

Comment /oz-review on this pull request to retrigger a review (up to 3 times on the same pull request).

Powered by Oz

return !parsed.username().is_empty()
|| parsed.password().is_some()
|| parsed.query_pairs().any(|(key, value)| {
text_contains_credentials(&key) || text_contains_credentials(&value)
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] [SECURITY] Raw token-shaped query or fragment values are still accepted when the key is innocuous because text_contains_credentials only checks credential keywords; reject common token patterns here too so configs like ?q=sk-... cannot persist secrets in profile JSON.

Comment thread app/src/ai/policy_hooks/tests.rs Outdated
#[cfg(all(unix, not(target_family = "wasm")))]
#[tokio::test]
async fn stdio_engine_does_not_inherit_parent_environment() {
std::env::var("HOME").expect("HOME must be set for policy hook environment inheritance test");
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ [IMPORTANT] This test requires HOME, and the PR notes it fails in the full suite when the environment is cleared; use a test-scoped parent env var and assert the hook does not inherit that instead.

@etherman-os
Copy link
Copy Markdown
Author

Pushed 8ad569b with another security/regression pass over the policy hook changes.

What changed:

  • Reject raw token-shaped values in HTTP hook query/fragment values even when the key is innocuous, including disabled/malformed hook URLs before persistence.
  • Validate hook secret refs as environment variable names only, so object-shaped literal refs such as { "env": "sk-..." } or bearer literals cannot persist.
  • Extend stdio config validation and command redaction for compound/camelCase credential names such as --client-secret, --refresh-token, --access-token, and --clientSecret=... while preserving pure env references.
  • Redact policy-denied command results at creation and again before agent inputs are sent through the API path.
  • Replace the env-isolation test's dependency on inherited HOME with a parent-only sentinel env var.

Validation run locally:

  • cargo fmt --all --check
  • git diff --check
  • cargo test -p ai --lib (157 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp policy_hooks --lib (61 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp redact_inputs_redacts_policy_denied_command_result_command --lib (1 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp test_convert_tool_call_result_to_input --lib (10 passed)
  • CARGO_BUILD_JOBS=1 cargo check -p warp --features agent_mode_evals --lib
  • CARGO_BUILD_JOBS=1 cargo test -p warp terminal::model::grid::grid_handler::secrets::tests::test_bytes_processed_for_secrets_after_turning_redaction --lib -- --nocapture (4 passed)
  • CARGO_BUILD_JOBS=1 cargo test -p warp --lib (3801 passed, 13 failed, 7 ignored; all ai::policy_hooks tests pass in the full suite, including the env-isolation test. Remaining failures are outside this PR surface in my local environment.)
  • cargo audit and cargo deny check advisories were rerun; both still fail on existing Cargo.lock baseline advisories, with no dependency file changes in this PR.

I am holding off on another automated review trigger from this comment while I keep the PR quiet unless another review is explicitly requested.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

cla-signed external-contributor Indicates that a PR has been opened by someone outside the Warp team.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants