feat: skill security audit (skill-audit crate + add-skill gate)#229
feat: skill security audit (skill-audit crate + add-skill gate)#229Fldicoahkiin wants to merge 9 commits into
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
📝 WalkthroughWalkthroughAdds a new offline skill-audit crate with bundled YARA rules and an injection scanner, exposes an audit API, and gates CLI skill installs with a --force-unsafe override; blocked audits return exit code 2. ChangesSkill Audit Feature
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: 6739082fa3
ℹ️ About Codex in GitHub
Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".
| Ok(AuditInput { | ||
| name: parsed.name, | ||
| skill_md, | ||
| resources: Vec::new(), |
There was a problem hiding this comment.
Audit packaged skill resources before allowing install
For .skill/.zip inputs this builds an audit from only parsed.content and leaves resources empty, but the same path is then installed by manager.add_skill_from_path. Since skill::parser::parse_skill_file records scripts/, references/, and assets/ entries from archives, a packaged skill can put the malicious payload in scripts/setup.sh (or another resource) while keeping SKILL.md benign and bypass the new gate entirely. Please extract/read the archived resources for package inputs or block packages until they can be fully scanned.
Useful? React with 👍 / 👎.
There was a problem hiding this comment.
Actionable comments posted: 16
🧹 Nitpick comments (1)
crates/skill-audit/src/verdict.rs (1)
14-16: ⚡ Quick winAlign the Medium-threshold docs with actual logic.
The docs say “across categories,” but the code counts all Medium findings regardless of category. Please make one source of truth (logic or docs) and pin it with a test.
Also applies to: 30-33
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/skill-audit/src/verdict.rs` around lines 14 - 16, The documentation for the Medium-threshold in verdict.rs is inconsistent with the implementation: docs say “two or more Medium across categories” but the code actually counts all Medium findings regardless of category; update either the documentation or the logic in the verdict calculation so they match, and add a unit test that asserts the chosen behavior. Specifically, locate the verdict calculation (e.g., the function determine_verdict or method on Verdict that tallies severity counts and applies the Medium threshold) and either change the counting to require Mediums from distinct categories or revise the doc comment to state that Mediums are counted globally; then add a test in the verdict tests (or tests module) that covers the two-Medium threshold case to pin the behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@Cargo.toml`:
- Around line 57-58: The manifest currently pins yara-x by tag ("v1.17.0") only;
update the Cargo.toml dependency declaration for yara-x to include the explicit
git commit SHA that Cargo.lock resolved (rev =
"e0096cd41fe16be76ca5321618ac880d1894b086") alongside the existing tag and
feature settings so the crate is immutably pinned (keep default-features = false
and the features list unchanged).
In `@crates/cli/src/commands/add.rs`:
- Around line 31-39: The current flow lets
skill_audit::AuditInput::from_skill_path errors merely print "audit skipped" and
continue; change it so that when from_skill_path returns Err(e) you treat that
as a blocking audit failure unless --force-unsafe is true: after printing the
error include logic that returns an Err (or otherwise aborts installation) when
force_unsafe is false, and only continue when force_unsafe is true. Apply the
same behavior for the later similar occurrence; keep references to
skill_audit::AuditInput::from_skill_path, skill_audit::audit, and handle_audit
so the blocking/error-return integrates with the existing error handling flow.
In `@crates/skill-audit/rules/cisco/code_execution_generic.yara`:
- Around line 55-58: The condition currently applies global exclusions
($zig_rust_fn, $js_eval_json, $security_doc) to all branches, letting a benign
token like $security_doc suppress high-confidence hits; change the logic so
high-confidence indicators ($obfuscated_exec, $pickle_network,
$shell_injection_var, $eval_user_input, $import_user_input,
$eval_variable_network, $exec_fstring) are evaluated independently (e.g., if any
high-confidence pattern matches then rule matches regardless of those
exclusions) while still applying the exclusions to lower-confidence branches;
update the condition around those symbol names to implement this precedence and
add a regression test case where a high-confidence pattern (like $exec_fstring
or $eval_user_input) co-occurs with the $security_doc phrase to ensure the rule
still fires.
In `@crates/skill-audit/rules/cisco/embedded_binary_detection.yara`:
- Around line 89-90: The $shebang_python3 string currently only matches
"#!/usr/bin/python" and therefore misses "#!/usr/bin/python3"; update the
$shebang_python3 symbol in the yara rule to match Python 3 shebangs (either
replace it with "#!/usr/bin/python3" or, preferably, convert it to a regex that
matches both python and python3 shebangs, e.g. a pattern like
^#!/usr/bin/python(3)? with the appropriate yara regex delimiters) so the rule
branch correctly detects Python 3 payloads.
In `@crates/skill-audit/rules/cisco/prompt_injection_generic.yara`:
- Around line 63-83: The rule currently applies broad exclusions
($python_type_comment, $legitimate_ignore, $negation_context,
$security_doc_context, $test_context) to the entire detection block, letting a
single benign marker silence all serious matches like
$instruction_override_strong, $do_not_obey, $advanced_overrides,
$tool_injection_commands, $shadow_parameters, $hidden_behavior,
$role_redefinition, and $privilege_escalation; fix by removing the global
exclusion guard and instead apply exclusions narrowly (attach
$python_type_comment, $legitimate_ignore, $negation_context,
$security_doc_context, $test_context only to specific low-risk patterns) or
require multiple exclusion tokens to suppress a match, updating the boolean
logic around the high-risk symbols (e.g., in the clauses referencing
$instruction_override_strong, $do_not_obey, $tool_injection_commands,
$shadow_parameters, $hidden_behavior, $role_redefinition, $privilege_escalation)
so those remain detected unless an explicit, context-specific exclusion is
present.
In `@crates/skill-audit/rules/cisco/script_injection_generic.yara`:
- Around line 64-74: The current condition gate uses global "not" exclusions
($xml_namespace, $react_component, $markdown_code, $documentation_example,
$inline_code_marker, $vue_template, $svelte_component) which can entirely
suppress detections; instead, move these exclusions into the specific branches
so only relevant branches are negated (e.g., apply these "not" checks to the
$script_suspicious branch rather than the top-level conjunction), and ensure
$js_protocol_handler and other high-confidence checks remain unaffected by
unrelated HTML/markdown markers; update the rule logic around the
($script_suspicious and not $legitimate_html_app) clause to be (
$script_suspicious and not $legitimate_html_app and not ($xml_namespace or
$react_component or $markdown_code or $documentation_example or
$inline_code_marker or $vue_template or $svelte_component) ) so exclusions are
scoped per-detection.
In `@crates/skill-audit/rules/cisco/sql_injection_generic.yara`:
- Around line 56-84: The top-level "not $non_sql_sleep" and broad "not
$common_context_phrases" exclusions are overly broad and suppress real SQLi
hits; restrict their scope by removing "not $non_sql_sleep" from the global
conjunct and applying it only to the time-based branch (i.e., combine "not
$non_sql_sleep" with $time_based_injections), and replace the single global
$common_context_phrases exclusion with per-branch context filters (or narrower
variables) so each branch ($injection_tautologies, $destructive_injections,
$union_based_attacks, $time_based_injections, $error_based_techniques,
$sql_cast_injection, $database_system_objects, $malicious_user_functions) uses
an appropriate contextual exclusion instead of the global "not
$common_context_phrases"; keep the $common_sql_ops exclusion as-is.
In `@crates/skill-audit/rules/cisco/system_manipulation_generic.yara`:
- Around line 64-91: The top-level benign exclusions ($safe_cleanup,
$testing_commands, $safe_mkdir, $security_doc_context) are gating the entire
rule and can suppress all matches; remove these global "not" checks and instead
apply them as localized exclusions to each suspicious pattern (e.g., add "and
not $safe_mkdir" to the $file_destruction or $recursive_operations checks, add
"and not $testing_commands" to $process_manipulation, etc.), or wrap each
detection clause ($env_var_manipulation, $file_destruction,
$permission_manipulation, $critical_system_write, $privilege_escalation,
$process_manipulation, $recursive_operations, $path_manipulation) with its own
"and not <benign_token>" expressions so benign tokens no longer produce a
full-rule bypass.
In `@crates/skill-audit/rules/cisco/tool_chaining_abuse_generic.yara`:
- Around line 55-80: The rule currently uses global not exclusions
($security_docs, $auth_code, $test_fixture) which let benign files suppress true
positives; change the condition to first compute a combined exfil indicator
(e.g. $exfil_any = $exfil_discord or $exfil_telegram or $exfil_pastebin or
$exfil_requestbin or $ssh_key_exfil or $aws_cred_exfil or $env_file_exfil or
$explicit_exfil or $attacker_dest or $env_var_exfil) and then require $exfil_any
while only suppressing matches when a benign marker appears without any exfil
indicators (i.e. condition: $exfil_any and not ($security_docs and not
$exfil_any) and not ($auth_code and not $exfil_any) and not ($test_fixture and
not $exfil_any)); also tighten the $exfil_telegram regex to include common
domain and endpoint patterns so Telegram endpoints aren’t missed.
- Line 21: The $exfil_telegram YARA pattern only matches "telegram.org/bot" and
misses the common "api.telegram.org/bot" endpoint; update the $exfil_telegram
regex to also match the API host (e.g., allow an optional "api." prefix or
explicitly include "api.telegram.org") so it captures both telegram.org/bot and
api.telegram.org/bot while preserving the existing send|post|upload and /i
flags; modify the pattern in the rule where $exfil_telegram is defined.
In `@crates/skill-audit/src/engine/injection.rs`:
- Around line 10-14: The INVISIBLE_CHARS constant is missing the BIDI isolate
control characters U+2066..U+2069 (LRI/RLI/FSI/PDI), which allows easy evasion;
update the INVISIBLE_CHARS array to include '\u{2066}', '\u{2067}', '\u{2068}',
and '\u{2069}' so these bidi-isolate controls are treated as invisible by the
injection detection logic (look for the INVISIBLE_CHARS constant in injection.rs
and append those four code points to the existing list).
- Around line 72-74: The code currently exits the loop when
lower[body_start..].find("-->") returns None, which lets unclosed <!-- comments
bypass detection; instead, when rel_close is None treat the comment as running
to EOF by setting rel_close = lower.len() - body_start (or equivalent
end-of-input index) and proceed with the same comment-handling logic so the
scanner advances past the rest of the input rather than breaking; update the
block around lower[body_start..].find("-->") and the use of rel_close to handle
both Some and None cases (variables: lower, body_start, rel_close).
In `@crates/skill-audit/src/input.rs`:
- Around line 60-63: The package-handling branch that builds an input for
`.skill`/`.zip` currently leaves resources empty so only SKILL.md is audited;
update the package path handling (the function that "Build[s] an input from any
skill path" / the code path that detects `.skill` or `.zip`) to open the
archive, iterate its entries, and populate the Input.resources vector the same
way directory scanning does (collect resource file names, sizes and any needed
metadata, excluding SKILL.md) instead of leaving it empty; use the existing
archive/zip-reading utilities (or add a zip reader) to enumerate entries, map
them into the same Resource type used by directory scanning, and add/adjust unit
tests to assert resources are included for packaged inputs.
- Around line 45-50: The loop that calls std::fs::read_to_string and pushes a
ResourceFile into the resources vector must not ignore read failures; change the
code so that failures from std::fs::read_to_string are propagated or returned
(e.g., return Err(...) or use ?/map_err with a contextual error) instead of
being silently skipped, and include the relative path (rel) in the error context
so callers can fail-closed on I/O or non-UTF8 errors; update the surrounding
function signature to return a Result if needed and use the existing
ResourceFile and resources symbols when locating the change.
In `@crates/skill-audit/src/report.rs`:
- Around line 72-76: from_threat_type currently matches the generic
s.contains("injection") before the more specific command-exec check, causing
"command injection" to be classified as PromptInjection; to fix, reorder the
branches in from_threat_type so the specific CommandInjection checks (e.g.,
s.contains("command injection") or s.contains("code execution")) run before the
generic s.contains("injection") branch, ensuring CommandInjection is returned
for command-related strings instead of PromptInjection.
In `@crates/skill-audit/src/rules.rs`:
- Around line 37-43: The current rules() lazily initializes RULES and uses
compiler.add_source(...).expect(...), which panics on compile failures; change
rules() to return a Result so compilation errors are propagated instead of
panicking: replace the expect call with the ? (or map_err) to convert yara_x
errors into your crate's error type, store a Result<CompiledRules, Error> (or
similar) in RULES/OnceLock so the initializer returns that Result, and update
callers to handle the Result; refer to RULES, rules(), RULE_SOURCES,
yara_x::Compiler::new, and compiler.add_source to locate and modify the
initialization and call sites.
---
Nitpick comments:
In `@crates/skill-audit/src/verdict.rs`:
- Around line 14-16: The documentation for the Medium-threshold in verdict.rs is
inconsistent with the implementation: docs say “two or more Medium across
categories” but the code actually counts all Medium findings regardless of
category; update either the documentation or the logic in the verdict
calculation so they match, and add a unit test that asserts the chosen behavior.
Specifically, locate the verdict calculation (e.g., the function
determine_verdict or method on Verdict that tallies severity counts and applies
the Medium threshold) and either change the counting to require Mediums from
distinct categories or revise the doc comment to state that Mediums are counted
globally; then add a test in the verdict tests (or tests module) that covers the
two-Medium threshold case to pin the behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 2f880b21-21b0-4e5d-a220-06aa834ca4df
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (35)
Cargo.tomlcrates/cli/Cargo.tomlcrates/cli/src/commands/add.rscrates/cli/src/main.rscrates/cli/tests/audit_cli.rscrates/skill-audit/Cargo.tomlcrates/skill-audit/LICENSE-APACHEcrates/skill-audit/rules/NOTICE.mdcrates/skill-audit/rules/aghub/real_world.yaracrates/skill-audit/rules/cisco/autonomy_abuse_generic.yaracrates/skill-audit/rules/cisco/capability_inflation_generic.yaracrates/skill-audit/rules/cisco/code_execution_generic.yaracrates/skill-audit/rules/cisco/coercive_injection_generic.yaracrates/skill-audit/rules/cisco/command_injection_generic.yaracrates/skill-audit/rules/cisco/credential_harvesting_generic.yaracrates/skill-audit/rules/cisco/embedded_binary_detection.yaracrates/skill-audit/rules/cisco/indirect_prompt_injection_generic.yaracrates/skill-audit/rules/cisco/prompt_injection_generic.yaracrates/skill-audit/rules/cisco/prompt_injection_unicode_steganography.yaracrates/skill-audit/rules/cisco/script_injection_generic.yaracrates/skill-audit/rules/cisco/sql_injection_generic.yaracrates/skill-audit/rules/cisco/system_manipulation_generic.yaracrates/skill-audit/rules/cisco/tool_chaining_abuse_generic.yaracrates/skill-audit/rules/clawhub/agent_specific.yaracrates/skill-audit/src/engine/injection.rscrates/skill-audit/src/engine/mod.rscrates/skill-audit/src/engine/yara.rscrates/skill-audit/src/input.rscrates/skill-audit/src/lib.rscrates/skill-audit/src/policy.rscrates/skill-audit/src/report.rscrates/skill-audit/src/rules.rscrates/skill-audit/src/verdict.rscrates/skill-audit/tests/audit.rscrates/skill-audit/tests/real_threats.rs
| # Skill security audit — yara-x git tag (crates.io 1.14 sub-crates out of sync) | ||
| yara-x = { git = "https://github.com/VirusTotal/yara-x", tag = "v1.17.0", default-features = false, features = [ "constant-folding", "exact-atoms", "fast-regexp" ] } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
echo "[1] Current manifest entry:"
rg -n '^\s*yara-x\s*=' Cargo.toml
echo
echo "[2] Current lockfile source (if present):"
rg -n -A3 -B1 'name = "yara-x"|source = "git\+https://github.com/VirusTotal/yara-x' Cargo.lock || true
echo
echo "[3] Remote commit currently referenced by tag v1.17.0:"
git ls-remote --tags https://github.com/VirusTotal/yara-x refs/tags/v1.17.0Repository: AkaraChen/aghub
Length of output: 1076
Pin yara-x with an immutable rev in Cargo.toml
Cargo.toml uses tag = "v1.17.0", but Cargo.lock already resolves that tag to commit e0096cd41fe16be76ca5321618ac880d1894b086. Still, add rev in the manifest to make the pin explicit (defense-in-depth, and safer when Cargo.lock changes or is not present).
🔒 Suggested manifest hardening
-yara-x = { git = "https://github.com/VirusTotal/yara-x", tag = "v1.17.0", default-features = false, features = [ "constant-folding", "exact-atoms", "fast-regexp" ] }
+yara-x = { git = "https://github.com/VirusTotal/yara-x", tag = "v1.17.0", rev = "<commit-sha>", default-features = false, features = [ "constant-folding", "exact-atoms", "fast-regexp" ] }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@Cargo.toml` around lines 57 - 58, The manifest currently pins yara-x by tag
("v1.17.0") only; update the Cargo.toml dependency declaration for yara-x to
include the explicit git commit SHA that Cargo.lock resolved (rev =
"e0096cd41fe16be76ca5321618ac880d1894b086") alongside the existing tag and
feature settings so the crate is immutably pinned (keep default-features = false
and the features list unchanged).
| // Security audit before installing (blocks on Malicious unless --force-unsafe). | ||
| match skill_audit::AuditInput::from_skill_path(&from_path) { | ||
| Ok(input) => { | ||
| handle_audit(&skill_audit::audit(&input), force_unsafe)? | ||
| } | ||
| Err(e) => eprintln!( | ||
| "# Skill audit skipped — could not read skill: {e}" | ||
| ), | ||
| } |
There was a problem hiding this comment.
Fail closed when audit input cannot be built
At Line 32-38, a from_skill_path error only prints “audit skipped” and then Line 46 continues installation. That creates a bypass path for the audit gate. Treat audit-input failures as blocking by default, and only continue when --force-unsafe is set.
Suggested patch
if let Some(from_path) = from {
// Security audit before installing (blocks on Malicious unless --force-unsafe).
match skill_audit::AuditInput::from_skill_path(&from_path) {
Ok(input) => {
handle_audit(&skill_audit::audit(&input), force_unsafe)?
}
- Err(e) => eprintln!(
- "# Skill audit skipped — could not read skill: {e}"
- ),
+ Err(e) => {
+ if force_unsafe {
+ eprintln!(
+ "# Skill audit unavailable ({e}); proceeding due --force-unsafe"
+ );
+ } else {
+ return Err(anyhow!(
+ "# Refusing to install: skill audit failed ({e})"
+ ));
+ }
+ }
}Also applies to: 46-46
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/cli/src/commands/add.rs` around lines 31 - 39, The current flow lets
skill_audit::AuditInput::from_skill_path errors merely print "audit skipped" and
continue; change it so that when from_skill_path returns Err(e) you treat that
as a blocking audit failure unless --force-unsafe is true: after printing the
error include logic that returns an Err (or otherwise aborts installation) when
force_unsafe is false, and only continue when force_unsafe is true. Apply the
same behavior for the later similar occurrence; keep references to
skill_audit::AuditInput::from_skill_path, skill_audit::audit, and handle_audit
so the blocking/error-return integrates with the existing error handling flow.
| not $zig_rust_fn and | ||
| not $js_eval_json and | ||
| not $security_doc and | ||
| ( |
There was a problem hiding this comment.
Global exclusions enable trivial evasion of high-confidence matches.
In crates/skill-audit/rules/cisco/code_execution_generic.yara (lines 55-58), the condition gates both the high-confidence branch and the rest of the logic behind not $zig_rust_fn and not $js_eval_json and not $security_doc. Since $security_doc matches phrases like security audit (case-insensitive), adding that benign token can suppress detections even when high-confidence indicators such as $obfuscated_exec / $pickle_network / $shell_injection_var / $eval_user_input / $import_user_input / $eval_variable_network / $exec_fstring are present.
<details>
<summary>🔧 Proposed fix</summary>
- condition:
- not $zig_rust_fn and
- not $js_eval_json and
- not $security_doc and
- (
- // High confidence patterns - always flag
- $obfuscated_exec or
- $pickle_network or
- $shell_injection_var or
- $eval_user_input or
- $import_user_input or
- $eval_variable_network or
- $exec_fstring
- or
- // Medium confidence - flag unless clearly documentation
- (
- ($system_format or $exec_network) and
- not $markdown_codeblock
- )
- )
+ condition:
+ // High confidence patterns: do not allow global doc tokens to disable.
+ (
+ $obfuscated_exec or
+ $pickle_network or
+ $shell_injection_var or
+ $eval_user_input or
+ $import_user_input or
+ $eval_variable_network or
+ $exec_fstring
+ )
+ or
+ // Medium confidence patterns: keep exclusions here.
+ (
+ not $zig_rust_fn and
+ not $js_eval_json and
+ not $security_doc and
+ not $markdown_codeblock and
+ ($system_format or $exec_network)
+ )</details>
Add a regression case where malicious content hits a high-confidence pattern (e.g., $exec_fstring or $eval_user_input) while also including security audit, and ensure the rule still matches.
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/rules/cisco/code_execution_generic.yara` around lines 55 -
58, The condition currently applies global exclusions ($zig_rust_fn,
$js_eval_json, $security_doc) to all branches, letting a benign token like
$security_doc suppress high-confidence hits; change the logic so high-confidence
indicators ($obfuscated_exec, $pickle_network, $shell_injection_var,
$eval_user_input, $import_user_input, $eval_variable_network, $exec_fstring) are
evaluated independently (e.g., if any high-confidence pattern matches then rule
matches regardless of those exclusions) while still applying the exclusions to
lower-confidence branches; update the condition around those symbol names to
implement this precedence and add a regression test case where a high-confidence
pattern (like $exec_fstring or $eval_user_input) co-occurs with the
$security_doc phrase to ensure the rule still fires.
| $shebang_python3 = "#!/usr/bin/python" | ||
| $shebang_perl = "#!/usr/bin/perl" |
There was a problem hiding this comment.
$shebang_python3 currently matches Python (not Python 3).
Line 89 uses #!/usr/bin/python, so #!/usr/bin/python3 payloads are missed by this rule branch.
Suggested fix
- $shebang_python3 = "#!/usr/bin/python"
+ $shebang_python3 = "#!/usr/bin/python3"
+ $shebang_env_python3 = "#!/usr/bin/env python3"📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| $shebang_python3 = "#!/usr/bin/python" | |
| $shebang_perl = "#!/usr/bin/perl" | |
| $shebang_python3 = "#!/usr/bin/python3" | |
| $shebang_env_python3 = "#!/usr/bin/env python3" | |
| $shebang_perl = "#!/usr/bin/perl" |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/rules/cisco/embedded_binary_detection.yara` around lines
89 - 90, The $shebang_python3 string currently only matches "#!/usr/bin/python"
and therefore misses "#!/usr/bin/python3"; update the $shebang_python3 symbol in
the yara rule to match Python 3 shebangs (either replace it with
"#!/usr/bin/python3" or, preferably, convert it to a regex that matches both
python and python3 shebangs, e.g. a pattern like ^#!/usr/bin/python(3)? with the
appropriate yara regex delimiters) so the rule branch correctly detects Python 3
payloads.
| not $python_type_comment and | ||
| not $legitimate_ignore and | ||
| not $negation_context and | ||
| not $security_doc_context and | ||
| not $test_context and | ||
|
|
||
| ( | ||
| // Instruction overrides | ||
| $instruction_override_strong or | ||
| $instruction_override_your or | ||
| $do_not_obey or | ||
|
|
||
| // Advanced overrides | ||
| $advanced_overrides or | ||
|
|
||
| // These are suspicious regardless | ||
| $tool_injection_commands or | ||
| $shadow_parameters or | ||
| $hidden_behavior or | ||
| $role_redefinition or | ||
| $privilege_escalation |
There was a problem hiding this comment.
Global exclusion guards make this rule easy to evade.
Lines 63-68 apply exclusions to the entire rule. A malicious file can include one benign marker (e.g., test or security-doc phrasing) to suppress all detections in Lines 71-83.
Suggested fix
- condition:
-
- not $python_type_comment and
- not $legitimate_ignore and
- not $negation_context and
- not $security_doc_context and
- not $test_context and
-
- (
- // Instruction overrides
- $instruction_override_strong or
- $instruction_override_your or
- $do_not_obey or
-
- // Advanced overrides
- $advanced_overrides or
-
- // These are suspicious regardless
- $tool_injection_commands or
- $shadow_parameters or
- $hidden_behavior or
- $role_redefinition or
- $privilege_escalation
- )
+ condition:
+ (
+ ($instruction_override_strong or $instruction_override_your or
+ $do_not_obey) and
+ not $python_type_comment and
+ not $legitimate_ignore and
+ not $negation_context and
+ not $security_doc_context and
+ not $test_context
+ ) or
+ $advanced_overrides or
+ $tool_injection_commands or
+ $shadow_parameters or
+ $hidden_behavior or
+ $role_redefinition or
+ $privilege_escalation🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/rules/cisco/prompt_injection_generic.yara` around lines 63
- 83, The rule currently applies broad exclusions ($python_type_comment,
$legitimate_ignore, $negation_context, $security_doc_context, $test_context) to
the entire detection block, letting a single benign marker silence all serious
matches like $instruction_override_strong, $do_not_obey, $advanced_overrides,
$tool_injection_commands, $shadow_parameters, $hidden_behavior,
$role_redefinition, and $privilege_escalation; fix by removing the global
exclusion guard and instead apply exclusions narrowly (attach
$python_type_comment, $legitimate_ignore, $negation_context,
$security_doc_context, $test_context only to specific low-risk patterns) or
require multiple exclusion tokens to suppress a match, updating the boolean
logic around the high-risk symbols (e.g., in the clauses referencing
$instruction_override_strong, $do_not_obey, $tool_injection_commands,
$shadow_parameters, $hidden_behavior, $role_redefinition, $privilege_escalation)
so those remain detected unless an explicit, context-specific exclusion is
present.
| let Some(rel_close) = lower[body_start..].find("-->") else { | ||
| break; | ||
| }; |
There was a problem hiding this comment.
Unclosed HTML comments currently bypass hidden-instruction detection.
When <!-- has no closing -->, the function breaks and returns no finding. Treating EOF as comment body avoids this evasion path.
Suggested patch
- let Some(rel_close) = lower[body_start..].find("-->") else {
- break;
- };
+ let Some(rel_close) = lower[body_start..].find("-->") else {
+ let body = &lower[body_start..];
+ return COMMENT_VERBS.iter().any(|v| body.contains(v));
+ };📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| let Some(rel_close) = lower[body_start..].find("-->") else { | |
| break; | |
| }; | |
| let Some(rel_close) = lower[body_start..].find("-->") else { | |
| let body = &lower[body_start..]; | |
| return COMMENT_VERBS.iter().any(|v| body.contains(v)); | |
| }; |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/src/engine/injection.rs` around lines 72 - 74, The code
currently exits the loop when lower[body_start..].find("-->") returns None,
which lets unclosed <!-- comments bypass detection; instead, when rel_close is
None treat the comment as running to EOF by setting rel_close = lower.len() -
body_start (or equivalent end-of-input index) and proceed with the same
comment-handling logic so the scanner advances past the rest of the input rather
than breaking; update the block around lower[body_start..].find("-->") and the
use of rel_close to handle both Some and None cases (variables: lower,
body_start, rel_close).
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| if let Ok(content) = std::fs::read_to_string(dir.join(rel)) { | ||
| resources.push(ResourceFile { | ||
| path: rel.clone(), | ||
| content, | ||
| }); | ||
| } |
There was a problem hiding this comment.
Do not silently discard resource read errors.
This currently drops all read failures, which can produce an incomplete scan input and still continue. That undermines fail-closed behavior for audit coverage.
✅ Fail closed on non-UTF8-safe errors
- for rel in parsed.all_resources() {
- if let Ok(content) = std::fs::read_to_string(dir.join(rel)) {
- resources.push(ResourceFile {
- path: rel.clone(),
- content,
- });
- }
- }
+ for rel in parsed.all_resources() {
+ match std::fs::read_to_string(dir.join(rel)) {
+ Ok(content) => resources.push(ResourceFile {
+ path: rel.clone(),
+ content,
+ }),
+ Err(e) if e.kind() == std::io::ErrorKind::InvalidData => {
+ // Keep intentional UTF-8 skip behavior.
+ }
+ Err(e) => return Err(e),
+ }
+ }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/src/input.rs` around lines 45 - 50, The loop that calls
std::fs::read_to_string and pushes a ResourceFile into the resources vector must
not ignore read failures; change the code so that failures from
std::fs::read_to_string are propagated or returned (e.g., return Err(...) or use
?/map_err with a contextual error) instead of being silently skipped, and
include the relative path (rel) in the error context so callers can fail-closed
on I/O or non-UTF8 errors; update the surrounding function signature to return a
Result if needed and use the existing ResourceFile and resources symbols when
locating the change.
| /// Build an input from any skill path: a directory, a single `.md`, or a | ||
| /// packaged `.skill`/`.zip`. Directories get full resource scanning; | ||
| /// single files and packages scan SKILL.md only (resource scanning for | ||
| /// packages is a follow-up). |
There was a problem hiding this comment.
Package-based audits are bypassable because resources are not scanned.
For .skill/.zip paths, resources is always empty, so only SKILL.md content is audited. Malicious payloads in bundled resource files can evade L1/L2 checks.
Also applies to: 80-84
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/src/input.rs` around lines 60 - 63, The package-handling
branch that builds an input for `.skill`/`.zip` currently leaves resources empty
so only SKILL.md is audited; update the package path handling (the function that
"Build[s] an input from any skill path" / the code path that detects `.skill` or
`.zip`) to open the archive, iterate its entries, and populate the
Input.resources vector the same way directory scanning does (collect resource
file names, sizes and any needed metadata, excluding SKILL.md) instead of
leaving it empty; use the existing archive/zip-reading utilities (or add a zip
reader) to enumerate entries, map them into the same Resource type used by
directory scanning, and add/adjust unit tests to assert resources are included
for packaged inputs.
There was a problem hiding this comment.
Seems like the humans are having a chat. I'll hop back into my burrow for now. If you need me again, just tag @coderabbitai in a new comment, and I'll come hopping out!
| } else if s.contains("injection") || s.contains("steganography") { | ||
| Self::PromptInjection | ||
| } else if s.contains("code execution") { | ||
| Self::CommandInjection | ||
| } else if s.contains("supply") || s.contains("trust") { |
There was a problem hiding this comment.
from_threat_type misclassifies "command injection" as prompt injection.
The generic s.contains("injection") branch runs before command-exec-specific handling, so command-injection threat text gets the wrong category.
🛠️ Minimal ordering fix
- } else if s.contains("injection") || s.contains("steganography") {
- Self::PromptInjection
- } else if s.contains("code execution") {
+ } else if s.contains("command injection") || s.contains("code execution") {
Self::CommandInjection
+ } else if s.contains("injection") || s.contains("steganography") {
+ Self::PromptInjection
} else if s.contains("supply") || s.contains("trust") {🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/src/report.rs` around lines 72 - 76, from_threat_type
currently matches the generic s.contains("injection") before the more specific
command-exec check, causing "command injection" to be classified as
PromptInjection; to fix, reorder the branches in from_threat_type so the
specific CommandInjection checks (e.g., s.contains("command injection") or
s.contains("code execution")) run before the generic s.contains("injection")
branch, ensuring CommandInjection is returned for command-related strings
instead of PromptInjection.
| RULES.get_or_init(|| { | ||
| let mut compiler = yara_x::Compiler::new(); | ||
| for src in RULE_SOURCES { | ||
| compiler | ||
| .add_source(*src) | ||
| .expect("bundled YARA rules must compile"); | ||
| } |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# show relevant section of rules.rs
sed -n '1,120p' crates/skill-audit/src/rules.rs | cat -n
# show more around RULE_SOURCES and RULES usage if beyond 120
sed -n '120,240p' crates/skill-audit/src/rules.rs | cat -n
# locate style guide
ls -la . | head
ls -la .impeccable.md || true
if [ -f .impeccable.md ]; then
sed -n '1,120p' .impeccable.md | cat -n
fi
# search for RULES/rules usage and expect panic handling
rg -n "RULES|get_or_init|RULE_SOURCES|add_source|expect\\(" crates/skill-audit/src -S
# search in whole repo for "bundled YARA rules must compile"
rg -n "bundled YARA rules must compile" -S .
# find where rules() is used
rg -n "rules\\(\\)" crates/skill-audit/src -SRepository: AkaraChen/aghub
Length of output: 5812
🏁 Script executed:
#!/bin/bash
set -euo pipefail
# show relevant section of rules.rs
sed -n '1,140p' crates/skill-audit/src/rules.rs | cat -n
# show rest of file
sed -n '140,260p' crates/skill-audit/src/rules.rs | cat -n || true
# locate style guide
ls -la .impeccable.md || true
if [ -f .impeccable.md ]; then
sed -n '1,160p' .impeccable.md | cat -n
fi
# search for RULES/rules usage and expect panic handling
rg -n "RULES|get_or_init|RULE_SOURCES|add_source|expect\\(" crates/skill-audit/src -S
# search in whole repo for error string
rg -n "bundled YARA rules must compile" -S .
# find where rules() is used
rg -n "\\brules\\s*\\(" crates/skill-audit/src -SRepository: AkaraChen/aghub
Length of output: 5407
🏁 Script executed:
#!/bin/bash
set -euo pipefail
sed -n '1,220p' crates/skill-audit/src/rules.rs | cat -n
ls -la .impeccable.md || true
if [ -f .impeccable.md ]; then
sed -n '1,200p' .impeccable.md | cat -n
fi
rg -n "RULES|get_or_init|RULE_SOURCES|add_source|expect\\(" crates/skill-audit/src -S || true
rg -n "bundled YARA rules must compile" -S . || true
rg -n "\\brules\\s*\\(" crates/skill-audit/src -S || trueRepository: AkaraChen/aghub
Length of output: 5407
Avoid panicking on bundled-rule compile failures.
rules() compiles the embedded rule set lazily via OnceLock::get_or_init, and compiler.add_source(*src).expect("bundled YARA rules must compile") will panic if any bundled YARA fails to compile—crashing the scan path instead of returning a structured error. Propagate the compilation error (e.g., make rules() return a Result and handle/report it in the caller).
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/src/rules.rs` around lines 37 - 43, The current rules()
lazily initializes RULES and uses compiler.add_source(...).expect(...), which
panics on compile failures; change rules() to return a Result so compilation
errors are propagated instead of panicking: replace the expect call with the ?
(or map_err) to convert yara_x errors into your crate's error type, store a
Result<CompiledRules, Error> (or similar) in RULES/OnceLock so the initializer
returns that Result, and update callers to handle the Result; refer to RULES,
rules(), RULE_SOURCES, yara_x::Compiler::new, and compiler.add_source to locate
and modify the initialization and call sites.
6739082 to
96f552b
Compare
There was a problem hiding this comment.
♻️ Duplicate comments (1)
crates/cli/src/commands/add.rs (1)
31-39:⚠️ Potential issue | 🟠 Major | ⚡ Quick winFail closed when audit input construction fails.
This branch only prints “audit skipped”, and Line 46 still installs the skill. That leaves a bypass around the new gate whenever
AuditInput::from_skill_pathcannot read or parse the source, even though this PR’s contract is to fail closed on audit/input errors and reserve exit code 2 for blocked installs.Suggested patch
-#[derive(Debug)] -pub struct AuditBlocked; +#[derive(Debug)] +pub struct AuditBlocked(pub String); impl std::fmt::Display for AuditBlocked { fn fmt(&self, f: &mut std::fmt::Formatter<'_>) -> std::fmt::Result { - f.write_str( - "# Refusing to install: skill flagged Malicious (re-run with --force-unsafe to override)", - ) + f.write_str(&self.0) } } @@ match skill_audit::AuditInput::from_skill_path(&from_path) { Ok(input) => { handle_audit(&skill_audit::audit(&input), force_unsafe)? } - Err(e) => eprintln!( - "# Skill audit skipped — could not read skill: {e}" - ), + Err(e) => { + if force_unsafe { + eprintln!( + "# Skill audit unavailable ({e}); proceeding due to --force-unsafe" + ); + } else { + return Err(AuditBlocked(format!( + "# Refusing to install: skill audit failed ({e})" + )) + .into()); + } + } } @@ if force_unsafe { eprintln!( "# --force-unsafe set: installing despite a malicious verdict" ); } else { - return Err(AuditBlocked.into()); + return Err(AuditBlocked( + "# Refusing to install: skill flagged Malicious (re-run with --force-unsafe to override)" + .to_string(), + ) + .into()); } } }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/cli/src/commands/add.rs` around lines 31 - 39, The code currently prints "audit skipped" when skill_audit::AuditInput::from_skill_path(&from_path) fails, allowing installs to proceed; instead fail closed: when AuditInput::from_skill_path returns Err(e) do not continue to install—either propagate an error so the caller exits with the blocked-install exit code (2) or invoke handle_audit with a blocking audit result equivalent to a failed audit; update the Err branch (the match on skill_audit::AuditInput::from_skill_path) to return or trigger the same failure path as handle_audit(&skill_audit::audit(&input), force_unsafe) would for a blocked audit (use the existing handle_audit/audit/force_unsafe symbols to ensure consistent blocking behavior and exit code 2).
🧹 Nitpick comments (2)
crates/cli/tests/audit_cli.rs (2)
20-27: ⚡ Quick winUse the shared
TestConfigisolation helper here.These tests hand-roll
HOME/USERPROFILE/APPDATAinstead of using the repository’s standard integration-test harness. That makes this suite drift from the agent-specific directory layout and path-override behavior the rest of the tests rely on.As per coding guidelines, integration tests under
crates/*/tests/**/*.rsshould useTestConfigpluscrate::adapter::set_skills_path_override(agent_id, path)for isolation.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/cli/tests/audit_cli.rs` around lines 20 - 27, The helper aghub_cli currently hard-codes HOME/USERPROFILE/APPDATA using a TempDir; replace that with the repo test harness by constructing and using TestConfig for isolation and applying its environment to the Command, and call crate::adapter::set_skills_path_override(agent_id, path) (using the test agent id used elsewhere in tests) so the agent-specific directory/layout and path overrides are respected; update aghub_cli to accept or create a TestConfig, apply its env overrides to the Command from Command::cargo_bin("aghub-cli"), and remove the manual env(...) TempDir wiring.Sources: Coding guidelines, Learnings
32-93: ⚡ Quick winAdd an end-to-end
Suspiciousverdict case.
handle_audithas a separate warn-and-proceed branch, but this suite only exercises Allow and Block/override. A fixture that yieldsSuspiciousand asserts success plus stderr findings would lock down the last policy path.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@crates/cli/tests/audit_cli.rs` around lines 32 - 93, Add a new end-to-end test named like suspicious_skill_warns_and_installs in the same test file to exercise the warn-and-proceed branch of handle_audit: create a TempDir skill via make_skill whose contents trigger the audit engine's Suspicious verdict (not outright malicious), run aghub_cli with the same args used by other tests (e.g., ["-g","add","skill","--from"] plus .arg(&skill)), assert the command succeeds (success exit code) and assert stderr contains the Suspicious warning text (e.g., "Suspicious" or the audit findings string) to ensure the warn-and-proceed path is covered. Ensure the new test mirrors the pattern of malicious_skill_is_blocked_with_exit_2 and benign_skill_installs_normally for setup and assertions.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Duplicate comments:
In `@crates/cli/src/commands/add.rs`:
- Around line 31-39: The code currently prints "audit skipped" when
skill_audit::AuditInput::from_skill_path(&from_path) fails, allowing installs to
proceed; instead fail closed: when AuditInput::from_skill_path returns Err(e) do
not continue to install—either propagate an error so the caller exits with the
blocked-install exit code (2) or invoke handle_audit with a blocking audit
result equivalent to a failed audit; update the Err branch (the match on
skill_audit::AuditInput::from_skill_path) to return or trigger the same failure
path as handle_audit(&skill_audit::audit(&input), force_unsafe) would for a
blocked audit (use the existing handle_audit/audit/force_unsafe symbols to
ensure consistent blocking behavior and exit code 2).
---
Nitpick comments:
In `@crates/cli/tests/audit_cli.rs`:
- Around line 20-27: The helper aghub_cli currently hard-codes
HOME/USERPROFILE/APPDATA using a TempDir; replace that with the repo test
harness by constructing and using TestConfig for isolation and applying its
environment to the Command, and call
crate::adapter::set_skills_path_override(agent_id, path) (using the test agent
id used elsewhere in tests) so the agent-specific directory/layout and path
overrides are respected; update aghub_cli to accept or create a TestConfig,
apply its env overrides to the Command from Command::cargo_bin("aghub-cli"), and
remove the manual env(...) TempDir wiring.
- Around line 32-93: Add a new end-to-end test named like
suspicious_skill_warns_and_installs in the same test file to exercise the
warn-and-proceed branch of handle_audit: create a TempDir skill via make_skill
whose contents trigger the audit engine's Suspicious verdict (not outright
malicious), run aghub_cli with the same args used by other tests (e.g.,
["-g","add","skill","--from"] plus .arg(&skill)), assert the command succeeds
(success exit code) and assert stderr contains the Suspicious warning text
(e.g., "Suspicious" or the audit findings string) to ensure the warn-and-proceed
path is covered. Ensure the new test mirrors the pattern of
malicious_skill_is_blocked_with_exit_2 and benign_skill_installs_normally for
setup and assertions.
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 0da9e552-392c-4d3c-a68a-1b5e14615ba4
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (35)
Cargo.tomlcrates/cli/Cargo.tomlcrates/cli/src/commands/add.rscrates/cli/src/main.rscrates/cli/tests/audit_cli.rscrates/skill-audit/Cargo.tomlcrates/skill-audit/LICENSE-APACHEcrates/skill-audit/rules/NOTICE.mdcrates/skill-audit/rules/aghub/real_world.yaracrates/skill-audit/rules/cisco/autonomy_abuse_generic.yaracrates/skill-audit/rules/cisco/capability_inflation_generic.yaracrates/skill-audit/rules/cisco/code_execution_generic.yaracrates/skill-audit/rules/cisco/coercive_injection_generic.yaracrates/skill-audit/rules/cisco/command_injection_generic.yaracrates/skill-audit/rules/cisco/credential_harvesting_generic.yaracrates/skill-audit/rules/cisco/embedded_binary_detection.yaracrates/skill-audit/rules/cisco/indirect_prompt_injection_generic.yaracrates/skill-audit/rules/cisco/prompt_injection_generic.yaracrates/skill-audit/rules/cisco/prompt_injection_unicode_steganography.yaracrates/skill-audit/rules/cisco/script_injection_generic.yaracrates/skill-audit/rules/cisco/sql_injection_generic.yaracrates/skill-audit/rules/cisco/system_manipulation_generic.yaracrates/skill-audit/rules/cisco/tool_chaining_abuse_generic.yaracrates/skill-audit/rules/clawhub/agent_specific.yaracrates/skill-audit/src/engine/injection.rscrates/skill-audit/src/engine/mod.rscrates/skill-audit/src/engine/yara.rscrates/skill-audit/src/input.rscrates/skill-audit/src/lib.rscrates/skill-audit/src/policy.rscrates/skill-audit/src/report.rscrates/skill-audit/src/rules.rscrates/skill-audit/src/verdict.rscrates/skill-audit/tests/audit.rscrates/skill-audit/tests/real_threats.rs
✅ Files skipped from review due to trivial changes (3)
- crates/skill-audit/Cargo.toml
- crates/skill-audit/LICENSE-APACHE
- crates/skill-audit/rules/NOTICE.md
🚧 Files skipped from review as they are similar to previous changes (27)
- crates/skill-audit/tests/audit.rs
- crates/cli/Cargo.toml
- crates/skill-audit/src/rules.rs
- crates/skill-audit/rules/cisco/prompt_injection_unicode_steganography.yara
- crates/skill-audit/rules/clawhub/agent_specific.yara
- crates/skill-audit/src/policy.rs
- crates/skill-audit/rules/aghub/real_world.yara
- crates/skill-audit/rules/cisco/code_execution_generic.yara
- crates/skill-audit/rules/cisco/credential_harvesting_generic.yara
- crates/skill-audit/rules/cisco/tool_chaining_abuse_generic.yara
- crates/skill-audit/rules/cisco/sql_injection_generic.yara
- crates/skill-audit/rules/cisco/prompt_injection_generic.yara
- crates/skill-audit/rules/cisco/autonomy_abuse_generic.yara
- crates/skill-audit/src/engine/mod.rs
- crates/skill-audit/rules/cisco/embedded_binary_detection.yara
- crates/skill-audit/rules/cisco/system_manipulation_generic.yara
- crates/skill-audit/rules/cisco/capability_inflation_generic.yara
- crates/skill-audit/rules/cisco/indirect_prompt_injection_generic.yara
- crates/skill-audit/src/lib.rs
- crates/skill-audit/tests/real_threats.rs
- crates/skill-audit/rules/cisco/coercive_injection_generic.yara
- crates/skill-audit/src/report.rs
- crates/skill-audit/src/verdict.rs
- crates/skill-audit/src/engine/injection.rs
- crates/skill-audit/src/engine/yara.rs
- crates/skill-audit/rules/cisco/script_injection_generic.yara
- crates/cli/src/main.rs
96f552b to
bb56548
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@crates/skill-audit/rules/cisco/credential_harvesting_generic.yara`:
- Around line 99-151: The current rule applies broad exclusions
($python_imports, $python_type_hints, $function_definitions,
$template_indicators, $documentation_env_setup, $documentation_config_hint,
$documentation_env_var_hint, $markdown_export_example, $shell_var_reference,
$negation_context, $security_doc_context) before evaluating high-confidence
secret signatures (e.g. $api_credentials_aws, $api_credentials_github,
$api_credentials_sk_legacy, $api_credentials_openai_project,
$api_credentials_anthropic, $key_certificate_content, $credential_file_access,
$hardcoded_credential_paths, $ai_model_credential_names,
$credential_theft_actions, $network_credential_transfer, $env_var_theft,
$leak_param, $dump_creds, $base64_credential_encoding, $whatsapp_exploit);
change the condition precedence so the concrete secret group (the big OR block
of $api_* / $key_certificate_content / credential theft/exfiltration signatures)
is evaluated independently and allowed to match before applying the
documentation/template exclusions, or alternatively apply exclusions only to the
less-specific indicators (template/documentation markers) by splitting the rule
into two branches — one that returns a match for any high-confidence signature
regardless of doc/template exclusions and a second branch that applies
exclusions when matching heuristic/low-confidence indicators — ensuring
functions/patterns named above are referenced and not globally suppressed.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro
Run ID: 16bccc78-d1a0-4b03-83b9-3b8673b21ed7
⛔ Files ignored due to path filters (1)
Cargo.lockis excluded by!**/*.lock
📒 Files selected for processing (35)
Cargo.tomlcrates/cli/Cargo.tomlcrates/cli/src/commands/add.rscrates/cli/src/main.rscrates/cli/tests/audit_cli.rscrates/skill-audit/Cargo.tomlcrates/skill-audit/LICENSE-APACHEcrates/skill-audit/rules/NOTICE.mdcrates/skill-audit/rules/aghub/real_world.yaracrates/skill-audit/rules/cisco/autonomy_abuse_generic.yaracrates/skill-audit/rules/cisco/capability_inflation_generic.yaracrates/skill-audit/rules/cisco/code_execution_generic.yaracrates/skill-audit/rules/cisco/coercive_injection_generic.yaracrates/skill-audit/rules/cisco/command_injection_generic.yaracrates/skill-audit/rules/cisco/credential_harvesting_generic.yaracrates/skill-audit/rules/cisco/embedded_binary_detection.yaracrates/skill-audit/rules/cisco/indirect_prompt_injection_generic.yaracrates/skill-audit/rules/cisco/prompt_injection_generic.yaracrates/skill-audit/rules/cisco/prompt_injection_unicode_steganography.yaracrates/skill-audit/rules/cisco/script_injection_generic.yaracrates/skill-audit/rules/cisco/sql_injection_generic.yaracrates/skill-audit/rules/cisco/system_manipulation_generic.yaracrates/skill-audit/rules/cisco/tool_chaining_abuse_generic.yaracrates/skill-audit/rules/clawhub/agent_specific.yaracrates/skill-audit/src/engine/injection.rscrates/skill-audit/src/engine/mod.rscrates/skill-audit/src/engine/yara.rscrates/skill-audit/src/input.rscrates/skill-audit/src/lib.rscrates/skill-audit/src/policy.rscrates/skill-audit/src/report.rscrates/skill-audit/src/rules.rscrates/skill-audit/src/verdict.rscrates/skill-audit/tests/audit.rscrates/skill-audit/tests/real_threats.rs
✅ Files skipped from review due to trivial changes (2)
- crates/skill-audit/rules/NOTICE.md
- crates/skill-audit/LICENSE-APACHE
🚧 Files skipped from review as they are similar to previous changes (30)
- crates/cli/Cargo.toml
- Cargo.toml
- crates/skill-audit/rules/cisco/coercive_injection_generic.yara
- crates/skill-audit/rules/cisco/capability_inflation_generic.yara
- crates/skill-audit/rules/aghub/real_world.yara
- crates/skill-audit/Cargo.toml
- crates/skill-audit/tests/real_threats.rs
- crates/skill-audit/rules/clawhub/agent_specific.yara
- crates/skill-audit/src/policy.rs
- crates/cli/src/commands/add.rs
- crates/skill-audit/src/engine/injection.rs
- crates/skill-audit/tests/audit.rs
- crates/skill-audit/src/rules.rs
- crates/skill-audit/rules/cisco/prompt_injection_generic.yara
- crates/skill-audit/rules/cisco/tool_chaining_abuse_generic.yara
- crates/skill-audit/rules/cisco/prompt_injection_unicode_steganography.yara
- crates/skill-audit/rules/cisco/command_injection_generic.yara
- crates/skill-audit/src/engine/mod.rs
- crates/skill-audit/src/verdict.rs
- crates/skill-audit/rules/cisco/system_manipulation_generic.yara
- crates/skill-audit/rules/cisco/indirect_prompt_injection_generic.yara
- crates/skill-audit/rules/cisco/autonomy_abuse_generic.yara
- crates/cli/tests/audit_cli.rs
- crates/skill-audit/rules/cisco/script_injection_generic.yara
- crates/skill-audit/src/engine/yara.rs
- crates/skill-audit/src/report.rs
- crates/skill-audit/rules/cisco/sql_injection_generic.yara
- crates/cli/src/main.rs
- crates/skill-audit/rules/cisco/embedded_binary_detection.yara
- crates/skill-audit/src/input.rs
| not $python_imports and | ||
| not $python_type_hints and | ||
| not $function_definitions and | ||
| not $template_indicators and | ||
| not $documentation_env_setup and | ||
| not $documentation_config_hint and | ||
| not $documentation_env_var_hint and | ||
| not $markdown_export_example and | ||
| not $shell_var_reference and | ||
| not $negation_context and | ||
| not $security_doc_context and | ||
|
|
||
| ( | ||
| // Real API credentials (high confidence specific patterns) | ||
| $api_credentials_aws or | ||
| $api_credentials_github or | ||
| $api_credentials_sk_legacy or | ||
| $api_credentials_openai_project or | ||
| $api_credentials_anthropic or | ||
|
|
||
| // Actual private key content | ||
| $key_certificate_content or | ||
|
|
||
| // Specific credential file access | ||
| $credential_file_access or | ||
|
|
||
| // Hardcoded credential paths | ||
| $hardcoded_credential_paths or | ||
|
|
||
| // AI model API keys with actual values | ||
| $ai_model_credential_names or | ||
|
|
||
| // Credential theft actions | ||
| $credential_theft_actions or | ||
|
|
||
| // Network credential transfer | ||
| $network_credential_transfer or | ||
|
|
||
| // Environment variable theft | ||
| $env_var_theft or | ||
|
|
||
| // Exfiltration attempts | ||
| $leak_param or | ||
|
|
||
| // Credential dumping | ||
| $dump_creds or | ||
|
|
||
| // Base64 credential encoding | ||
| $base64_credential_encoding or | ||
|
|
||
| // WhatsApp exploit | ||
| $whatsapp_exploit | ||
| ) |
There was a problem hiding this comment.
Global exclusions allow trivial suppression of real-secret matches.
At Line 99 through Line 110, exclusions are applied before all indicators, so a benign token (for example, security-doc wording) can suppress matches even when concrete secrets are present (Line 113-121). Split condition precedence so concrete secret signatures are evaluated independently of documentation/template exclusions.
🔧 Proposed condition structure
- condition:
-
- not $python_imports and
- not $python_type_hints and
- not $function_definitions and
- not $template_indicators and
- not $documentation_env_setup and
- not $documentation_config_hint and
- not $documentation_env_var_hint and
- not $markdown_export_example and
- not $shell_var_reference and
- not $negation_context and
- not $security_doc_context and
-
- (
- // Real API credentials (high confidence specific patterns)
- $api_credentials_aws or
- $api_credentials_github or
- $api_credentials_sk_legacy or
- $api_credentials_openai_project or
- $api_credentials_anthropic or
-
- // Actual private key content
- $key_certificate_content or
+ condition:
+ // High-confidence secret material: always match.
+ (
+ $api_credentials_aws or
+ $api_credentials_github or
+ $api_credentials_sk_legacy or
+ $api_credentials_openai_project or
+ $api_credentials_anthropic or
+ $key_certificate_content
+ )
+ or
+ // Contextual indicators: apply exclusions here.
+ (
+ not $python_imports and
+ not $python_type_hints and
+ not $function_definitions and
+ not $template_indicators and
+ not $documentation_env_setup and
+ not $documentation_config_hint and
+ not $documentation_env_var_hint and
+ not $markdown_export_example and
+ not $shell_var_reference and
+ not $negation_context and
+ not $security_doc_context and
+ (
+ // Specific credential file access
+ $credential_file_access or
+ // Hardcoded credential paths
+ $hardcoded_credential_paths or
+ // AI model API keys with actual values
+ $ai_model_credential_names or
+ // Credential theft actions
+ $credential_theft_actions or
+ // Network credential transfer
+ $network_credential_transfer or
+ // Environment variable theft
+ $env_var_theft or
+ // Exfiltration attempts
+ $leak_param or
+ // Credential dumping
+ $dump_creds or
+ // Base64 credential encoding
+ $base64_credential_encoding or
+ // WhatsApp exploit
+ $whatsapp_exploit
+ )
+ )
- ...
- )📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| not $python_imports and | |
| not $python_type_hints and | |
| not $function_definitions and | |
| not $template_indicators and | |
| not $documentation_env_setup and | |
| not $documentation_config_hint and | |
| not $documentation_env_var_hint and | |
| not $markdown_export_example and | |
| not $shell_var_reference and | |
| not $negation_context and | |
| not $security_doc_context and | |
| ( | |
| // Real API credentials (high confidence specific patterns) | |
| $api_credentials_aws or | |
| $api_credentials_github or | |
| $api_credentials_sk_legacy or | |
| $api_credentials_openai_project or | |
| $api_credentials_anthropic or | |
| // Actual private key content | |
| $key_certificate_content or | |
| // Specific credential file access | |
| $credential_file_access or | |
| // Hardcoded credential paths | |
| $hardcoded_credential_paths or | |
| // AI model API keys with actual values | |
| $ai_model_credential_names or | |
| // Credential theft actions | |
| $credential_theft_actions or | |
| // Network credential transfer | |
| $network_credential_transfer or | |
| // Environment variable theft | |
| $env_var_theft or | |
| // Exfiltration attempts | |
| $leak_param or | |
| // Credential dumping | |
| $dump_creds or | |
| // Base64 credential encoding | |
| $base64_credential_encoding or | |
| // WhatsApp exploit | |
| $whatsapp_exploit | |
| ) | |
| condition: | |
| // High-confidence secret material: always match. | |
| ( | |
| $api_credentials_aws or | |
| $api_credentials_github or | |
| $api_credentials_sk_legacy or | |
| $api_credentials_openai_project or | |
| $api_credentials_anthropic or | |
| $key_certificate_content | |
| ) | |
| or | |
| // Contextual indicators: apply exclusions here. | |
| ( | |
| not $python_imports and | |
| not $python_type_hints and | |
| not $function_definitions and | |
| not $template_indicators and | |
| not $documentation_env_setup and | |
| not $documentation_config_hint and | |
| not $documentation_env_var_hint and | |
| not $markdown_export_example and | |
| not $shell_var_reference and | |
| not $negation_context and | |
| not $security_doc_context and | |
| ( | |
| // Specific credential file access | |
| $credential_file_access or | |
| // Hardcoded credential paths | |
| $hardcoded_credential_paths or | |
| // AI model API keys with actual values | |
| $ai_model_credential_names or | |
| // Credential theft actions | |
| $credential_theft_actions or | |
| // Network credential transfer | |
| $network_credential_transfer or | |
| // Environment variable theft | |
| $env_var_theft or | |
| // Exfiltration attempts | |
| $leak_param or | |
| // Credential dumping | |
| $dump_creds or | |
| // Base64 credential encoding | |
| $base64_credential_encoding or | |
| // WhatsApp exploit | |
| $whatsapp_exploit | |
| ) | |
| ) |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@crates/skill-audit/rules/cisco/credential_harvesting_generic.yara` around
lines 99 - 151, The current rule applies broad exclusions ($python_imports,
$python_type_hints, $function_definitions, $template_indicators,
$documentation_env_setup, $documentation_config_hint,
$documentation_env_var_hint, $markdown_export_example, $shell_var_reference,
$negation_context, $security_doc_context) before evaluating high-confidence
secret signatures (e.g. $api_credentials_aws, $api_credentials_github,
$api_credentials_sk_legacy, $api_credentials_openai_project,
$api_credentials_anthropic, $key_certificate_content, $credential_file_access,
$hardcoded_credential_paths, $ai_model_credential_names,
$credential_theft_actions, $network_credential_transfer, $env_var_theft,
$leak_param, $dump_creds, $base64_credential_encoding, $whatsapp_exploit);
change the condition precedence so the concrete secret group (the big OR block
of $api_* / $key_certificate_content / credential theft/exfiltration signatures)
is evaluated independently and allowed to match before applying the
documentation/template exclusions, or alternatively apply exclusions only to the
less-specific indicators (template/documentation markers) by splitting the rule
into two branches — one that returns a match for any high-confidence signature
regardless of doc/template exclusions and a second branch that applies
exclusions when matching heuristic/low-confidence indicators — ensuring
functions/patterns named above are referenced and not globally suppressed.
bb56548 to
5dbde92
Compare
- new `skill-audit` crate: scans a skill for poisoning / credential exfiltration / prompt injection before it is installed - scans SKILL.md plus every text file under the skill dir; .skill/.zip packages are unpacked (via the skill crate) and their resources scanned too - L1 static layer: YARA rules via yara-x (cisco rules vendored under Apache-2.0; clawhub rules rewritten to YARA under MIT) - aghub's own rules generalise real-world SafeSkills samples (credential-file exfil, download-pipe-exec, reverse shell, raw-IP, external-payload docs) - behavioral-lite: correlate a secret-reading source with a network sink across files into a data-flow chain finding (catches cross-file exfil that single-file rules miss, e.g. cisco's multi-file-exfiltration eval) - L2 layer: pure-Rust injection detector (zero-width / bidi chars, hidden HTML-comment instructions, prompt-override phrases) - AuditReport (Benign / Suspicious / Malicious) + suggested Action; verdict and policy kept as separate types; a scan error fails closed - yara-x pinned to git tag v1.17.0 (crates.io 1.14 sub-crates out of sync)
- run the skill-audit before installing a skill from --from - a Malicious verdict blocks install (exit code 2) unless --force-unsafe - the block propagates to main as an error, not exit() inside the handler - a Suspicious verdict prints findings to stderr but proceeds - audit output goes to stderr; stdout stays reserved for the skill JSON - end-to-end tests: block / force-unsafe / benign / root-script / .zip package
5dbde92 to
aebbae9
Compare
- code files (.py/.js/.ts) are scanned by ast-grep (AST), which does not false-positive on danger words sitting in comments / docstrings / string literals the way the regex YARA layer does - rules ported from apiiro/malicious-code-ruleset (MIT): dynamic-execution AST patterns (eval / exec / os.system / subprocess / pickle); regex and obfuscation patterns stay in the YARA layer where regex belongs - runs via `ast-grep scan --stdin --inline-rules` — rules bundled inline, no temp files; the binary is located via $AST_GREP_BIN or PATH - optional enhancement: if ast-grep is absent the scan is a no-op and YARA still covers the file (bundling the binary via build.rs is a follow-up) - new FindingSource::AstGrep; verified against cisco/SafeSkills samples — os.system, socket+dup2 reverse shell etc. matched with zero benign noise
Reverts 462575e. aghub's skill-audit targets skill poisoning specifically (credential exfiltration, prompt injection, supply-chain download lures, reverse shells), not generic code SAST. ast-grep's generic dynamic-execution patterns (os.system / eval / subprocess) also fire on legitimate skills that run commands, so they don't fit a skill-security tool. The intent-based layers already cover skill threats: YARA rules + L2 injection + behavioral-lite cross-file source->sink correlation (SafeSkills 6/6).
- run skills.sh install as clone → audit → install in one pass: clone once, audit each skill, gate on the worst verdict, and reuse the cached clone session for the force-install path - add audit DTOs (AuditReport / Finding / Verdict / Severity / Category / Confidence / FindingSource) plus the audit fields on the install request/response - register the audit DTOs in export-dto and regenerate the TS bindings
# Conflicts: # crates/api/src/routes/skills.rs
- engine: derive a 1-based line number for each finding from the earliest YARA match offset (previously always None) - api: git_scan audits each scanned skill in the cloned repo and returns the verdict on GitScanSkillEntry, so the import UI can surface it before install - fix: install_skill's GitCloneSession was missing scanned_skill_paths after the main merge
…tional ts-rs generated them as required `T | null`, so install callers that don't gate on the audit (deep link, base install) failed the desktop build. Mark both `#[ts(optional = nullable)]` — optional but still nullable.
What
Adds
skill-audit— an offline security audit that scans an agent skill for poisoning, credential exfiltration, and prompt injection before it's installed — and gatesaghub add skillon the verdict.How
yara-x: cisco rules (Apache-2.0, vendored as-is), clawhub rules (MIT, rewritten to YARA), and aghub's own rules generalised from real-world SafeSkills samples.AuditReport { verdict: Benign | Suspicious | Malicious, findings }+ a suggestedAction(policy.rs). The CLI blocks a Malicious verdict (exit 2) unless--force-unsafe; Suspicious prints findings and proceeds. Audit text → stderr, stdout stays the skill JSON. A scan error fails closed (surfaced as a finding, never Benign).Detection check
Against 6 real malicious skills from Sangfor SafeSkills:
6/6 flagged, benign cases stay clean.
Trade-offs to know
yara-xis pinned to git tag v1.17.0: crates.io is stuck at 1.14 with out-of-sync sub-crates that don't build. It pulls in wasmtime, so the first compile is heavy (CI caches it). TheCargo.lockdiff is wide because yara-x's constraints force a workspace-wide re-resolve — it couldn't be narrowed.verdict.rs::aggregateand the ciscoclassification="harmful" → Criticalmapping inyara.rs.Tests
skill-audit: 5 unit + 6 real-threatcli: 3 end-to-end (block / force-unsafe / benign)Summary by CodeRabbit
New Features
Detection
Tests
Documentation