Skip to content

Fix scan_directory sandbox validation#159

Merged
gautammanak1 merged 5 commits into
fetchai:mainfrom
Anshika0998:fix-scan-directory-sandbox
Jun 21, 2026
Merged

Fix scan_directory sandbox validation#159
gautammanak1 merged 5 commits into
fetchai:mainfrom
Anshika0998:fix-scan-directory-sandbox

Conversation

@Anshika0998

Copy link
Copy Markdown
Contributor

Summary

This PR fixes sandbox validation issues for scan_directory operations and strengthens path enforcement across the workflow.

Changes

  • Added planning-time validation for scan_directory paths in the orchestrator policy.
  • Updated connector path validation to use Path.relative_to() instead of string prefix matching.
  • Restricted scan operations to the demo sandbox directory.
  • Ensured generated scan plans use the demo sandbox path.
  • Updated workflow behavior to prevent scanning outside the allowed sandbox.

Testing

  • Ran the full test suite locally.
  • All tests passing:

68 passed in 10.73s
closes #112

@github-actions github-actions Bot added bug Something isn't working gssoc26 GirlScript Summer of Code 2026 contribution level2 GSSoC level 2 - intermediate (medium points) labels Jun 16, 2026
@gautammanak1 gautammanak1 self-requested a review June 17, 2026 16:54
@gautammanak1 gautammanak1 added the gssoc:approved GSSoC validation: approved by Project Admin label Jun 17, 2026
@gautammanak1

Copy link
Copy Markdown
Collaborator

Review: Fix scan_directory sandbox validation

Summary

This PR strengthens sandbox validation for scan_directory operations by adding planning-time checks in the orchestrator policy and improving path validation logic. The changes address path traversal risks by using Path.relative_to() instead of string prefix matching, and enforce that scan_directory operations are restricted to the demo sandbox directory. Overall risk level: low — this is a defensive security improvement.

What I liked

  • The use of Path.relative_to() in connector/policy.py:73 is the correct approach for path containment checks — it properly handles edge cases like ../demo_projects that string prefix matching would miss.
  • The defense-in-depth strategy is solid: validation now happens at both planning time (orchestrator/policy.py) and execution time (connector/policy.py), with path enforcement in the planner (orchestrator/planner.py:345-348).
  • The workflow simplification in weekly_report.py — removing the unused params["path"] handling — makes the intent clearer: scans always use the demo sandbox.

Security

  • orchestrator/policy.py:82-88 — The new planning-time validation for scan_directory correctly uses Path.relative_to() to reject paths outside the demo sandbox. This blocks malicious or malformed paths like /etc/passwd or ../../sensitive before execution.
  • connector/policy.py:73 — The execution-time validation also uses Path.relative_to(), providing a second layer of defense. The dual validation (planning + execution) is good practice.
  • orchestrator/planner.py:345-348 — The planner now explicitly overrides any LLM-generated path for scan_directory to "./demo_projects". This prevents an LLM hallucination from bypassing the sandbox.

Correctness / logic

  • orchestrator/policy.py:82-88 — The indentation issue on lines 82-88 uses spaces instead of Python's standard 4-space indentation. While this works, it's inconsistent with the rest of the file and could cause issues in strict linting environments.
  • orchestrator/policy.py:82 — The raw_path variable is declared but only used if non-None. The logic is correct, but the variable could be inlined for clarity: if (raw_path := step.params.get("path")):.

Performance

  • No performance concerns. The path resolution and relative_to() calls are O(1) and only run on planning/execution boundaries, not in hot loops.

Code quality

  • orchestrator/policy.py:82-88 — Inconsistent indentation (appears to use 1-2 spaces instead of 4). Consider running black or ruff to standardize.
  • connector/policy.py:73 — The try/except ValueError pattern for relative_to() is correct and idiomatic.

Tests

  • The PR description mentions 68 passing tests. Given the security-sensitive nature of path validation, consider adding explicit tests for:
    • Paths with ../ escape attempts
    • Absolute paths outside the sandbox
    • Symlink-based escapes (if supported)
    • Empty or missing path parameters

Suggested fixes

  1. Fix indentation in orchestrator/policy.py:82-88:
if step.action == "scan_directory":
    raw_path = step.params.get("path")

    if raw_path:
        demo_dir = Path("./demo_projects").resolve()
        requested = Path(raw_path).expanduser().resolve()

        try:
            requested.relative_to(demo_dir)
        except ValueError:
            logger.warning("Path '%s' outside demo sandbox", requested)
            return RejectionReason.PATH_NOT_ALLOWED
  1. Optional: Use walrus operator for conciseness:
if step.action == "scan_directory":
    if (raw_path := step.params.get("path")):
        demo_dir = Path("./demo_projects").resolve()
        requested = Path(raw_path).expanduser().resolve()

        try:
            requested.relative_to(demo_dir)
        except ValueError:
            logger.warning("Path '%s' outside demo sandbox", requested)
            return RejectionReason.PATH_NOT_ALLOWED

Questions for the author

  • The orchestrator/policy.py validation hardcodes "./demo_projects" instead of reading from DEMO_PROJECTS_DIR. Is this intentional? The connector policy reads from the env var, so there's a slight inconsistency.

Overall: This is a clean, focused security improvement. The path validation logic is sound, and the defense-in-depth approach is well-executed. The only real issue is the indentation inconsistency, which is minor. Nice work closing #112!

@gautammanak1

gautammanak1 commented Jun 17, 2026

Copy link
Copy Markdown
Collaborator

(could not anchor to openclaw/fetchai-openclaw-orchestrator/orchestrator/policy.py:82 — see below)

Inconsistent indentation — lines 82-88 use 1-2 spaces instead of the file's standard 4-space indentation. Consider running black or ruff format to standardize.

@Anshika0998

Copy link
Copy Markdown
Contributor Author

Addressed the review feedback and pushed the changes in commit 039f613. The policy validation logic has been updated accordingly. Please take another look when you get a chance. Thanks!

@Anshika0998 Anshika0998 force-pushed the fix-scan-directory-sandbox branch from fddc7ee to c979a29 Compare June 20, 2026 09:56
@Anshika0998

Copy link
Copy Markdown
Contributor Author

Hi @gautammanak1,

I've addressed all the review feedback, added the required changelog entries, and updated the branch with the latest changes from main. The checks are passing on my side.

Could you please take a final look when you have a chance? Thank you for the review and feedback.

@gautammanak1 gautammanak1 merged commit 71f35a5 into fetchai:main Jun 21, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bug Something isn't working gssoc:approved GSSoC validation: approved by Project Admin gssoc26 GirlScript Summer of Code 2026 contribution level2 GSSoC level 2 - intermediate (medium points)

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Bug: OpenClaw scan_directory allows LLM-steered paths outside demo sandbox

2 participants