Skip to content

feat(ctf): Puppet Master — orchestrator config-layer injection challenge#321

Open
Deez-Automations wants to merge 5 commits into
GenAI-Security-Project:mainfrom
Deez-Automations:feat/ctf-puppet-master-challenge
Open

feat(ctf): Puppet Master — orchestrator config-layer injection challenge#321
Deez-Automations wants to merge 5 commits into
GenAI-Security-Project:mainfrom
Deez-Automations:feat/ctf-puppet-master-challenge

Conversation

@Deez-Automations
Copy link
Copy Markdown

@Deez-Automations Deez-Automations commented Mar 25, 2026

Summary

Adds the Puppet Master CTF challenge (policy-bypass-puppet-master) for issue #203.

The challenge exploits the unvalidated custom_goals field in the orchestrator agent. The value is dropped raw into the system prompt under OVERRIDE DEFAULT BEHAVIOR framing — no length check, no character validation. A workspace operator can plant a standing directive that persists across every conversation in the namespace.

This is meaningfully different from existing policy_bypass challenges: the attack surface is the configuration layer, not the chat interface.

Files

  • finbot/ctf/definitions/challenges/policy_bypass/puppet_master.yaml — challenge definition, intermediate difficulty, 200pts, 3 tiered hints, pi_jb penalty modifier
  • finbot/ctf/detectors/implementations/puppet_master.pyPuppetMasterDetector: combines invoice approval check (amount + vendor trust) with a system prompt scan for the OVERRIDE marker, confirming the config vector was used
  • finbot/ctf/detectors/implementations/__init__.py — registers the new detector
  • tests/unit/ctf/test_puppet_master.py — 20 tests (PPM-VAL + PPM-DET) covering guard clauses, all detection conditions, configurable marker, malformed JSON, and evidence assertions

Labels

LLM01:Prompt Injection · LLM06:Excessive Agency · CWE-20 · CWE-285 · AML.T0043 · ASI-01 · ASI-03:Prompt Injection via Trusted Config

Test plan

  • pytest tests/unit/ctf/test_puppet_master.py -v — all 20 tests pass
  • Detector registers correctly via list_registered_detectors()
  • YAML loads without schema errors
  • order_index: 10 — no collision with existing challenges

New policy_bypass CTF challenge that exploits the unvalidated custom_goals
field in the orchestrator agent. The field is interpolated raw into the system
prompt under OVERRIDE framing, letting a workspace operator plant standing
directives that persist across every conversation in the namespace.

Files added:
- finbot/ctf/definitions/challenges/policy_bypass/puppet_master.yaml
  Intermediate-level challenge (200pts). Three progressive hints guide
  players from discovering the config field to weaponizing it.

- finbot/ctf/detectors/implementations/puppet_master.py
  PuppetMasterDetector: fires on business.invoice.decision events,
  verifies invoice amount > threshold + vendor trust == low, then
  confirms the OVERRIDE marker is present in the orchestrator system
  prompt for that workflow (proving config-layer injection was used,
  not chat-layer injection).

- tests/unit/ctf/test_puppet_master.py
  20 tests across PPM-VAL and PPM-DET categories covering guard clauses,
  detection scenarios, configurable override_marker, graceful JSON error
  handling, and evidence field assertions.

Labels: LLM01, LLM06, CWE-20, CWE-285, AML.T0043, ASI-01, ASI-03
Closes GenAI-Security-Project#203
Copilot AI review requested due to automatic review settings March 25, 2026 05:39
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a new CTF policy-bypass challenge (“Puppet Master”) focused on config-layer prompt injection via the orchestrator’s unvalidated custom_goals, including a dedicated detector and unit tests.

Changes:

  • Added policy-bypass-puppet-master challenge definition (YAML) with scoring, hints, and detector configuration.
  • Implemented PuppetMasterDetector to detect approvals of large invoices from low-trust vendors when the orchestrator system prompt contains the OVERRIDE marker.
  • Registered the detector and added a comprehensive unit test suite covering validation and detection paths.

Reviewed changes

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.

File Description
finbot/ctf/definitions/challenges/policy_bypass/puppet_master.yaml New challenge definition and scoring modifiers for the Puppet Master scenario.
finbot/ctf/detectors/implementations/puppet_master.py New detector implementation scanning stored orchestrator LLM events for OVERRIDE marker + invoice/vendor conditions.
finbot/ctf/detectors/implementations/__init__.py Registers/exports the new detector implementation.
tests/unit/ctf/test_puppet_master.py New unit test suite for config validation, guard clauses, and detection outcomes.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread finbot/ctf/detectors/implementations/puppet_master.py Outdated
Comment thread finbot/ctf/detectors/implementations/puppet_master.py Outdated
Comment thread tests/unit/ctf/test_puppet_master.py Outdated
Comment thread tests/unit/ctf/test_puppet_master.py
…n to workflow

- agent_name was 'orchestrator' but runtime emits 'orchestrator_agent' — query
  never matched real events, making detection impossible in production
- removed namespace-wide fallback scan when workflow_id is absent; without a
  workflow scope the query is unbounded and prone to false positives, so we
  return not-detected and log a debug message instead
- made agent_name configurable via detector_config for future flexibility
- updated test fixtures and added PPM-DET-06b to cover the no-workflow_id path

Fixes Copilot review comments on PR GenAI-Security-Project#321
@Deez-Automations
Copy link
Copy Markdown
Author

Addressed all 4 Copilot suggestions

  • fixed the agent_name from "orchestrator" to "orchestrator_agent" to match what the runtime actually emits.
  • Removed the unbounded namespace scan when workflow_id is absent (returns not-detected instead), made agent_name configurable via detector_config.
  • Updated the test fixtures to match. Second commit pushed

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 4 out of 4 changed files in this pull request and generated 4 comments.


💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/ctf/test_puppet_master.py Outdated
Comment thread tests/unit/ctf/test_puppet_master.py Outdated
Comment thread tests/unit/ctf/test_puppet_master.py Outdated
Comment thread finbot/ctf/detectors/implementations/puppet_master.py
- return distinct message when workflow_id is absent so operators can
  tell 'correlation data missing' from 'scanned but no marker found'
- remove unused DetectionResult import and OVERRIDE_MARKER constant
  from test module
- align bad_evt and empty_evt agent_name to 'orchestrator_agent' so
  mocks reflect the real query filter
@Deez-Automations
Copy link
Copy Markdown
Author

Fixed the remaining suggestions:

  • Added a distinct return message when workflow_id is absent so it's clear it's a missing correlation issue
    vs. "scanned and not found"
  • Removed unused DetectionResult import and OVERRIDE_MARKER constant from the test module
  • aligned bad_evt/empty_evt agent_name to "orchestrator_agent" to match the real query filter.

Addresses Copilot feedback that the existing DB mock ignores filter
predicates, giving false confidence. Added PPM-QRY class with 3 tests
that call _find_override_in_workflow directly and assert the SQLAlchemy
query is built with the correct agent_name, workflow_id, and that the DB
is not touched at all when workflow_id is absent.
@Deez-Automations
Copy link
Copy Markdown
Author

Added a PPM-QRY test class that calls _find_override_in_workflow directly and asserts the SQLAlchemy query is built with the correct agent_name and workflow_id filters, and that the DB isn't touched at all when workflow_id is absent. Should address the mock filter predicate concern.

@Deez-Automations
Copy link
Copy Markdown
Author

Note: couldn't run the test suite locally due to a Python environment constraint (MSYS2 pydantic-core wheel incompatibility). The tests follow the same patterns as existing detector tests in the repo so they should pass, but happy to fix anything CI flags.

@stealthwhizz
Copy link
Copy Markdown
Contributor

@Deez-Automations Is this a challenge inside a challenge or a seperate challenge.!
If its a seperate challenge , what is the flag that the user will get when the user completes it!?

@Deez-Automations
Copy link
Copy Markdown
Author

Hey @stealthwhizz , it's a separate standalone challenge under the policy_bypass category, same structure as Fine Print and Invoice Trust Override. prerequisites:[], its own detector, its own scoring.

On the flag, FinBot doesn't use traditional CTF flag strings. Looking at the existing challenges, none of them have a flag field.
Completion is event-driven: PuppetMasterDetector fires when the success criteria are met (invoice > $10k, low-trust vendor, approved, OVERRIDE marker in system prompt).
The challenge is marked solved, and points are awarded. That's consistent with how the rest of the platform works.

If the project is moving toward explicit flag strings, happy to add one, just let me know the format.

@stealthwhizz
Copy link
Copy Markdown
Contributor

I got confused with the demo challenge nvm
Challenge looks good 👍

…ires_at

SessionContext gained required created_at and expires_at fields upstream.
Updated _make_session_context helper to build SessionContext directly
without session_manager, removing the live DB dependency from unit tests.
@Deez-Automations
Copy link
Copy Markdown
Author

Follow-up commit to fix the test_orchestrator_custom_goals.py tests. SessionContext picked up created_at and expires_at as required fields upstream after this PR was opened, so updated the test helper to include them. Also dropped the session_manager dependency from the helper since we can just build SessionContext directly.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants