DeepExtract Agent Analysis Runtime is an AI-driven binary analysis toolkit that operates on top of DeepExtractIDA extraction outputs. It transforms per-binary SQLite databases, decompiled C++ source files, and JSON metadata into a queryable runtime with slash commands, specialized subagents, analysis skills, lifecycle hooks, and a shared helper library.
AI coding agents (Cursor, Claude Code, Codex) parse source code repositories effectively: they resolve imports, follow type definitions, and navigate call hierarchies through language servers and syntax trees. Compiled binaries present a structural gap. The cross-references, PE metadata, assembly instructions, and stack frame layouts required for binary analysis are locked inside reverse engineering frameworks and are inaccessible to these agents through their native code navigation tools. Decompiled C++ output compounds the problem: it consists of isolated function definitions grouped into flat .cpp files with no project structure, no #include headers, and no shared type definitions. Standard code indexing mechanisms (LSPs, Tree-sitter parsers, embedding-based search) fail to resolve cross-references across these files, forcing the agent to fall back to unreliable text search for callgraph traversal.
DeepExtractIDA addresses the extraction side by running a deterministic pipeline through IDA Pro 9.x and the Hex-Rays decompiler, producing structured SQLite databases with full function records, cross-reference tables, PE metadata, and JSON indexes. The Agent Analysis Runtime addresses the consumption side: it provides deterministic Python scripts that query those databases directly, replacing semantic search with structured tool invocation. The agent invokes a skill script through its shell tool, the script queries the database, and the agent reasons on the structured result. Large payloads (function bodies, callgraph data, scan results) remain on disk in workspace directories; the agent operates on compact summaries and loads full data on demand.
The runtime deploys as an .claude/ directory alongside the extraction data and operates across Claude Code, Cursor, Codex, and any AI coding environment that supports CLAUDE.md or equivalent agent configuration.
New here? Start with the Onboarding Guide.
The runtime is organized into five layers. Each layer depends only on the layers below it.
User
|
v
Slash Commands (/triage, /audit, /scan, ...)
|
v
Specialized Agents (re-analyst, security-auditor, ...)
|
v
Analysis Skills (callgraph-tracer, map-attack-surface, ...)
|
v
Shared Helper Library (DB access, function resolution, caching, ...)
|
v
Data: Analysis DBs (SQLite) + JSON Metadata + Decompiled C++
Execution proceeds through three stages:
Stage 1: Session Initialization. The sessionStart hook scans the extraction output directory, reads skill and agent registries, resolves module databases, and injects a compact workspace context table into the agent session. Context injection uses progressive disclosure: module summaries and registry frontmatter load at session start; full skill instructions and function data load only when the agent activates a specific workflow.
Stage 2: Command Dispatch. The user issues a slash command (for example, /triage appinfo.dll). The agent reads the corresponding command definition (a Markdown file with step-by-step instructions), then executes the prescribed sequence of skill scripts and subagent delegations. Each skill script queries the analysis databases through the helper library and returns structured JSON or writes results to a workspace directory. Subagents run in isolated context windows, absorbing the cost of large code payloads and returning only their conclusions to the parent agent.
Stage 3: Result Synthesis. The agent synthesizes outputs from multiple skills and subagents into a consolidated report. For multi-step workflows, intermediate results are written to run directories under .claude/workspace/ with a manifest.json tracking each step. This workspace handoff pattern keeps large payloads out of the agent's context window and prevents reasoning degradation across complex analysis pipelines.
-
Command: A user-facing slash command defined in a Markdown file under
commands/. Commands orchestrate agents and skills, specifying which scripts to run, in what order, and how to synthesize the results. Commands range from lightweight single-skill lookups (/explain,/xref) to multi-phase analysis pipelines (/full-report,/scan). -
Agent: A specialized subagent that runs in its own context window. Agents are defined in
agents/and registered inagents/registry.json. Some agents execute Python entry scripts (for example, the triage-coordinator runsanalyze_module.py); others operate as LLM-only subagents with no scripts, relying on skill-prepared context and their own reasoning (for example, the memory-corruption-scanner). -
Skill: A reusable analysis pipeline consisting of a
SKILL.mddescriptor and one or more Python scripts underskills/<skill-id>/scripts/. Skills perform the actual data retrieval and computation: querying databases, building call graphs, scanning for vulnerability patterns, reconstructing types. Each script supports--jsonfor machine-readable output. -
Helper: A shared Python module under
helpers/. Helpers own all database access, function resolution, API classification, caching, error handling, JSON output formatting, and workspace I/O. Every skill imports from the same library. No script reimplements database queries or output formatting. -
Hook: A lifecycle script triggered by the host IDE at specific events. Hooks are configured in
hooks.jsonand execute Python scripts underhooks/. The runtime uses hooks for session context injection, iterative task continuation, and workspace cleanup. -
Workspace Handoff: The pattern used by multi-step workflows to keep large payloads out of the agent context. Run directories under
.claude/workspace/store per-stepresults.jsonandsummary.jsonfiles alongside amanifest.jsonthat tracks step completion. The agent coordinates using summaries and file paths, not by holding full data in its context window. -
Grind Loop: A batch processing mechanism for iterative workflows. The agent writes a Markdown scratchpad with checkbox items. When the agent's turn ends, the
stophook checks for unchecked items and re-invokes the agent to continue, bounded by a configurable iteration limit. Used by commands that process multiple functions or phases (/batch-audit,/scan,/full-report). -
Pipeline: A headless batch execution mode defined by YAML configuration files. Pipelines specify a sequence of analysis steps (triage, security scan, type reconstruction) to run across one or more modules without interactive input. The
/pipelinecommand andpipeline_cli.pyprovide interactive and CLI access respectively.
The headless batch extractor in DeepExtractIDA writes two bootstrap files (CLAUDE.md and CLAUDE.md) into the extraction output directory. These files contain the full installation procedure and are recognized automatically by AI coding agents.
Install:
- Open the extraction output directory (the
StorageDirpassed toheadless_batch_extractor.ps1) as a project in Cursor, Claude Code, or Codex. - Type
install DeepExtractRuntimein the agent chat.
The agent reads the bootstrap instructions and executes the setup automatically: cloning the DeepExtractRuntime repository into .claude/, creating the .claude symlink for Claude Code, copying hooks.json and rule files into .cursor/ for Cursor, and verifying the installation.
Update:
Type update DeepExtractRuntime in the agent chat. The agent pulls the latest changes into .claude/ and re-copies hooks and rules.
Bootstrap Templates:
Example bootstrap files are available in the bootstrap/ directory. bootstrap/CLAUDE.md targets Cursor and Codex; bootstrap/CLAUDE.md targets Claude Code. These are the templates that the headless batch extractor writes into each extraction output directory.
Installed Workspace Layout:
<extraction_output_root>/
CLAUDE.md Bootstrap instructions (written by extractor)
CLAUDE.md Claude Code bootstrap pointer
extraction_report.json Batch extraction provenance and status
logs/ Extractor and symbol resolution logs
idb_cache/ Optional cached IDA databases
extracted_code/
<module>/
*.cpp Grouped decompiled functions
file_info.json PE metadata and analysis report
function_index.json Function-to-file index with library tags
module_profile.json Pre-computed module fingerprint
reports/ Generated analysis reports
extracted_dbs/
analyzed_files.db Tracking database (module index)
<module>_<hash>.db Per-module analysis database (read-only)
.claude/ Installed DeepExtractRuntime
commands/ Slash command definitions
agents/ Subagent definitions and entry scripts
skills/ Analysis skills with Python scripts
helpers/ Shared Python library
hooks/ Lifecycle hook scripts
rules/ Behavioral convention rules
config/
defaults.json Runtime configuration
assets/ COM, RPC, WinRT, and misc ground-truth data
pipelines/ YAML pipeline definitions
cache/ Cached analysis outputs
workspace/ Run directories for multi-step workflows
tests/ Test suite
docs/ Documentation
.cursor/ Cursor IDE integration (created by bootstrap)
hooks.json Copy of .claude/hooks.json
rules/ Copies of .claude/rules/ with .mdc extension
In this source checkout, the runtime content lives at repository root. When installed, that source tree becomes .claude/.
Verify the installation:
/health
Validates that extraction data, databases, and runtime infrastructure are present and functional.
Triage a module:
/triage appinfo.dll
Classifies every function, discovers entry points, maps the attack surface, and generates a summary report. This is the recommended first step for any module.
Explain a function:
/explain appinfo.dll AiLaunchProcess
Produces a structured explanation: purpose, parameters, return value, called APIs, cross-references, and security implications.
Audit a function:
/audit appinfo.dll AiLaunchProcess
Builds a security dossier with attack reachability, dangerous API mapping, data flow exposure, resource patterns, and risk assessment.
Scan a module for vulnerabilities:
/scan appinfo.dll
Runs unified memory corruption, logic vulnerability, and taint analysis scanners with independent skeptic verification and exploitability scoring.
The following table summarizes the analysis operations the runtime provides on top of the extraction data produced by DeepExtractIDA.
| Category | Operations |
|---|---|
| Module Triage | Function classification across multiple categories, entry point discovery, attack surface ranking by callgraph reachability |
| Call Graph Analysis | Forward and backward traversal, cross-module resolution, topology analysis (SCCs, hubs, roots, leaves), path queries, Mermaid diagram generation |
| IPC Analysis | RPC procedure enumeration with client correlation, COM server mapping with SDDL permission parsing, WinRT activation server analysis, privilege boundary auditing across all three IPC mechanisms |
| AI Vulnerability Scanning | Memory corruption scanning (buffer overflows, integer issues, use-after-free), logic vulnerability scanning (auth bypass, TOCTOU, confused deputy), taint analysis (entry point to dangerous sink tracing with trust boundary detection), each with independent skeptic verification |
| Security Auditing | Per-function security dossiers, attack reachability verification, dangerous API mapping, batch auditing of top-ranked entry points |
| Code Lifting | Batch lifting of decompiled functions into clean C++ with shared struct definitions, constant maps, and dependency ordering across class methods |
| Type Reconstruction | Struct and class inference from assembly memory access patterns, vtable reconstruction, COM interface reconstruction, compilable C++ header generation with per-field confidence annotations |
| PE Analysis | Import and export resolution across modules, dependency graphs, forwarded export chain resolution, cross-module consumer mapping |
| Batch Processing | YAML pipeline definitions for headless execution across multiple modules, parallel module processing, cross-module result aggregation |
| Finding Management | Finding persistence with SQLite-backed store, cross-report comparison (new, recurring, missed), cross-module prioritization by exploitability, reachability, and impact |
The runtime operates on extraction outputs produced by DeepExtractIDA. Each analyzed binary produces:
-
SQLite analysis database (
extracted_dbs/<module>_<hash>.db) containing three tables:file_info(binary-level metadata, PE headers, security features),functions(per-function decompiled code, assembly, cross-references, strings, dangerous APIs, loop analysis, stack frames), andfunction_xrefs(deduplicated caller-callee edges for SQL-based callgraph queries). -
Grouped C++ source files (
extracted_code/<module>/*.cpp) containing decompiled functions organized by class and namespace, sized to fit within LLM context windows. -
JSON metadata:
function_index.json(function-to-file mapping with library tags),module_profile.json(pre-computed module fingerprint covering scale, library composition, API surface, complexity), andfile_info.json(PE metadata and analysis report).
A typical Windows DLL contains 30 to 60 percent library boilerplate: C++ runtime support, Windows Implementation Library (WIL) helpers, Windows Runtime (WRL) template instantiations, STL internals, and ETW tracing stubs. The runtime filters these functions automatically using the library classification in function_index.json, allowing every skill, agent, and command to focus on application-specific logic by default.
All analysis databases are treated as read-only. Helper-mediated connections enforce PRAGMA query_only = ON.
See the DeepExtractIDA README for full extraction capabilities and database schema details.
The runtime ships slash commands under commands/, organized by analysis category. The live command set is defined in commands/registry.json.
| Command | Purpose |
|---|---|
/triage <module> [--with-security] |
Module orientation: identity, classification, call graph, attack surface, optional quick taint pass |
/full-report <module> [--brief] |
End-to-end multi-phase analysis: RE report, classification, attack surface, topology, specialized analysis |
/compare-modules <A> <B> [C ...] | --all |
Cross-module comparison: dependencies, API overlap, classification distributions |
| Command | Purpose |
|---|---|
/explain [module] <function> [--depth N] |
Structured explanation of a function: purpose, parameters, APIs, call context |
/search [module] <term> [--dimensions ...] |
Cross-dimensional search: function names, signatures, strings, APIs, classes, exports |
/xref [module] <function> [--depth N] |
Cross-reference lookup: callers and callees in compact tables |
| Command | Purpose |
|---|---|
/lift-class [module] <class> |
Batch-lift all methods of a C++ class with shared struct context |
/reconstruct-types <module> [class] [--validate] |
Reconstruct C/C++ struct and class definitions from memory access patterns |
| Command | Purpose |
|---|---|
/audit [module] <function> [--diagram] |
Security audit: dossier, verification, call chain, risk assessment |
/batch-audit <module> [--top N] [--privilege-boundary] |
Batch audit of top-ranked entry points or privilege-boundary handlers |
/taint <module> [function] [--from-entrypoints] |
AI-driven taint analysis from entry points to dangerous sinks |
| Command | Purpose |
|---|---|
/scan <module> [--memory-only|--logic-only|--taint-only] |
Unified vulnerability scan: memory, logic, and taint with verification |
/memory-scan <module> [function] |
AI-driven memory corruption scan: buffer overflows, integer issues, UAF |
/ai-logical-bug-scan <module> [function] |
AI-driven logic scan: auth bypass, state errors, TOCTOU, confused deputy |
| Command | Purpose |
|---|---|
/hunt-plan [mode] [module] [target] |
VR campaign planning, hypothesis testing, cross-module research, re-planning |
/hunt-execute [module] [--plan-file <path>] |
Execute a hunt plan: run commands, collect evidence, score confidence |
| Command | Purpose |
|---|---|
/callgraph <module> [function] [--stats|--scc|--path A B] |
Call graph queries: topology, SCCs, hubs, roots, leaves, path finding, diagrams |
/imports [module] [--function name] [--consumers] |
PE import/export relationships, dependency graphs, forwarder chains |
| Command | Purpose |
|---|---|
/rpc <module> | surface | audit | trace | clients | topology |
RPC interface enumeration, attack surface, audit, trace, clients, topology |
/winrt <module> | surface | methods | audit | privesc |
WinRT server enumeration, attack surface, methods, audit, EoP targets |
/com <module_or_clsid> | surface | methods | audit | privesc |
COM server enumeration, attack surface, audit (permissions, elevation, DCOM), EoP targets |
| Command | Purpose |
|---|---|
/diff <module_old> <module_new> |
Compare two module versions: function deltas, classification shifts, code diffs |
/prioritize [--modules A B C | --all] |
Cross-module finding prioritization by exploitability, reachability, impact |
/compare-scans <module> [--type logic|memory|taint] |
Compare findings across AI scan reports: recurring, new, missed, severity changes |
| Command | Purpose |
|---|---|
/health [--quick|--full] |
Pre-flight workspace validation: extraction data, DBs, skills, config |
/cache-manage stats|clear|refresh|purge-runs |
Cache and workspace run management |
/runs list|show|latest |
List, inspect, and reopen prior workspace runs |
/pipeline run <yaml> [--dry-run] | validate | list-steps |
Run or validate headless batch analysis pipelines |
See commands/README.md for the full command catalog.
The runtime ships specialized subagents under agents/. The live agent set is defined in agents/registry.json. Agents divide into two categories: script-backed agents that execute Python entry scripts, and LLM-only agents that operate purely through prepared context and model reasoning.
Script-backed agents:
| Agent | Type | Purpose | Entry Scripts |
|---|---|---|---|
re-analyst |
analyst | Explain and analyze decompiled functions using IDA domain knowledge | re_query.py, explain_function.py |
triage-coordinator |
coordinator | Orchestrate multi-skill analysis workflows for module triage, security, and full analysis | analyze_module.py, generate_analysis_plan.py |
security-auditor |
analyst | Vulnerability scanning, exploitability analysis, finding verification | run_security_scan.py |
code-lifter |
lifter | Lift related function groups with shared struct context across methods | batch_extract.py, track_shared_state.py |
type-reconstructor |
reconstructor | Reconstruct C/C++ struct and class definitions from memory access patterns | reconstruct_all.py, merge_evidence.py, validate_layout.py |
LLM-only agents:
| Agent | Type | Purpose |
|---|---|---|
memory-corruption-scanner |
analyst | AI-driven memory corruption scanning with callgraph navigation and adversarial prompting |
logic-scanner |
analyst | AI-driven logic vulnerability scanning (auth bypass, state confusion, TOCTOU) |
taint-scanner |
analyst | AI-driven taint analysis with cross-module data flow tracing and trust boundary detection |
LLM-only agents receive skill-prepared context (threat models, callgraph JSON, preloaded function code) and navigate the analysis space through their own reasoning. Each uses a mandatory skeptic verification pass before reporting findings.
See agents/README.md for the full agent architecture and decision table.
The runtime ships analysis skills under skills/. Each skill consists of a SKILL.md descriptor and Python scripts under scripts/. The live skill set is defined in skills/registry.json.
Foundation and Indexing:
| Skill | Type | Purpose |
|---|---|---|
decompiled-code-extractor |
foundation | Extract function data from analysis DBs: decompiled code, assembly, xrefs, signatures, strings, vtable contexts |
function-index |
index | Fast function-to-file resolution and library-tag filtering via function_index.json |
Analysis:
| Skill | Type | Purpose |
|---|---|---|
callgraph-tracer |
analysis | Build and query call graphs, trace execution paths, cross-module chain traversal |
classify-functions |
analysis | Classify every function by purpose (file I/O, registry, crypto, security) and interest score |
import-export-resolver |
analysis | PE-level import/export resolution across modules, dependency graphs, forwarder chains |
Reconstruction:
| Skill | Type | Purpose |
|---|---|---|
reconstruct-types |
reconstruction | Reconstruct C/C++ struct and class layouts from assembly memory access patterns |
com-interface-reconstruction |
reconstruction | Reconstruct COM/WRL interface definitions from vtable patterns and mangled names |
batch-lift |
code_generation | Lift related function groups with shared struct definitions and dependency ordering |
Security:
| Skill | Type | Purpose |
|---|---|---|
map-attack-surface |
security | Discover entry points (exports, COM, RPC, WinRT, callbacks) and rank by attack value |
security-dossier |
security | Build pre-audit dossiers: identity, reachability, dangerous ops, data exposure, complexity |
ai-memory-corruption-scanner |
security | LLM-driven memory corruption scanning with adversarial prompting and skeptic verification |
ai-logic-scanner |
security | LLM-driven logic vulnerability scanning with callgraph navigation |
ai-taint-scanner |
security | LLM-driven taint tracing from entry points to dangerous sinks with trust boundary analysis |
rpc-interface-analysis |
security | RPC interface enumeration, surface mapping, audit, chain tracing, client correlation, topology |
winrt-interface-analysis |
security | WinRT server analysis: enumeration, privilege-boundary risk scoring, audit, EoP detection |
com-interface-analysis |
security | COM server analysis: CLSID enumeration, SDDL parsing, elevation/UAC audit, EoP detection |
Reporting:
| Skill | Type | Purpose |
|---|---|---|
generate-re-report |
reporting | Multi-section RE reports: provenance, imports, architecture, complexity, strings, topology |
See skills/README.md for per-skill documentation and the full inventory.
The helpers/ directory is the shared Python library for the entire runtime. It includes importable modules, standalone CLI scripts, and subpackages (analyzed_files_db/, function_index/, individual_analysis_db/). Public symbols are re-exported via lazy imports in helpers/__init__.py.
The library covers the following functional areas:
- Database access and path resolution:
db_paths,individual_analysis_db,analyzed_files_db,sql_utils - Function resolution:
function_resolver(name and ID lookup with index-exact, index-partial, and DB fallback),batch_operations(bulk function loading) - API classification:
api_taxonomy(Win32/NT API prefix classification across functional and security categories, dangerous API set) - Call graph:
callgraph(in-module graph construction, BFS/DFS, Tarjan SCC),cross_module_graph(cross-module resolution via tracking DB and forwarded exports) - Module profiles:
module_profile(pre-computed fingerprints for scale, library composition, complexity) - IPC indexes:
com_index,rpc_index,winrt_index,rpc_stub_parser,ipc_workspace(COM/RPC/WinRT server and client correlation) - Parameter and type analysis:
param_risk(C-style parameter surface risk classification),type_constants,calling_conventions,struct_scanner(assembly memory access pattern scanning) - Parsing:
decompiled_parser(function call extraction),mangled_names(MSVC C++ name demangling) - Findings:
finding_schema,finding_merge,findings_store(SQLite-backed persistence),report_comparison,taint_helpers - Error handling and output:
errors(ScriptError,emit_error, error codes),json_output(emit_json,emit_json_list),progress(throttled stderr progress reporting) - Caching:
cache(filesystem cache with DB mtime-based TTL and atomic writes) - Pipeline:
pipeline_schema(YAML parsing and validation),pipeline_executor(batch module dispatch) - Configuration:
config(hierarchical config fromdefaults.jsonwithDEEPEXTRACT_*env-var overrides) - Workspace and session:
workspace,workspace_bootstrap,workspace_validation,session_utils,module_discovery - Validation:
validation(DB schema and integrity checks),command_validation(command argument preflight) - Security analysis:
sddl_parser(SDDL ACE parsing with deny-before-allow evaluation)
Standalone CLI scripts (not importable, run directly): unified_search.py, health_check.py, pipeline_cli.py, qa_runner.py, cleanup_workspace.py, select_audit_callees.py, select_backward_traces.py, json_extract.py, ipc_index_inspect.py.
Key rule: use helpers instead of reimplementing DB queries, path logic, classification, or output formatting in commands, skills, agents, or hooks.
Developer references: helpers/README.md, docs/helper_api_reference.md.
Installed workspaces configure hook events in the root-level hooks.json. Hook commands execute relative to the output root, not relative to .claude/.
| Trigger | Script | Timeout | Purpose |
|---|---|---|---|
sessionStart |
.claude/hooks/inject-module-context.py |
15s | Scan extraction data and runtime registries; inject workspace context into the agent session |
stop |
.claude/hooks/grind-until-done.py |
5s | Read the session scratchpad; re-invoke the agent if unchecked items remain (bounded by loop_limit) |
sessionEnd |
.claude/hooks/cleanup-workspace.py |
10s | Remove stale run directories, agent state files, and cache entries |
The sessionStart hook supports three context levels controlled by the DEEPEXTRACT_CONTEXT_LEVEL environment variable:
- minimal: Module count, database list, skill/agent/command names.
- standard (default): Full module table, registry tables, quick-reference command list.
- full: Module profiles, RPC/COM/WinRT tables, README summaries, cached results, triage highlights.
For workspaces with many modules, compact mode activates automatically, reducing context size by caching the module list and trimming per-module detail.
Scratchpads are session-scoped and live at .claude/hooks/scratchpads/{session_id}.md. Run directories live under .claude/workspace/.
See hooks/README.md for lifecycle details.
The runtime ships always-on rules under rules/. Each rule is a Markdown file with optional YAML frontmatter (alwaysApply, description, globs). For Cursor, rules are copied to .cursor/rules/ with a .mdc extension during installation.
| Rule | Purpose |
|---|---|
workspace-pattern |
Filesystem handoff contract for multi-step workflows |
workspace-layout |
Path conventions for the output root and the .claude/ overlay |
script-invocation-guide |
Canonical script signatures, DB path resolution, common invocation mistakes |
call-discovery-convention |
Ground-truth call discovery via xrefs; forbidden regex-only patterns |
grind-loop-protocol |
Scratchpad structure and iterative task protocol |
error-handling-convention |
ScriptError, emit_error(), error codes, and warning conventions |
json-output-convention |
stdout/stderr separation and --json behavior |
missing-dependency-handling |
Graceful degradation when data or tools are missing |
ai-scanner-orchestration |
Self-driving AI scanner phases, escalation protocol, skeptic verification |
agent-tool-guardrails |
Shell pre-flight checklist, data access decision tree, path quoting |
cache-conventions |
Cache location, TTL, DB-mtime invalidation, --no-cache bypass |
Runtime configuration lives in config/defaults.json. Individual values can be overridden via environment variables using the DEEPEXTRACT_* prefix (see helpers/config.py for override behavior).
Configuration sections cover:
classification: Weights for API, structural, and library signals used in function classificationscoring: Severity thresholds, guard weights, scanner defaultscallgraph: Vtable edge inclusion, max traversal depths for reachability and tainttriage: COM/RPC/security density thresholds, worker counts, step timeoutssecurity_auditor: Step timeouts, dynamic top-N selection based on module sizepipeline: Default step timeout, worker counts, continue-on-error, parallel module processingscript_runner: Default timeout, max retriesexplain: Max callee depth and countcache: Max age (hours), max size (MB)findings_store: SQLite path, retention dayshooks: Session timeout, grind loop limit, scratchpad stale hours, cleanup agerpc: Server index path, client stubs path, enabled flag, cache behaviorwinrt: Data root, enabled flag, cache behaviorcom: Data root, enabled flag, cache behaviordangerous_apis: JSON path to the API list, auto-classify flagscale: Compact mode threshold, context truncation limits, cross-scan limits, connection pool size
Ground-truth asset data for COM, RPC, and WinRT server registrations, dangerous API lists, and vulnerability patterns lives in config/assets/.
See docs/cache_conventions.md for cache policy.
The runtime supports headless batch execution via YAML pipeline definitions stored in config/pipelines/. Pipelines specify a sequence of analysis steps to run across one or more modules without interactive input.
Built-in pipelines:
| Pipeline | Purpose |
|---|---|
quick-triage.yaml |
Triage for all modules, minimal cost |
security-sweep.yaml |
Triage, security analysis, and vulnerability scan for selected modules |
full-analysis.yaml |
Triage, full analysis, type reconstruction, memory scan, logic scan, callgraph |
function-deep-dive.yaml |
Entry points, security dossiers, taint analysis, classification, callgraph |
CLI access:
python .claude/helpers/pipeline_cli.py run config/pipelines/security-sweep.yaml
python .claude/helpers/pipeline_cli.py validate config/pipelines/security-sweep.yaml
python .claude/helpers/pipeline_cli.py list-stepsInteractive access: The /pipeline slash command wraps the same CLI.
Pipeline output is written to .claude/workspace/batch_{name}_{timestamp}/ with per-module results and a batch summary.
See docs/pipeline_guide.md for YAML schema, step mapping, and configuration options.
The installed workspace consists of two layers: extractor-managed root artifacts produced by DeepExtractIDA, and the runtime-managed overlay installed at .claude/.
Extractor-managed data:
extracted_code/<module>/with grouped.cppfiles,file_info.json,function_index.json, andmodule_profile.jsonextracted_dbs/<module>_<hash>.dbwith per-module SQLite analysis databasesextracted_dbs/analyzed_files.dbas the tracking database (module index, status, hashes)extraction_report.json,logs/, and optionalidb_cache/
Runtime-managed data:
.claude/cache/for cached skill-script results (TTL-based, DB mtime-validated).claude/workspace/for multi-step workflow manifests and per-step results.claude/hooks/scratchpads/for grind-loop session state.claude/config/assets/for ground-truth COM, RPC, WinRT, and miscellaneous data files.claude/config/pipelines/for YAML pipeline definitions
All analysis databases are read-only. Helper-mediated connections enforce PRAGMA query_only = ON. The tracking database normally resides at extracted_dbs/analyzed_files.db; for compatibility with single-file or older layouts, helpers/db_paths.py also accepts a root-level analyzed_files.db.
Format references: file_info | function_index | module_profile | database schema
Installed-workspace command:
cd <extraction_output_root>/.claude && python -m pytest tests/ -vSource-checkout command:
python -m pytest tests/ -vThe test suite covers registry consistency, helper behavior, hook behavior, workspace handoff, pipeline execution, and integration across commands, agents, and skills.
Integration tests are executed via:
python helpers/qa_runner.pyThe runner parses the testing guide, resolves database paths, executes script-level test cases, and validates output against the JSON output convention.
See docs/testing_guide.md for the full test documentation.
- Python: 3.10 or later
- Runtime dependency:
pyyaml>=6.0 - Optional test dependencies:
pytest>=7.0,pytest-timeout>=2.0 - Optional development dependencies:
ruff>=0.4,mypy>=1.10 - Supported AI environments: Claude Code, Cursor, Codex, and any environment that supports
CLAUDE.mdor equivalent agent configuration - License: MIT
| Document | Description |
|---|---|
| Onboarding Guide | Getting started in 5 minutes |
| Architecture | Full system design and installed workspace model |
| Integration Guide | End-to-end request flow for /triage and /pipeline |
| Data Format Reference | SQLite schema, data architecture, analysis heuristics |
| File Info Format Reference | file_info.json and file_info.md layout |
| Function Index Format Reference | function_index.json format and library tagging |
| Module Profile Format Reference | module_profile.json computation and fields |
| Helper API Reference | Full helper module reference |
| Command Authoring Guide | How to add or update slash commands |
| Agent Authoring Guide | How to create or extend subagents |
| Skill Authoring Guide | How to create or extend skills |
| AI Scanner Authoring Guide | How to create AI vulnerability scanners |
| Pipeline Guide | Headless batch execution and YAML pipelines |
| Cache Conventions | Cache location, TTL, invalidation policy |
| Performance Guide | Optimization strategies for large modules |
| VR Workflow Overview | Vulnerability research workflow and methodology |
| Scan-Audit-Taint Workflow | Security scanning workflow patterns |
| Cross-Module Callgraph Guide | Cross-module call graph traversal |
| IDA Conventions Reference | IDA Pro output conventions and Hex-Rays artifacts |
| Technical Reference | Internal architecture and implementation details |
| Persistence and Lifecycle | Data persistence and session lifecycle |
| Command Depth Spectrum | Lightweight vs heavyweight command classification |
| Examples | Concrete usage examples and walkthroughs |
| Testing Guide | Full test suite documentation |
| Testing Guide Prompts | Prompt templates for testing guide generation |
| Troubleshooting | Common failures and recovery guidance |
| commands/README.md | Complete command catalog and file inventory |
| agents/README.md | Agent architecture, files, and usage guidance |
| skills/README.md | Skill inventory and per-skill documentation |
| hooks/README.md | Hook lifecycle and generated artifacts |
| helpers/README.md | Helper library import patterns and module index |
Feature requests and planned capabilities are tracked in docs/feature_requests/.
DeepExtract Agent Analysis Runtime, developed by Marcos Oviedo for Agentic Vulnerability Research.