Arkana's security model, sandboxing configuration, and testing infrastructure.
When running Arkana in HTTP mode (--mcp-transport streamable-http), any MCP client can call open_file to read files on the server. Use --allowed-paths to restrict access:
# Only allow access to /samples and /tmp
python arkana.py --mcp-server --mcp-transport streamable-http \
--allowed-paths /samples /tmp
# Docker with sandboxing (via run.sh - extra flags are passed through)
./run.sh --allowed-paths /samples
# Equivalent manual docker command
docker run --rm -it -p 8082:8082 \
--user "$(id -u):$(id -g)" \
-e HOME=/app/home \
-v "$(pwd)/samples:/samples:ro" \
arkana-toolkit \
--mcp-server --mcp-transport streamable-http --mcp-host 0.0.0.0 \
--allowed-paths /samples \
--samples-path /samplesIf --allowed-paths is not set in HTTP mode, Arkana logs a warning at startup.
- Non-root Docker: The
run.shhelper runs the container as your host UID/GID (--user "$(id -u):$(id -g)"), never as root. - API key storage: Keys are stored in
~/.arkana/config.jsonwith 0o600 (owner-only) permissions. - Zip-slip protection: Archive extraction validates member paths against directory traversal.
- No hardcoded secrets: API keys are sourced from environment variables or the config file.
- CSP: Dashboard Content-Security-Policy uses
script-src 'self'(no inline scripts) and restrictsimg-srcto'self'only (nodata:URIs). All event handlers useaddEventListeneror event delegation withdata-*attributes. - Dashboard XSS prevention: All dynamic values in dashboard JS (functions.js, strings.js, callgraph.js) are escaped via
escapeHtml()before innerHTML insertion — including addresses, names, numeric counts, scores, and complexity values. Cache-Control: no-store: Security middleware injectsCache-Control: no-storeon all non-static dashboard responses to prevent browser caching of sensitive API data.fetchJSON()HTTP error checking: All dashboard JS files use a sharedfetchJSON()helper that checksr.okbefore parsing JSON and throws on HTTP errors, preventing silent consumption of error responses.- Diff path traversal prevention:
get_diff_data()validates comparison file paths viaos.path.realpath()with samples directory containment check. - Dashboard query bounds: Code search queries capped at 500 chars; callgraph edges capped at 5,000; URL redirect parameters use
urllib.parse.urlencode(). - Dashboard
asyncio.to_thread(): All API endpoints and htmx partials run data functions off the event loop viaasyncio.to_thread(), withtry/exceptfallbacks on partials.
Arkana applies defence-in-depth input validation across all layers:
- Search regex safety:
searchparameters on decompile/disassembly tools validate patterns viavalidate_regex_pattern()— rejects patterns longer than 1,000 chars, nested quantifiers (ReDoS), and invalid syntax. Context lines clamped to[0, 20], matches capped at 500. - Emulation limits: Qiling tools validate
max_instructions(0–10M) via_validate_max_instructions()to prevent CPU/memory exhaustion. - Auth header handling:
BearerAuthMiddlewarelowercases ASGI header keys defensively before matching. - Error message sanitisation: Crypto tool errors truncate user input to 50 chars to prevent information disclosure.
- Address validation: Dashboard endpoints reject address parameters longer than 40 characters.
- Content-Length parsing: POST endpoints wrap
int(content_length)in try/except for malformed values. - Delay-load imports: PE parser bounds delay-load thunk iteration at 10,000 to prevent infinite loops on malformed binaries.
- ThreadPool cap: PE parser limits
ThreadPoolExecutorworkers tomin(cpu_count, 8). - IOC regex: Domain TLD matching tightened to
{2,16}characters. - Cache eviction: CAPA and YARA rule caches use LRU (OrderedDict) instead of FIFO.
- ELF symbol parsing: Per-symbol error handling continues iteration on individual failures.
- Path validation:
_get_filepath()always validates paths viastate.check_path_allowed(), even when falling back to the default loaded file. - Recursion depth guard:
_make_hashable()enforces a max depth of 20 to prevent stack overflow on cyclic data structures. - Cache size bounds:
ARKANA_CACHE_MAX_SIZE_MBenv var is clamped to 1–50,000 MB to prevent misconfiguration. - File size limit:
DEFAULT_MAX_FILE_SIZE_MB(256) caps file loading, overridable viaARKANA_MAX_FILE_SIZE_MBenv var (parsed safely via_safe_env_int()). - Systemic limit clamping: All ~60 tool functions that accept a
limitparameter clamp it viamax(1, min(limit, MAX_TOOL_LIMIT))whereMAX_TOOL_LIMITis 100,000 (fromarkana.constants). - Decompression bomb protection:
refinery_decompressuses streaming chunk iteration with a cumulative byte counter, aborting when output exceeds 100 MB (_MAX_DECOMPRESS_OUTPUT). Prevents zip-bomb style attacks. - Hex input validation: All
bytes.fromhex()call sites validate input length before decoding._hex_to_bytes()in_refinery_helpers.pyprovides friendly error messages for invalid hex.patch_binary_memorycaps hex input at 2 MB. - Refinery pipeline loop limits: Pipeline iteration loops break when item count reaches the configured
limit, preventing unbounded output accumulation. - IOC IP filtering:
_is_non_routable_ip()uses Python'sipaddressmodule to correctly filter CGNAT (100.64.0.0/10), multicast, and other non-routable ranges. - Session limits:
MAX_ACTIVE_SESSIONS(default 100) caps concurrent HTTP sessions with oldest-session eviction. Overridable viaARKANA_MAX_SESSIONSenv var. - Debug session limits:
MAX_DEBUG_SESSIONS(3) caps concurrent debug sessions with oldest-session eviction.DEBUG_SESSION_TTL(1800s) auto-cleans idle sessions.DEBUG_COMMAND_TIMEOUT(300s) pauses execution commands viaemu_stop()(session preserved, not killed).DEBUG_RUNNER_TIMEOUT_BUFFER(15s) provides a client-side safety net beyond the runner deadline.MAX_DEBUG_INSTRUCTIONS(10M) caps execution per continue/run_until.MAX_DEBUG_MEMORY_READ(1MB) andMAX_DEBUG_WATCHPOINT_SIZE(1MB) cap memory operations.MAX_DEBUG_BREAKPOINTS(100) andMAX_DEBUG_WATCHPOINTS(50) cap per-session hook counts.MAX_DEBUG_SNAPSHOTS(10) caps saved states. - Emulation inspect session limits:
MAX_EMULATION_SESSIONS(3) caps concurrent emulation inspect sessions with oldest-session eviction.EMULATION_SESSION_TTL(1800s) auto-cleans idle sessions.EMULATION_COMMAND_TIMEOUT(60s) per-command timeout for memory operations.EMULATION_RUN_TIMEOUT(300s) fallback timeout for the initial emulation run; user'stimeout_secondstakes precedence when larger.MAX_EMULATION_MEMORY_READ(1MB) caps memory read per call.MAX_EMULATION_SEARCH_MATCHES(100) caps search results. Defense-in-depth timeouts prevent infinite hangs: runner hard timer (os._exitat timeout+15s), MCPthreading.Timersubprocess kill (at timeout+45s),asyncio.wait_for(at timeout+30s).emulation_resume(Qiling only) allows staged long-running emulation with per-resume timeout enforcement. - Cache atomic writes:
cache.put()andupdate_session_data()usetempfile.NamedTemporaryFile()+os.replace()for atomic writes, preventing.tmpfile collisions and partial writes on crash. - Resource entry cap: PE resource directory traversal is bounded at 1,000 entries (
_MAX_RESOURCE_ENTRIES). - Config atomic writes:
user_config.pyusestempfile.mkstemp()+os.replace()for atomic config file writes. - Result cache defensive copy:
_ToolResultCache.set()stores a shallow copy of the items list to prevent callers from mutating cached data.
Arkana has two layers of testing, with automated CI via GitHub Actions:
- Unit tests (
tests/) - 2853 fast tests covering core modules (utils, cache, state, hashing, parsers, MCP helpers), plus parametrised edge-case tests, concurrency tests for session isolation, and ResettableLock/partial CFG tests. No server or binary samples required. Run in ~13 seconds. - Integration tests (
mcp_test_client.py) - End-to-end tests for all 294 MCP tools against a running server, organised into 19 test categories with pytest markers. - CI/CD (
.github/workflows/ci.yml) - Automated unit tests on Python 3.10/3.11/3.12, coverage enforcement (65% floor with branch coverage), and syntax checking on every push, PR, and manual dispatch. Dependabot monitors pip dependencies weekly.
# Run unit tests (no server needed)
pytest tests/ -v
# Run unit tests with coverage
pytest tests/ -v --cov=arkana --cov-report=term-missing
# Run integration tests (requires running server)
pytest mcp_test_client.py -vFor detailed instructions on running tests, writing new tests, coverage, CI/CD, environment variables, test categories, markers, and troubleshooting, see Testing.