Aar is configured via AgentConfig — either in code or through a JSON config file.
from agent import AgentConfig, GuardrailsConfig, ProviderConfig, SafetyConfig, ToolConfig
from agent.core.config import SandboxConfig, TUIConfig
config = AgentConfig(
provider=ProviderConfig(
name="anthropic", # "anthropic" | "openai" | "ollama" | "gemini" | "generic"
model="claude-sonnet-4-20250514",
api_key="...", # or set via env var
max_tokens=4096,
temperature=0.0,
context_window=None, # override AgentConfig.context_window per provider
token_budget=None, # override AgentConfig.token_budget per provider
cost_limit=None, # override AgentConfig.cost_limit per provider
response_format="", # "" | "json" | "json_schema"
json_schema={}, # schema when response_format="json_schema"
),
providers={ # named provider profiles for runtime switching
"claude": ProviderConfig(name="anthropic", model="claude-sonnet-4-6"),
"gpt4": ProviderConfig(name="openai", model="gpt-4o"),
},
tools=ToolConfig(
enabled_builtins=["read_file", "write_file", "edit_file", "list_directory", "bash", "grep", "find_files"],
command_timeout=30, # per-tool execution limit in seconds; 0 = no limit
max_output_chars=50_000,
),
safety=SafetyConfig(
read_only=False, # block all writes
require_approval_for_writes=True, # ask before every write
require_approval_for_execute=True, # ask before every shell command
denied_paths=["**/.env", "**/*.key"], # glob patterns (see docs/safety.md for defaults)
allowed_paths=["<cwd>/**"], # hard path boundary; <cwd> expands to Path.cwd() at startup
# empty list = allow all non-denied paths
sandbox=SandboxConfig( # see docs/safety.md for all modes and per-mode options
mode="local", # "local" | "linux" | "windows" | "wsl" | "auto"
),
acp_approval_timeout=0.0, # seconds the ACP client has to respond to a permission request; 0.0 = wait indefinitely
),
guardrails=GuardrailsConfig(
max_tokens_recoveries=2, # retry after output truncation (0 = disabled)
max_repeated_tool_steps=3, # stop after N identical tool-call patterns
reserve_tokens=512, # budget proximity threshold
reserve_cost_fraction=0.1, # cost proximity fraction
),
max_steps=50,
max_retries=3, # provider request retry attempts
timeout=0.0, # wall-clock limit in seconds for the whole run; 0.0 = no limit
streaming=False, # use token-level streaming when supported
context_window=0, # model context limit in tokens; 0 = no management
context_strategy="sliding_window", # "sliding_window" | "compact" | "none"
system_prompt="You are a helpful assistant.",
tui=TUIConfig(
theme="default", # "default" | "contrast" | "decker" | "sleek" or custom name
layout={}, # section visibility (see docs/themes.md)
),
token_budget=0, # max total tokens per run; 0 = unlimited
cost_limit=0.0, # max USD cost per run; 0.0 = unlimited
token_warning_threshold=0.8, # TUI warning at 80% of budget
cost_warning_threshold=0.8, # TUI warning at 80% of cost limit
session_dir=".agent/sessions",
project_rules_dir=".agent", # project rules folder (see below)
log_level="WARNING", # DEBUG | INFO | WARNING | ERROR | CRITICAL
log_file=None, # opt-in file logging path (append mode)
)The providers dict lets you pre-configure multiple LLM providers and switch between them at runtime with /model <key>. Each entry is a full ProviderConfig — API keys, temperature, max_tokens, and provider-specific extra settings are all per-profile.
The provider field selects the active profile:
- String — key into
providers(e.g."provider": "claude") - Inline object — a
ProviderConfigdict (backward compatible with existing configs)
See Providers — Runtime provider switching for usage details.
ProviderConfig accepts three optional override fields that take effect when that provider is active:
| Field | Type | Default | Description |
|---|---|---|---|
context_window |
int | null |
null |
Model context limit in tokens; overrides global context_window |
token_budget |
int | null |
null |
Max total tokens per run; overrides global token_budget |
cost_limit |
float | null |
null |
Max USD cost per run; overrides global cost_limit |
When null (the default), the global AgentConfig value is used. Set to 0 explicitly to disable budgets for local/free models.
This lets you configure context windows and cost limits per model — a 32k Ollama model doesn't need a 200k sliding window, and a free local model doesn't need a $5 cost cap.
Aar has several independent timeouts that operate at different layers. They interact — setting one without considering the others can cause confusing failures.
| Setting | Layer | Provider | Default | What it controls |
|---|---|---|---|---|
provider.extra.read_timeout |
HTTP read | Ollama | null (unlimited) |
Max seconds to wait for the next byte while streaming; null = no cap |
provider.extra.timeout |
HTTP request | Anthropic, OpenAI, Generic | SDK default / 60 s |
Whole-request timeout passed to the provider SDK or httpx client |
tools.command_timeout |
Tool executor | all | 30 s |
Max wall-clock seconds a single shell/bash tool call may run; 0 = unlimited |
safety.acp_approval_timeout |
ACP transport | all | 0.0 (unlimited) |
Seconds the ACP client has to respond to a permission approval request |
timeout |
Agent loop | all | 0.0 (unlimited) |
Total wall-clock limit for a whole Agent.run() call |
| Provider | Knob | Default | Notes |
|---|---|---|---|
| Ollama | extra.read_timeout |
null |
Controls the streaming read phase only. null is strongly recommended for local models — response generation can take many minutes. |
| Anthropic | extra.timeout |
null → SDK default (600 s) |
Passed directly to AsyncAnthropic(timeout=...). null uses the SDK's own default. |
| Anthropic | extra.prompt_caching |
false |
Enable prompt caching to cut repeated system-prompt costs by ~90% on turns 2+. |
| OpenAI | extra.timeout |
null → SDK default (600 s) |
Passed directly to AsyncOpenAI(timeout=...). null uses the SDK's own default. |
| Gemini | extra.timeout |
120.0 s |
Passed to httpx.Timeout(timeout, connect=10.0) (HTTP mode) or SDK client (SDK mode). Increase for Pro with large thinking budgets. |
| Generic | extra.timeout |
60.0 s |
Passed to httpx.Timeout(timeout, connect=10.0). Covers the full round-trip. Increase for slow proxies. |
Agent.run() wall-clock limit (timeout)
└─ each loop step
├─ provider call ─── provider-level timeout (read_timeout / extra.timeout)
└─ tool execution ─── command_timeout caps a single bash/shell invocation
└─ if ACP ─── acp_approval_timeout caps the human-approval round-trip
Key rules:
- Provider timeout must be ≥ the longest single model response you expect. Large local models on slow hardware can need 3–10 minutes per step.
null(unlimited) is the safe default for Ollama and for Anthropic/OpenAI (they have their own 600 s SDK default which is usually sufficient). command_timeoutmust be ≥ the longest shell command the agent may run. Build steps, test suites, or long compilations need a generous value (120–300 s).0disables it entirely.timeout(agent loop) is the outer bound — set it larger than the provider timeout × expected number of steps. If it fires mid-stream the run is cancelled cleanly.acp_approval_timeoutonly matters in ACP mode (aar acp).0.0waits indefinitely for the editor to respond, which is usually correct.
{
"provider": {
"extra": {
"read_timeout": null
}
},
"tools": {
"command_timeout": 120
},
"timeout": 0.0
}Use null / 0 / 0.0 (unlimited) at every layer when running large models locally. Add explicit caps only when you want to protect against runaway steps.
If you want a ceiling (e.g. to surface a hung model faster):
{
"provider": {
"extra": {
"read_timeout": 600
}
}
}600 means the HTTP read times out after 10 minutes of no new bytes from Ollama. The agent loop then retries up to max_retries times before failing the step.
Both SDKs default to 600 s. Tighten it for enterprise proxy setups where you want faster failure:
{
"provider": {
"name": "anthropic",
"extra": {
"timeout": 120
}
}
}The generic provider defaults to 60 s — suitable for fast proxies but may be too short for slow ones:
{
"provider": {
"name": "generic",
"extra": {
"timeout": 300
}
}
}All CLI modes and the web transport load configuration from multiple sources. The order of precedence (highest wins):
| Source | aar chat / aar run / aar tui |
aar serve |
WebTransport() programmatic |
Agent() programmatic |
|---|---|---|---|---|
| Explicit CLI flag | highest | yes (fewer flags — see Web API) | — | — |
--config <file> |
yes | yes | — | — |
~/.aar/config.json |
auto-discovered | auto-discovered | auto-discovered | not loaded |
| Built-in defaults | lowest | yes | yes | only source unless you pass config= |
When using Agent() directly in code, the config file is not loaded automatically — pass a config explicitly if you need it:
from pathlib import Path
from agent.core.config import load_config
from agent import Agent
config = load_config(Path("~/.aar/config.json").expanduser())
agent = Agent(config=config)SafetyConfig.require_approval_for_writes and require_approval_for_execute default to True. What happens when a tool triggers an approval check depends on the transport:
| Mode | Approval behaviour |
|---|---|
aar chat, aar tui, aar run |
Terminal prompt — y / n / always |
aar serve / WebTransport |
Auto-approved — the HTTP request is treated as implicit approval |
Agent() programmatic (no callback) |
Denied — logged as "No approval callback configured" |
To disable approval prompts in terminal mode entirely:
aar chat --no-require-approval
aar run --no-require-approval "do something"Or set it permanently in ~/.aar/config.json:
{
"safety": {
"require_approval_for_writes": false,
"require_approval_for_execute": false
}
}To inject a custom approval callback for the web transport (e.g. call a webhook before each write):
from agent.safety.permissions import ApprovalCallback, ApprovalResult
from agent.transports.web import create_asgi_app
async def my_approval(spec, tc) -> ApprovalResult:
# call your external system here
return ApprovalResult.APPROVED
app = create_asgi_app(config, approval_callback=my_approval)To supply an approval callback when using Agent() directly:
from agent import Agent, AgentConfig
from agent.safety.permissions import ApprovalResult
async def my_approval(spec, tc) -> ApprovalResult:
print(f"Allow {tc.tool_name}? (y/n) ", end="")
return ApprovalResult.APPROVED if input().strip().lower() == "y" else ApprovalResult.DENIED
agent = Agent(config=AgentConfig(), approval_callback=my_approval)Control how much the agent logs to stderr. The default is WARNING.
| Level | What you see |
|---|---|
DEBUG |
Everything — full tracebacks, HTTP traces, step-by-step loop internals |
INFO |
Step counts, provider timing, tool execution summaries |
WARNING |
Provider errors, safety policy hits, unexpected conditions (default) |
ERROR |
Only hard failures |
CRITICAL |
Silent except for fatal errors |
Via config file (~/.aar/config.json or --config):
{
"log_level": "DEBUG"
}Via AgentConfig in code:
config = AgentConfig(log_level="DEBUG")Via CLI flag (overrides the config file for that run):
aar chat --log-level DEBUG
aar run "do something" --log-level INFO
aar tui --log-level WARNING
aar serve --log-level DEBUGBy default aar logs to stderr only (12-factor / container-friendly). Opt in to file logging
with log_file in the config or --log-file on the CLI:
{
"log_level": "DEBUG",
"log_file": "/var/log/aar/agent.log"
}aar serve --log-level INFO --log-file /var/log/aar/agent.log
aar chat --log-file ./debug.logThe file handler uses append mode and includes timestamps. Both stderr and file handlers are
active when log_file is set.
You can cap how many tokens or how much estimated cost a single agent run is allowed to consume. Both limits default to zero (unlimited).
Token counts are read from the ProviderMeta event that fires after every provider call — for both streaming and non-streaming responses. In streaming mode the final chunk from the provider carries the usage data; _consume_stream() (in agent/core/provider_runner.py) captures it and attaches it to the response before the event is emitted. See Tokens, costs, and budgets for the full pipeline, per-provider details, and how each transport displays the counts.
Run aar init to get ~/.aar/pricing.template.json — a copy of the full built-in pricing table — as a reference. Rename to pricing.json and adjust if needed.
Aar loads a built-in pricing table from agent/core/pricing.json (shipped with the package). If ~/.aar/pricing.json exists it is merged on top, letting you extend or override any entry. After each provider call the framework multiplies token counts by the matching per-token price to produce an estimated USD cost. The estimate is approximate — prompt-caching discounts, batching, and future price changes are not reflected.
- Cost is accumulated across all steps in the run, just like tokens.
- When
cost_limit(if > 0) is exceeded the agent stops the same way as fortoken_budget. - Local or Ollama models that don't match any pricing-table entry will report $0.00 cost.
To add prices for custom or local models (e.g. Ollama), create or edit ~/.aar/pricing.json:
{
"_comment": "USD per 1M tokens. Keys are model-name prefixes.",
"gemma4": { "input_per_million": 0.05, "output_per_million": 0.10, "cache_read_per_million": 0.0, "cache_write_per_million": 0.0 }
}token_warning_threshold and cost_warning_threshold are fractions (0.0–1.0) of the corresponding limit. When the running total crosses the threshold the TUI switches the counter display to red. This is a visual cue only — the agent keeps running until the hard limit is hit.
| Behaviour | Detail |
|---|---|
| Checked | After each provider call, before the next step |
| Scope | Per-run (resets when Agent.run() is called again) |
| State on exceed | AgentState.BUDGET_EXCEEDED |
| Event emitted | ErrorEvent with a descriptive message |
| Warning thresholds | Visual only — TUI counter turns red |
Via AgentConfig in code:
config = AgentConfig(
token_budget=100_000, # stop after 100 k tokens
cost_limit=5.0, # stop after ~$5
token_warning_threshold=0.9, # yellow → red at 90 %
cost_warning_threshold=0.9,
)Via config file (~/.aar/config.json or --config):
{
"token_budget": 100000,
"cost_limit": 5.0,
"token_warning_threshold": 0.9,
"cost_warning_threshold": 0.9
}For full details on how token counts flow through the system, how each transport displays them, and per-provider caveats, see Tokens, costs, and budgets.
GuardrailsConfig provides mechanical safety nets for the agent loop — things that cannot be expressed as system prompt instructions.
| Field | Default | Meaning |
|---|---|---|
max_tokens_recoveries |
2 |
How many times the loop retries after output truncation (max_tokens). Set to 0 to disable. |
max_repeated_tool_steps |
3 |
Stop the loop when the same tool-call pattern repeats this many times in a row. |
reserve_tokens |
512 |
Token budget proximity threshold — the loop reports "near budget" below this margin. |
reserve_cost_fraction |
0.1 |
Cost proximity — fraction of cost_limit that triggers "near budget". |
The guardrails are deliberately minimal. Agent behavior (planning, persistence, completion quality) is guided entirely by the system prompt — see the rules.md file loaded via the configurable system prompt layers.
By default, the system prompt is assembled automatically from up to five layers (all optional except Base):
| # | Layer | Source | Purpose |
|---|---|---|---|
| 1 | Base | built-in | Runtime facts — OS, working directory, shell |
| 2 | Global rules | ~/.aar/rules.md |
Personal preferences that apply to all projects |
| 3 | Global drop-ins | ~/.aar/rules.d/*.md (sorted) |
Environment-specific additions; drop files in without editing the main file |
| 4 | Project rules | <project_rules_dir>/rules.md |
Project-specific instructions (checked into git) |
| 5 | Project drop-ins | <project_rules_dir>/rules.d/*.md (sorted) |
Per-contributor or per-machine overrides; can be gitignored |
If no rules files exist, only the base prompt is used. When present, the layers are concatenated in order, separated by ---.
Global rules — create ~/.aar/rules.md for preferences that follow you across projects:
# My rules
- Always use type hints on public functions.
- Prefer pathlib over os.path.
- Use ruff for formatting.Global drop-ins — place any number of .md files in ~/.aar/rules.d/ and they are appended after rules.md, sorted by filename. Useful for environment-specific rules (e.g. 10-work-proxy.md, 20-local-models.md) without touching the main file.
Project rules — create <project_rules_dir>/rules.md (default .agent/rules.md) for instructions specific to the current repo:
# Project rules
- This is a FastAPI app. Use pytest-asyncio for async tests.
- Follow the existing service pattern in app/services/.Project drop-ins — place .md files in <project_rules_dir>/rules.d/ for per-contributor or per-machine additions. Add rules.d/ to .gitignore if you don't want them committed, or commit them for shared team overrides.
Run aar init to create the skeleton files and directories. The init command pre-installs default agent rules at ~/.aar/rules.md and a multi-provider reference config at ~/.aar/config.example.json. Edit the rules file to add your own global preferences, or use the example config as a starting point for new provider profiles.
Override — if you pass system_prompt explicitly to AgentConfig, the auto-assembly is skipped entirely and your string is used as-is.
The project rules directory defaults to .agent. Change it with project_rules_dir so the agent reads <project_rules_dir>/rules.md instead of .agent/rules.md:
Via config file:
{
"project_rules_dir": ".config/aar"
}Via AgentConfig in code:
config = AgentConfig(project_rules_dir=".config/aar")This only affects where project rules are loaded from. The session_dir is configured independently.
The tui section controls the TUI's visual appearance and section visibility. See Themes & Layout for full details.
Via config file:
{
"tui": {
"theme": "default",
"layout": {
"reasoning": { "visible": false },
"token_usage": { "visible": false }
}
}
}Via CLI flag (theme and mode):
aar tui --theme decker
aar tui -t contrast
aar tui --fixed # full-screen mode with fixed bars, scrollable body, mouse support
aar tui --fixed --theme decker # fixed mode with a specific themeFixed mode includes keyboard shortcuts: Ctrl+S (send), Ctrl+X (cancel agent), Ctrl+T (cycle theme), Ctrl+K (toggle thinking), Ctrl+L (clear), Ctrl+G (log viewer), Ctrl+Q (quit), Ctrl+Up/Down (input history), Page Up/Down (scroll). Enter adds a new line; Ctrl+S submits. See Themes & Layout for the full reference.
At runtime (inside the TUI):
/theme # list available themes
/theme decker # switch theme
/theme next # cycle themes
Built-in themes: default, contrast, decker, sleek. Custom themes go in ~/.aar/themes/<name>.json — run aar init to get a template and JSON schema.