easyai ships as ONE library — libeasyai. There is no split between
"engine" and "cli" libraries; the same shared object carries Engine,
Client, Session, every tool, and the system-prompt composer. CLI /
server / MCP binaries are demos that prove the lib's surface; they
link a single target (easyai / easyai::easyai). Legacy aliases
easyai::engine and easyai::cli resolve to the unified target so
existing split-layout link lines still work.
| Concern | Lives where |
|---|---|
| AI connection — local llama.cpp | easyai::Engine (lib) |
| AI connection — remote OpenAI-protocol | easyai::Client (lib) |
Backend abstraction (uniform chat/reset) |
easyai::Backend, LocalBackend, RemoteBackend (lib) |
| Built-in tools (datetime/web/fs/bash/python/memory/tool_lookup) | easyai::tools::* (lib) |
| System-prompt composition | easyai::preamble::* + easyai::Session (lib) |
| One-call agent setup | easyai::Session (lib) |
External tool loader (EASYAI-*.tools manifests) |
easyai::load_external_tools_from_dir (lib) |
| MCP server / client | easyai::mcp::*, easyai::McpClient (lib) |
| HTTP server, SSE, web UI | examples/server.cpp ONLY |
| REPL, hybrid shell, signal handling | examples/cli.cpp ONLY |
--show-system-prompt, presets, banners |
shared via lib helpers; binary owns the flag |
Rule: anything that touches the model, registers a tool, or composes a system prompt MUST live in the lib so a third-party agent reads as short as ours. The binaries own the surface their job requires (HTTP routes, terminal UX, signal handling), nothing more.
easyai::Session is the OpenAI-Python-SDK-shaped one-call entry
point. See LIB_GUIDE.md for the prose; the contract:
| Surface | Contract |
|---|---|
Session::local(path) · Session::local(Config) · Session::remote(url, model) |
Pick the backend once; same fluent API after. |
.system(text) |
Replace the BASE prompt verbatim — lib default suppressed. |
.no_builtin_system() |
Drop the lib's default BASE; combine with .system_append(...) to author from scratch. |
.system_append(text) / .system_append(callable) |
Concatenated after the BASE in call order. Dynamic form recomputed on every refresh_system(). |
.preamble_options(opt) |
Override per-turn preamble::Options (date/time, cutoff, memory_root, cite_sources). |
.with_default_tools(bool) |
Toggle the canonical toolset (datetime + web + tool_lookup baseline, plus gated fs/bash/python/memory/external). On by default. |
.add_tool(Tool) |
Append a custom tool; its Tool::system_addendum is collected. |
.init(err) |
Builds backend + tools + system. Once per session. |
.refresh_system() |
Re-render and push the system prompt after a mid-session .system_append. |
.chat(user) |
Runs the agentic loop, returns visible reply. Tokens stream via .on_token callback. |
.render_system() |
Returns the resolved system prompt for inspection. |
System-prompt composition order (5 layers, see LIB_GUIDE.md §4):
base → tool addenda → operator appends → dynamic preamble → tools catalogue.
The tools-catalogue tail is local-only; remote sessions delegate to the
server's own build_session_info (server-side, per-request).
New optional std::string field on easyai::Tool. When a tool is
registered onto a Session (or any Backend that honours
Config::extra_tools), the addendum is concatenated into the system
prompt with blank-line separators. Authoring rule: a tool that needs
prompt-level guardrails (policy, citation rules, "always confirm X
first") ships them itself instead of asking the application to
mirror-paste a paragraph into its own system prompt. Removes the
"did we update both places?" drift.
Builder access: Tool::builder("name").system_addendum("…").
LocalBackend::Config and RemoteBackend::Config gained two fields,
populated by Session and also usable directly by callers who skip
the Session layer:
| Field | What |
|---|---|
std::vector<Tool> extra_tools |
Caller-supplied tools, registered AFTER built-ins / memory / external, BEFORE tool_lookup. |
std::string system_appendix |
Static text appended to the system prompt AFTER tool addenda, BEFORE the AVAILABLE-TOOLS catalogue (local). |
Every text source that gets spliced into the system prompt by the lib
runs through one of two sanitizers in easyai::preamble:
| Source | Sanitizer | Cap | Why |
|---|---|---|---|
Single-line tool name / wire_description in tools_block |
sanitize_for_prompt (anonymous, src/preamble.cpp) — strips ALL C0 incl. \n |
64 / 200 | Structural — must stay one bullet item per tool. (§23.1) |
Multi-paragraph Tool::system_addendum |
preamble::sanitize_addendum |
8 KiB / tool | Paragraph block — keeps \n / \t, strips ESC / DEL / bell / other C0. (§25.1) |
Operator's Config::system_appendix / Session::system_append(...) |
preamble::sanitize_addendum |
16 KiB | Same. (§25.1) |
Applied at the THREE splice sites: LocalBackend::init,
RemoteBackend::Impl::rebuild, Session::Impl::full_compose (which
also serves Session::refresh_system / set_system /
render_system). Today every source is operator-controlled, but the
sanitizer is the seatbelt for future plumbing (external manifests,
MCP server tool descriptors) and defends the operator's TTY against
ESC injection via --show-system-prompt.
| Call | History preserved? | Underlying backend call |
|---|---|---|
Session::refresh_system() |
YES | engine_ptr()->system(...) / client_ptr()->system(...) (pure setter) |
Session::set_system(text) |
NO (cleared, matches REPL /system <text>) |
backend->set_system(...) |
Session::add_tool(t) post-init |
YES | engine_ptr()->add_tool(t) / client_ptr()->add_tool(t) + refresh_system() |
Session::add_tool(t) pre-init |
n/a (no history yet) | queued into Config::extra_tools |
Session::reset() |
NO | backend->reset() |
Calling engine_ptr()->add_tool(t) (or client_ptr()->add_tool(t))
directly is allowed but bypasses Session — the tool will be visible
in the model's <tools> block on the next turn, but its
system_addendum will NOT be added to the system prompt until the
next Session::set_system / refresh_system runs. Use
Session::add_tool for the safe path.
| Mode | Invocation | Description |
|---|---|---|
| One-shot | easyai-cli --url URL -p PROMPT |
Execute prompt and exit |
| Interactive REPL | easyai-cli --url URL |
AI prompt loop with / commands |
| Hybrid AI Shell | easyai-cli --url URL --shell |
User's shell with > AI prefix |
| Management | easyai-cli --url URL --list-models |
Server diagnostics |
| State | Ctrl+C effect |
|---|---|
| AI generating | Stop generation, return to prompt |
| At prompt | Clear line, show new prompt (like bash) |
| Shell command running (--shell) | Kill command, return to prompt |
| Triple rapid Ctrl+C | Force-exit (escape hatch) |
Exit via /exit, /quit, or Ctrl+D (EOF).
First Ctrl+C hard-cancels and exits. Second force-exits.
Hybrid shell: the user's $SHELL executes normal commands, lines prefixed
with > are forwarded to the AI model.
| Input pattern | Action |
|---|---|
> prompt text |
Send to AI model |
/exit, /quit |
Exit shell |
/clear, /reset, /compress |
Session management |
/plan, /tools, /help |
Info commands |
cd [dir] |
Change CWD (persists) |
export KEY=VALUE |
Set env var (persists) |
unset VAR |
Remove env var (persists) |
| anything else | Execute via $SHELL -c command |
--shellimplies--allow-bash(AI can run commands)- INI:
[cli] shell = true(CLI flag overrides)
~/project $ command
~/project $ > ask the AI something
CWD shown with ~ abbreviation for $HOME.
These run in-process to persist state across commands:
- cd: Supports
~,-(OLDPWD), relative and absolute paths - export KEY=VALUE: Sets env var, strips surrounding quotes
- unset VAR: Removes env var
Normal commands fork the user's shell ($SHELL -c command). The child
shares the parent's process group so Ctrl+C (SIGINT) reaches it directly
from the kernel. The parent's signal handler ignores SIGINT while a shell
command is running and resumes waitpid on EINTR.
.easyai_sessionwritten after every AI turn (atomic temp+rename)--continueto resume,--compressto recap--no-local-sessionfor read-only mode
| Tool | Condition |
|---|---|
| datetime | always |
| plan | unless --no-plan |
| web | always (runtime check for libcurl) |
| fs (split/unified) | --sandbox DIR |
| bash | --allow-bash or --shell |
| evaluate (formerly python3; runtime is Python 3) | --sandbox and not --no-python |
Green ● icon (same style as tool-call success markers in streaming output).
When the spinner transitions FROM token-streaming mode (showing tk/s) TO thinking/shimmer mode or end-of-turn, a summary line is emitted in dark blue:
● XX% / NNNN tokens last: 00.0tk/s
XX%— context window fill percentageNNNN tokens— absolute context token count (omitted if unavailable)00.0tk/s— last observed generation speed
Emitted on:
- Thinking transition (between agentic hops, when prompt eval restarts)
- End of turn (
finish())
Not emitted when token_speed_ < 0.1 (no meaningful generation occurred).
The shimmer's suffix shows BOTH the live prompt-eval percentage AND the running context-fill percentage:
thinking <N>% · ctx <M>%
| Element | Source |
|---|---|
<N>% |
processed / total from the server's easyai.prompt_progress SSE event (one tick per n_batch tokens decoded). |
<M>% |
LIVE ctx-%: (cached + processed) / n_ctx. Reflects where the KV cache will be when this prompt-eval pass finishes — not stale data from the prior turn. |
When n_ctx is unknown (first turn before the server reports it), only the <N>% part renders.
CLI flag + INI key [cli] prompt_progress = on|off. When off, the cli sends stream_options.easyai_prompt_progress = false in the request body. The server inspects this and skips wiring the per-batch on_prompt_progress callback for that request — no easyai.prompt_progress SSE events fire. The final easyai.prompt_eval summary still fires either way.
| State | --verbose |
Per-batch SSE | Spinner % during eval | Per-batch log lines | Final summary on screen | Final summary in log |
|---|---|---|---|---|---|---|
| default | OFF | yes | thinking N% · ctx M% |
none | ● prompt eval: N tok · ms · t/s · ctx M% |
yes |
| default | ON | yes | same | per batch via easyai::log::write |
same | yes |
--no-prompt-progress |
OFF | NO | static "thinking" | none | same | yes |
--no-prompt-progress |
ON | NO | static "thinking" | none (no events to log) | same | yes |
on_prompt_eval (the final summary, fires once per agentic hop) always calls easyai::log::write with the structured line [prompt_eval] N tok (M cached) · X ms · Y t/s · ctx Z% (used/total). log::write tees stderr + the --log-file file (if set), so the final metrics land in the log regardless of --verbose.
Output always prefixes every line with <n>| (line numbers on by default in both modes). Reports total line count for files ≤ 8 MiB. Description tells the model to read before fs_edit for accurate line references.
| Mode | Trigger | Default limit | Line numbers | Total count |
|---|---|---|---|---|
| Line mode | start_line set |
200 lines (max 2000) | Always on | Yes |
| Byte mode | default / offset set |
65536 bytes (max 1 MiB) | On (default true) | Yes (≤ 8 MiB files) |
The unified fs tool accepts a batch via ops — an array of single-op shapes. Lets a model land many file edits in one round trip and amortises the per-call overhead.
| Cap | Value | Why |
|---|---|---|
| Ops per call | 50 | Bound the report length and the worst-case file-system churn per turn. |
| Distinct files per call | 20 | Counted only over ops that name a path (cwd/sandbox/glob/grep/list are free). Stops a runaway batch from blasting many files at once. |
| Same-path edits reorder | descending start_line |
Each edit's start_line refers to the file's ORIGINAL line numbers — model doesn't have to track line drift. |
| Read clip in batch | 2 KiB per op | Successful read ops are clipped to 2 KiB in the batch report; re-issue a standalone read for the full body. |
| Error visibility | full body | Failed ops emit their complete diagnostic so the model can self-correct without re-running. |
continue_on_error |
default false |
Stop-on-first matches small-model debugging flow. |
| Report header | batch: N ops across F files (path1, path2, …) |
Single line at the top so the model can verify it touched the right set before reading per-op statuses. |
Exposed only on the unified fs surface. Default ToolMode is Split (one focused tool per action — better small-model dispatch); opt into the ops batch with --tools-mode unified or --tools-mode both.
easyai::cli::Toolbelt::tool_mode_ defaults to ToolMode::Split — one focused tool per action (fs_read, fs_write, fs_edit, …, web_search, web_fetch, memory_search, …). Small / weaker tool-callers dispatch more reliably against flat one-verb-per-tool schemas than against an action-discriminated union.
To pick up the unified fs(action="ops") batch (or the web(action=…) dispatcher), opt in with .tool_mode(ToolMode::Unified) or --tools-mode unified. Both registers both surfaces side-by-side.
Mandatory workflow when both memory and web tools are available:
- Memory first — search/load relevant keywords
- Web second — also search the web, even if memory had results
- Merge & answer — combine both, prefer more recent/authoritative on conflict
- Update memory — save durable new facts the web provided
Enforced in three places: system preamble (preamble.cpp), memory tool description (rag_tools.cpp), web tool descriptions (builtin_tools.cpp).
| Operation | Default | Max |
|---|---|---|
| search (results per page) | 10 | 20 |
| list (entries) | 50 | 200 |
| keywords (vocabulary) | 200 | 500 |
Keywords are the sole identifier. Sorted + joined by _ = filename stem.
Example: "python async sockets" → file async_python_sockets.md.
Files starting with fix- are immutable (cannot overwrite or delete).
| Entry exists? | Behavior | Return message |
|---|---|---|
| Yes | Append content after --- separator |
updated "key.md" (+N B → M B total) |
| No | Create new entry | created "key.md" (N bytes) |
Fixed (fix-*) |
Error | Immutable, cannot append |
Three independent channels the model sees per turn — these MUST stay in sync.
| Channel | Contains | Producer | Source field |
|---|---|---|---|
<tools> block in system message (Jinja template) |
name + short trigger + JSON schema |
Engine::Pimpl::chat_tools() / Client::tool_to_json() |
Tool::wire_description() |
| Inline "Active tools" enumeration in system prompt | name — short trigger per tool |
easyai::preamble::tools_block(view) |
Tool::wire_description() |
MCP tools/list response |
name + full description + inputSchema |
easyai::mcp::tool_descriptor(t) |
Tool::description (full) |
Authoring rule: every tool sets both .short_describe(...) and .describe(...). wire_description() falls back to the first line of .describe(...) if .short_describe(...) was omitted (pre-Shape-C tools keep working).
tool_lookup:
- No arg → INDEX view: numbered
name: shortlist. Cheap, scannable. name=<substring>→ MANUAL view: full.describe()body for every match. The expanded help text the model drills into when the short trigger isn't enough.
| Tool | Disk reads | Disk writes/edits |
|---|---|---|
fs (or split fs_*) |
yes | YES — primary |
bash |
yes | YES — for shell features fs can't do |
evaluate (legacy alias python3; runtime is Python 3) |
yes (read-only) | NO — sandbox preamble rejects any write-mode open(), even inside the sandbox root |
Code enforcement: kPythonSandboxPreamble (src/builtin_tools.cpp) wraps builtins.open, io.open, and os.open to raise PermissionError on any w/a/x/+ mode or O_WRONLY|O_RDWR|O_CREAT|O_TRUNC|O_APPEND flag. Read-only r, rb, default-mode open continue to work inside the sandbox.
Prompt enforcement: easyai::preamble::tools_block(view) emits the ## Write/edit policy (AUTHORITATIVE) section whenever evaluate is registered. MCP servers inject the same policy via initialize.result.instructions so MCP clients' models see it too.
To defeat the model's strong "python = write files / call subprocess / fetch URL" training prior, the model-facing tool name was renamed from python3 to evaluate. The operator-facing surface (CLI flag --no-python, INI key allow_python, the runtime binary python3 -I -S -E -c) is unchanged.
canonical_tool_name("python3") returns "evaluate" so resumed chat sessions, hardcoded manifest reservations, and any legacy caller that dispatches by the old name still work — the dispatcher routes the legacy name to the new tool, no second schema shipped.
The model's short trigger is now: "Evaluate Python 3 code for compute / algorithm prototyping. FORBIDDEN: filesystem, subprocess, network, ctypes. Stdlib compute only." — the first non-generic word is evaluate, framing the affordance as expression evaluation, not file authoring.
easyai::preamble::build() emits blocks in this order so prompt-eval cache survives a memory write:
- AUTHORITATIVE DATE/TIME
- KNOWLEDGE CUTOFF
- KNOWLEDGE LOOP rules
- CITE SOURCES
- MEMORY VOCABULARY (volatile — appended at the tail)
render_memory_vocabulary caches the rendered string keyed on (root_dir, directory mtime, file count). Memory saves bump the directory mtime via rename(2), invalidating the cache without explicit signalling. Warm path: one stat() per request. Edge case: filesystems with second-resolution mtime (HFS+, some NFS) can serve one second of stale vocab if two writes within the same second leave the file-count unchanged — accepted because vocab is advisory (cf. SECURITY_AUDIT §23.3).
easyai::preamble::tools_block runs t.name and t.wire_description() through sanitize_for_prompt(s, cap) before emitting the active-tools bullet list. C0 control bytes (0x00–0x1f) and DEL (0x7f) collapse to a single space; UTF-8 multi-byte (0x80+) passes through. Caps: 64 chars (name), 200 chars (description). Closes the structural-corruption prompt-injection vector when active_tools is populated from a less-trusted source (e.g. cli's /v1/tools runtime fetch). See SECURITY_AUDIT §23.1.
The kPythonSandboxPreamble injected before every evaluate snippet enforces TWO invariants:
- Sandbox containment — open() / io.open() / os.open() reject paths resolving outside the cwd (sandbox root).
- Read-only — write-mode
open(...)rejected regardless of path. Mode charsw/a/x/+(any case) onbuiltins.open/io.open; flagsO_WRONLY | O_RDWR | O_CREAT | O_TRUNC | O_APPENDonos.open.
PermissionError messages point the model at the filesystem write tool registered this session (the exact callable name is read from the model's AVAILABLE TOOLS list — that way the error message stays correct whether the operator chose Split mode fs_write or Unified mode fs(action="write")). Read-only opens inside the sandbox continue to work (legitimate "load CSV, compute, print result" flows are unaffected).
Documented residual: Python's __closure__ introspection on builtins.open recovers the unwrapped open from the closure cell, bypassing both checks. Same class as the existing ctypes / _io.FileIO / subprocess bypasses — adversarial intent is out of scope; defense is against accident. See SECURITY_AUDIT §23.2.