A lightweight CLI coding agent built as a reference implementation.
Merlion is a working coding agent you can run from the terminal or from WeChat. It is built as a reference implementation: small enough to read, but complete enough to show the real shape of a coding agent. Context assembly, tool execution, session persistence, and verification are all here in code you can actually follow.
Compared with broader tools such as Claude Code and Codex CLI, Merlion keeps the product layer intentionally thin so the runtime stays legible. The point is twofold: the core path is compact without being partial, and if coding agents are going to matter, we need a lightweight system that helps us understand what one actually is.
- A runtime loop with planning, tool execution, retries, guardrails, and verification
- A context system with orientation, compact summaries, path guidance, and layered
AGENTS.md/MERLION.md - A builtin tool layer for files, search, shell, git, config, and LSP-assisted edits
- A real sandbox stack: OS-level sandboxing for subprocesses plus application-layer policy enforcement for file, fetch, and approval flows
- Two transports: terminal REPL first, plus optional WeChat inbox mode
- Bench and regression lanes for fixture tests, BugsInPy, and SWE-bench Lite
- The core path stays short, but the essential pieces are still there: loop, tools, context, sessions, guardrails, verification
- The codebase is small enough to read end-to-end without reverse-engineering a large product surface
- It runs as a local Node.js runtime rather than depending on a hosted control plane
- The tool layer is practical, but still narrow enough to understand without days of setup
- The architecture is opinionated on purpose: fewer abstractions, fewer hidden systems, less ceremony
Lightweight here does not mean incomplete. It means the runtime is kept narrow enough that the design decisions are still visible.
Merlion requires Node.js >=22.
Global install:
npm install -g merlion
merlionProject-local install:
npm install merlion
npx merlionOn first run, Merlion opens a setup wizard for provider, API key, and model. It works with OpenAI-compatible endpoints, including custom base URLs.
Common usage:
# one-shot
merlion "read src/index.ts and summarize the startup flow"
# interactive REPL
merlion
# continue a previous session
merlion --resume <session-id>
# restore the last git checkpoint for a session
merlion undo <session-id>Default CLI execution runs with:
--sandbox workspace-write--approval on-failure--network off
That means Merlion can change files in the current workspace, does not allow outbound network by default, and only asks to widen the sandbox after a sandbox/policy failure.
Useful overrides:
# strict read-only investigation
merlion --sandbox read-only --approval never
# allow networked shell/tool execution
merlion --network full
# fully unsandboxed local run
merlion --sandbox danger-full-access --approval neverLegacy flags still work:
--auto-allowmaps to--approval never--auto-denymaps to--approval untrusted
Shell-like tools such as bash and run_script execute through the sandbox backend. File tools (read_file, write_file, edit_file, create_file, and related mutations) use the same sandbox policy at the application layer, so read-only, deny-read, and deny-write still apply even when no shell is involved. The fetch tool also respects --network.
Sandboxing is one of Merlion's core runtime features, not an afterthought.
Merlion separates two concerns:
sandbox: the execution boundaryapproval: when Merlion is allowed to widen that boundary
The main modes are:
read-only: no file mutationsworkspace-write: writes are limited to the workspace or explicit writable rootsdanger-full-access: no filesystem sandbox
Approval policies are:
untrusted: deny escalationon-failure: ask only after a sandbox or policy failureon-request: allow interactive escalation requestsnever: never ask; stay inside the configured boundary
This model applies across the runtime:
bashandrun_scriptrun inside the sandbox backend- file tools enforce the same policy at the application layer
fetchrespects network policy- subagents inherit and can only narrow the parent sandbox
- WeChat runs without interactive escalation
Merlion also creates a git checkpoint for writable local sessions and provides merlion undo <session-id> and /undo as a recovery path.
If you want to read the code rather than just run it, start here:
src/index.ts: CLI bootstrap, config resolution, session wiringsrc/runtime/loop.ts: main agent loopsrc/runtime/executor.ts: tool execution and model turn handlingsrc/runtime/query_engine.ts: conversation runtimesrc/context/*: orientation, compacting, path guidancesrc/tools/*: tool registry and builtin toolssrc/transport/wechat/*: WeChat transport
There is also a higher-level technical overview in docs/merlion_runtime_technical_overview.md.
Merlion can use WeChat as an agent inbox.
# first time or token refresh
merlion wechat --login
# daily use
merlion wechatInside REPL, you can also trigger login directly:
:wechat
/wechat
Credentials are stored at ~/.config/merlion/wechat.json.
By default, WeChat receives final replies and concise error hints, not internal tool logs. If you want progress updates, set MERLION_WECHAT_PROGRESS=1. For more detailed progress, set MERLION_WECHAT_PROGRESS_VERBOSE=1.
Interactive terminal approvals are not available in WeChat mode. WeChat sessions run with approval=never, and the default sandbox is workspace-write, so the agent can edit files in the current workspace but cannot widen permissions mid-session. If you want a different startup boundary, pass sandbox flags explicitly when launching WeChat mode, for example:
# cautious
merlion wechat --sandbox read-only
# trusted local automation
merlion wechat --sandbox danger-full-access- Not a product-comparison project; it is a runtime to read, run, and extend
- Not trying to reproduce every workflow and integration from broader agent tools
- Not a stable SDK or platform layer yet
- Not optimized for non-technical onboarding first
- Not interested in hiding architectural tradeoffs behind a black box
Merlion is a small, opinionated runtime meant to stay understandable while still covering the essential shape of a real coding agent.