Skip to content

SafeRL-Lab/nano-claude-code

Repository files navigation

English | 中文 | Français | 한국어 | 日本語 | Deutsch | Português

Logo

Nano Claude Code: A Fast, Easy-to-Use Python Reimplementation of Claude Code Supporting Any Model

The newest source of Claude Code · Issue · Brief Intro


🔥🔥🔥 News (Pacific Time)

  • 02:16 PM, Apr 05, 2026: Reasoning, Rendering, and Packaging Improvements
    • Post-merge fixes — removed a debug debug_payload.json file write that was firing on every OpenAI-compatible API call (left over from PR #11 development). Also fixed ANSI dim color not being reset after the thinking block ends, which caused subsequent text to appear dim in non-Rich terminals. Bumped pyproject.toml version to 3.05.4, and moved sounddevice to the optional voice extra (pip install nano-claude-code[voice]).
    • Native Ollama reasoning + terminal rendering fix — local reasoning models (deepseek-r1, qwen3, gemma4) now stream their <think> blocks to the terminal. Ollama exposes thoughts in msg["thinking"], but nano-claude was previously dropping them; this is now fixed by yielding ThinkingChunk from the Ollama adapter. Also fixed a Windows CMD/PowerShell rendering issue where token-by-token ANSI dim resets caused thoughts to print vertically, and corrected flush_response() so it runs once at the end instead of on every thinking token. Enable with /verbose and /thinking.
    • uv support — added pyproject.toml; install with uv tool install . to make the nano_claude command available globally from anywhere in an isolated environment, without manual PATH setup.
  • 00:41 PM, Apr 05, 2026: v3.05.4 — Structured session history: on every exit, sessions are saved to daily/YYYY-MM-DD/ (capped at session_daily_limit, default 5 per day) and appended to a master history.json (capped at session_history_limit, default 100). Each session file now includes session_id and saved_at metadata. /load groups sessions by date with time, ID, and turn-count display; supports multi-select (1,2,3) to merge sessions and H to load the full history with token-count confirmation. Both limits are configurable via /config.
  • 00:41 PM, Apr 05, 2026: v3.05.3 fix session — Structured session history: on every exit, sessions are saved to daily/YYYY-MM-DD/ (capped at session_daily_limit, default 5 per day) and appended to a master history.json (capped at session_history_limit, default 100). Each session file now includes session_id and saved_at metadata. /load groups sessions by date with time, ID, and turn-count display; supports multi-select (1,2,3) to merge sessions and H to load the full history with token-count confirmation. Both limits are configurable via /config.
  • 09:34 AM, Apr 05, 2026: v3.05.3 — Added GitHub Gist cloud sync: /cloudsave setup <token> to configure, /cloudsave to upload the current session to a private Gist, /cloudsave auto on to sync automatically on /exit, /cloudsave list to browse cloud sessions, and /cloudsave load <id> to restore from the cloud. Uses stdlib urllib — no new dependencies. Also added version number (e.g., v3.05.2) in the startup banner: The startup banner now displays the current version number (v3.05.2) in green, making it easy to identify which version is running at a glance.
  • 08:58 AM, Apr 05, 2026: v3.05.2 — Introduced /proactive [duration] command: a background daemon thread watches for user inactivity and automatically wakes the agent up after the specified interval (e.g. /proactive 5m), enabling continuous monitoring loops without user intervention. /proactive with no args now shows current status; /proactive off disables it explicitly. Proactive polling state is stored in config (no module-level globals). Watcher exceptions are logged via traceback instead of silently swallowed. Also fixed duplicated output in Rich-enabled terminals by buffering text during streaming and rendering Markdown once via rich.live.Live — updates happen in-place for a true streaming Markdown experience.
  • 10:51 PM, Apr 04, 2026: v3.05_fix04 — Fixed a crash on /model and config save commands caused by the newly introduced _run_query_callback being serialized to JSON; also added SleepTimer usage
    guidance to the system prompt so the agent knows when to invoke background timers proactively.
  • 10:28 PM, Apr 04, 2026: v3.05_fix03 — Added a native SleepTimer tool that lets the agent schedule background timers and autonomously wake itself up after a delay — no user prompt required. Paired with a threading.Lock to prevent output collisions when background and foreground calls overlap. Also includes cross-platform fixes: Windows ANSI color support, CRLF-aware Edit tool matching, an interactive numbered menu for /load, native Ollama streaming via /api/chat, and auto-capping max_tokens per provider to prevent API errors.
  • 08:31 PM, Apr 04, 2026: v3.05_fix — Autosave + /resume: session is automatically saved to mr_sessions/session_latest.json on /exit, /quit, Ctrl+C, and Ctrl+D. Run /resume to restore the last session instantly, or /resume <file> to load a specific file from mr_sessions/, and better support for api and local Ollama models (specifically gemma4), along with Windows compatibility enhancements, session management UX improvements, and cross-platform reliability fixes for the Edit tool.
  • 00:41 AM, Apr 04, 2026: v3.05 — Voice input (voice/ package): sounddevicearecord → SoX recording backends, faster-whisperopenai-whisper → OpenAI API STT backends. Smart keyterm extraction from git branch + project name + recent files passed as Whisper initial_prompt for coding-domain accuracy. /voice, /voice status, /voice lang <code> REPL commands. Works fully offline with no API key. 29 new tests (~11.6K lines of Python).
  • 10:29 PM, Apr 03, 2026: v3.04 — Expanded tool coverage: NotebookEdit (edit Jupyter .ipynb cells — replace/insert/delete with full JSON round-trip) and GetDiagnostics (LSP-style diagnostics via pyright/mypy/flake8/tsc/shellcheck). Also fixed a pre-existing schema-index bug in _register_builtins by switching to name-based lookup (~10.5K lines of Python).
  • 06:00 PM, Apr 03, 2026: v3.03 — Task management system (task/ package): TaskCreate / TaskUpdate / TaskGet / TaskList tools with sequential IDs, dependency edges (blocks/blocked_by), metadata, persistence to .nano_claude/tasks.json, thread-safe store, /tasks REPL command, 37 new tests (~9500 lines of Python).
  • 02:50 PM, Apr 03, 2026: v3.02 — Plugin system (plugin/ package): install/uninstall/enable/disable/update via /plugin CLI, recommendation engine (keyword+tag matching), multi-scope (user/project), git-based marketplace. AskUserQuestion tool: interactive mid-task user prompts with numbered options and free-text input (~8500 lines of Python).
  • 10:00 AM, Apr 03, 2026: v3.01 — MCP (Model Context Protocol) support: mcp/ package, stdio + SSE + HTTP transports, auto tool discovery, /mcp command, 34 new tests (~7000 lines of Python).
  • 12:20 PM, Apr 02, 2026: v3.0 — Multi-agent packages (multi_agent/), memory package (memory/), skill package (skill/) with built-in skills, argument substitution, fork/inline execution, AI memory search, git worktree isolation, agent type definitions (~5000 lines of Python), see update.
  • 10:00 AM, Apr 02, 2026: v2.0 — Context compression, memory, sub-agents, skills, diff view, tool plugin system (~3400 lines of Python Code).
  • 01:47 PM, Apr 01, 2026: Support VLLM inference (~2000 lines of Python Code).
  • 11:30 AM, Apr 01, 2026: Support more closed-source models and open-source models: Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint. (~1700 lines of Python Code).
  • 09:50 AM, Apr 01, 2026: Support more closed-source models: Claude, GPT, Gemini. (~1300 lines of Python Code).
  • 08:23 AM, Apr 01, 2026: Release the initial version of Nano Claude Code (~900 lines of Python Code).

Nano Claude Code

Nano Claude Code: A Lightweight and Easy-to-Use Python Reimplementation of Claude Code Supporting Any Model, such as Claude, GPT, Gemini, Kimi, Qwen, Zhipu, DeepSeek, and local open-source models via Ollama or any OpenAI-compatible endpoint.


Content

Why Nano Claude Code

Claude Code is a powerful, production-grade AI coding assistant — but its source code is a compiled, 12 MB TypeScript/Node.js bundle (~1,300 files, ~283K lines). It is tightly coupled to the Anthropic API, hard to modify, and impossible to run against a local or alternative model.

Nano Claude Code reimplements the same core loop in ~10K lines of readable Python, keeping everything you need and dropping what you don't. See here for more detailed analysis (Nano Claude code v3.03), English version and Chinese version

At a glance

Dimension Claude Code (TypeScript) Nano Claude Code (Python)
Language TypeScript + React/Ink Python 3.8+
Source files ~1,332 TS/TSX files 51 Python files
Lines of code ~283K ~11.6K
Built-in tools 44+ 25
Slash commands 88 20
Voice input Proprietary Anthropic WebSocket (OAuth required) Local Whisper / OpenAI API — works offline, no subscription
Model providers Anthropic only 7+ (Anthropic · OpenAI · Gemini · Kimi · Qwen · DeepSeek · Ollama · …)
Local models No Yes — Ollama, LM Studio, vLLM, any OpenAI-compatible endpoint
Build step required Yes (Bun + esbuild) No — run directly with python nano_claude.py (or install to use nano_claude)
Runtime extensibility Closed (compile-time) Open — register_tool() at runtime, Markdown skills, git plugins
Task dependency graph No Yes — blocks / blocked_by edges in task/ package

Where Claude Code wins

  • UI quality — React/Ink component tree with streaming rendering, fine-grained diff visualization, and dialog systems.
  • Tool breadth — 44 tools including RemoteTrigger, EnterWorktree, and more UI-integrated tools.
  • Enterprise features — MDM-managed config, team permission sync, OAuth, keychain storage, GrowthBook feature flags.
  • AI-driven memory extractionextractMemories service proactively extracts knowledge from conversations without explicit tool calls.
  • Production reliability — single distributable cli.js, comprehensive test coverage, version-locked releases.

Where Nano Claude Code wins

  • Multi-provider — switch between Claude, GPT-4o, Gemini 2.5 Pro, DeepSeek, Qwen, or a local Llama model with --model or /model — no recompile needed.
  • Local model support — run entirely offline with Ollama, LM Studio, or any vLLM-hosted model.
  • Readable source — the full agent loop is 174 lines (agent.py). Any Python developer can read, fork, and extend it in minutes.
  • Zero buildpip install -r requirements.txt and you're running. Changes take effect immediately.
  • Dynamic extensibility — register new tools at runtime with register_tool(ToolDef(...)), install skill packs from git URLs, or wire in any MCP server.
  • Task dependency graphTaskCreate / TaskUpdate support blocks / blocked_by edges for structured multi-step planning (not available in Claude Code).
  • Two-layer context compression — rule-based snip + AI summarization, configurable via preserve_last_n_turns.
  • Notebook editingNotebookEdit directly manipulates .ipynb JSON (replace/insert/delete cells) with no kernel required.
  • Diagnostics without LSP serverGetDiagnostics chains pyright → mypy → flake8 → py_compile for Python and tsc/shellcheck for other languages, with zero configuration.
  • Offline voice input/voice records via sounddevice/arecord/SoX, transcribes with local faster-whisper (no API key, no subscription), and auto-submits. Keyterms from your git branch and project files boost coding-term accuracy.
  • Cloud session sync/cloudsave backs up conversations to private GitHub Gists with zero extra dependencies; restore any past session on any machine with /cloudsave load <id>.
  • Proactive background monitoring/proactive 5m activates a sentinel daemon that wakes the agent automatically after a period of inactivity, enabling continuous monitoring loops, scheduled checks, or trading bots without user prompts.
  • Rich Live streaming rendering — When rich is installed, responses stream as live-updating Markdown in place (no duplicate raw text), with clean tool-call interleaving.
  • Native Ollama reasoning — Local reasoning models (deepseek-r1, qwen3, gemma4) stream their <think> tokens directly to the terminal via ThinkingChunk events; enable with /verbose and /thinking.

Key design differences

Agent loop — Nano uses a Python generator that yields typed events (TextChunk, ToolStart, ToolEnd, TurnDone). The entire loop is visible in one file, making it easy to add hooks, custom renderers, or logging.

Tool registration — every tool is a ToolDef(name, schema, func, read_only, concurrent_safe) dataclass. Any module can call register_tool() at import time; MCP servers, plugins, and skills all use the same mechanism.

Context compression

Claude Code Nano Claude Code
Trigger Exact token count len / 3.5 estimate, fires at 70 %
Layer 1 Snip: truncate old tool outputs (no API cost)
Layer 2 AI summarization AI summarization of older turns
Control System-managed preserve_last_n_turns parameter

Memory — Claude Code's extractMemories service has the model proactively surface facts. Nano's memory/ package is tool-driven: the model calls MemorySave explicitly, which is more predictable and auditable.

Who should use Nano Claude Code

  • Developers who want to use a local or non-Anthropic model as their coding assistant.
  • Researchers studying how agentic coding assistants work — the entire system fits in one screen.
  • Teams who need a hackable baseline to add proprietary tools, custom permission policies, or specialised agent types.
  • Anyone who wants Claude Code-style productivity without a Node.js build chain.

Features

Feature Details
Multi-provider Anthropic · OpenAI · Gemini · Kimi · Qwen · Zhipu · DeepSeek · Ollama · LM Studio · Custom endpoint
Interactive REPL readline history, Tab-complete slash commands
Agent loop Streaming API + automatic tool-use loop
25 built-in tools Read · Write · Edit · Bash · Glob · Grep · WebFetch · WebSearch · NotebookEdit · GetDiagnostics · MemorySave · MemoryDelete · MemorySearch · MemoryList · Agent · SendMessage · CheckAgentResult · ListAgentTasks · ListAgentTypes · Skill · SkillList · AskUserQuestion · TaskCreate/Update/Get/List · SleepTimer · (MCP + plugin tools auto-added at startup)
MCP integration Connect any MCP server (stdio/SSE/HTTP), tools auto-registered and callable by Claude
Plugin system Install/uninstall/enable/disable/update plugins from git URLs or local paths; multi-scope (user/project); recommendation engine
AskUserQuestion Claude can pause and ask the user a clarifying question mid-task, with optional numbered choices
Task management TaskCreate/Update/Get/List tools; sequential IDs; dependency edges; metadata; persisted to .nano_claude/tasks.json; /tasks REPL command
Diff view Git-style red/green diff display for Edit and Write
Context compression Auto-compact long conversations to stay within model limits
Persistent memory Dual-scope memory (user + project) with 4 types, AI search, staleness warnings
Multi-agent Spawn typed sub-agents (coder/reviewer/researcher/…), git worktree isolation, background mode
Skills Built-in /commit · /review + custom markdown skills with argument substitution and fork/inline execution
Plugin tools Register custom tools via tool_registry.py
Permission system auto / accept-all / manual modes
19 slash commands /model · /config · /save · /cost · /memory · /skills · /agents · /voice · /proactive · …
Voice input Record → transcribe → auto-submit. Backends: sounddevice / arecord / SoX + faster-whisper / openai-whisper / OpenAI API. Works fully offline.
Proactive monitoring /proactive [duration] starts a background sentinel daemon; agent wakes automatically after inactivity, enabling continuous monitoring loops without user prompts
Rich Live streaming When rich is installed, responses render as live-updating Markdown in place — no duplicate raw text, clean tool-call interleaving
Context injection Auto-loads CLAUDE.md, git status, cwd, persistent memory
Session persistence Autosave on exit to daily/YYYY-MM-DD/ (per-day limit) + history.json (master, all sessions) + session_latest.json (/resume); sessions include session_id and saved_at metadata; /load grouped by date
Cloud sync /cloudsave syncs sessions to private GitHub Gists; auto-sync on exit; load from cloud by Gist ID. No new dependencies (stdlib urllib).
Extended Thinking Toggle on/off for Claude models; native <think> block streaming for local Ollama reasoning models (deepseek-r1, qwen3, gemma4)
Cost tracking Token usage + estimated USD cost
Non-interactive mode --print flag for scripting / CI

Supported Models

Closed-Source (API)

Provider Model Context Strengths API Key Env
Anthropic claude-opus-4-6 200k Most capable, best for complex reasoning ANTHROPIC_API_KEY
Anthropic claude-sonnet-4-6 200k Balanced speed & quality ANTHROPIC_API_KEY
Anthropic claude-haiku-4-5-20251001 200k Fast, cost-efficient ANTHROPIC_API_KEY
OpenAI gpt-4o 128k Strong multimodal & coding OPENAI_API_KEY
OpenAI gpt-4o-mini 128k Fast, cheap OPENAI_API_KEY
OpenAI o3-mini 200k Strong reasoning OPENAI_API_KEY
OpenAI o1 200k Advanced reasoning OPENAI_API_KEY
Google gemini-2.5-pro-preview-03-25 1M Long context, multimodal GEMINI_API_KEY
Google gemini-2.0-flash 1M Fast, large context GEMINI_API_KEY
Google gemini-1.5-pro 2M Largest context window GEMINI_API_KEY
Moonshot (Kimi) moonshot-v1-8k 8k Chinese & English MOONSHOT_API_KEY
Moonshot (Kimi) moonshot-v1-32k 32k Chinese & English MOONSHOT_API_KEY
Moonshot (Kimi) moonshot-v1-128k 128k Long context MOONSHOT_API_KEY
Alibaba (Qwen) qwen-max 32k Best Qwen quality DASHSCOPE_API_KEY
Alibaba (Qwen) qwen-plus 128k Balanced DASHSCOPE_API_KEY
Alibaba (Qwen) qwen-turbo 1M Fast, cheap DASHSCOPE_API_KEY
Alibaba (Qwen) qwq-32b 32k Strong reasoning DASHSCOPE_API_KEY
Zhipu (GLM) glm-4-plus 128k Best GLM quality ZHIPU_API_KEY
Zhipu (GLM) glm-4 128k General purpose ZHIPU_API_KEY
Zhipu (GLM) glm-4-flash 128k Free tier available ZHIPU_API_KEY
DeepSeek deepseek-chat 64k Strong coding DEEPSEEK_API_KEY
DeepSeek deepseek-reasoner 64k Chain-of-thought reasoning DEEPSEEK_API_KEY

Open-Source (Local via Ollama)

Model Size Strengths Pull Command
llama3.3 70B General purpose, strong reasoning ollama pull llama3.3
llama3.2 3B / 11B Lightweight ollama pull llama3.2
qwen2.5-coder 7B / 32B Best for coding tasks ollama pull qwen2.5-coder
qwen2.5 7B / 72B Chinese & English ollama pull qwen2.5
deepseek-r1 7B–70B Reasoning, math ollama pull deepseek-r1
deepseek-coder-v2 16B Coding ollama pull deepseek-coder-v2
mistral 7B Fast, efficient ollama pull mistral
mixtral 8x7B Strong MoE model ollama pull mixtral
phi4 14B Microsoft, strong reasoning ollama pull phi4
gemma3 4B / 12B / 27B Google open model ollama pull gemma3
codellama 7B / 34B Code generation ollama pull codellama

Note: Tool calling requires a model that supports function calling. Recommended local models: qwen2.5-coder, llama3.3, mistral, phi4.

Reasoning models: deepseek-r1, qwen3, and gemma4 stream native <think> blocks. Enable with /verbose and /thinking to see thoughts in the terminal. Note: models fed a large system prompt (like nano-claude's 25 tool schemas) may suppress their thinking phase to avoid breaking the expected JSON format — this is model behavior, not a bug.


Installation

Recommended: install as a global command with uv

uv installs nano_claude into an isolated environment and puts it on your PATH so you can run it from anywhere:

# Install uv (if not already installed)
curl -LsSf https://astral.sh/uv/install.sh | sh

# Clone and install
git clone <repo-url>
cd nano-claude-code
uv tool install .

After that, nano_claude is available as a global command:

nano_claude                        # start REPL
nano_claude --model gpt-4o         # choose a model
nano_claude -p "explain this"      # non-interactive

To update after pulling new code:

uv tool install . --reinstall

To uninstall:

uv tool uninstall nano-claude-code

Alternative: run directly from the repo

git clone <repo-url>
cd nano-claude-code

pip install -r requirements.txt
# or manually (sounddevice is optional — only needed for /voice):
pip install anthropic openai httpx rich
pip install sounddevice  # optional: voice input

python nano_claude.py

Usage: Closed-Source API Models

Anthropic Claude

Get your API key at console.anthropic.com.

export ANTHROPIC_API_KEY=sk-ant-api03-...

# Default model (claude-opus-4-6)
nano_claude

# Choose a specific model
nano_claude --model claude-sonnet-4-6
nano_claude --model claude-haiku-4-5-20251001

# Enable Extended Thinking
nano_claude --model claude-opus-4-6 --thinking --verbose

OpenAI GPT

Get your API key at platform.openai.com.

export OPENAI_API_KEY=sk-...

nano_claude --model gpt-4o
nano_claude --model gpt-4o-mini
nano_claude --model gpt-4.1-mini
nano_claude --model o3-mini

Google Gemini

Get your API key at aistudio.google.com.

export GEMINI_API_KEY=AIza...

nano_claude --model gemini/gemini-2.0-flash
nano_claude --model gemini/gemini-1.5-pro
nano_claude --model gemini/gemini-2.5-pro-preview-03-25

Kimi (Moonshot AI)

Get your API key at platform.moonshot.cn.

export MOONSHOT_API_KEY=sk-...

nano_claude --model kimi/moonshot-v1-32k
nano_claude --model kimi/moonshot-v1-128k

Qwen (Alibaba DashScope)

Get your API key at dashscope.aliyun.com.

export DASHSCOPE_API_KEY=sk-...

nano_claude --model qwen/Qwen3.5-Plus
nano_claude --model qwen/Qwen3-MAX
nano_claude --model qwen/Qwen3.5-Flash

Zhipu GLM

Get your API key at open.bigmodel.cn.

export ZHIPU_API_KEY=...

nano_claude --model zhipu/glm-4-plus
nano_claude --model zhipu/glm-4-flash   # free tier

DeepSeek

Get your API key at platform.deepseek.com.

export DEEPSEEK_API_KEY=sk-...

nano_claude --model deepseek/deepseek-chat
nano_claude --model deepseek/deepseek-reasoner

Usage: Open-Source Models (Local)

Option A — Ollama (Recommended)

Ollama runs models locally with zero configuration. No API key required.

Step 1: Install Ollama

# macOS / Linux
curl -fsSL https://ollama.com/install.sh | sh

# Or download from https://ollama.com/download

Step 2: Pull a model

# Best for coding (recommended)
ollama pull qwen2.5-coder          # 4.7 GB (7B)
ollama pull qwen2.5-coder:32b      # 19 GB (32B)

# General purpose
ollama pull llama3.3               # 42 GB (70B)
ollama pull llama3.2               # 2.0 GB (3B)

# Reasoning
ollama pull deepseek-r1            # 4.7 GB (7B)
ollama pull deepseek-r1:32b        # 19 GB (32B)

# Other
ollama pull phi4                   # 9.1 GB (14B)
ollama pull mistral                # 4.1 GB (7B)

Step 3: Start Ollama server (runs automatically on macOS; on Linux run manually)

ollama serve     # starts on http://localhost:11434

Step 4: Run nano claude

nano_claude --model ollama/qwen2.5-coder
nano_claude --model ollama/llama3.3
nano_claude --model ollama/deepseek-r1

List your locally available models:

ollama list

Then use any model from the list:

nano_claude --model ollama/<model-name>

Option B — LM Studio

LM Studio provides a GUI to download and run models, with a built-in OpenAI-compatible server.

Step 1: Download LM Studio and install it.

Step 2: Search and download a model inside LM Studio (GGUF format).

Step 3: Go to Local Server tab → click Start Server (default port: 1234).

Step 4:

nano_claude --model lmstudio/<model-name>
# e.g.:
nano_claude --model lmstudio/phi-4-GGUF
nano_claude --model lmstudio/qwen2.5-coder-7b

The model name should match what LM Studio shows in the server status bar.


Option C — vLLM / Self-Hosted OpenAI-Compatible Server

For self-hosted inference servers (vLLM, TGI, llama.cpp server, etc.) that expose an OpenAI-compatible API:

Quick Start for option C: Step 1: Start vllm:

CUDA_VISIBLE_DEVICES=7 python -m vllm.entrypoints.openai.api_server \
     --model Qwen/Qwen2.5-Coder-7B-Instruct \
     --host 0.0.0.0 \
     --port 8000 \
     --enable-auto-tool-choice \
     --tool-call-parser hermes

Step 2: Start nano claude:

  export CUSTOM_BASE_URL=http://localhost:8000/v1
  export CUSTOM_API_KEY=none
  nano_claude --model custom/Qwen/Qwen2.5-Coder-7B-Instruct
# Example: vLLM serving Qwen2.5-Coder-32B
python -m vllm.entrypoints.openai.api_server \
    --model Qwen/Qwen2.5-Coder-32B-Instruct \
    --port 8000

# Then run nano claude pointing to your server:
nano_claude

Inside the REPL:

/config custom_base_url=http://localhost:8000/v1
/config custom_api_key=token-abc123    # skip if no auth
/model custom/Qwen2.5-Coder-32B-Instruct

Or set via environment:

export CUSTOM_BASE_URL=http://localhost:8000/v1
export CUSTOM_API_KEY=token-abc123

nano_claude --model custom/Qwen2.5-Coder-32B-Instruct

For a remote GPU server:

/config custom_base_url=http://192.168.1.100:8000/v1
/model custom/your-model-name

Model Name Format

Three equivalent formats are supported:

# 1. Auto-detect by prefix (works for well-known models)
nano_claude --model gpt-4o
nano_claude --model gemini-2.0-flash
nano_claude --model deepseek-chat

# 2. Explicit provider prefix with slash
nano_claude --model ollama/qwen2.5-coder
nano_claude --model kimi/moonshot-v1-128k

# 3. Explicit provider prefix with colon (also works)
nano_claude --model kimi:moonshot-v1-32k
nano_claude --model qwen:qwen-max

Auto-detection rules:

Model prefix Detected provider
claude- anthropic
gpt-, o1, o3 openai
gemini- gemini
moonshot-, kimi- kimi
qwen, qwq- qwen
glm- zhipu
deepseek- deepseek
llama, mistral, phi, gemma, mixtral, codellama ollama

CLI Reference

nano_claude [OPTIONS] [PROMPT]
# or: python nano_claude.py [OPTIONS] [PROMPT]

Options:
  -p, --print          Non-interactive: run prompt and exit
  -m, --model MODEL    Override model (e.g. gpt-4o, ollama/llama3.3)
  --accept-all         Auto-approve all operations (no permission prompts)
  --verbose            Show thinking blocks and per-turn token counts
  --thinking           Enable Extended Thinking (Claude only)
  --version            Print version and exit
  -h, --help           Show help

Examples:

# Interactive REPL with default model
nano_claude

# Switch model at startup
nano_claude --model gpt-4o
nano_claude -m ollama/deepseek-r1:32b

# Non-interactive / scripting
nano_claude --print "Write a Python fibonacci function"
nano_claude -p "Explain the Rust borrow checker in 3 sentences" -m gemini/gemini-2.0-flash

# CI / automation (no permission prompts)
nano_claude --accept-all --print "Initialize a Python project with pyproject.toml"

# Debug mode (see tokens + thinking)
nano_claude --thinking --verbose

Slash Commands (REPL)

Type / and press Tab to autocomplete.

Command Description
/help Show all commands
/clear Clear conversation history
/model Show current model + list all available models
/model <name> Switch model (takes effect immediately)
/config Show all current config values
/config key=value Set a config value (persisted to disk)
/save Save session (auto-named by timestamp)
/save <filename> Save session to named file
/load Interactive list grouped by date; enter number, 1,2,3 to merge, or H for full history
/load <filename> Load a saved session by filename
/resume Restore the last auto-saved session (mr_sessions/session_latest.json)
/resume <filename> Load a specific file from mr_sessions/ (or absolute path)
/history Print full conversation history
/context Show message count and token estimate
/cost Show token usage and estimated USD cost
/verbose Toggle verbose mode (tokens + thinking)
/thinking Toggle Extended Thinking (Claude only)
/permissions Show current permission mode
/permissions <mode> Set permission mode: auto / accept-all / manual
/cwd Show current working directory
/cwd <path> Change working directory
/memory List all persistent memories
/memory <query> Search memories by keyword
/skills List available skills
/agents Show sub-agent task status
/mcp List configured MCP servers and their tools
/mcp reload Reconnect all MCP servers and refresh tools
/mcp reload <name> Reconnect a single MCP server
/mcp add <name> <cmd> [args] Add a stdio MCP server to user config
/mcp remove <name> Remove a server from user config
/voice Record voice, transcribe with Whisper, auto-submit as prompt
/voice status Show recording and STT backend availability
/voice lang <code> Set STT language (e.g. zh, en, ja; auto to detect)
/proactive Show current proactive polling status (ON/OFF and interval)
/proactive <duration> Enable background sentinel polling (e.g. 5m, 30s, 1h)
/proactive off Disable background polling
/cloudsave setup <token> Configure GitHub Personal Access Token for Gist sync
/cloudsave Upload current session to a private GitHub Gist
/cloudsave push [desc] Upload with an optional description
/cloudsave auto on|off Toggle auto-upload on /exit
/cloudsave list List your nano-claude-code Gists
/cloudsave load <gist_id> Download and restore a session from Gist
/exit / /quit Exit

Switching models inside a session:

[myproject] ❯ /model
  Current model: claude-opus-4-6  (provider: anthropic)

  Available models by provider:
    anthropic     claude-opus-4-6, claude-sonnet-4-6, ...
    openai        gpt-4o, gpt-4o-mini, o3-mini, ...
    ollama        llama3.3, llama3.2, phi4, mistral, ...
    ...

[myproject] ❯ /model gpt-4o
  Model set to gpt-4o  (provider: openai)

[myproject] ❯ /model ollama/qwen2.5-coder
  Model set to ollama/qwen2.5-coder  (provider: ollama)

Configuring API Keys

Method 1: Environment Variables (recommended)

# Add to ~/.bashrc or ~/.zshrc
export ANTHROPIC_API_KEY=sk-ant-...
export OPENAI_API_KEY=sk-...
export GEMINI_API_KEY=AIza...
export MOONSHOT_API_KEY=sk-...       # Kimi
export DASHSCOPE_API_KEY=sk-...      # Qwen
export ZHIPU_API_KEY=...             # Zhipu GLM
export DEEPSEEK_API_KEY=sk-...       # DeepSeek

Method 2: Set Inside the REPL (persisted)

/config anthropic_api_key=sk-ant-...
/config openai_api_key=sk-...
/config gemini_api_key=AIza...
/config kimi_api_key=sk-...
/config qwen_api_key=sk-...
/config zhipu_api_key=...
/config deepseek_api_key=sk-...

Keys are saved to ~/.nano_claude/config.json and loaded automatically on next launch.

Method 3: Edit the Config File Directly

// ~/.nano_claude/config.json
{
  "model": "qwen/qwen-max",
  "max_tokens": 8192,
  "permission_mode": "auto",
  "verbose": false,
  "thinking": false,
  "qwen_api_key": "sk-...",
  "kimi_api_key": "sk-...",
  "deepseek_api_key": "sk-..."
}

Permission System

Mode Behavior
auto (default) Read-only operations always allowed. Prompts before Bash commands and file writes.
accept-all Never prompts. All operations proceed automatically.
manual Prompts before every single operation, including reads.

When prompted:

  Allow: Run: git commit -am "fix bug"  [y/N/a(ccept-all)]
  • y — approve this one action
  • n or Enter — deny
  • a — approve and switch to accept-all for the rest of the session

Commands always auto-approved in auto mode: ls, cat, head, tail, wc, pwd, echo, git status, git log, git diff, git show, find, grep, rg, python, node, pip show, npm list, and other read-only shell commands.


Built-in Tools

Core Tools

Tool Description Key Parameters
Read Read file with line numbers file_path, limit, offset
Write Create or overwrite file (shows diff) file_path, content
Edit Exact string replacement (shows diff) file_path, old_string, new_string, replace_all
Bash Execute shell command command, timeout (default 30s)
Glob Find files by glob pattern pattern (e.g. **/*.py), path
Grep Regex search in files (uses ripgrep if available) pattern, path, glob, output_mode
WebFetch Fetch and extract text from URL url, prompt
WebSearch Search the web via DuckDuckGo query

Notebook & Diagnostics Tools

Tool Description Key Parameters
NotebookEdit Edit a Jupyter notebook (.ipynb) cell notebook_path, new_source, cell_id, cell_type, edit_mode (replace/insert/delete)
GetDiagnostics Get LSP-style diagnostics for a source file (pyright/mypy/flake8 for Python; tsc/eslint for JS/TS; shellcheck for shell) file_path, language (optional override)

Memory Tools

Tool Description Key Parameters
MemorySave Save or update a persistent memory name, type, description, content, scope
MemoryDelete Delete a memory by name name, scope
MemorySearch Search memories by keyword (or AI ranking) query, scope, use_ai, max_results
MemoryList List all memories with age and metadata scope

Sub-Agent Tools

Tool Description Key Parameters
Agent Spawn a sub-agent for a task prompt, subagent_type, isolation, name, model, wait
SendMessage Send a message to a named background agent name, message
CheckAgentResult Check status/result of a background agent task_id
ListAgentTasks List all active and finished agent tasks
ListAgentTypes List available agent type definitions

Background & Autonomy Tools

Tool Description Key Parameters
SleepTimer Schedule a silent background timer; injects an automated wake-up prompt when it fires so the agent can resume monitoring or deferred tasks seconds

Skill Tools

Tool Description Key Parameters
Skill Invoke a skill by name from within the conversation name, args
SkillList List all available skills with triggers and metadata

MCP Tools

MCP tools are discovered automatically from configured servers and registered under the name mcp__<server>__<tool>. Claude can use them exactly like built-in tools.

Example tool name Where it comes from
mcp__git__git_status git server, git_status tool
mcp__filesystem__read_file filesystem server, read_file tool
mcp__myserver__my_action custom server you configured

Adding custom tools: See Architecture Guide for how to register your own tools.


Memory

The model can remember things across conversations using the built-in memory system.

How it works: Memories are stored as markdown files. There are two scopes:

  • User scope (~/.nano_claude/memory/) — follows you across all projects
  • Project scope (.nano_claude/memory/ in cwd) — specific to the current repo

A MEMORY.md index (≤ 200 lines / 25 KB) is auto-rebuilt on every save or delete and injected into the system prompt so Claude always has an overview.

Memory types:

Type Use for
user Your role, preferences, background
feedback How you want the model to behave
project Ongoing work, deadlines, decisions
reference Links to external resources

Memory file format (~/.nano_claude/memory/coding_style.md):

---
name: coding style
description: Python formatting preferences
type: feedback
created: 2026-04-02
---
Prefer 4-space indentation and full type hints in all Python code.
**Why:** user explicitly stated this preference.
**How to apply:** apply to every Python file written or edited.

Example interaction:

You: Remember that I prefer 4-space indentation and type hints in all Python code.
AI: [calls MemorySave] Memory saved: coding_style [feedback/user]

You: /memory
  [feedback/user] coding_style (today): Python formatting preferences

You: /memory python
  [feedback/user] coding_style: Prefers 4-space indent and type hints in Python

Staleness warnings: Memories older than 1 day get a freshness note in /memory output so you know when to review or update them.

AI-ranked search: MemorySearch(query="...", use_ai=true) uses the model to rank results by relevance rather than simple keyword matching.


Skills

Skills are reusable prompt templates that give the model specialized capabilities. Two built-in skills ship out of the box — no setup required.

Built-in skills:

Trigger Description
/commit Review staged changes and create a well-structured git commit
/review [PR] Review code or PR diff with structured feedback

Quick start — custom skill:

mkdir -p ~/.nano_claude/skills

Create ~/.nano_claude/skills/deploy.md:

---
name: deploy
description: Deploy to an environment
triggers: [/deploy]
allowed-tools: [Bash, Read]
when_to_use: Use when the user wants to deploy a version to an environment.
argument-hint: [env] [version]
arguments: [env, version]
context: inline
---

Deploy $VERSION to the $ENV environment.
Full args: $ARGUMENTS

Now use it:

You: /deploy staging 2.1.0
AI: [deploys version 2.1.0 to staging]

Argument substitution:

  • $ARGUMENTS — the full raw argument string
  • $ARG_NAME — positional substitution by named argument (first word → first name)
  • Missing args become empty strings

Execution modes:

  • context: inline (default) — runs inside current conversation history
  • context: fork — runs as an isolated sub-agent with fresh history; supports model override

Priority (highest wins): project-level > user-level > built-in

List skills: /skills — shows triggers, argument hint, source, and when_to_use

Skill search paths:

./.nano_claude/skills/     # project-level (overrides user-level)
~/.nano_claude/skills/     # user-level

Sub-Agents

The model can spawn independent sub-agents to handle tasks in parallel.

Specialized agent types — built-in:

Type Optimized for
general-purpose Research, exploration, multi-step tasks
coder Writing, reading, and modifying code
reviewer Security, correctness, and code quality analysis
researcher Web search and documentation lookup
tester Writing and running tests

Basic usage:

You: Search this codebase for all TODO comments and summarize them.
AI: [calls Agent(prompt="...", subagent_type="researcher")]
    Sub-agent reads files, greps for TODOs...
    Result: Found 12 TODOs across 5 files...

Background mode — spawn without waiting, collect result later:

AI: [calls Agent(prompt="run all tests", name="test-runner", wait=false)]
AI: [continues other work...]
AI: [calls CheckAgentResult / SendMessage to follow up]

Git worktree isolation — agents work on an isolated branch with no conflicts:

Agent(prompt="refactor auth module", isolation="worktree")

The worktree is auto-cleaned up if no changes were made; otherwise the branch name is reported.

Custom agent types — create ~/.nano_claude/agents/myagent.md:

---
name: myagent
description: Specialized for X
model: claude-haiku-4-5-20251001
tools: [Read, Grep, Bash]
---
Extra system prompt for this agent type.

List running agents: /agents

Sub-agents have independent conversation history, share the file system, and are limited to 3 levels of nesting.


MCP (Model Context Protocol)

MCP lets you connect any external tool server — local subprocess or remote HTTP — and Claude can use its tools automatically. This is the same protocol Claude Code uses to extend its capabilities.

Supported transports

Transport Config type Description
stdio "stdio" Spawn a local subprocess (most common)
SSE "sse" HTTP Server-Sent Events stream
HTTP "http" Streamable HTTP POST (newer servers)

Configuration

Place a .mcp.json file in your project directory or edit ~/.nano_claude/mcp.json for user-wide servers.

{
  "mcpServers": {
    "git": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-git"]
    },
    "filesystem": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-filesystem", "/tmp"]
    },
    "my-remote": {
      "type": "sse",
      "url": "http://localhost:8080/sse",
      "headers": {"Authorization": "Bearer my-token"}
    }
  }
}

Config priority: .mcp.json (project) overrides ~/.nano_claude/mcp.json (user) by server name.

Quick start

# Install a popular MCP server
pip install uv        # uv includes uvx
uvx mcp-server-git --help   # verify it works

# Add to user config via REPL
/mcp add git uvx mcp-server-git

# Or create .mcp.json in your project dir, then:
/mcp reload

REPL commands

/mcp                          # list servers + their tools + connection status
/mcp reload                   # reconnect all servers, refresh tool list
/mcp reload git               # reconnect a single server
/mcp add myserver uvx mcp-server-x   # add stdio server
/mcp remove myserver          # remove from user config

How Claude uses MCP tools

Once connected, Claude can call MCP tools directly:

You: What files changed in the last git commit?
AI: [calls mcp__git__git_diff_staged()]
    → shows diff output from the git MCP server

Tool names follow the pattern mcp__<server_name>__<tool_name>. All characters that are not alphanumeric or _ are automatically replaced with _.

Popular MCP servers

Server Install Provides
mcp-server-git uvx mcp-server-git git operations (status, diff, log, commit)
mcp-server-filesystem uvx mcp-server-filesystem <path> file read/write/list
mcp-server-fetch uvx mcp-server-fetch HTTP fetch tool
mcp-server-postgres uvx mcp-server-postgres <conn-str> PostgreSQL queries
mcp-server-sqlite uvx mcp-server-sqlite --db-path x.db SQLite queries
mcp-server-brave-search uvx mcp-server-brave-search Brave web search

Browse the full registry at modelcontextprotocol.io/servers


Plugin System

The plugin/ package lets you extend nano-claude-code with additional tools, skills, and MCP servers from git repositories or local directories.

Install a plugin

/plugin install my-plugin@https://github.com/user/my-plugin
/plugin install local-plugin@/path/to/local/plugin

Manage plugins

/plugin                   # list installed plugins
/plugin enable my-plugin  # enable a disabled plugin
/plugin disable my-plugin # disable without uninstalling
/plugin disable-all       # disable all plugins
/plugin update my-plugin  # pull latest from git
/plugin uninstall my-plugin
/plugin info my-plugin    # show manifest details

Plugin recommendation engine

/plugin recommend                    # auto-detect from project files
/plugin recommend "docker database"  # recommend by keyword context

The engine matches your context against a curated marketplace (git-tools, python-linter, docker-tools, sql-tools, test-runner, diagram-tools, aws-tools, web-scraper) using tag and keyword scoring.

Plugin manifest (plugin.json)

{
  "name": "my-plugin",
  "version": "0.1.0",
  "description": "Does something useful",
  "author": "you",
  "tags": ["git", "python"],
  "tools": ["tools"],        // Python module(s) that export TOOL_DEFS
  "skills": ["skills/my.md"],
  "mcp_servers": {},
  "dependencies": ["httpx"]  // pip packages
}

Alternatively use YAML frontmatter in PLUGIN.md.

Scopes

Scope Location Config
user (default) ~/.nano_claude/plugins/ ~/.nano_claude/plugins.json
project .nano_claude/plugins/ .nano_claude/plugins.json

Use --project flag: /plugin install name@url --project


AskUserQuestion Tool

Claude can pause mid-task and interactively ask you a question before proceeding.

Example invocation by Claude:

{
  "tool": "AskUserQuestion",
  "question": "Which database should I use?",
  "options": [
    {"label": "SQLite", "description": "Simple, file-based"},
    {"label": "PostgreSQL", "description": "Full-featured, requires server"}
  ],
  "allow_freetext": true
}

What you see in the terminal:

❓ Question from assistant:
   Which database should I use?

  [1] SQLite — Simple, file-based
  [2] PostgreSQL — Full-featured, requires server
  [0] Type a custom answer

Your choice (number or text):
  • Select by number or type free text directly
  • Claude receives your answer and continues the task
  • 5-minute timeout (returns "(no answer — timeout)" if unanswered)

Task Management

The task/ package gives Claude (and you) a structured task list for tracking multi-step work within a session.

Tools available to Claude

Tool Parameters What it does
TaskCreate subject, description, active_form?, metadata? Create a task; returns #id created: subject
TaskUpdate task_id, subject?, description?, status?, owner?, add_blocks?, add_blocked_by?, metadata? Update any field; status='deleted' removes the task
TaskGet task_id Return full details of one task
TaskList (none) List all tasks with status icons and pending blockers

Valid statuses: pendingin_progresscompleted / cancelled / deleted

Dependency edges

TaskUpdate(task_id="3", add_blocked_by=["1","2"])
# Task 3 is now blocked by tasks 1 and 2.
# Reverse edges are set automatically: tasks 1 and 2 get task 3 in their "blocks" list.

Completed tasks are treated as resolved — TaskList hides their blocking effect on dependents.

Persistence

Tasks are saved to .nano_claude/tasks.json in the current working directory after every mutation and reloaded on first access.

REPL commands

/tasks                    list all tasks
/tasks create <subject>   quick-create a task
/tasks start <id>         mark in_progress
/tasks done <id>          mark completed
/tasks cancel <id>        mark cancelled
/tasks delete <id>        remove a task
/tasks get <id>           show full details
/tasks clear              delete all tasks

Typical Claude workflow

User: implement the login feature

Claude:
  TaskCreate(subject="Design auth schema", description="JWT vs session")  → #1
  TaskCreate(subject="Write login endpoint", description="POST /auth/login") → #2
  TaskCreate(subject="Write tests", description="Unit + integration") → #3
  TaskUpdate(task_id="2", add_blocked_by=["1"])
  TaskUpdate(task_id="3", add_blocked_by=["2"])

  TaskUpdate(task_id="1", status="in_progress", active_form="Designing schema")
  ... (does the work) ...
  TaskUpdate(task_id="1", status="completed")
  TaskList()  → task 2 is now unblocked
  ...

Voice Input

Nano Claude Code v3.05 adds a fully offline voice-to-prompt pipeline. Speak your request — it is transcribed and submitted as if you had typed it.

Quick start

# 1. Install a recording backend (choose one)
pip install sounddevice        # recommended: cross-platform, no extra binary
# sudo apt install alsa-utils  # Linux arecord fallback
# sudo apt install sox         # SoX rec fallback

# 2. Install a local STT backend (recommended — works offline, no API key)
pip install faster-whisper numpy

# 3. Start Nano Claude Code and speak
nano_claude
[myproject] ❯ /voice
  🎙  Listening… (speak now, auto-stops on silence, Ctrl+C to cancel)
  🎙  ████
✓  Transcribed: "fix the authentication bug in user.py"
[auto-submitting…]

STT backends (tried in order)

Backend Install Notes
faster-whisper pip install faster-whisper Recommended — local, offline, fastest, GPU optional
openai-whisper pip install openai-whisper Local, offline, original OpenAI model
OpenAI Whisper API set OPENAI_API_KEY Cloud, requires internet + API key

Override the Whisper model size with NANO_CLAUDE_WHISPER_MODEL (default: base):

export NANO_CLAUDE_WHISPER_MODEL=small   # better accuracy, slower
export NANO_CLAUDE_WHISPER_MODEL=tiny    # fastest, lightest

Recording backends (tried in order)

Backend Install Notes
sounddevice pip install sounddevice Recommended — cross-platform, Python-native
arecord sudo apt install alsa-utils Linux ALSA, no pip needed
sox rec sudo apt install sox / brew install sox Built-in silence detection

Keyterm boosting

Before each recording, Nano extracts coding vocabulary from:

  • Git branch (e.g. feat/voice-input → "feat", "voice", "input")
  • Project root name (e.g. "nano-claude-code")
  • Recent source file stems (e.g. authentication_handler.py → "authentication", "handler")
  • Global coding terms: MCP, grep, TypeScript, OAuth, regex, gRPC, …

These are passed as Whisper's initial_prompt so the STT engine prefers correct spellings of coding terms.

Commands

Command Description
/voice Record voice and auto-submit the transcript as your next prompt
/voice status Show which recording and STT backends are available
/voice lang <code> Set transcription language (en, zh, ja, de, fr, … default: auto)

How it compares to Claude Code

Claude Code Nano Claude Code v3.05
STT service Anthropic private WebSocket (voice_stream) faster-whisper / openai-whisper / OpenAI API
Requires Anthropic OAuth Yes No
Works offline No Yes (with local Whisper)
Keyterm hints Deepgram keyterms param Whisper initial_prompt (git + files + vocab)
Language support Server-allowlisted codes Any language Whisper supports

Proactive Background Monitoring

Nano Claude Code v3.05.2 adds a sentinel daemon that automatically wakes the agent after a configurable period of inactivity — no user prompt required. This enables use cases like continuous log monitoring, market script polling, or scheduled code checks.

Quick start

[myproject] ❯ /proactive 5m
Proactive background polling: ON  (triggering every 300s of inactivity)

[myproject] ❯ keep monitoring the build log and alert me if errors appear

╭─ Claude ● ─────────────────────────
│ Understood. I'll check the build log each time I wake up.

[Background Event Triggered]
╭─ Claude ● ─────────────────────────
│ ⚙ Bash(tail -50 build.log)
│ ✓ → Build failed: ImportError in auth.py line 42
│ **Action needed:** fix the import before the next CI run.

Commands

Command Description
/proactive Show current status (ON/OFF and interval)
/proactive 5m Enable — trigger every 5 minutes of inactivity
/proactive 30s Enable — trigger every 30 seconds
/proactive 1h Enable — trigger every hour
/proactive off Disable sentinel polling

Duration suffix: s = seconds, m = minutes, h = hours. Plain integer = seconds.

How it works

  • A background daemon thread starts when the REPL launches (paused by default).
  • The daemon checks elapsed time since the last user or agent interaction every second.
  • When the inactivity threshold is reached, it calls the agent with a wake-up prompt.
  • The threading.Lock used by the main agent loop ensures wake-ups never interrupt an active session — they queue and fire after the current turn completes.
  • Watcher exceptions are logged via traceback so failures are visible and debuggable.

Complements SleepTimer

SleepTimer /proactive
Who initiates The agent The user
Trigger After a fixed delay from now After N seconds of inactivity
Use case "Check back in 10 minutes" "Keep watching until I stop typing"

Context Compression

Long conversations are automatically compressed to stay within the model's context window.

Two layers:

  1. Snip — Old tool outputs (file reads, bash results) are truncated after a few turns. Fast, no API cost.
  2. Auto-compact — When token usage exceeds 70% of the context limit, older messages are summarized by the model into a concise recap.

This happens transparently. You don't need to do anything.


Diff View

When the model edits or overwrites a file, you see a git-style diff:

  Changes applied to config.py:

--- a/config.py
+++ b/config.py
@@ -12,7 +12,7 @@
     "model": "claude-opus-4-6",
-    "max_tokens": 8192,
+    "max_tokens": 16384,
     "permission_mode": "auto",

Green lines = added, red lines = removed. New file creations show a summary instead.


CLAUDE.md Support

Place a CLAUDE.md file in your project to give the model persistent context about your codebase. Nano Claude automatically finds and injects it into the system prompt.

~/.claude/CLAUDE.md          # Global — applies to all projects
/your/project/CLAUDE.md      # Project-level — found by walking up from cwd

Example CLAUDE.md:

# Project: FastAPI Backend

## Stack
- Python 3.12, FastAPI, PostgreSQL, SQLAlchemy 2.0, Alembic
- Tests: pytest, coverage target 90%

## Conventions
- Format with black, lint with ruff
- Full type annotations required
- New endpoints must have corresponding tests

## Important Notes
- Never hard-code credentials — use environment variables
- Do not modify existing Alembic migration files
- The `staging` branch deploys automatically to staging on push

Session Management

Storage layout

Every exit automatically saves to three places:

~/.nano_claude/sessions/
├── history.json                          ← master: all sessions ever (capped)
├── mr_sessions/
│   └── session_latest.json              ← always the most recent (/resume)
└── daily/
    ├── 2026-04-05/
    │   ├── session_110523_a3f9.json     ← per-day files, newest kept
    │   └── session_143022_b7c1.json
    └── 2026-04-04/
        └── session_183100_3b4c.json

Each session file includes metadata:

{
  "session_id": "a3f9c1b2",
  "saved_at": "2026-04-05 11:05:23",
  "turn_count": 8,
  "messages": [...]
}

Autosave on exit

Every time you exit — via /exit, /quit, Ctrl+C, or Ctrl+D — the session is saved automatically:

✓ Session saved → /home/.../.nano_claude/sessions/mr_sessions/session_latest.json
✓              → /home/.../.nano_claude/sessions/daily/2026-04-05/session_110523_a3f9.json  (id: a3f9c1b2)
✓   history.json: 12 sessions / 87 total turns

Quick resume

To continue where you left off:

nano_claude
[myproject] ❯ /resume
✓  Session loaded from …/mr_sessions/session_latest.json (42 messages)

Resume a specific file:

/resume session_latest.json          # loads from mr_sessions/
/resume /absolute/path/to/file.json  # loads from absolute path

Manual save / load

/save                          # save with auto-name (session_TIMESTAMP_ID.json)
/save debug_auth_bug           # named save to ~/.nano_claude/sessions/

/load                          # interactive list grouped by date
/load debug_auth_bug           # load by filename

/load interactive list:

  ── 2026-04-05 ──
  [ 1] 11:05:23  id:a3f9c1b2  turns:8   session_110523_a3f9.json
  [ 2] 09:22:01  id:7e2d4f91  turns:3   session_092201_7e2d.json

  ── 2026-04-04 ──
  [ 3] 22:18:00  id:3b4c5d6e  turns:15  session_221800_3b4c.json

  ── Complete History ──
  [ H] Load ALL history  (3 sessions / 26 total turns)  /home/.../.nano_claude/sessions/history.json

  Enter number(s) (e.g. 1 or 1,2,3), H for full history, or Enter to cancel >
  • Enter a single number to load one session
  • Enter comma-separated numbers (e.g. 1,3) to merge multiple sessions in order
  • Enter H to load the entire history — shows message count and token estimate before confirming

Configurable limits

Config key Default Description
session_daily_limit 5 Max session files kept per day in daily/
session_history_limit 100 Max sessions kept in history.json
/config session_daily_limit=10
/config session_history_limit=200

history.json — full conversation history

history.json accumulates every session in one place, making it possible to search your complete conversation history or analyze usage patterns:

{
  "total_turns": 150,
  "sessions": [
    {"session_id": "a3f9c1b2", "saved_at": "2026-04-05 11:05:23", "turn_count": 8, "messages": [...]},
    {"session_id": "7e2d4f91", "saved_at": "2026-04-05 09:22:01", "turn_count": 3, "messages": [...]}
  ]
}

Cloud Sync (GitHub Gist)

Nano Claude Code v3.05.3 adds optional cloud backup of conversation sessions via GitHub Gist. Sessions are stored as private Gists (JSON), browsable in the GitHub UI. No extra dependencies — uses Python's stdlib urllib.

Setup (one-time)

  1. Go to github.com/settings/tokensGenerate new token (classic)
  2. Enable the gist scope
  3. Copy the token and run:
[myproject] ❯ /cloudsave setup ghp_xxxxxxxxxxxxxxxxxxxx
✓ GitHub token saved (logged in as: Chauncygu). Cloud sync is ready.

Upload a session

[myproject] ❯ /cloudsave
Uploading session to GitHub Gist…
✓ Session uploaded → https://gist.github.com/abc123def456

Add an optional description:

[myproject] ❯ /cloudsave push auth refactor debug session

Auto-sync on exit

[myproject] ❯ /cloudsave auto on
✓ Auto cloud-sync ON — session will be uploaded to Gist on /exit.

From that point on, every /exit or /quit automatically uploads the session before closing.

Browse and restore

[myproject] ❯ /cloudsave list
  Found 3 session(s):
  abc123de…  2026-04-05 11:02  auth refactor debug session
  7f9e12ab…  2026-04-04 22:18  proactive monitoring test
  3b4c5d6e…  2026-04-04 18:31

[myproject] ❯ /cloudsave load abc123de...full-gist-id...
✓ Session loaded from Gist (42 messages).

Commands reference

Command Description
/cloudsave setup <token> Save GitHub token (needs gist scope)
/cloudsave Upload current session to a new or existing Gist
/cloudsave push [desc] Upload with optional description
/cloudsave auto on|off Toggle auto-upload on exit
/cloudsave list List all nano-claude-code Gists
/cloudsave load <gist_id> Download and restore a session

Project Structure

nano_claude_code/
├── nano_claude.py        # Entry point: REPL + slash commands + diff rendering + Rich Live streaming + proactive sentinel daemon
├── agent.py              # Agent loop: streaming, tool dispatch, compaction
├── providers.py          # Multi-provider: Anthropic, OpenAI-compat streaming
├── tools.py              # Core tools (Read/Write/Edit/Bash/Glob/Grep/Web/NotebookEdit/GetDiagnostics) + registry wiring
├── tool_registry.py      # Tool plugin registry: register, lookup, execute
├── compaction.py         # Context compression: snip + auto-summarize
├── context.py            # System prompt builder: CLAUDE.md + git + memory
├── config.py             # Config load/save/defaults; DAILY_DIR, SESSION_HIST_FILE paths
├── cloudsave.py          # GitHub Gist cloud sync (upload/download/list sessions)
│
├── multi_agent/          # Multi-agent package
│   ├── __init__.py       # Re-exports
│   ├── subagent.py       # AgentDefinition, SubAgentManager, worktree helpers
│   └── tools.py          # Agent, SendMessage, CheckAgentResult, ListAgentTasks, ListAgentTypes
├── subagent.py           # Backward-compat shim → multi_agent/
│
├── memory/               # Memory package
│   ├── __init__.py       # Re-exports
│   ├── types.py          # MEMORY_TYPES and format guidance
│   ├── store.py          # save/load/delete/search, MEMORY.md index rebuilding
│   ├── scan.py           # MemoryHeader, age/freshness helpers
│   ├── context.py        # get_memory_context(), truncation, AI search
│   └── tools.py          # MemorySave, MemoryDelete, MemorySearch, MemoryList
├── memory.py             # Backward-compat shim → memory/
│
├── skill/                # Skill package
│   ├── __init__.py       # Re-exports; imports builtin to register built-ins
│   ├── loader.py         # SkillDef, parse, load_skills, find_skill, substitute_arguments
│   ├── builtin.py        # Built-in skills: /commit, /review
│   ├── executor.py       # execute_skill(): inline or forked sub-agent
│   └── tools.py          # Skill, SkillList
├── skills.py             # Backward-compat shim → skill/
│
├── mcp/                  # MCP (Model Context Protocol) package
│   ├── __init__.py       # Re-exports
│   ├── types.py          # MCPServerConfig, MCPTool, MCPServerState, JSON-RPC helpers
│   ├── client.py         # StdioTransport, HttpTransport, MCPClient, MCPManager
│   ├── config.py         # Load .mcp.json (project) + ~/.nano_claude/mcp.json (user)
│   └── tools.py          # Auto-discover + register MCP tools into tool_registry
│
├── voice/                # Voice input package (v3.05)
│   ├── __init__.py       # Public API: check_voice_deps, voice_input
│   ├── recorder.py       # Audio capture: sounddevice → arecord → sox rec
│   ├── stt.py            # STT: faster-whisper → openai-whisper → OpenAI API
│   └── keyterms.py       # Coding-domain vocab from git branch + project files
│
└── tests/                # 239+ unit tests
    ├── test_mcp.py
    ├── test_memory.py
    ├── test_skills.py
    ├── test_subagent.py
    ├── test_tool_registry.py
    ├── test_compaction.py
    ├── test_diff_view.py
    └── test_voice.py      # 29 voice tests (no hardware required)

For developers: Each feature package (multi_agent/, memory/, skill/, mcp/, voice/) is self-contained. Add custom tools by calling register_tool(ToolDef(...)) from any module imported by tools.py.


FAQ

Q: How do I add an MCP server?

Option 1 — via REPL (stdio server):

/mcp add git uvx mcp-server-git

Option 2 — create .mcp.json in your project:

{
  "mcpServers": {
    "git": {"type": "stdio", "command": "uvx", "args": ["mcp-server-git"]}
  }
}

Then run /mcp reload or restart. Use /mcp to check connection status.

Q: An MCP server is showing an error. How do I debug it?

/mcp                    # shows error message per server
/mcp reload git         # try reconnecting

If the server uses stdio, make sure the command is in your $PATH:

which uvx               # should print a path
uvx mcp-server-git      # run manually to see errors

Q: Can I use MCP servers that require authentication?

For HTTP/SSE servers with a Bearer token:

{
  "mcpServers": {
    "my-api": {
      "type": "sse",
      "url": "https://myserver.example.com/sse",
      "headers": {"Authorization": "Bearer sk-my-token"}
    }
  }
}

For stdio servers with env-based auth:

{
  "mcpServers": {
    "brave": {
      "type": "stdio",
      "command": "uvx",
      "args": ["mcp-server-brave-search"],
      "env": {"BRAVE_API_KEY": "your-key"}
    }
  }
}

Q: Tool calls don't work with my local Ollama model.

Not all models support function calling. Use one of the recommended tool-calling models: qwen2.5-coder, llama3.3, mistral, or phi4.

ollama pull qwen2.5-coder
nano_claude --model ollama/qwen2.5-coder

Q: How do I connect to a remote GPU server running vLLM?

/config custom_base_url=http://your-server-ip:8000/v1
/config custom_api_key=your-token
/model custom/your-model-name

Q: How do I check my API cost?

/cost

  Input tokens:  3,421
  Output tokens:   892
  Est. cost:     $0.0648 USD

Q: Can I use multiple API keys in the same session?

Yes. Set all the keys you need upfront (via env vars or /config). Then switch models freely — each call uses the key for the active provider.

Q: How do I make a model available across all projects?

Add keys to ~/.bashrc or ~/.zshrc. Set the default model in ~/.nano_claude/config.json:

{ "model": "claude-sonnet-4-6" }

Q: Qwen / Zhipu returns garbled text.

Ensure your DASHSCOPE_API_KEY / ZHIPU_API_KEY is correct and the account has sufficient quota. Both providers use UTF-8 and handle Chinese well.

Q: Can I pipe input to nano claude?

echo "Explain this file" | nano_claude --print --accept-all
cat error.log | nano_claude -p "What is causing this error?"

Q: How do I run it as a CLI tool from anywhere?

Use uv tool install — it creates an isolated environment and puts nano_claude on your PATH:

cd nano-claude-code
uv tool install .

After that, just run nano_claude from any directory. To update after pulling changes, run uv tool install . --reinstall.

Q: How do I set up voice input?

# Minimal setup (local, offline, no API key):
pip install sounddevice faster-whisper numpy

# Then in the REPL:
/voice status          # verify backends are detected
/voice                 # speak your prompt

On first use, faster-whisper downloads the base model (~150 MB) automatically. Use a larger model for better accuracy: export NANO_CLAUDE_WHISPER_MODEL=small

Q: Voice input transcribes my words wrong (misses coding terms).

The keyterm booster already injects coding vocabulary from your git branch and project files. For persistent domain terms, put them in a .nano_claude/voice_keyterms.txt file (one term per line) — this is checked automatically on each recording.

Q: Can I use voice input in Chinese / Japanese / other languages?

Yes. Set the language before recording:

/voice lang zh    # Mandarin Chinese
/voice lang ja    # Japanese
/voice lang auto  # reset to auto-detect (default)

Whisper supports 99 languages. auto detection works well but explicit codes improve accuracy for short utterances.

About

Nano Claude Code: A Fast, Easy-to-Use Python Reimplementation of Claude Code Supporting Any Model.

Topics

Resources

License

Stars

Watchers

Forks

Packages

 
 
 

Contributors

Languages