chore: upstream sync v4 (48 commits) + CI workflow by kmbandy · Pull Request #13 · kmbandy/nanobot

kmbandy · 2026-03-23T21:29:41Z

Summary

Upstream sync merging 48 commits from HKUDS/nanobot, plus CI setup.

Upstream changes merged

CommandRouter — clean nanobot/command/ dispatch system replacing ad-hoc if/elif chains
Per-session locks — _session_locks + NANOBOT_MAX_CONCURRENT_REQUESTS semaphore (replaces global lock)
Streaming — _build_chat_kwargs refactor in LiteLLMProvider, chat_stream, nanobot/cli/stream.py
MCP schema normalization — _normalize_schema_for_openai for nullable type unions
WeChat channel (weixin), WhatsApp media, centralized think-tag filtering
Mistral + OVMS providers, keep_recent_messages config
_sanitize_persisted_blocks for multimodal session history
Zombie process reap in shell tool, estimate_message_tokens improvements

Custom changes preserved

suppress_tools_param + request_timeout — eng-2 Jinja2 crash workaround
_extract_text_tool_calls + think-tag stripping — Qwen3/Nemotron text-mode tool calls
_deep_merge + extends config inheritance
HttpConfig, NvidiaConfig, sysmon, searxng_url in schema
SysmonTool + NvidiaEscalateTool
Fleet commands (!status, !tasks, !ping, etc.) migrated to _register_fleet_commands() via CommandRouter
MCPServerConnection auto-reconnect on McpError code 32600

CI additions

.github/workflows/test.yml — pytest on PR + push, Python 3.11 & 3.12
tests/conftest.py — shared make_mock_provider() / make_agent_loop() fixtures with correct AgentLoop init shape, preventing test drift after future syncs

Test plan

CI workflow runs green on this PR (529 tests, matrix channel excluded — missing nh3 dep)
Smoke test eng-1 and eng-2 after merge: !ping, !status, !tasks
Verify suppress_tools_param: true still works for eng-2 (Nemotron model)
Check streaming works in CLI: nanobot agent

Part 1: Make system prompt static - Move Current Time from system prompt to user message prefix - System prompt now only changes when config/skills change, not every minute - Timestamp injected as [YYYY-MM-DD HH:MM (Day) (TZ)] prefix on each user message Part 2: Add second cache_control breakpoint - Existing: system message breakpoint (caches static system prompt) - New: second-to-last message breakpoint (caches conversation history prefix) - Refactored _apply_cache_control with shared _mark() helper Before: 0% cache hit rate (system prompt changed every minute) After: ~90% savings on cached input tokens for multi-turn conversations Closes HKUDS#981

# Conflicts: # nanobot/channels/telegram.py

Exclude openai_codex alongside github_copilot from generated config, filter OAuth-only providers out of the onboarding wizard, and clarify in README that OAuth login stores session state outside config. Also unify the GitHub Copilot login command spelling and add regression tests. Made-with: Cursor

Keep multimodal tool outputs on the native content-block path while restoring redirect SSRF checks for web_fetch image responses. Also share image block construction, simplify persisted history sanitization, and add regression tests for image reads and blocked private redirects. Made-with: Cursor

…rception Add native image content blocks for read_file and web_fetch, preserve the multimodal tool-result path through the agent loop, and keep session history compact with image placeholders. Also harden web_fetch against redirect-based SSRF bypasses and add regression coverage for image reads and blocked private redirects.

Only normalize nullable MCP tool schemas for OpenAI-compatible providers so optional params still work without collapsing unrelated unions. Also teach local validation to honor nullable flags and add regression coverage for nullable and non-nullable schemas. Made-with: Cursor

Handle /status at the run-loop level so it can return immediately while the agent is busy, and reset last-usage stats when providers omit usage data. Also keep Telegram help/menu coverage for /status without changing the existing final-response send path. Made-with: Cursor

Keep status output responsive while estimating current context from session history, dropping low-value queue/subagent counters, and marking command-style replies for plain-text rendering in CLI. Also route direct CLI calls through outbound metadata so help/status formatting stays explicit instead of relying on content heuristics. Made-with: Cursor

Only use process_direct_outbound when the agent loop actually exposes it as an async method, and otherwise fall back to the legacy process_direct path. This keeps the new CLI render-metadata flow without breaking existing test doubles or older direct-call implementations. Made-with: Cursor

feat: add /status command to show runtime info

Merge process_direct() and process_direct_outbound() into a single interface returning OutboundMessage | None. This eliminates the dual-path detection logic in CLI single-message mode that relied on inspect.iscoroutinefunction to distinguish between the two APIs. Extract status rendering into a pure function build_status_content() in utils/helpers.py, decoupling it from AgentLoop internals. Made-with: Cursor

estimate_prompt_tokens() only counted the `content` text field, completely missing tool_calls JSON (~72% of actual payload), reasoning_content, tool_call_id, name, and per-message framing overhead. This caused the memory consolidator to never trigger for tool-heavy sessions (e.g. cron jobs), leading to context window overflow errors from the LLM provider. Also adds reasoning_content counting and proper per-message overhead to estimate_message_tokens() for consistent boundary detection. Made-with: Cursor

Resolve conflict in context.py: keep main's build_messages which already merges runtime context into user message (achieving the same cache goal). The real value-add from this PR is the second cache breakpoint in litellm_provider.py. Made-with: Cursor

…ic models perf: optimize prompt cache hit rate for Anthropic models

Preserve the provider and agent-loop streaming primitives plus the CLI experiment scaffolding so this work can be resumed later without blocking urgent bug fixes on main. Made-with: Cursor

Provider layer: add chat_stream / chat_stream_with_retry to all providers (base fallback, litellm, custom, azure, codex). Refactor shared kwargs building in each provider. Channel layer: BaseChannel gains send_delta (no-op) and supports_streaming (checks config + method override). ChannelManager routes _stream_delta / _stream_end to send_delta, skips _streamed final messages. AgentLoop._dispatch builds bus-backed on_stream/on_stream_end callbacks when _wants_stream metadata is set. Non-streaming path unchanged. CLI: clean up spinner ANSI workarounds, simplify commands.py flow. Made-with: Cursor

Move ThinkingSpinner and StreamRenderer into a dedicated module to keep commands.py focused on orchestration. Uses Rich Live with manual refresh (auto_refresh=False) and ellipsis overflow for stable streaming output. Made-with: Cursor

…aming - Add strip_think() to helpers.py as single source of truth - Filter deltas in agent loop before dispatching to consumers - Implement send_delta in TelegramChannel with progressive edit_message_text - Remove duplicate think filtering from CLI stream.py and telegram.py - Remove legacy fake streaming (send_message_draft) from Telegram - Default Telegram streaming to true - Update CHANNEL_PLUGIN_GUIDE.md with streaming documentation Made-with: Cursor

Register Mistral as a first-class provider with LiteLLM routing, MISTRAL_API_KEY env var, and https://api.mistral.ai/v1 default base. Includes schema field, registry entry, and tests.

add OpenVINO Model Server provider

Trigger token consolidation before prompt usage reaches the full context window so response tokens and tokenizer estimation drift still fit safely within the model budget. Made-with: Cursor

…t dispatch Replace the single _processing_lock (asyncio.Lock) with per-session locks so that different sessions can process LLM requests concurrently, while messages within the same session remain serialised. An optional global concurrency cap is available via the NANOBOT_MAX_CONCURRENT_REQUESTS env var (default 3, <=0 for unlimited). Also re-binds tool context before each tool execution round to prevent concurrent sessions from clobbering each other's routing info. Tested in production and manually reviewed. (cherry picked from commit c397bb4)

(cherry picked from commit 5c871d7)

Add a new WeChat (微信) channel that connects to personal WeChat using the ilinkai.weixin.qq.com HTTP long-poll API. Protocol reverse-engineered from @tencent-weixin/openclaw-weixin v1.0.2. Features: - QR code login flow (nanobot weixin login) - HTTP long-poll message receiving (getupdates) - Text message sending with proper WeixinMessage format - Media download with AES-128-ECB decryption (image/voice/file/video) - Voice-to-text from WeChat + Groq Whisper fallback - Quoted message (ref_msg) support - Session expiry detection and auto-pause - Server-suggested poll timeout adaptation - Context token caching for replies - Auto-discovery via channel registry No WebSocket, no Node.js bridge, no local WeChat client needed — pure HTTP with a bot token obtained via QR code scan. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

Cherry-picked from PR HKUDS#2355 (ad128a7) — only agent/context.py and agent/tools/message.py. Co-Authored-By: qulllee <qullkui@tencent.com>

During testing, we discovered that when a user requests the agent to send a file (e.g., "send me IMG_1115.png"), the agent would call read_file to view the content and then reply with text claiming "file sent" — but never actually deliver the file to the user. Root cause: The system prompt stated "Reply directly with text for conversations. Only use the 'message' tool to send to a specific chat channel", which led the LLM to believe text replies were sufficient for all responses, including file delivery. Fix: Add an explicit IMPORTANT instruction in the system prompt telling the LLM it MUST use the 'message' tool with the 'media' parameter to send files, and that read_file only reads content for its own analysis. Co-Authored-By: qulllee <qullkui@tencent.com>

Previously the WeChat channel's send() method only handled text messages, completely ignoring msg.media. When the agent called message(media=[...]), the file was never delivered to the user. Implement the full WeChat CDN upload protocol following the reference @tencent-weixin/openclaw-weixin v1.0.2: 1. Generate a client-side AES-128 key (16 random bytes) 2. Call getuploadurl with file metadata + hex-encoded AES key 3. AES-128-ECB encrypt the file and POST to CDN with filekey param 4. Read x-encrypted-param from CDN response header as download param 5. Send message with the media item (image/video/file) referencing the CDN upload Also adds: - _encrypt_aes_ecb() for AES-128-ECB encryption (reverse of existing _decrypt_aes_ecb) - Media type detection from file extension (image/video/file) - Graceful error handling: failed media sends notify the user via text without blocking subsequent text delivery Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>

…ands Move channel-specific login logic from CLI into each channel class via a new `login(force=False)` method on BaseChannel. The `channels login <name>` command now dynamically loads the channel and calls its login() method. - WeixinChannel.login(): calls existing _qr_login(), with force to clear saved token - WhatsAppChannel.login(): sets up bridge and spawns npm process for QR login - CLI no longer contains duplicate login logic per channel - Update CHANNEL_PLUGIN_GUIDE to document the login() hook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>

Key upstream changes merged: - CommandRouter: clean command dispatch system (nanobot/command/) - Per-session locks + concurrency gate (replaces global lock) - Streaming support in LiteLLMProvider (_build_chat_kwargs refactor) - MCP schema normalization (_normalize_schema_for_openai) - WeChat channel (weixin), WhatsApp media, centralized think-tag filtering - Mistral + OVMS providers, keep_recent_messages config - _sanitize_persisted_blocks for multimodal session history - Zombie process reap in shell tool, estimate_message_tokens improvements Custom changes preserved: - suppress_tools_param + request_timeout (eng-2 Jinja2 workaround) - _extract_text_tool_calls + think-tag stripping (Qwen3/Nemotron text tools) - _deep_merge + extends config inheritance (loader.py) - HttpConfig, NvidiaConfig, sysmon, searxng_url in schema - SysmonTool + NvidiaEscalateTool in AgentLoop - Fleet commands migrated to _register_fleet_commands() via CommandRouter - MCPServerConnection auto-reconnect on McpError code 32600 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

- .github/workflows/test.yml: runs pytest on PRs and main pushes, Python 3.11 + 3.12, ignores matrix channel (missing nh3 dep) - tests/conftest.py: make_mock_provider() + make_agent_loop() helpers with correct AgentLoop init shape (chat_with_retry AsyncMock, provider.generation attrs) — prevents test drift after upstream syncs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

ExecTool.execute() was ignoring the timeout kwarg (falling through to **kwargs) and only appending exit code for non-zero returns. Both caused test_tool_validation failures introduced in the upstream sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

coldxiangyu163 and others added 30 commits February 28, 2026 22:41

feat: add /status command to show runtime info

a628741

Merge remote-tracking branch 'upstream/main' into feat/status-command

43475ed

# Conflicts: # nanobot/channels/telegram.py

feat: implement native multimodal autonomous sensory capabilities

71a88da

docs: add github copilot oauth channel setup instructions

055e2f3

chore: remove redundant github_copilot field from config.json

e029d52

Merge branch 'main' into pr-2304

834f1e3

fix: normalize MCP tool schema for OpenAI-compatible providers

b6cf702

Merge branch 'main' into pr-1985

570ca47

Merge PR HKUDS#1985: feat: add /status command to show runtime info

064ca25

feat: add /status command to show runtime info

Merge PR HKUDS#1109: perf: optimize prompt cache hit rate for Anthrop…

5fd66ca

…ic models perf: optimize prompt cache hit rate for Anthropic models

feat(agent): add streaming groundwork for future TUI

e79b9f4

Preserve the provider and agent-loop streaming primitives plus the CLI experiment scaffolding so this work can be resumed later without blocking urgent bug fixes on main. Made-with: Cursor

feat(providers): add Mistral AI provider

7878340

Register Mistral as a first-class provider with LiteLLM routing, MISTRAL_API_KEY env var, and https://api.mistral.ai/v1 default base. Includes schema field, registry entry, and tests.

feat(provider): add OpenVINO Model Server provider (HKUDS#2193)

f64ae3b

add OpenVINO Model Server provider

docs(provider): add mistral intro

a46803c

fix(cli): stop spinner after non-streaming interactive replies

8f5c2d1

fix(memory): reserve completion headroom for consolidation

aba0b83

Trigger token consolidation before prompt usage reaches the full context window so response tokens and tokenizer estimation drift still fit safely within the model budget. Made-with: Cursor

Re-bin and others added 21 commits March 23, 2026 16:48

refactor command routing for future plugins and clearer CLI structure

20494a2

fix(shell): reap zombie processes when command timeout kills subprocess

e423cee

refactor(shell): use finally block to reap zombie processes on timeout

dbcc7cb

refactor(shell): use finally block to reap zombie processes on timeoutx

e2e1c9c

refactor(shell): fix syntax error

84a7f8a

fix: clear heartbeat session to prevent token overflow

ba0a3d1

(cherry picked from commit 5c871d7)

refine heartbeat session retention boundaries

2056061

feat: add media message support in agent context and message tool

bc9f861

Cherry-picked from PR HKUDS#2355 (ad128a7) — only agent/context.py and agent/tools/message.py. Co-Authored-By: qulllee <qullkui@tencent.com>

fix(cli): use discovered class for channel login

0ca639b

docs(weixin): add setup guide and focused channel tests

d164548

docs: require explicit channel login command

bef88a5

feat(whatsapp): add outbound media support via bridge

25288f9

docs: update channel table and add plugin dev note

1d58c9b

kmbandy merged commit 5e99315 into main Mar 23, 2026
0 of 3 checks passed

kmbandy deleted the feature-forksync-v4 branch March 23, 2026 22:09

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

chore: upstream sync v4 (48 commits) + CI workflow#13

chore: upstream sync v4 (48 commits) + CI workflow#13
kmbandy merged 51 commits into
mainfrom
feature-forksync-v4

kmbandy commented Mar 23, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants

Conversation

kmbandy commented Mar 23, 2026

Summary

Upstream changes merged

Custom changes preserved

CI additions

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

15 participants