chore: upstream sync v4 (48 commits) + CI workflow#13
Merged
Conversation
Part 1: Make system prompt static - Move Current Time from system prompt to user message prefix - System prompt now only changes when config/skills change, not every minute - Timestamp injected as [YYYY-MM-DD HH:MM (Day) (TZ)] prefix on each user message Part 2: Add second cache_control breakpoint - Existing: system message breakpoint (caches static system prompt) - New: second-to-last message breakpoint (caches conversation history prefix) - Refactored _apply_cache_control with shared _mark() helper Before: 0% cache hit rate (system prompt changed every minute) After: ~90% savings on cached input tokens for multi-turn conversations Closes HKUDS#981
# Conflicts: # nanobot/channels/telegram.py
Exclude openai_codex alongside github_copilot from generated config, filter OAuth-only providers out of the onboarding wizard, and clarify in README that OAuth login stores session state outside config. Also unify the GitHub Copilot login command spelling and add regression tests. Made-with: Cursor
Keep multimodal tool outputs on the native content-block path while restoring redirect SSRF checks for web_fetch image responses. Also share image block construction, simplify persisted history sanitization, and add regression tests for image reads and blocked private redirects. Made-with: Cursor
…rception Add native image content blocks for read_file and web_fetch, preserve the multimodal tool-result path through the agent loop, and keep session history compact with image placeholders. Also harden web_fetch against redirect-based SSRF bypasses and add regression coverage for image reads and blocked private redirects.
Only normalize nullable MCP tool schemas for OpenAI-compatible providers so optional params still work without collapsing unrelated unions. Also teach local validation to honor nullable flags and add regression coverage for nullable and non-nullable schemas. Made-with: Cursor
Handle /status at the run-loop level so it can return immediately while the agent is busy, and reset last-usage stats when providers omit usage data. Also keep Telegram help/menu coverage for /status without changing the existing final-response send path. Made-with: Cursor
Keep status output responsive while estimating current context from session history, dropping low-value queue/subagent counters, and marking command-style replies for plain-text rendering in CLI. Also route direct CLI calls through outbound metadata so help/status formatting stays explicit instead of relying on content heuristics. Made-with: Cursor
Only use process_direct_outbound when the agent loop actually exposes it as an async method, and otherwise fall back to the legacy process_direct path. This keeps the new CLI render-metadata flow without breaking existing test doubles or older direct-call implementations. Made-with: Cursor
feat: add /status command to show runtime info
Merge process_direct() and process_direct_outbound() into a single interface returning OutboundMessage | None. This eliminates the dual-path detection logic in CLI single-message mode that relied on inspect.iscoroutinefunction to distinguish between the two APIs. Extract status rendering into a pure function build_status_content() in utils/helpers.py, decoupling it from AgentLoop internals. Made-with: Cursor
estimate_prompt_tokens() only counted the `content` text field, completely missing tool_calls JSON (~72% of actual payload), reasoning_content, tool_call_id, name, and per-message framing overhead. This caused the memory consolidator to never trigger for tool-heavy sessions (e.g. cron jobs), leading to context window overflow errors from the LLM provider. Also adds reasoning_content counting and proper per-message overhead to estimate_message_tokens() for consistent boundary detection. Made-with: Cursor
Resolve conflict in context.py: keep main's build_messages which already merges runtime context into user message (achieving the same cache goal). The real value-add from this PR is the second cache breakpoint in litellm_provider.py. Made-with: Cursor
…ic models perf: optimize prompt cache hit rate for Anthropic models
Preserve the provider and agent-loop streaming primitives plus the CLI experiment scaffolding so this work can be resumed later without blocking urgent bug fixes on main. Made-with: Cursor
Provider layer: add chat_stream / chat_stream_with_retry to all providers (base fallback, litellm, custom, azure, codex). Refactor shared kwargs building in each provider. Channel layer: BaseChannel gains send_delta (no-op) and supports_streaming (checks config + method override). ChannelManager routes _stream_delta / _stream_end to send_delta, skips _streamed final messages. AgentLoop._dispatch builds bus-backed on_stream/on_stream_end callbacks when _wants_stream metadata is set. Non-streaming path unchanged. CLI: clean up spinner ANSI workarounds, simplify commands.py flow. Made-with: Cursor
Move ThinkingSpinner and StreamRenderer into a dedicated module to keep commands.py focused on orchestration. Uses Rich Live with manual refresh (auto_refresh=False) and ellipsis overflow for stable streaming output. Made-with: Cursor
…aming - Add strip_think() to helpers.py as single source of truth - Filter deltas in agent loop before dispatching to consumers - Implement send_delta in TelegramChannel with progressive edit_message_text - Remove duplicate think filtering from CLI stream.py and telegram.py - Remove legacy fake streaming (send_message_draft) from Telegram - Default Telegram streaming to true - Update CHANNEL_PLUGIN_GUIDE.md with streaming documentation Made-with: Cursor
Register Mistral as a first-class provider with LiteLLM routing, MISTRAL_API_KEY env var, and https://api.mistral.ai/v1 default base. Includes schema field, registry entry, and tests.
add OpenVINO Model Server provider
Trigger token consolidation before prompt usage reaches the full context window so response tokens and tokenizer estimation drift still fit safely within the model budget. Made-with: Cursor
…t dispatch Replace the single _processing_lock (asyncio.Lock) with per-session locks so that different sessions can process LLM requests concurrently, while messages within the same session remain serialised. An optional global concurrency cap is available via the NANOBOT_MAX_CONCURRENT_REQUESTS env var (default 3, <=0 for unlimited). Also re-binds tool context before each tool execution round to prevent concurrent sessions from clobbering each other's routing info. Tested in production and manually reviewed. (cherry picked from commit c397bb4)
(cherry picked from commit 5c871d7)
Add a new WeChat (微信) channel that connects to personal WeChat using the ilinkai.weixin.qq.com HTTP long-poll API. Protocol reverse-engineered from @tencent-weixin/openclaw-weixin v1.0.2. Features: - QR code login flow (nanobot weixin login) - HTTP long-poll message receiving (getupdates) - Text message sending with proper WeixinMessage format - Media download with AES-128-ECB decryption (image/voice/file/video) - Voice-to-text from WeChat + Groq Whisper fallback - Quoted message (ref_msg) support - Session expiry detection and auto-pause - Server-suggested poll timeout adaptation - Context token caching for replies - Auto-discovery via channel registry No WebSocket, no Node.js bridge, no local WeChat client needed — pure HTTP with a bot token obtained via QR code scan. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from PR HKUDS#2355 (ad128a7) — only agent/context.py and agent/tools/message.py. Co-Authored-By: qulllee <qullkui@tencent.com>
During testing, we discovered that when a user requests the agent to send a file (e.g., "send me IMG_1115.png"), the agent would call read_file to view the content and then reply with text claiming "file sent" — but never actually deliver the file to the user. Root cause: The system prompt stated "Reply directly with text for conversations. Only use the 'message' tool to send to a specific chat channel", which led the LLM to believe text replies were sufficient for all responses, including file delivery. Fix: Add an explicit IMPORTANT instruction in the system prompt telling the LLM it MUST use the 'message' tool with the 'media' parameter to send files, and that read_file only reads content for its own analysis. Co-Authored-By: qulllee <qullkui@tencent.com>
Previously the WeChat channel's send() method only handled text messages,
completely ignoring msg.media. When the agent called message(media=[...]),
the file was never delivered to the user.
Implement the full WeChat CDN upload protocol following the reference
@tencent-weixin/openclaw-weixin v1.0.2:
1. Generate a client-side AES-128 key (16 random bytes)
2. Call getuploadurl with file metadata + hex-encoded AES key
3. AES-128-ECB encrypt the file and POST to CDN with filekey param
4. Read x-encrypted-param from CDN response header as download param
5. Send message with the media item (image/video/file) referencing
the CDN upload
Also adds:
- _encrypt_aes_ecb() for AES-128-ECB encryption (reverse of existing
_decrypt_aes_ecb)
- Media type detection from file extension (image/video/file)
- Graceful error handling: failed media sends notify the user via text
without blocking subsequent text delivery
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ands Move channel-specific login logic from CLI into each channel class via a new `login(force=False)` method on BaseChannel. The `channels login <name>` command now dynamically loads the channel and calls its login() method. - WeixinChannel.login(): calls existing _qr_login(), with force to clear saved token - WhatsAppChannel.login(): sets up bridge and spawns npm process for QR login - CLI no longer contains duplicate login logic per channel - Update CHANNEL_PLUGIN_GUIDE to document the login() hook Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Key upstream changes merged: - CommandRouter: clean command dispatch system (nanobot/command/) - Per-session locks + concurrency gate (replaces global lock) - Streaming support in LiteLLMProvider (_build_chat_kwargs refactor) - MCP schema normalization (_normalize_schema_for_openai) - WeChat channel (weixin), WhatsApp media, centralized think-tag filtering - Mistral + OVMS providers, keep_recent_messages config - _sanitize_persisted_blocks for multimodal session history - Zombie process reap in shell tool, estimate_message_tokens improvements Custom changes preserved: - suppress_tools_param + request_timeout (eng-2 Jinja2 workaround) - _extract_text_tool_calls + think-tag stripping (Qwen3/Nemotron text tools) - _deep_merge + extends config inheritance (loader.py) - HttpConfig, NvidiaConfig, sysmon, searxng_url in schema - SysmonTool + NvidiaEscalateTool in AgentLoop - Fleet commands migrated to _register_fleet_commands() via CommandRouter - MCPServerConnection auto-reconnect on McpError code 32600 Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- .github/workflows/test.yml: runs pytest on PRs and main pushes, Python 3.11 + 3.12, ignores matrix channel (missing nh3 dep) - tests/conftest.py: make_mock_provider() + make_agent_loop() helpers with correct AgentLoop init shape (chat_with_retry AsyncMock, provider.generation attrs) — prevents test drift after upstream syncs Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ExecTool.execute() was ignoring the timeout kwarg (falling through to **kwargs) and only appending exit code for non-zero returns. Both caused test_tool_validation failures introduced in the upstream sync. Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Upstream sync merging 48 commits from HKUDS/nanobot, plus CI setup.
Upstream changes merged
nanobot/command/dispatch system replacing ad-hoc if/elif chains_session_locks+NANOBOT_MAX_CONCURRENT_REQUESTSsemaphore (replaces global lock)_build_chat_kwargsrefactor inLiteLLMProvider,chat_stream,nanobot/cli/stream.py_normalize_schema_for_openaifor nullable type unionsweixin), WhatsApp media, centralized think-tag filteringkeep_recent_messagesconfig_sanitize_persisted_blocksfor multimodal session historyestimate_message_tokensimprovementsCustom changes preserved
suppress_tools_param+request_timeout— eng-2 Jinja2 crash workaround_extract_text_tool_calls+ think-tag stripping — Qwen3/Nemotron text-mode tool calls_deep_merge+extendsconfig inheritanceHttpConfig,NvidiaConfig,sysmon,searxng_urlin schemaSysmonTool+NvidiaEscalateTool!status,!tasks,!ping, etc.) migrated to_register_fleet_commands()viaCommandRouterMCPServerConnectionauto-reconnect onMcpErrorcode 32600CI additions
.github/workflows/test.yml— pytest on PR + push, Python 3.11 & 3.12tests/conftest.py— sharedmake_mock_provider()/make_agent_loop()fixtures with correctAgentLoopinit shape, preventing test drift after future syncsTest plan
nh3dep)!ping,!status,!taskssuppress_tools_param: truestill works for eng-2 (Nemotron model)nanobot agent