Skip to content

chore: upstream sync v4 (48 commits) + CI workflow#13

Merged
kmbandy merged 51 commits into
mainfrom
feature-forksync-v4
Mar 23, 2026
Merged

chore: upstream sync v4 (48 commits) + CI workflow#13
kmbandy merged 51 commits into
mainfrom
feature-forksync-v4

Conversation

@kmbandy
Copy link
Copy Markdown
Owner

@kmbandy kmbandy commented Mar 23, 2026

Summary

Upstream sync merging 48 commits from HKUDS/nanobot, plus CI setup.

Upstream changes merged

  • CommandRouter — clean nanobot/command/ dispatch system replacing ad-hoc if/elif chains
  • Per-session locks_session_locks + NANOBOT_MAX_CONCURRENT_REQUESTS semaphore (replaces global lock)
  • Streaming_build_chat_kwargs refactor in LiteLLMProvider, chat_stream, nanobot/cli/stream.py
  • MCP schema normalization_normalize_schema_for_openai for nullable type unions
  • WeChat channel (weixin), WhatsApp media, centralized think-tag filtering
  • Mistral + OVMS providers, keep_recent_messages config
  • _sanitize_persisted_blocks for multimodal session history
  • Zombie process reap in shell tool, estimate_message_tokens improvements

Custom changes preserved

  • suppress_tools_param + request_timeout — eng-2 Jinja2 crash workaround
  • _extract_text_tool_calls + think-tag stripping — Qwen3/Nemotron text-mode tool calls
  • _deep_merge + extends config inheritance
  • HttpConfig, NvidiaConfig, sysmon, searxng_url in schema
  • SysmonTool + NvidiaEscalateTool
  • Fleet commands (!status, !tasks, !ping, etc.) migrated to _register_fleet_commands() via CommandRouter
  • MCPServerConnection auto-reconnect on McpError code 32600

CI additions

  • .github/workflows/test.yml — pytest on PR + push, Python 3.11 & 3.12
  • tests/conftest.py — shared make_mock_provider() / make_agent_loop() fixtures with correct AgentLoop init shape, preventing test drift after future syncs

Test plan

  • CI workflow runs green on this PR (529 tests, matrix channel excluded — missing nh3 dep)
  • Smoke test eng-1 and eng-2 after merge: !ping, !status, !tasks
  • Verify suppress_tools_param: true still works for eng-2 (Nemotron model)
  • Check streaming works in CLI: nanobot agent

coldxiangyu163 and others added 30 commits February 28, 2026 22:41
Part 1: Make system prompt static
- Move Current Time from system prompt to user message prefix
- System prompt now only changes when config/skills change, not every minute
- Timestamp injected as [YYYY-MM-DD HH:MM (Day) (TZ)] prefix on each user message

Part 2: Add second cache_control breakpoint
- Existing: system message breakpoint (caches static system prompt)
- New: second-to-last message breakpoint (caches conversation history prefix)
- Refactored _apply_cache_control with shared _mark() helper

Before: 0% cache hit rate (system prompt changed every minute)
After: ~90% savings on cached input tokens for multi-turn conversations

Closes HKUDS#981
Exclude openai_codex alongside github_copilot from generated config,
filter OAuth-only providers out of the onboarding wizard, and clarify in
README that OAuth login stores session state outside config. Also unify
the GitHub Copilot login command spelling and add regression tests.

Made-with: Cursor
Keep multimodal tool outputs on the native content-block path while
restoring redirect SSRF checks for web_fetch image responses. Also share
image block construction, simplify persisted history sanitization, and
add regression tests for image reads and blocked private redirects.

Made-with: Cursor
…rception

Add native image content blocks for read_file and web_fetch, preserve the multimodal tool-result path through the agent loop, and keep session history compact with image placeholders. Also harden web_fetch against redirect-based SSRF bypasses and add regression coverage for image reads and blocked private redirects.
Only normalize nullable MCP tool schemas for OpenAI-compatible providers so optional params still work without collapsing unrelated unions. Also teach local validation to honor nullable flags and add regression coverage for nullable and non-nullable schemas.

Made-with: Cursor
Handle /status at the run-loop level so it can return immediately while the agent is busy, and reset last-usage stats when providers omit usage data. Also keep Telegram help/menu coverage for /status without changing the existing final-response send path.

Made-with: Cursor
Keep status output responsive while estimating current context from session history, dropping low-value queue/subagent counters, and marking command-style replies for plain-text rendering in CLI. Also route direct CLI calls through outbound metadata so help/status formatting stays explicit instead of relying on content heuristics.

Made-with: Cursor
Only use process_direct_outbound when the agent loop actually exposes it as an async method, and otherwise fall back to the legacy process_direct path. This keeps the new CLI render-metadata flow without breaking existing test doubles or older direct-call implementations.

Made-with: Cursor
feat: add /status command to show runtime info
Merge process_direct() and process_direct_outbound() into a single
interface returning OutboundMessage | None. This eliminates the
dual-path detection logic in CLI single-message mode that relied on
inspect.iscoroutinefunction to distinguish between the two APIs.

Extract status rendering into a pure function build_status_content()
in utils/helpers.py, decoupling it from AgentLoop internals.

Made-with: Cursor
estimate_prompt_tokens() only counted the `content` text field, completely
missing tool_calls JSON (~72% of actual payload), reasoning_content,
tool_call_id, name, and per-message framing overhead. This caused the
memory consolidator to never trigger for tool-heavy sessions (e.g. cron
jobs), leading to context window overflow errors from the LLM provider.

Also adds reasoning_content counting and proper per-message overhead to
estimate_message_tokens() for consistent boundary detection.

Made-with: Cursor
Resolve conflict in context.py: keep main's build_messages which already
merges runtime context into user message (achieving the same cache goal).
The real value-add from this PR is the second cache breakpoint in
litellm_provider.py.

Made-with: Cursor
…ic models

perf: optimize prompt cache hit rate for Anthropic models
Preserve the provider and agent-loop streaming primitives plus the CLI experiment scaffolding so this work can be resumed later without blocking urgent bug fixes on main.

Made-with: Cursor
Provider layer: add chat_stream / chat_stream_with_retry to all providers
(base fallback, litellm, custom, azure, codex). Refactor shared kwargs
building in each provider.

Channel layer: BaseChannel gains send_delta (no-op) and supports_streaming
(checks config + method override). ChannelManager routes _stream_delta /
_stream_end to send_delta, skips _streamed final messages.

AgentLoop._dispatch builds bus-backed on_stream/on_stream_end callbacks
when _wants_stream metadata is set. Non-streaming path unchanged.

CLI: clean up spinner ANSI workarounds, simplify commands.py flow.
Made-with: Cursor
Move ThinkingSpinner and StreamRenderer into a dedicated module to keep
commands.py focused on orchestration. Uses Rich Live with manual refresh
(auto_refresh=False) and ellipsis overflow for stable streaming output.

Made-with: Cursor
…aming

- Add strip_think() to helpers.py as single source of truth
- Filter deltas in agent loop before dispatching to consumers
- Implement send_delta in TelegramChannel with progressive edit_message_text
- Remove duplicate think filtering from CLI stream.py and telegram.py
- Remove legacy fake streaming (send_message_draft) from Telegram
- Default Telegram streaming to true
- Update CHANNEL_PLUGIN_GUIDE.md with streaming documentation

Made-with: Cursor
Register Mistral as a first-class provider with LiteLLM routing,
MISTRAL_API_KEY env var, and https://api.mistral.ai/v1 default base.

Includes schema field, registry entry, and tests.
Trigger token consolidation before prompt usage reaches the full context window so response tokens and tokenizer estimation drift still fit safely within the model budget.

Made-with: Cursor
Re-bin and others added 21 commits March 23, 2026 16:48
…t dispatch

Replace the single _processing_lock (asyncio.Lock) with per-session locks
so that different sessions can process LLM requests concurrently, while
messages within the same session remain serialised.

An optional global concurrency cap is available via the
NANOBOT_MAX_CONCURRENT_REQUESTS env var (default 3, <=0 for unlimited).

Also re-binds tool context before each tool execution round to prevent
concurrent sessions from clobbering each other's routing info.

Tested in production and manually reviewed.

(cherry picked from commit c397bb4)
Add a new WeChat (微信) channel that connects to personal WeChat using
the ilinkai.weixin.qq.com HTTP long-poll API. Protocol reverse-engineered
from @tencent-weixin/openclaw-weixin v1.0.2.

Features:
- QR code login flow (nanobot weixin login)
- HTTP long-poll message receiving (getupdates)
- Text message sending with proper WeixinMessage format
- Media download with AES-128-ECB decryption (image/voice/file/video)
- Voice-to-text from WeChat + Groq Whisper fallback
- Quoted message (ref_msg) support
- Session expiry detection and auto-pause
- Server-suggested poll timeout adaptation
- Context token caching for replies
- Auto-discovery via channel registry

No WebSocket, no Node.js bridge, no local WeChat client needed — pure
HTTP with a bot token obtained via QR code scan.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Cherry-picked from PR HKUDS#2355 (ad128a7) — only agent/context.py and agent/tools/message.py.

Co-Authored-By: qulllee <qullkui@tencent.com>
During testing, we discovered that when a user requests the agent to
send a file (e.g., "send me IMG_1115.png"), the agent would call
read_file to view the content and then reply with text claiming
"file sent" — but never actually deliver the file to the user.

Root cause: The system prompt stated "Reply directly with text for
conversations. Only use the 'message' tool to send to a specific
chat channel", which led the LLM to believe text replies were
sufficient for all responses, including file delivery.

Fix: Add an explicit IMPORTANT instruction in the system prompt
telling the LLM it MUST use the 'message' tool with the 'media'
parameter to send files, and that read_file only reads content
for its own analysis.

Co-Authored-By: qulllee <qullkui@tencent.com>
Previously the WeChat channel's send() method only handled text messages,
completely ignoring msg.media. When the agent called message(media=[...]),
the file was never delivered to the user.

Implement the full WeChat CDN upload protocol following the reference
@tencent-weixin/openclaw-weixin v1.0.2:
  1. Generate a client-side AES-128 key (16 random bytes)
  2. Call getuploadurl with file metadata + hex-encoded AES key
  3. AES-128-ECB encrypt the file and POST to CDN with filekey param
  4. Read x-encrypted-param from CDN response header as download param
  5. Send message with the media item (image/video/file) referencing
     the CDN upload

Also adds:
- _encrypt_aes_ecb() for AES-128-ECB encryption (reverse of existing
  _decrypt_aes_ecb)
- Media type detection from file extension (image/video/file)
- Graceful error handling: failed media sends notify the user via text
  without blocking subsequent text delivery

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ands

Move channel-specific login logic from CLI into each channel class via a
new `login(force=False)` method on BaseChannel. The `channels login <name>`
command now dynamically loads the channel and calls its login() method.

- WeixinChannel.login(): calls existing _qr_login(), with force to clear saved token
- WhatsAppChannel.login(): sets up bridge and spawns npm process for QR login
- CLI no longer contains duplicate login logic per channel
- Update CHANNEL_PLUGIN_GUIDE to document the login() hook

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Key upstream changes merged:
- CommandRouter: clean command dispatch system (nanobot/command/)
- Per-session locks + concurrency gate (replaces global lock)
- Streaming support in LiteLLMProvider (_build_chat_kwargs refactor)
- MCP schema normalization (_normalize_schema_for_openai)
- WeChat channel (weixin), WhatsApp media, centralized think-tag filtering
- Mistral + OVMS providers, keep_recent_messages config
- _sanitize_persisted_blocks for multimodal session history
- Zombie process reap in shell tool, estimate_message_tokens improvements

Custom changes preserved:
- suppress_tools_param + request_timeout (eng-2 Jinja2 workaround)
- _extract_text_tool_calls + think-tag stripping (Qwen3/Nemotron text tools)
- _deep_merge + extends config inheritance (loader.py)
- HttpConfig, NvidiaConfig, sysmon, searxng_url in schema
- SysmonTool + NvidiaEscalateTool in AgentLoop
- Fleet commands migrated to _register_fleet_commands() via CommandRouter
- MCPServerConnection auto-reconnect on McpError code 32600

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
- .github/workflows/test.yml: runs pytest on PRs and main pushes,
  Python 3.11 + 3.12, ignores matrix channel (missing nh3 dep)
- tests/conftest.py: make_mock_provider() + make_agent_loop() helpers
  with correct AgentLoop init shape (chat_with_retry AsyncMock,
  provider.generation attrs) — prevents test drift after upstream syncs

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
ExecTool.execute() was ignoring the timeout kwarg (falling through to
**kwargs) and only appending exit code for non-zero returns. Both caused
test_tool_validation failures introduced in the upstream sync.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
@kmbandy kmbandy merged commit 5e99315 into main Mar 23, 2026
0 of 3 checks passed
@kmbandy kmbandy deleted the feature-forksync-v4 branch March 23, 2026 22:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.