From 2db51c0c9a3ffdac63f2ea4bdffae5b9c80ad5e6 Mon Sep 17 00:00:00 2001 From: Alex's Mac Date: Sun, 29 Mar 2026 20:02:51 +0800 Subject: [PATCH 1/4] =?UTF-8?q?feat:=20A2A=20protocol=20v2=20=E2=80=94=20s?= =?UTF-8?q?elective=20independence=20+=20Discussion=20mode?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Evolve the A2A protocol based on deep research, source code verification, and architectural analysis: Core architecture change — selective independence: - Only high-value cross-cutting agents (e.g., CoS) get independent Slack Apps. Execution-layer agents (CTO, Builder, CIO) keep sharing one App via channel-based routing (existing proven model) - CoS-Bot joins existing channels (#cto, #build) to collaborate directly — like walking into another agent's office - CoS is the orchestrator (maps to Harness Design's external orchestrator), CTO is a participant/generator Protocol (shared/A2A_PROTOCOL.md — agent-facing): - Delegation mode: preserved unchanged, all platforms - Discussion mode: CoS-Bot enters target channel, @mention-driven conversation with per-account channel config isolation - Platform matrix with honest "why not" for Discord/Feishu - Agent-executable config guide (user sends to their OpenClaw) - POC verification steps (4 tests) - Source-code-verified: self-loop per-account, allowBots 3-tier fallback, per-account channel config (prepare.ts) Concepts (docs/CONCEPTS.md + en — human-facing): - "Selective independence" explained: not every agent needs own App - CoS as orchestrator, execution layer unchanged - Discord: OpenClaw code bug (#11199), not platform limitation - Feishu: platform API limitation (bot messages invisible) All Discussion content marked [待 POC 验证] / [Pending POC Verification]. Co-Authored-By: Claude Opus 4.6 (1M context) --- docs/CONCEPTS.md | 50 +++++- docs/en/CONCEPTS.md | 50 +++++- shared/A2A_PROTOCOL.md | 344 +++++++++++++++++++++++++++++++++-------- 3 files changed, 379 insertions(+), 65 deletions(-) diff --git a/docs/CONCEPTS.md b/docs/CONCEPTS.md index 200ca33..b474930 100644 --- a/docs/CONCEPTS.md +++ b/docs/CONCEPTS.md @@ -119,6 +119,51 @@ Builder → 不能派单(只接单执行) CIO → 独立运作(必要时与 CoS 同步) ``` +### A2A 的两种模式:Delegation 与 Discussion + +上面描述的两步触发是 **Delegation(委派)** 模式——一个 Agent 通过 `sessions_send` 把结构化任务交给另一个 Agent。这是 A2A 的基础,所有平台都支持,流程清晰、单向可控。 + +v2 引入了第二种模式:**Discussion(讨论)**。少数高价值 Agent(如 CoS)拥有独立 Slack App,直接进入其他 Agent 的频道进行实时讨论。 [待 POC 验证] + +**核心思路:选择性独立化。** 不需要每个 Agent 都有独立 App——执行层(CTO、Builder、CIO 等)继续共享一个 Slack App。只让 CoS(代表用户推进方向)等需要跨域协作的 Agent 拥有独立 App,然后把它拉进目标频道就能直接对话。 + +**什么时候用哪个?** + +| | Delegation(委派) | Discussion(讨论)[待 POC 验证] | +|--|-------------------|-------------------------------| +| 场景 | "CTO 给 Builder 派一个具体任务" | "CoS 进 #cto 跟 CTO 讨论方案,然后去 #build 跟 Builder 确认可行性" | +| 触发方式 | `sessions_send` | @mention / 直接发消息 | +| 方向性 | 单向,一对一 | 多向,多对多 | +| 平台 | Slack / Discord / Feishu | 仅 Slack(需独立 App) | + +两种模式共存,不互相替代。Delegation 是"给任务",Discussion 是"一起想"。 + +**为什么需要 Discussion?** Delegation 是任务分发:CTO 说做什么,Builder 照做。但真实团队不只是派活——他们讨论。CoS 走进 CTO 的办公室说"这个方向你怎么看?",CTO 说"技术上可以但成本高",CoS 又去问 Builder"你觉得多久能做完?"。Discussion 模式让 Agent 也能这样协作——CoS-Bot 被拉进 #cto 频道,直接在 CTO 的地盘上对话,你可以实时旁观、随时插话。 + +### 平台能力对比 + +| 能力 | Slack | Discord | Feishu | +|------|-------|---------|--------| +| Delegation(sessions_send) | ✅ | ✅ | ✅ | +| Discussion(跨 bot 对话)| 待 POC 验证 | 不支持(OpenClaw 代码层 bug) | 不支持(飞书平台限制) | +| Thread / Topic 隔离 | 原生 thread | Thread(自动归档) | groupSessionScope(>= 2026.3.1) | + +**为什么 Discord 和 Feishu 不支持 Discussion?** + +- **Discord**:平台本身支持跨 bot 消息可见,但 OpenClaw 的 bot 消息过滤代码存在 bug(Issue #11199),把所有已配置的 bot 都当作"自己"并丢弃消息。属于代码层问题,理论上可修复,但修复 PR 均已关闭。 +- **Feishu**:飞书的 `im.message.receive_v1` 事件**只投递用户发送的消息**——bot 发的消息对其他 bot 完全不可见。这是飞书平台 API 的设计决策,无法通过配置绕过。 + +### Discussion 模式的当前状态 + +坦率地说:Discussion 模式还没有被端到端验证过。 + +通过 OpenClaw 源码验证(`extensions/slack/src/monitor/message-handler/prepare.ts`),以下机制已确认: +- Self-loop 过滤是 per-account 的(不同 Slack App 不互相过滤) +- `allowBots` 支持三级 fallback(per-channel > per-account > global) +- Per-account channel config 可以给同一频道的不同 bot 设置不同的 `requireMention` + +但完整链路——CoS-Bot 在 #cto 发消息,CTO 收到并回复,CoS 看到回复并继续对话——还需要实际 POC 验证。详见 `shared/A2A_PROTOCOL.md` 附录 B。 + --- ## 5. 结构化产物:Closeout 和 Checkpoint @@ -270,8 +315,9 @@ Layer 2: 抽象知识 ├─ 任务按 QAPS 分类处理 │ └─ Q 轻量处理,A/P/S 必须 Closeout │ - ├─ Agent 之间通过 A2A 两步触发协作 - │ └─ 权限矩阵 + 循环保护 + ├─ Agent 之间通过 A2A 协作 + │ ├─ Delegation:两步触发 + 权限矩阵 + │ └─ Discussion:@mention 多 Agent 讨论 [待 POC 验证] │ └─ 知识通过三层沉淀积累 └─ 对话 → Closeout → KO 抽象知识 diff --git a/docs/en/CONCEPTS.md b/docs/en/CONCEPTS.md index 63f706b..b50005a 100644 --- a/docs/en/CONCEPTS.md +++ b/docs/en/CONCEPTS.md @@ -119,6 +119,51 @@ Builder -> Cannot assign tasks (receives and executes only) CIO -> Operates independently (syncs with CoS when needed) ``` +### Two A2A Modes: Delegation and Discussion + +The two-step trigger described above is the **Delegation** mode -- one Agent hands a structured task to another via `sessions_send`. This is the foundation of A2A, supported on all platforms, with a clear one-directional flow. + +v2 introduces a second mode: **Discussion**. A small number of high-value Agents (e.g., CoS) get their own independent Slack App, then join other Agents' channels to collaborate in real-time. [Pending POC Verification] + +**Core idea: selective independence.** Not every Agent needs its own App -- execution-layer Agents (CTO, Builder, CIO, etc.) keep sharing one Slack App. Only Agents that need cross-domain collaboration (like CoS, who represents the user and drives strategy) get their own App, then get invited into target channels for direct conversation. + +**When to use which?** + +| | Delegation | Discussion [Pending POC Verification] | +|--|------------|---------------------------------------| +| Scenario | "CTO assigns a specific task to Builder" | "CoS walks into #cto to discuss the approach with CTO, then checks with Builder in #build" | +| Trigger | `sessions_send` | @mention / direct message | +| Directionality | One-way, one-to-one | Multi-directional, many-to-many | +| Platform | Slack / Discord / Feishu | Slack only (requires independent App) | + +The two modes coexist -- they do not replace each other. Delegation is for "assign the work." Discussion is for "think it through together." + +**Why Discussion?** Delegation is task distribution: CTO says what to do, Builder executes. But real teams don't just assign tasks -- they discuss. CoS walks into CTO's office and asks "what do you think about this direction?", CTO says "technically feasible but expensive", CoS then asks Builder "how long would this take?". Discussion mode lets Agents work the same way -- CoS-Bot gets invited into #cto and talks directly on CTO's home turf. You can watch in real-time and intervene at any point. + +### Platform Capability Comparison + +| Capability | Slack | Discord | Feishu | +|------------|-------|---------|--------| +| Delegation (sessions_send) | YES | YES | YES | +| Discussion (cross-bot) | Pending POC Verification | Not supported (OpenClaw code-level bug) | Not supported (platform limitation) | +| Thread / Topic isolation | Native thread | Thread (auto-archive) | groupSessionScope (>= 2026.3.1) | + +**Why can't Discord and Feishu support Discussion?** + +- **Discord**: The platform itself supports cross-bot message visibility, but OpenClaw's bot message filter (Issue #11199) treats ALL configured bots as "self" and drops their messages. This is a code-level bug, not a platform limitation -- but fix PRs have all been closed. +- **Feishu**: Feishu's `im.message.receive_v1` event **only delivers user-sent messages** -- bot messages are completely invisible to other bots. This is a platform API design decision and cannot be worked around through configuration. + +### Current Status of Discussion Mode + +To be candid: Discussion mode has not been verified end-to-end. + +Through OpenClaw source code verification (`extensions/slack/src/monitor/message-handler/prepare.ts`), the following mechanisms are confirmed: +- Self-loop filtering is per-account (different Slack Apps don't filter each other) +- `allowBots` supports three-tier fallback (per-channel > per-account > global) +- Per-account channel config can give different bots different `requireMention` settings on the same channel + +But the complete chain -- CoS-Bot posts in #cto, CTO receives and replies, CoS sees the reply and continues the conversation -- still needs a real POC test. See `shared/A2A_PROTOCOL.md` Appendix B for verification steps. + --- ## 5. Structured Artifacts: Closeout and Checkpoint @@ -270,8 +315,9 @@ You (decision-maker) |-- Tasks are classified and handled via QAPS | +-- Q gets lightweight handling; A/P/S require Closeout | - |-- Agents collaborate via A2A two-step trigger - | +-- Permission matrix + loop prevention + |-- Agents collaborate via A2A + | |-- Delegation: two-step trigger + permission matrix + | +-- Discussion: @mention multi-agent deliberation [Pending POC Verification] | +-- Knowledge accumulates through three-layer distillation +-- Conversation -> Closeout -> KO abstract knowledge diff --git a/shared/A2A_PROTOCOL.md b/shared/A2A_PROTOCOL.md index d9f3bd5..f3c4ebd 100644 --- a/shared/A2A_PROTOCOL.md +++ b/shared/A2A_PROTOCOL.md @@ -1,141 +1,363 @@ -# A2A 协作协议(Slack 多 Agent) +# A2A 协作协议 v2(跨平台多 Agent) -> 目标:让 Agent 之间的协作 **自动发生在正确的 Slack 频道/线程里**,做到: +> 目标:让 Agent 之间的协作 **自动发生在正确的频道/线程里**,做到: > - 可见(用户 能在频道里看到) > - 可追踪(每个任务一个 thread/session) > - 不串上下文(thread 级隔离 + 任务包完整) +> +> v2 覆盖平台:Slack / Feishu / Discord +> v2 协作模式:**Delegation**(全平台)+ **Discussion**(Slack 多 Bot)[待 POC 验证] --- ## 0. 术语 -- **A2A(本文)**:Agent-to-Agent 协作流程(不等同于 OpenClaw 的某一个单独工具名)。 -- **Task Thread**:在目标 Agent 的 Slack 频道里创建的任务线程;该线程即该任务的独立 Session。 +- **A2A**:Agent-to-Agent 协作流程总称,包含 Delegation 和 Discussion 两种模式。 +- **Task Thread**:在目标 Agent 频道里创建的任务线程;该线程即该任务的独立 Session。 +- **Delegation(委派)**:由 `sessions_send` 触发的结构化任务委派,全平台可用。 +- **Discussion(讨论)**:由 @mention 触发的多 Agent 实时讨论,仅 Slack 多 Bot [待 POC 验证]。 +- **Multi-Account(多账户)**:每个 Agent 使用独立 Slack App(独立 bot token / app token / bot user ID)。 +- **Orchestrator(编排者)**:控制讨论节奏的角色。默认是 CoS(代表用户推进),也可以是人类。 --- ## 1) 权限矩阵(必须遵守) -- CoS → 只能给 CTO 派单/对齐方向(默认不直达 Builder) -- CTO → 可以派单给 Builder / Research / KO / Ops +- CoS → CTO(默认不直达 Builder);Discussion 模式中 CoS 是编排者 +- CTO → Builder / Research / KO / Ops - Builder → 只接单执行;需要澄清时回到 CTO thread 提问 - CIO → 尽量独立;仅必要时与 CoS/KO 同步 - KO/Ops → 作为审计/沉淀,通常不主动派单 -(注:技术上 Slack bot 可以给任意频道发消息,但这是组织纪律,不遵守视为 bug。) +(注:技术上 bot 可以给任意频道发消息,但这是组织纪律,不遵守视为 bug。) --- -## 2) A2A 触发方式(核心) +## 2a) Delegation Mode(委派模式 — 全平台) 当 A 想让 B 开工时(**不允许人工复制粘贴**): -> ⚠️ 重要现实:Slack 中所有 Agent 共用同一个 bot 身份。 -> **bot 自己发到别的频道的消息,默认不会触发对方 Agent 自动运行**(OpenClaw 默认忽略 bot-authored inbound,避免自循环)。 -> 因此:跨 Agent 的"真正触发"必须通过 **sessions_send(agent-to-agent)** 完成;Slack 发消息仅作为"可见性锚点"。 +> ⚠️ 重要现实:单 Bot 模式下所有 Agent 共用一个 bot 身份。 +> **bot 自己发到别的频道的消息,默认不会触发对方 Agent**(OpenClaw 忽略 bot-authored inbound,防自循环)。 +> 因此:跨 Agent 触发必须通过 **sessions_send** 完成;频道消息仅作"可见性锚点"。 ### Step 1 - 在目标频道创建可见的 root message(锚点) -A 在 B 的 Slack 频道创建一个任务根消息(root message),第一行固定前缀: +A 在 B 的频道创建 root message,第一行固定前缀: ``` A2A | | TID:<YYYYMMDD-HHMM>-<short> ``` 正文必须是完整任务包(建议使用 `~/.openclaw/shared/SUBAGENT_PACKET_TEMPLATE.md`): -- Objective(目标) -- DoD(完成标准) -- Inputs(已有信息/链接/文件) -- Constraints(约束/边界) -- Output format(输出格式) -- CC(需要同步到哪个频道/人) +- Objective / DoD / Inputs / Constraints / Output format / CC -> 前置条件:OpenClaw bot 必须被邀请进目标频道,否则会报 `not_in_channel`。 +> 前置条件:bot 必须已加入目标频道,否则报 `not_in_channel`。 -### Step 2 - 用 sessions_send 触发 B 在该 thread/session 中运行 -A 读取 root message 的 Slack message id(ts),拼出 thread sessionKey: +### Step 2 - 用 sessions_send 触发目标 Agent +A 读取 root message 的 message id(ts),拼出 thread sessionKey: -- 频道 session:`agent:<B>:slack:channel:<channelId>` -- 线程 session:`agent:<B>:slack:channel:<channelId>:thread:<root_ts>` +| 平台 | Session Key 格式 | +|------|-----------------| +| Slack | `agent:<B>:slack:channel:<channelId>:thread:<root_ts>` | +| Discord | `agent:<B>:discord:channel:<channelId>:thread:<root_ts>` | +| Feishu | `agent:<B>:feishu:group:<chatId>:topic:<root_id>` | -然后 A 用 `sessions_send(sessionKey=..., message=<完整任务包或第一步的引用>)` 触发 B。 +然后 A 用 `sessions_send(sessionKey=..., message=<完整任务包>)` 触发 B。 -> ⚠️ **timeout 容错**:`sessions_send` 返回 timeout **≠ 没送达**。消息可能已送达并被处理。 -> 规避:在 Slack thread 里补发一条兜底消息("已通过 A2A 发送,如未收到可在此查看全文")。 - -> ⚠️ **SessionKey 注意**:不要手打 sessionKey。优先从 `sessions_list` 复制 `deliveryContext=slack` 的 key。 -> 注意 channel ID 大小写一致性——大小写不一致可能导致 session 被拆分,路由到 webchat 而非 Slack。 +> ⚠️ **timeout ≠ 失败**。消息可能已送达。规避:在 thread 里补发兜底消息。 +> ⚠️ **SessionKey 不要手打**。从 `sessions_list` 复制 `deliveryContext` 匹配的 key。 ### Step 3 — 执行与汇报 - B 的执行与产出都留在该 thread。 -- 需要上游(如 CTO)掌控节奏时,上游应在自己的协调 thread 里同步 checkpoint/closeout(见第 3 节)。 +- 上游在自己的协调 thread 里同步 checkpoint/closeout。 --- ## 2.5) 多轮 WAIT 纪律(实战验证) -当 A2A 任务需要多轮迭代时(大部分非 Q 类任务都是): +当 A2A 任务需要多轮迭代时: - **每轮只聚焦 1-2 个改动点**,完成后**必须 WAIT**。 -- **禁止一次性做完所有步骤**--等上游下一轮指令后再继续。 -- 每轮输出格式固定: +- **禁止一次性做完所有步骤**——等上游指令后再继续。 +- 每轮输出格式: ``` [<角色>] Round N/M Done: <做了什么> Run: <执行了什么命令> - Output: <关键输出,允许截断> + Output: <关键输出> WAIT: 等待上游指令 ``` -- 最终轮贴 closeout 到 thread,A2A reply 中回复 `REPLY_SKIP` 表示完成。 +- 最终轮贴 closeout,A2A reply 中回复 `REPLY_SKIP`。 ### Round0 审计握手(推荐) +在 Round1 前,先验证审计链路:要求目标 Agent 执行 `pwd` 并贴到 thread。看不到回传就停止。 + +--- + +## 2b) Discussion Mode(讨论模式 — Slack 多 Bot)[待 POC 验证] + +> Discussion 是 Delegation 的增强,不是替代。适用于需要多方实时讨论的场景。 +> 仅 Slack 平台支持。原因见 §7。 + +### 核心思路:选择性独立化 + +不需要每个 Agent 都有独立 Slack App。只需让**少数高价值的横向 Agent**(如 CoS、QA)拥有独立 App,然后**把它们拉进现有 Agent 的频道**进行协作: + +``` + 独立 Slack App 共享 Slack App (现有) + ┌─────────┐ ┌─────────────────┐ + │ CoS-Bot │ │ Default-Bot │ + └────┬────┘ └───┬───┬───┬─────┘ + │ │ │ │ + 频道: #hq(home) #cto #build #cto #build #invest ... + ────────────────────────────────────────────────────── + Agent: CoS ← 进入协作 → CTO Builder CIO ... +``` + +**CoS-Bot 被拉进 #cto** → 直接在 CTO 的地盘对话 → 像两个人在同一间办公室讨论。 + +### 技术原理(源码验证) + +1. **Self-loop 按 account 隔离**:每个 Slack App 有独立 `botUserId`,OpenClaw 只过滤来自自己的消息(`message.user === ctx.botUserId`),不同 App 之间不互相过滤。 +2. **`allowBots: true`**:允许处理其他 bot 的消息。须在目标频道的 channel config 中开启。 +3. **Per-account channel config**:同一频道可以给不同 account 设置不同的 `requireMention`。例如 #cto 频道:CTO 的 account → `requireMention: false`(照常响应所有消息);CoS 的 account → `requireMention: true`(只在被 @mention 时响应)。 +4. **Thread participation 隐式 mention**:CoS-Bot 一旦在某个 thread 中发过消息,后续该 thread 的消息会触发 CoS 的隐式 mention,让对话可以持续进行。 + +### 协作流程 + +``` +用户在 #cto: "@CoS 请协调评审 X 功能的架构方案" + +CoS-Bot 收到 @mention → 进入 #cto thread + → CoS: "好的,我来协调。@CTO 请先提出你的方案。" + +CTO (Default-Bot) 收到消息(requireMention: false,在自己频道) + → CTO: "方案如下:... 建议用方案 A。" + +CoS-Bot 收到(thread participation 隐式 mention) + → CoS: "@CTO 方案 A 的成本如何?另外 @Builder 请评估可行性。" + (CoS 决定下一个参与者——编排者角色) + +Builder (Default-Bot) 在 #cto 收到 @mention → 加载 thread 历史 + → Builder: "方案 A 工期约 2 周,有一个依赖需要先解决。" + +CoS-Bot 收到 → 综合意见 + → CoS: "DISCUSSION_CLOSE | 共识:采用方案 A,Builder 先处理依赖。" + → CoS 用 Delegation (sessions_send) 给 CTO 派正式任务 +``` + +### 关键约束 + +- **CoS 是编排者**:决定讨论节奏,谁下一个发言。CTO/Builder 是参与者,不主动 @mention 其他 Agent。 +- **轮次上限**:CoS 的 AGENTS.md 写明 `maxDiscussionTurns: 5`。到限后必须结束。 +- **Thread 隔离**:每个讨论 = 一个 thread。 +- **讨论后转 Delegation**:Discussion 的 Action Item 通过 Delegation 执行。 + +### Discussion 终止协议 + +讨论结束时,Orchestrator 发送: + +``` +DISCUSSION_CLOSE +Topic: <讨论主题> +Consensus: <共识 / "未达成共识"> +Actions: <后续 Delegation 任务列表,含 TID> +Participants: <参与 Agent 列表> +``` + +--- + +## 2c) 平台能力矩阵 + +| 能力 | Slack | Discord | Feishu | +|------|-------|---------|--------| +| Delegation | YES | YES | YES | +| Discussion | 待 POC 验证 | NO(OpenClaw 代码层阻塞) | NO(飞书平台限制) | +| Multi-Account | YES | YES | YES(注意 #47436) | +| Thread/Topic 隔离 | YES (native) | YES (auto-archive) | YES (groupSessionScope >= 2026.3.1) | + +**为什么 Discord 和 Feishu 不能用 Discussion?** -在正式 Round1 前,先做一个**极小的真实动作**验证审计链路: -- 要求目标 Agent 执行一个无副作用命令(如 `pwd`)并把结果贴到 Slack thread。 -- **看不到 Round0 回传就停止**--说明目标 Agent 的 session 可能没绑定 Slack(deliveryContext 落到 webchat),继续执行会导致"在跑但 Slack 不可审计"。 +- **Discord**:平台层面支持跨 bot 消息可见,但 OpenClaw 的 bot 消息过滤器(Issue #11199)将所有已配置 bot 视为"自己"并丢弃,导致 Bot-A 的消息被 Bot-B 的 handler 忽略。此外 `requireMention` 在多账户下也失效(Issue #45300)。两个 issue 均已关闭但未修复——属于 OpenClaw 代码层 bug,非平台限制。 +- **Feishu**:飞书 `im.message.receive_v1` 事件**仅投递用户发送的消息**,bot 发送的消息对其他 bot 完全不可见。这是飞书平台的 API 设计,无法通过 OpenClaw 配置绕过。 --- ## 3) 可见性(用户 必须能看到) -- 任务根消息必须在目标频道可见(root message 作为锚点)。 -- 关键 checkpoint(开始/阻塞/完成)至少更新 1 次。 -- **上游负责到底**:谁派单(例如 CTO 派给 Builder),谁负责在自己的协调 thread 里持续跟进: - - Builder thread 的输出由 CTO 通过 sessions_send 的 tool result 捕获 - - CTO 必须在 #cto 的对应协调 thread 里同步 checkpoint(避免 用户 去多个频道"捞信息") -- **双通道留痕**: - - A2A reply(给上游的结构化回复) - - Slack thread message(给用户可见的审计日志,格式 `[角色] 内容...`) - - **两者都要做**--A2A reply 只有上游能看到,thread message 用户才能看到。 -- 完成后必须 closeout(DoD 硬规则,缺一不可): - 1. 在目标 Agent thread 贴 closeout(产物路径 + 验证命令) - 2. **上游本机复核**(CLI-first):至少执行关键命令 + 贴 exit code - 3. **回发起方频道汇报**:同步最终结果 + 如何验证 + 风险遗留。**不做视为任务未完成** - 4. 通知 KO 沉淀(默认:同步到 #know + 触发 KO ingest) +- 任务根消息必须在目标频道可见。 +- 关键 checkpoint 至少更新 1 次。 +- **上游负责到底**:派单方在自己的频道同步 checkpoint。 +- **双通道留痕**:A2A reply(上游可见)+ Thread message(用户可见),两者都要做。 +- 完成后必须 closeout: + 1. 在目标 thread 贴 closeout + 2. 上游本机复核(CLI-first) + 3. 回发起方频道汇报(**不做视为未完成**) + 4. 通知 KO 沉淀 --- -## 4) 频道映射(约定) +## 4) 频道映射 -- #hq → CoS +- #hq → CoS(home) - #cto → CTO - #build → Builder - #invest → CIO - #know → KO - #ops → Ops - #research → Research -- #main(可选:你的主入口频道) → Main Agent(可选)(不属于本系统,但可作为 用户 的总入口) + +Discussion 模式下,CoS-Bot 进入其他 Agent 的频道(如 #cto、#build)进行协作,无需额外创建共享频道。 --- -## 5) 命名与并行 +## 5) Session Key 格式与命名 + +**一个任务 = 一个 thread = 一个 session。** -- **一个任务 = 一个 thread = 一个 session**。 -- 同一个频道可以并行多个任务 thread;不要在频道主线里混聊多个任务。 +| 平台 | Session Key | +|------|------------| +| Slack (thread) | `agent:<B>:slack:channel:<channelId>:thread:<root_ts>` | +| Discord (thread) | `agent:<B>:discord:channel:<channelId>:thread:<root_ts>` | +| Feishu (topic) | `agent:<B>:feishu:group:<chatId>:topic:<root_id>` | --- ## 6) 失败回退 -如果 Slack thread 行为异常: -- 退回到"单频道单任务":临时在频道主线完成该任务 -- 或让 CTO/CoS 在 thread 里发 /new 重置(开始新 session id) +| 模式 | 故障 | 回退 | +|------|------|------| +| Delegation | `sessions_send` timeout | 在 thread 补兜底消息;检查 session key | +| Delegation | Agent 无回复 | Round0 审计握手可提前发现;检查 deliveryContext | +| Discussion | Agent 未响应 @mention | 检查 `allowBots` + `requireMention` 配置;检查 bot 是否在频道 | +| Discussion | 讨论死循环 | Orchestrator 强制 DISCUSSION_CLOSE | +| Discussion | 平台不支持 | 降级为 Delegation(`sessions_send` 串联意见) | + +--- + +## 7) 已知限制与待验证 + +1. **Slack Discussion [待 POC 验证]**:各组件已通过源码验证(self-loop per-account、allowBots 三级 fallback、per-account channel config),但端到端链路尚无实测记录。 +2. **Discord Discussion [NO]**:OpenClaw Issues #11199(bot filter 全局化)+ #45300(requireMention 多账户失效),均已关闭未修复。 +3. **Feishu Discussion [NO]**:飞书 `im.message.receive_v1` 仅投递用户消息(平台限制,非 OpenClaw bug)。 +4. **Issue #15836**:OpenClaw 关闭了 Slack A2A routing 请求(NOT_PLANNED)。`sessions_send` 仍是官方推荐方式。Discussion 作为增强,非替代。 + +--- + +## 附录 A:Discussion Mode 配置指南(Slack) + +> 以下配置可由你的 OpenClaw agent 协助完成。人工操作仅需创建 Slack App 和邀请 bot。 + +### 人工操作(一次性) + +1. **创建独立 Slack App**(如 CoS-Bot): + - 前往 [api.slack.com/apps](https://api.slack.com/apps) → Create New App + - 启用 Socket Mode,获取 App Token (`xapp-`) + - 添加 Bot Token Scopes: `channels:history`, `channels:read`, `chat:write`, `users:read` + - 添加 Event Subscriptions: `message.channels`, `app_mention` + - 获取 Bot Token (`xoxb-`) + - 记录 Bot User ID(Settings → Basic Info → App Credentials,或在 Slack 中查看 bot 的 profile) + +2. **邀请 CoS-Bot 到目标频道**:在 #cto、#build 等频道中运行 `/invite @CoS-Bot` + +### Agent 可执行的配置(发给你的 OpenClaw) + +> 以下是给 OpenClaw agent 的执行提示。将凭证替换为实际值后,发送给你的 agent。 + +``` +请帮我配置 Discussion Mode。 + +CoS-Bot 凭证(写入配置,不要回显): +- Bot Token: xoxb-cos-xxx +- App Token: xapp-cos-xxx + +请在 openclaw.json 中执行以下增量修改(不要覆盖现有配置): + +1. 在 channels.slack 下添加 accounts 块: + - 将现有的 botToken/appToken 移入 accounts.default + - 添加 accounts.cos(使用上面的凭证) + +2. 添加 CoS 的 account binding: + { "agentId": "cos", "match": { "channel": "slack", "accountId": "cos" } } + +3. 在需要跨 bot 协作的频道配置中添加 allowBots: true: + channels.slack.accounts.default.channels.<CTO_CHANNEL_ID>.allowBots = true + channels.slack.accounts.cos.channels.<CTO_CHANNEL_ID>.requireMention = true + channels.slack.accounts.cos.channels.<CTO_CHANNEL_ID>.allowBots = true + +4. 不要修改现有的 agent bindings、models、auth、gateway 配置。 + +5. 重启 gateway 并验证 CoS-Bot 在 #cto 中可以被 @mention 触发。 +``` + +### 配置结构参考 + +```jsonc +{ + "channels": { + "slack": { + "accounts": { + "default": { + "botToken": "${SLACK_BOT_TOKEN}", + "appToken": "${SLACK_APP_TOKEN}" + }, + "cos": { + "botToken": "${SLACK_BOT_TOKEN_COS}", + "appToken": "${SLACK_APP_TOKEN_COS}", + "channels": { + "<CTO_CHANNEL_ID>": { + "requireMention": true, // CoS 在 #cto 只响应 @mention + "allowBots": true // CoS 能看到 CTO 的回复 + } + } + } + }, + "channels": { + "<CTO_CHANNEL_ID>": { + "allow": true, + "allowBots": true // CTO 能看到 CoS-Bot 的消息 + } + }, + "thread": { + "historyScope": "thread", + "initialHistoryLimit": 50 + } + } + }, + "bindings": [ + // CoS: account-level binding + { "agentId": "cos", "match": { "channel": "slack", "accountId": "cos" } }, + // 执行层 agents: peer-level binding(现有,不变) + { "agentId": "cto", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<CTO_CHANNEL_ID>" } } } + // ... builder, cio, ko, ops, research 同理 + ] +} +``` + +--- + +## 附录 B:POC 验证步骤 + +``` +测试 1:基础跨 bot 消息传递 + - 在 #cto 中用人类账号 @mention CoS-Bot + - 验证 CoS 响应 + - 验证 CTO 不会"抢答"(如果 CTO 也回复了,属正常——requireMention: false) + +测试 2:CoS 和 CTO 在 thread 中对话 + - CoS 在 #cto thread 中 @mention CTO(或直接发消息,CTO 的 requireMention 为 false) + - 验证 CTO 回复 + - 验证 CoS 收到 CTO 的回复(thread participation 隐式 mention) + - 验证 CoS 可以继续对话(第二轮) + +测试 3:轮次控制 + - 在 CoS 的 AGENTS.md 中设定 maxDiscussionTurns: 3 + - 发起讨论 + - 验证 CoS 在第 3 轮后发布 DISCUSSION_CLOSE + +测试 4:多 Agent 讨论 + - 邀请 CoS-Bot 到 #build + - 在 #cto thread 中 CoS @mention Builder + - 验证 Builder 收到并响应 + - 验证 CoS 能综合 CTO + Builder 的意见 +``` From 83f631b2eec50892acc74c8631476037656a97bb Mon Sep 17 00:00:00 2001 From: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> Date: Thu, 2 Apr 2026 21:47:02 +0800 Subject: [PATCH 2/4] =?UTF-8?q?feat:=20A2A=20v2=20=E2=80=94=20verified=20D?= =?UTF-8?q?iscussion=20Mode=20+=20setup=20guide=20+=20harness=20design=20i?= =?UTF-8?q?ntegration?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Based on 2026-03-30 ~ 2026-04-02 live testing: Config experience (Block 1): - Multi-account setup with accounts.default requirement - Complete Slack App manifest (all scopes + events) - Binding with accountId+peer for routing - P1 incident lesson: accounts.default must be explicit Collaboration experience (Block 2): - requireMention: true bypassed in threads by implicitMention - allowBots: 'mentions' only works on Discord, not Slack - Two-layer defense: Config (channel level) + Prompt rules (thread level) - Explicit @mention protocol for message routing - File-based collaboration from Anthropic Harness Design - Orchestrator = Planner + Evaluator (separation of concerns) Files: - docs/A2A_SETUP_GUIDE.md: new — complete setup + collaboration guide - shared/A2A_PROTOCOL.md: updated — [待 POC 验证] → [已验证] - docs/CONCEPTS.md: updated — verified status + harness design context - docs/en/CONCEPTS.md: updated — English sync --- .harness/reports/architecture_collab_r1.md | 1099 +++++++++++++++++ .harness/reports/architecture_protocol_r1.md | 918 ++++++++++++++ .harness/reports/qa_a2a_research_r1.md | 237 ++++ .harness/reports/qa_docs_official_r1.md | 217 ++++ .../reports/research_autonomous_slack_r1.md | 567 +++++++++ .harness/reports/research_discord_r1.md | 382 ++++++ .harness/reports/research_feishu_r1.md | 433 +++++++ .../research_platform_limitations_r1.md | 109 ++ .../reports/research_selective_agents_r1.md | 501 ++++++++ .harness/reports/research_slack_r1.md | 382 ++++++ .harness/reports/verify_source_code_r1.md | 389 ++++++ CLAUDE.md | 98 ++ docs/A2A_SETUP_GUIDE.md | 442 ++++--- docs/CONCEPTS.md | 38 +- docs/en/CONCEPTS.md | 38 +- shared/A2A_PROTOCOL.md | 189 ++- 16 files changed, 5755 insertions(+), 284 deletions(-) create mode 100644 .harness/reports/architecture_collab_r1.md create mode 100644 .harness/reports/architecture_protocol_r1.md create mode 100644 .harness/reports/qa_a2a_research_r1.md create mode 100644 .harness/reports/qa_docs_official_r1.md create mode 100644 .harness/reports/research_autonomous_slack_r1.md create mode 100644 .harness/reports/research_discord_r1.md create mode 100644 .harness/reports/research_feishu_r1.md create mode 100644 .harness/reports/research_platform_limitations_r1.md create mode 100644 .harness/reports/research_selective_agents_r1.md create mode 100644 .harness/reports/research_slack_r1.md create mode 100644 .harness/reports/verify_source_code_r1.md create mode 100644 CLAUDE.md diff --git a/.harness/reports/architecture_collab_r1.md b/.harness/reports/architecture_collab_r1.md new file mode 100644 index 0000000..65bf3f0 --- /dev/null +++ b/.harness/reports/architecture_collab_r1.md @@ -0,0 +1,1099 @@ +commit 7e825263db36aef68792a050c324daef598b4c56 +Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> +Date: Sat Mar 28 17:38:48 2026 +0800 + + feat: add A2A v2 research harness, architecture, and agent definitions + + Multi-agent harness for researching and designing A2A v2 protocol: + + Research reports (Phase 1): + - Slack: true multi-agent collaboration via multi-account + @mention + - Feishu: groupSessionScope + platform limitation analysis + - Discord: multi-bot routing + Issue #11199 blocker analysis + + Architecture designs (Phase 2): + - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode + - 5 collaboration patterns: Architecture Review, Strategic Alignment, + Code Review, Incident Response, Knowledge Synthesis + - 3-level orchestration: Human → Agent → Event-Driven + - Platform configs, migration guides, 6 ADRs + + Agent definitions for Claude Code Agent Teams: + - researcher.md, architect.md, doc-fixer.md, qa.md + + QA verification: all issues resolved, PASS verdict after fixes. + + Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> + +diff --git a/.harness/reports/architecture_collab_r1.md b/.harness/reports/architecture_collab_r1.md +new file mode 100644 +index 0000000..8ff71a1 +--- /dev/null ++++ b/.harness/reports/architecture_collab_r1.md +@@ -0,0 +1,1066 @@ ++# Architecture Report: Multi-Agent Collaboration Patterns (U0/U3, Round 1) ++ ++> Architect: Claude Opus 4.6 | Date: 2026-03-27 | Contract: `.harness/contracts/architecture-a2a.md` ++ ++--- ++ ++## Executive Summary ++ ++本报告定义了 OpenCrew 从"委派式 A2A"演化为"协作式 A2A"的完整架构设计。核心洞察:当前 A2A 是**单向委派**(A 发任务包给 B,B 执行后汇报),我们需要的是**多方协作**(多个 Agent 在同一 thread 中各自带入领域判断,互相挑战和完善)。 ++ ++设计产出五个部分: ++1. **协作模式目录** -- 5 种可落地的协作模式,每种含完整机制描述 ++2. **编排模型** -- 3 级编排层次,从人工驱动到事件驱动 ++3. **共享 Thread 协议** -- 多 Agent thread 的命名、轮次、终止、升级规范 ++4. **Harness 集成** -- 文件式(harness)与聊天式(OpenCrew)协作的映射关系 ++5. **配置模板** -- 可直接使用的 Slack 多账号配置片段 ++ ++**平台支持矩阵**(贯穿全文): ++ ++| 平台 | 多方协作 | 阻塞因素 | 替代方案 | ++|------|---------|---------|---------| ++| **Slack** | NOW | 无 -- `allowBots` + `requireMention` + multi-account 已就绪 | -- | ++| **Discord** | BLOCKED | Issue #11199(同实例 bot 消息互相过滤) | `sessions_send` 委派模式可用 | ++| **Feishu** | NOT POSSIBLE | 平台限制:`im.message.receive_v1` 仅投递用户消息,bot 消息对其他 bot 不可见 | `sessions_send` + 话题隔离可用 | ++ ++--- ++ ++## 1. Collaboration Patterns Catalog ++ ++### Pattern 1: Architecture Review(架构评审) ++ ++**描述**:CTO 提出技术方案,Builder 从可行性角度质疑,QA/Ops 识别风险,迭代至收敛。这是"提案-挑战-精炼"循环。 ++ ++**适用场景**: ++- 实现重大功能前的技术方案评审 ++- 引入新依赖或架构变更 ++- 跨系统集成方案论证 ++ ++**参与者**: ++| Agent | 角色 | 贡献 | ++|-------|------|------| ++| CTO | 提案者 | 提出架构方案、回应质疑、迭代设计 | ++| Builder | 可行性评审者 | 从实现角度评估复杂度、工期、技术债 | ++| Ops | 风险评审者 | 评估运维影响、安全风险、回滚策略 | ++| 用户 | 决策者 | 定目标、验收、在分歧时拍板 | ++ ++**机制(step-by-step)**: ++ ++``` ++Step 1: 用户(或 CoS)在 #collab 频道发帖 ++ "我们需要评审 X 功能的技术方案" ++ → CTO 作为频道绑定 Agent 自动收到(或被 @mention) ++ ++Step 2: CTO 发布架构提案(在 thread 中) ++ 格式:[CTO] Proposal: <标题> ++ 内容:目标、方案、技术选型、风险评估 ++ ++Step 3: 用户 @mention Builder ++ "@Builder 请从实现角度评估这个方案" ++ → Builder 的 bot 收到 mention event ++ → Builder 加载 thread 历史(通过 initialHistoryLimit) ++ → Builder 发布可行性评估 ++ ++Step 4: 用户 @mention Ops(如需要) ++ "@Ops 请评估运维和安全影响" ++ → Ops 加载 thread 历史,看到 CTO 提案 + Builder 评估 ++ → Ops 发布风险评估 ++ ++Step 5: 用户 @mention CTO ++ "@CTO 请回应 Builder 和 Ops 的反馈" ++ → CTO 看到所有历史,发布修订方案或反驳 ++ ++Step 6: 重复 Step 3-5 直到收敛 ++ 收敛标志:所有参与者表示"无进一步反对"或用户拍板 ++ ++Step 7: CTO 发布最终决议总结(thread 内) ++ 格式:[CTO] DECISION: <结论> ++ 内容:最终方案、遗留风险、下一步 action ++``` ++ ++**平台支持**: ++- **Slack**: NOW -- 多账号 + `allowBots: true` + `requireMention: true` ++- **Discord**: AFTER #11199 -- 修复后机制完全相同 ++- **Feishu**: NOT SUPPORTED -- bot 消息对其他 bot 不可见;替代:用户手动转述或 `sessions_send` 逐步委派 ++ ++**防护栏**: ++- **循环防护**:设定 `maxDiscussionTurns = 5`(AGENTS.md 指令层)。超过 5 轮未收敛,必须升级到用户拍板 ++- **噪音防护**:所有共享频道 `requireMention: true`,Agent 不被 @mention 就不响应 ++- **上下文溢出防护**:每位 Agent 的发言不超过 500 字。超过时拆分为"摘要 + 详细附件链接" ++- **发散防护**:每轮必须聚焦 1 个议题。CTO 作为提案者负责"下一轮聚焦什么" ++ ++**示例场景**: ++ ++> 用户想给 OpenCrew 增加"自动 PR 创建"功能。 ++> ++> 1. 用户在 #collab: "评审自动 PR 创建功能的技术方案" ++> 2. CTO: "建议用 GitHub API + Builder Agent 的 bash tool 实现。架构:CoS 收到用户请求 → 转 CTO 拆解 → Builder 执行 git/gh 命令 → Builder closeout 含 PR URL" ++> 3. @Builder: "git 操作需要 repo 写权限。当前 Builder 的 tool 配置没有 gh auth。需要增加 GITHUB_TOKEN 环境变量。工期估计:配置 30 分钟,测试 1 小时" ++> 4. @Ops: "风险:GITHUB_TOKEN 泄露到 closeout 日志。建议:使用 SecretRef 而非明文。另外需要 L3 用户确认(创建 PR 是可回滚但影响较大的动作)" ++> 5. @CTO: "接受两点反馈。修订:使用 SecretRef 管理 token;PR 创建前在 thread 里 WAIT 用户确认(L2→L3 升级)" ++> 6. 用户: "APPROVED。CTO 请拆解任务给 Builder" ++> 7. CTO 发布 DECISION 总结 → 转入 A2A v1 委派流程给 Builder ++ ++--- ++ ++### Pattern 2: Strategic Alignment(战略对齐) ++ ++**描述**:用户表达高层目标,CoS 解读深层意图,CTO 规划技术路径,CIO 补充领域约束,多方收敛出可执行计划。 ++ ++**适用场景**: ++- 新项目/新方向启动 ++- 季度 OKR 拆解 ++- 重大方向转变(pivot) ++ ++**参与者**: ++| Agent | 角色 | 贡献 | ++|-------|------|------| ++| CoS | 意图解读者 | 澄清用户真实意图、优先级、价值判断 | ++| CTO | 技术规划者 | 将意图转化为技术路径、评估可行性 | ++| CIO | 领域约束者 | 补充领域知识、市场约束、合规要求 | ++| 用户 | 决策者 | 定方向、校正偏差 | ++ ++**机制(step-by-step)**: ++ ++``` ++Step 1: 用户在 #hq 频道发布目标 ++ "我想做 X,因为 Y"(可以是模糊的一句话) ++ → CoS 作为 #hq 绑定 Agent 首先响应 ++ ++Step 2: CoS 解读意图 ++ 格式:[CoS] Intent Alignment ++ 内容:我理解的目标、隐含假设、需要确认的点 ++ → 用户确认/校正 ++ ++Step 3: CoS @mention CTO(在同一 thread) ++ "@CTO 基于以上对齐的目标,请规划技术路径" ++ → CTO 加载 thread(看到用户原始目标 + CoS 解读 + 用户确认) ++ → CTO 发布技术方案草案 ++ ++Step 4: CoS @mention CIO(如涉及领域) ++ "@CIO 请补充领域约束和市场视角" ++ → CIO 加载 thread,看到全部上下文 ++ → CIO 发布领域分析 ++ ++Step 5: CoS 综合所有输入,产出对齐摘要 ++ 格式:[CoS] Alignment Summary ++ 内容:确认的目标、技术路径、领域约束、优先级排序、下一步 ++ → 用户最终确认 ++ ++Step 6: 确认后,CoS 通过 A2A v1 委派给 CTO 执行拆解 ++``` ++ ++**关键设计决策**:CoS 作为编排者而非信息中转。CoS 的价值不是"传话",而是在每一轮中加入自己的判断——"用户说了 X,但我认为真正的需求是 Y"。这与 ARCHITECTURE.md 中"CoS 是战略伙伴不是秘书"的定位一致。 ++ ++**平台支持**: ++- **Slack**: NOW ++- **Discord**: AFTER #11199 ++- **Feishu**: NOT SUPPORTED(替代:CoS 在各 Agent 群组间用 `sessions_send` 逐步推进,手动综合) ++ ++**防护栏**: ++- **意图漂移防护**:CoS 每轮输出必须包含"与用户原始目标的对齐度"评估 ++- **过度规划防护**:战略对齐最多 3 轮。3 轮后必须产出可执行的 next action ++- **CTO/CIO 范围防护**:CTO 只谈技术、CIO 只谈领域。越界时 CoS 有权引导回正题 ++ ++**示例场景**: ++ ++> 用户: "我想让 OpenCrew 支持自动处理 GitHub Issues" ++> ++> 1. CoS: "理解目标。我的解读:你想让 Agent 自动 triage issues、分类、分配。但我想确认——是全自动(Agent 直接处理)还是半自动(Agent 分析后等你确认)?另外范围是所有 repo 还是特定 repo?" ++> 2. 用户: "先做半自动,只针对 opencrew repo" ++> 3. CoS @CTO: "目标确认:半自动 GitHub Issues triage for opencrew repo。请规划技术路径。" ++> 4. CTO: "方案:用 GitHub Webhook → OpenClaw channel → Ops Agent 接收。Ops 分析 issue 后产出建议(分类 + 优先级 + 建议处理者),发到 #ops 等用户确认。技术需要:新增 GitHub channel 配置、Ops AGENTS.md 增加 triage 指令。" ++> 5. CoS @CIO: "从项目管理视角,有什么分类标准建议?" ++> 6. CIO: "建议分类:bug/feature/docs/question。优先级用 impact x urgency 矩阵。注意:外部贡献者的 issue 应该比内部的优先响应(社区建设)" ++> 7. CoS 综合: "Alignment Summary: 目标=半自动 issue triage for opencrew。路径=GitHub Webhook + Ops Agent。分类=bug/feature/docs/question。优先级=impact x urgency。社区 issue 优先。下一步:CTO 拆解任务。" → 用户确认 → 委派 ++ ++--- ++ ++### Pattern 3: Code/Design Review(代码/设计评审) ++ ++**描述**:Builder 产出代码或设计文档,CTO 评审架构合理性,QA/Ops 评审正确性和安全性,KO 检查知识一致性。这是"生产-多维评审-修订"循环。 ++ ++**适用场景**: ++- PR 合并前评审 ++- 设计文档评审 ++- 配置变更评审 ++- 知识库内容评审 ++ ++**参与者**: ++| Agent | 角色 | 贡献 | ++|-------|------|------| ++| Builder | 生产者 | 产出代码/文档,根据反馈修订 | ++| CTO | 架构评审者 | 评审架构合理性、设计模式、长期可维护性 | ++| Ops | 正确性/安全评审者 | 评审安全风险、运维影响、合规性 | ++| KO | 知识一致性评审者 | 检查与已有知识(principles/patterns/scars)的一致性 | ++ ++**机制(step-by-step)**: ++ ++``` ++Step 1: Builder 在 #build thread 完成实现,产出 closeout ++ closeout 包含:产出物路径、变更摘要、验证命令 ++ ++Step 2: CTO 在 #build thread 中 @mention(或 CTO 主动发起评审 thread) ++ → 评审在一个"评审 thread"中进行(可以在 #collab 或 #cto) ++ → CTO 发布架构评审 ++ ++Step 3: @Ops 评审安全/运维 ++ → Ops 看到 Builder 的 closeout + CTO 的评审 ++ → 发布安全/运维评估 ++ ++Step 4: @KO 评审知识一致性(如涉及系统变更) ++ → KO 看到全部上下文 ++ → 检查是否与 principles.md/patterns.md/scars.md 冲突 ++ → 如有冲突则指出 ++ ++Step 5: CTO 综合所有评审意见 ++ 格式:[CTO] Review Summary ++ 状态:APPROVED / NEEDS_REVISION / REJECTED ++ 如 NEEDS_REVISION:列出具体修改项 → 回到 Builder ++ ++Step 6: Builder 修订后重新提交(thread 内) ++ → 重复 Step 2-5(通常 1-2 轮即可收敛) ++``` ++ ++**与 Harness Evaluator 的对比**: ++ ++Anthropic harness 设计中的 Evaluator 是单一评审者,采用"Generator vs Evaluator"对抗模式。OpenCrew 的评审是**多维度评审**——不同 Agent 从不同维度审查同一产出。这更接近真实团队中"架构师看设计、安全团队看漏洞、文档团队看一致性"的工作方式。 ++ ++**平台支持**: ++- **Slack**: NOW ++- **Discord**: AFTER #11199 ++- **Feishu**: NOT SUPPORTED(替代:CTO 手动在各群组间转述评审结论,或人工触发 `sessions_send`) ++ ++**防护栏**: ++- **评审范围限定**:每位评审者只评审自己领域。CTO 不评审安全,Ops 不评审架构 ++- **评审轮次限制**:最多 3 轮修订。3 轮后要么 APPROVED 要么升级到用户决策 ++- **上下文容量管理**:`initialHistoryLimit` 建议设为 80-100(评审 thread 内容较多) ++- **评审格式标准化**:每条评审必须包含 `severity: [BLOCKER|MAJOR|MINOR|NITPICK]` + `具体修改建议` ++ ++**示例场景**: ++ ++> Builder 完成了 A2A_PROTOCOL.md v2 的草案。 ++> ++> 1. Builder closeout: "A2A_PROTOCOL_V2.md 已产出,路径 shared/A2A_PROTOCOL_V2.md。变更:新增多 bot 模式、简化触发流程、新增协作模式引用。" ++> 2. @CTO: "架构评审:新增的 multi-bot 模式逻辑清晰。但第 3 节协作模式引用缺少 Feishu 的降级方案。MAJOR:请补充。另外命名建议:A2A v1/v2 改为 delegation-mode/discussion-mode,避免暗示 v1 被废弃。MINOR。" ++> 3. @Ops: "安全评审:multi-bot 配置示例中 botToken 是明文。BLOCKER:必须改为 SecretRef 或环境变量引用。另外 allowBots: true 的安全含义需要在文档中明确警告。MAJOR。" ++> 4. @KO: "知识一致性检查:与 scars.md 中'sessions_send timeout 不等于未送达'的记录一致。与 patterns.md 中'一个任务 = 一个 thread = 一个 session'的原则需要更新——discussion-mode 中一个 thread 可能对应多个 Agent 的 session。MAJOR:建议更新 patterns.md。" ++> 5. CTO Review Summary: "NEEDS_REVISION。3 项需修改:Feishu 降级方案、token 安全化、patterns.md 更新。" ++> 6. Builder 修订 → 重新提交 → 第二轮评审 → APPROVED ++ ++--- ++ ++### Pattern 4: Incident Response(事件响应) ++ ++**描述**:Ops 检测到异常,CTO 诊断根因,Builder 提出修复方案,QA 验证修复。快速迭代直至解决。 ++ ++**适用场景**: ++- 生产环境异常(Agent 无响应、消息路由错误、配置漂移) ++- A2A 通信故障 ++- 安全事件 ++ ++**参与者**: ++| Agent | 角色 | 贡献 | ++|-------|------|------| ++| Ops | 检测 + 初步分析 | 发现问题、收集初步证据、触发响应流程 | ++| CTO | 诊断者 | 分析根因、确定影响范围、决定修复策略 | ++| Builder | 修复者 | 实施修复、验证修复 | ++| 用户 | 审批者 | L3 动作审批(如需要) | ++ ++**机制(step-by-step)**: ++ ++``` ++Step 1: Ops 在 #ops 检测到异常(手动或自动) ++ → Ops 在 #collab 创建事件响应 thread ++ 格式:[Ops] INCIDENT: <标题> | Severity: P1/P2/P3 ++ 内容:症状描述、初步证据(日志/错误信息)、影响范围 ++ ++Step 2: @CTO 诊断 ++ → CTO 加载 thread,分析 Ops 提供的证据 ++ → 发布诊断结果:根因假设、影响范围评估、建议修复策略 ++ → 如需更多信息:@Ops "请补充 X 日志" ++ ++Step 3: @Builder 修复(如诊断明确) ++ → CTO 在 thread 中 @Builder 并附带修复方案 ++ → Builder 实施修复、在 thread 中报告修复步骤和验证结果 ++ ++Step 4: @Ops 验证 ++ → Ops 验证修复后系统状态 ++ → 发布验证结果:修复有效 / 部分有效 / 无效 ++ ++Step 5: 如修复无效 → 回到 Step 2 ++ 如修复有效 → CTO 发布事件总结 ++ ++Step 6: CTO 发布 INCIDENT RESOLVED ++ 格式:[CTO] INCIDENT RESOLVED: <标题> ++ 内容:根因、修复措施、遗留风险、防复发建议 ++ → 同步到 KO(signal ≥ 2,记入 scars.md) ++``` ++ ++**紧急性设计**:事件响应与其他协作模式的关键区别是**时间压力**。机制设计体现为: ++- **快速启动**:Ops 直接 @mention CTO,不需要 CoS 中转 ++- **并行诊断**:CTO 可以同时 @mention Builder 准备修复环境 ++- **简化格式**:允许短消息,不强制完整的任务包格式 ++- **L3 快速通道**:P1 事件中,用户可以预授权"Builder 可以执行通常需要确认的操作" ++ ++**平台支持**: ++- **Slack**: NOW ++- **Discord**: AFTER #11199(事件响应对实时性要求最高,Discord 解决后应优先支持此模式) ++- **Feishu**: PARTIAL -- Ops 可以在各群组间用 `sessions_send` 分步协调,但缺乏所有参与者共同可见的统一 thread ++ ++**防护栏**: ++- **升级时限**:P1 事件如果 3 轮内未解决(约 15 分钟),自动升级到用户 ++- **操作审计**:所有修复操作必须在 thread 中明文记录(命令 + 输出) ++- **回滚预案**:Builder 每次修复前必须声明回滚步骤 ++- **事后复盘**:INCIDENT RESOLVED 后 24 小时内必须完成 postmortem(可由 KO 协助) ++ ++**示例场景**: ++ ++> Ops 发现 Builder Agent 在 #build 频道不响应。 ++> ++> 1. Ops: "INCIDENT: Builder Agent 无响应 | Severity: P2。症状:#build 频道 @Builder 无反应已 30 分钟。初步检查:gateway 进程正常、Slack WebSocket 连接正常。" ++> 2. @CTO: "诊断:可能原因:(1) Builder 的 session 卡在长任务中 (2) binding 配置问题 (3) Slack app token 过期。请 @Ops 检查 sessions_list 中 Builder 的活跃 session 数量和最近活动时间。" ++> 3. @Ops: "sessions_list 结果:Builder 有 3 个活跃 session,最近一个 45 分钟前创建,状态 active。看起来是卡在长任务。" ++> 4. CTO: "确认根因:Builder session 卡在长任务。修复策略:(A) 等待当前任务完成 (B) 重置卡住的 session。建议 A,如果 30 分钟后仍无响应再执行 B。@Builder 你当前在执行什么任务?" ++> 5. Builder(恢复后): "刚完成一个大文件的 git 操作,耗时 40 分钟。已恢复正常。" ++> 6. CTO: "INCIDENT RESOLVED: Builder 因长任务阻塞导致短暂无响应。根因:大文件 git 操作超过预期耗时。防复发:在 Builder AGENTS.md 中增加'长任务必须每 10 分钟发 checkpoint'的指令。" ++ ++--- ++ ++### Pattern 5: Knowledge Synthesis(知识综合) ++ ++**描述**:KO 呈现提炼后的知识(从 closeout 中提取),CTO 验证技术准确性,CIO 验证领域准确性,CoS 评估战略相关性。 ++ ++**适用场景**: ++- 周期性知识复盘(每周/每月) ++- 新知识条目入库前的交叉验证 ++- 知识库重大更新 ++ ++**参与者**: ++| Agent | 角色 | 贡献 | ++|-------|------|------| ++| KO | 呈现者 | 提炼知识、组织结构、提出入库建议 | ++| CTO | 技术验证者 | 验证技术内容的准确性和时效性 | ++| CIO | 领域验证者 | 验证领域知识的准确性和适用性 | ++| CoS | 战略评估者 | 评估知识的战略相关性和优先级 | ++ ++**机制(step-by-step)**: ++ ++``` ++Step 1: KO 在 #know 或 #collab 创建知识综合 thread ++ 格式:[KO] Knowledge Synthesis: <主题/周期> ++ 内容: ++ - 新增原则(candidates for principles.md) ++ - 新增模式(candidates for patterns.md) ++ - 新增教训(candidates for scars.md) ++ - 建议变更(对现有条目的更新) ++ ++Step 2: @CTO 技术验证 ++ → CTO 逐条验证技术内容 ++ → 标注:ACCURATE / OUTDATED / NEEDS_CONTEXT / INCORRECT ++ → 对 NEEDS_CONTEXT 和 INCORRECT 提供修正建议 ++ ++Step 3: @CIO 领域验证(如涉及领域知识) ++ → CIO 验证领域内容 ++ → 同样标注 + 修正建议 ++ ++Step 4: @CoS 战略评估 ++ → CoS 评估每条知识的战略权重 ++ → 标注:HIGH_VALUE / USEFUL / LOW_VALUE / IRRELEVANT ++ → 对 HIGH_VALUE 建议"升级为原则"或"影响后续规划" ++ ++Step 5: KO 综合所有反馈 ++ → 发布最终版本:哪些入库、哪些修改、哪些丢弃 ++ → 执行知识库更新 ++ → 发布 closeout(signal ≥ 2) ++``` ++ ++**与现有知识管道的集成**: ++ ++当前 KNOWLEDGE_PIPELINE.md 定义了 closeout → KO ingest 的单向流。Knowledge Synthesis 模式将其扩展为**双向验证**:KO 不仅是接收者,还是综合者,主动发起交叉验证。这增加了知识库的可靠性。 ++ ++**平台支持**: ++- **Slack**: NOW ++- **Discord**: AFTER #11199 ++- **Feishu**: NOT SUPPORTED(替代:KO 在各 Agent 群组逐一发送验证请求,手动综合结果) ++ ++**防护栏**: ++- **验证粒度**:每次综合不超过 10 条知识条目(避免评审者认知过载) ++- **频率控制**:每周最多 1 次全面综合。临时入库可以走简化流程(KO 自行决定,signal < 2 不需要交叉验证) ++- **否决权**:CTO 对技术内容有否决权,CIO 对领域内容有否决权。被否决的条目不入库 ++ ++**示例场景**: ++ ++> KO 每周五执行知识综合。 ++> ++> 1. KO: "Knowledge Synthesis: Week 12。新增候选条目:(1) Principle: 'A2A sessions_send timeout 不等于失败,必须有兜底消息' (2) Pattern: '多 Agent 评审用 severity 标签分级' (3) Scar: 'Feishu bot 消息对其他 bot 不可见,跨 bot 触发必须走 sessions_send'" ++> 2. @CTO: "(1) ACCURATE,已在实战中多次验证。(2) ACCURATE,建议补充 severity 定义。(3) ACCURATE,这是 Feishu 平台限制非 OpenClaw bug。" ++> 3. @CIO: "本周无领域相关条目,SKIP。" ++> 4. @CoS: "(1) HIGH_VALUE — 影响所有 A2A 流程设计。(2) USEFUL — 评审效率提升。(3) HIGH_VALUE — 影响 Feishu 多 Agent 架构决策。" ++> 5. KO: "综合结果:3 条全部入库。(1) → principles.md (2) → patterns.md,补充 severity 定义后入库 (3) → scars.md。已更新。" ++ ++--- ++ ++## 2. Orchestration Model ++ ++### Level 1: Human Orchestrated(人工编排) ++ ++**描述**:用户手动 @mention Agent 驱动讨论。用户完全控制节奏、话题、参与者顺序。 ++ ++**机制**: ++1. 用户在共享频道/thread 中发帖 ++2. 用户通过 @mention 指定下一位发言的 Agent ++3. Agent 响应后 WAIT——不主动 @mention 其他 Agent ++4. 用户阅读响应后,决定下一步:@mention 另一位 Agent、要求当前 Agent 深入、或结束讨论 ++ ++**配置要求**: ++```json ++{ ++ "channels": { ++ "slack": { ++ "channels": { ++ "<COLLAB_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": true, ++ "allowBots": true ++ } ++ } ++ } ++ } ++} ++``` ++ ++**防护栏**: ++- 所有 Agent 的 AGENTS.md 中增加指令:`"在协作 thread 中,响应后 WAIT。不要主动 @mention 其他 Agent,除非处于 Level 2 编排模式。"` ++- `requireMention: true` 是硬约束——即使 Agent "想"响应也触发不了(无 mention = 无 event) ++ ++**推荐成熟度**:初始部署阶段。建立信任期。用户需要理解每位 Agent 的判断质量和领域能力。 ++ ++**优势**:完全可控、零风险无限循环、用户随时可以重定向讨论 ++**劣势**:用户成为瓶颈——每一步都需要人工输入 ++ ++--- ++ ++### Level 2: Agent Orchestrated(Agent 编排) ++ ++**描述**:指定一位"编排 Agent"(CTO 负责技术讨论、CoS 负责战略讨论),该 Agent 有权 @mention 其他 Agent 并管理讨论节奏。 ++ ++**机制**: ++1. 用户启动讨论并指定编排者:"@CTO 请主持这次架构评审,涉及 @Builder 和 @Ops" ++2. 编排 Agent(CTO)分析需求,决定第一步找谁 ++3. CTO @mention Builder: "请评估可行性" ++4. Builder 响应后(`allowBots: true` 使 CTO 能看到 Builder 的消息),CTO 决定下一步 ++5. CTO @mention Ops: "请评估安全影响" ++6. CTO 综合后发布结论或继续迭代 ++7. 如需用户决策,CTO @mention 用户 ++ ++**配置要求**: ++```json ++// 编排者 Agent 的 AGENTS.md 增加以下指令段: ++// ## 协作编排模式(Level 2) ++// 当用户指定你为编排者时: ++// 1. 分析参与者列表和讨论目标 ++// 2. 按逻辑顺序 @mention 参与者 ++// 3. 每位参与者响应后,判断:需要更多输入?收敛了?有分歧需要用户裁决? ++// 4. 最多 {maxOrchestratedRounds} 轮后必须产出结论或升级到用户 ++// 5. 每轮 @mention 最多 1 位 Agent(避免并发冲突) ++``` ++ ++技术实现的关键点——编排者如何看到其他 Agent 的响应: ++- 编排者的 bot 在共享频道中,`allowBots: true` 使其收到其他 bot 的消息 ++- `requireMention: true` 确保编排者不会对非相关消息响应 ++- 编排者的 AGENTS.md 指令决定何时主动 @mention 下一位参与者 ++ ++**防护栏**: ++- **轮次硬限制**:`maxOrchestratedRounds = 8`(AGENTS.md 指令层)。超过时编排者必须产出"当前最佳结论 + 遗留分歧" ++- **沉默检测**:如果被 @mention 的 Agent 30 秒内无响应,编排者应在 thread 中标注 `[TIMEOUT: <Agent> 未响应]` 并继续 ++- **用户干预**:用户随时可以在 thread 中发言,所有 Agent 看到用户消息后应暂停自动编排,等待用户指示 ++- **编排者不自封**:编排者角色由用户指定,Agent 不能自行升级为编排者 ++ ++**推荐成熟度**:经过 Level 1 验证 Agent 判断质量后。适合重复性高的协作模式(如每周评审)。 ++ ++**优势**:减少用户参与频率,Agent 自主推进讨论 ++**劣势**:编排者可能引入偏见(总是先问某个 Agent),讨论可能偏离用户预期 ++ ++--- ++ ++### Level 3: Event-Driven(事件驱动) ++ ++**描述**:Agent 基于"相关性信号"自主决定是否加入讨论。不需要被 @mention,而是检测到与自身领域相关的内容时主动贡献。 ++ ++**机制**: ++1. 用户或 Agent 在 thread 中发言 ++2. 所有参与频道的 Agent 收到消息(`allowBots: true` + `requireMention: false`) ++3. 每位 Agent 内部评估"这条消息与我的领域相关吗?" ++4. 如果相关度高于阈值,Agent 主动发言 ++5. 如果相关度低,Agent 保持沉默 ++ ++**配置要求**: ++```json ++{ ++ "channels": { ++ "slack": { ++ "channels": { ++ "<COLLAB_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, // 关键:不需要 mention 即可触发 ++ "allowBots": true ++ } ++ } ++ } ++ } ++} ++ ++// 每位 Agent 的 AGENTS.md 增加: ++// ## 事件驱动参与(Level 3) ++// 当你在协作频道收到非 @mention 消息时: ++// 1. 评估与你领域的相关性(0-10) ++// 2. 相关性 < 7:不响应 ++// 3. 相关性 >= 7 且你有独特视角:发言,前缀 "[proactive]" ++// 4. 相关性 >= 7 但已有其他 Agent 覆盖:不响应 ++// 5. 每个 thread 最多主动发言 2 次(避免噪音) ++``` ++ ++**为什么列为 FUTURE**: ++ ++Event-Driven 模式的核心挑战是**相关性判断的可靠性**。当前 LLM 在"是否应该发言"这个元判断上不够稳定——可能过度参与(噪音)或遗漏关键时刻(静默错误)。需要以下前置条件成熟后再启用: ++1. 通过 Level 1/2 积累足够多的"Agent 应该在什么时候发言"的实战数据 ++2. 在 AGENTS.md 中精炼出每位 Agent 的"触发条件清单" ++3. 建立"proactive 发言质量"的评估机制 ++ ++**防护栏**: ++- **频率限制**:每位 Agent 在每个 thread 中最多主动发言 2 次 ++- **冷却期**:同一 Agent 在同一 thread 中两次主动发言间隔至少 3 分钟 ++- **[proactive] 前缀**:主动发言必须标注,方便用户区分"被要求的"和"自主的" ++- **用户静音**:用户可以在 thread 中发 `@Agent MUTE` 让特定 Agent 在该 thread 中保持沉默 ++- **紧急刹车**:如果一个 thread 中 Agent 消息数超过 20 且无用户参与,所有 Agent 自动进入 WAIT ++ ++**推荐成熟度**:远期目标。需要 Level 2 运行稳定 + 相关性判断经过充分验证。 ++ ++**优势**:最接近"真实团队"的讨论体验,Agent 主动贡献洞察 ++**劣势**:噪音风险最高,调试困难,用户可能感到"Agent 在自说自话" ++ ++--- ++ ++## 3. Shared Thread Protocol ++ ++### 3.1 Thread 命名规范 ++ ++多 Agent 协作 thread 的 root message 必须包含以下前缀: ++ ++``` ++COLLAB <TYPE> | <TITLE> | <DATE> ++``` ++ ++其中 TYPE 对应协作模式: ++ ++| TYPE | 对应模式 | 示例 | ++|------|---------|------| ++| `REVIEW` | Architecture Review / Code Review | `COLLAB REVIEW \| A2A v2 协议草案评审 \| 2026-03-27` | ++| `ALIGN` | Strategic Alignment | `COLLAB ALIGN \| GitHub Issues 自动化方向 \| 2026-03-27` | ++| `INCIDENT` | Incident Response | `COLLAB INCIDENT \| Builder 无响应 P2 \| 2026-03-27` | ++| `SYNTH` | Knowledge Synthesis | `COLLAB SYNTH \| Week 12 知识综合 \| 2026-03-27` | ++| `DISCUSS` | 通用讨论(不匹配以上模式) | `COLLAB DISCUSS \| 是否引入 MCP 支持 \| 2026-03-27` | ++ ++**与现有 A2A 前缀的共存**: ++- 委派式 A2A 继续使用 `A2A <FROM>→<TO> | <TITLE> | TID:<timestamp>` 前缀 ++- 协作式 thread 使用 `COLLAB <TYPE>` 前缀 ++- 两种前缀可以在同一频道共存——人和 Agent 都能快速区分"委派任务"和"协作讨论" ++ ++### 3.2 Turn Structure(轮次格式) ++ ++每位 Agent 在协作 thread 中的发言遵循以下格式: ++ ++``` ++[<角色>] <动作类型> ++<内容> ++[STATUS: <状态>] ++``` ++ ++**动作类型**: ++ ++| 动作 | 含义 | 使用者 | ++|------|------|-------| ++| `Proposal` | 提出方案 | 任何提案者 | ++| `Review` | 评审意见 | 评审者 | ++| `Response` | 回应他人意见 | 被评审者 | ++| `Diagnosis` | 诊断分析 | 技术角色 | ++| `Fix` | 修复方案 | 实施者 | ++| `Synthesis` | 综合总结 | 编排者/KO | ++| `Escalation` | 升级到用户 | 任何 Agent | ++| `[proactive]` | 主动发言 | Level 3 模式 | ++ ++**状态标签**: ++ ++| STATUS | 含义 | ++|--------|------| ++| `WAIT` | 等待下一步指令 | ++| `NEEDS_INPUT:<Agent/User>` | 需要特定方的输入 | ++| `CONVERGED` | 认为讨论已收敛 | ++| `BLOCKED:<原因>` | 被阻塞 | ++| `DECISION:<结论>` | 最终决策(通常由编排者/用户发出) | ++ ++**示例**: ++ ++``` ++[CTO] Proposal ++建议使用 multi-account Slack 配置实现多 Agent 协作。 ++核心变更:3 个 Slack app(CoS/CTO/Builder),共享 #collab 频道,allowBots + requireMention。 ++[STATUS: NEEDS_INPUT:Builder] ++``` ++ ++``` ++[Builder] Review ++可行性评估:multi-account 配置本身简单(已有文档),但需要创建 3 个 Slack app + 配置 OAuth。 ++工期估计:2 小时配置 + 1 小时验证。 ++风险:Socket Mode 下 3 个 app 的连接稳定性未验证。 ++severity: MINOR(风险可控) ++[STATUS: WAIT] ++``` ++ ++### 3.3 Context Management(上下文管理) ++ ++**Thread 历史控制**: ++ ++| 场景 | `initialHistoryLimit` 建议值 | 理由 | ++|------|---------------------------|------| ++| Architecture Review | 50 | 方案文本通常较长 | ++| Strategic Alignment | 30 | 对话式,轮次多但每轮短 | ++| Code Review | 80-100 | 代码/配置内容占用大量 token | ++| Incident Response | 30 | 快速迭代,每轮短 | ++| Knowledge Synthesis | 50 | 条目列表 + 评审意见 | ++ ++**上下文压缩策略**: ++ ++当 thread 超过 `initialHistoryLimit` 时,后加入的 Agent 只能看到最近 N 条消息。为此: ++ ++1. **编排者摘要责任**:每 5 轮,编排者发布一条 `[<角色>] Synthesis: Thread Summary`,概括到目前为止的关键决策和未决问题 ++2. **"到此为止"锚点**:编排者可以发布 `--- CONTEXT ANCHOR ---`,后续 Agent 只需要从这个锚点开始阅读 ++3. **附件而非内联**:超过 200 字的内容(代码块、配置文件、日志)应该放在 Slack 文件附件或外部链接中,而非内联到 thread 消息 ++ ++### 3.4 Termination Criteria(终止条件) ++ ++讨论"完成"的判定标准: ++ ++**显式终止**(推荐): ++1. **用户宣布**:用户在 thread 中发布 `RESOLVED` 或 `APPROVED` ++2. **编排者宣布**:编排者发布 `[<角色>] DECISION: <结论>` + `[STATUS: CONVERGED]` ++3. **所有参与者同意**:每位参与 Agent 发布 `[STATUS: CONVERGED]` ++ ++**隐式终止**(防护栏触发): ++1. **轮次上限**:达到 `maxDiscussionTurns`(默认 5 轮 for Level 1, 8 轮 for Level 2) ++2. **时间上限**:thread 最后一条消息超过 4 小时无新内容 ++3. **Agent 消息上限**:thread 中 Agent 消息总数超过 20 条(Level 3 的紧急刹车) ++ ++**终止后动作**: ++1. 编排者(或最后发言的 Agent)发布 discussion closeout(精简版 CLOSEOUT_TEMPLATE) ++2. 如果讨论产生了可执行任务 → 转入 A2A v1 委派流程 ++3. 如果讨论产生了知识 → 同步到 #know ++4. Thread 保留为可搜索的组织记忆 ++ ++### 3.5 Escalation(升级到人类) ++ ++以下情况 Agent 必须升级到用户: ++ ++| 触发条件 | 升级方式 | 升级消息格式 | ++|---------|---------|------------| ++| Agent 间分歧无法收敛(2+ 轮) | Thread 内 @mention 用户 | `[<角色>] Escalation: <Agent-A> 认为 X,<Agent-B> 认为 Y。需要你裁决。` | ++| 涉及 L3 动作 | Thread 内 @mention 用户 | `[<角色>] Escalation: 需要 L3 审批——<具体动作>。` | ++| 讨论偏离原始目标 | Thread 内 @mention 用户 | `[<角色>] Escalation: 讨论已偏离 "<原始目标>"。请确认是否调整方向。` | ++| 编排 Agent 不确定下一步 | Thread 内 @mention 用户 | `[<角色>] Escalation: 不确定应该咨询哪位 Agent 或是否可以结论。` | ++ ++### 3.6 与现有模板的集成 ++ ++**Discussion Closeout**(讨论收尾,基于 CLOSEOUT_TEMPLATE 精简版): ++ ++``` ++## Discussion Closeout ++- Thread: [COLLAB <TYPE> | <TITLE> | <DATE>] ++- Participants: [Agent 列表] ++- Rounds: [轮次数] ++ ++## Decisions ++1. ... ++2. ... ++ ++## Dissent(未达成共识的点) ++- ... ++ ++## Next Actions ++| Action | Owner | Type | ++|--------|-------|------| ++| ... | ... | A2A v1 委派 / 用户确认 / 知识入库 | ++ ++## Signal Score ++- [0-3] ++``` ++ ++**Discussion Checkpoint**(讨论中间切割,当讨论跨天或上下文膨胀时): ++ ++``` ++## Discussion Checkpoint #N ++- Thread: [COLLAB <TYPE> | <TITLE> | <DATE>] ++- Current Round: [M/max] ++ ++## 到目前为止的决策 ++- ... ++ ++## 未决问题 ++- ... ++ ++## 下一步 ++- 继续讨论:@<Agent> <问题> ++- 或:升级到用户 ++``` ++ ++--- ++ ++## 4. Integration with Harness Design ++ ++### 4.1 模式映射 ++ ++Anthropic 的 harness 设计使用 Planner → Builder → QA 的流水线。OpenCrew 的协作模式可以映射到 harness 的角色: ++ ++| Harness 角色 | OpenCrew 对应 | 协作模式中的体现 | ++|-------------|-------------|----------------| ++| **Planner** | CoS + CTO | Strategic Alignment (Pattern 2) 中 CoS 解读意图、CTO 规划路径 | ++| **Builder/Generator** | Builder | 所有模式中作为实施者/生产者 | ++| **Evaluator/QA** | CTO + Ops + KO | Code Review (Pattern 3) 中的多维评审 | ++| **Orchestrator/Harness** | CoS (Level 2) / 用户 (Level 1) | 编排模型中的编排者角色 | ++ ++**关键差异**: ++ ++1. **Harness 的 Evaluator 是单一的,OpenCrew 的评审是多维的**。Harness 的 GAN-inspired 模式是"Generator 产出 → Evaluator 挑战 → 迭代"。OpenCrew 的评审是"Builder 产出 → CTO 评架构 → Ops 评安全 → KO 评知识一致性"。多维评审能发现单一评审者的盲区。 ++ ++2. **Harness 的 Planner 是预先规划的,OpenCrew 的对齐是协商的**。Harness 的 Planner 独立写 spec,其他 Agent 执行。OpenCrew 的 Strategic Alignment 中,CTO 和 CIO 可以挑战 CoS 的解读——方案是讨论出来的,不是单方面决定的。 ++ ++3. **Harness 是批处理的,OpenCrew 是流式的**。Harness 中 Planner 写完 spec 文件再由 Builder 读取。OpenCrew 中 Agent 可以在 thread 中实时看到其他 Agent 的思路演化,实时调整自己的判断。 ++ ++### 4.2 Slack Thread 作为"Live Blackboard" ++ ++Harness 设计中的核心通信模式是 **Blackboard Pattern**:Agent 写文件到共享空间,其他 Agent 读取。 ++ ++Slack thread 可以视为**实时版 Blackboard**: ++ ++| Blackboard 概念 | 文件式(Harness) | 聊天式(OpenCrew Slack) | ++|----------------|-----------------|------------------------| ++| 写入 | Agent 写文件到 `output/` | Agent 在 thread 中发消息 | ++| 读取 | Agent 读取 `output/` 中的文件 | Agent 加载 thread 历史(`initialHistoryLimit`) | ++| 结构 | 文件名 + 文件内容 | 消息前缀 `[角色] 动作类型` + 内容 | ++| 持久化 | Git 仓库 | Slack 搜索(免费版 90 天) | ++| 版本控制 | Git diff | Thread 消息时间线 | ++| 访问控制 | 文件系统权限 | Slack 频道权限 + `requireMention` | ++ ++**优势**: ++- **人可读**:Slack thread 是人类自然使用的界面。不需要"打开文件查看 Agent 在做什么" ++- **可介入**:人在 thread 中发消息等于"实时写入 Blackboard",Agent 立即可见 ++- **可搜索**:Slack 的全文搜索等于 Blackboard 的历史索引 ++ ++**劣势**: ++- **结构松散**:文件可以有精确的 schema,thread 消息是自由文本 ++- **上下文窗口受限**:文件可以无限大,thread 受 `initialHistoryLimit` 限制 ++- **持久性不足**:Slack 免费版 90 天历史限制(Harness 的 Git 是永久的) ++ ++### 4.3 文件式 vs 聊天式:何时用哪个 ++ ++**用文件式(Harness)**: ++- 纯代码生成/修改任务(Builder 在仓库中工作) ++- 需要精确结构的产出物(配置文件、协议文档) ++- 需要 Git 版本控制的内容 ++- 长期运行的自动化流水线(无人值守) ++ ++**用聊天式(OpenCrew Slack)**: ++- 需要多方判断的决策(架构评审、战略对齐) ++- 需要实时人类参与的场景(事件响应、方向校正) ++- 需要跨领域视角的综合(知识综合) ++- 需要渐进式信任建设的场景(先 Level 1 再 Level 2) ++ ++**混合使用**: ++- 一次 Strategic Alignment(聊天式)产出结论后 → CTO 创建 Harness spec(文件式)→ Builder 在 Harness 中执行 → 产出通过 Code Review(聊天式)评审 ++- 这是"讨论决定做什么,Harness 执行怎么做,讨论验证做对了"的闭环 ++ ++### 4.4 Harness Evaluator 与 OpenCrew 多维评审的融合 ++ ++Harness 设计的最强概念之一是"Generator-Evaluator 对抗循环":Generator 倾向于创造,Evaluator 倾向于挑战,两者对抗产出高质量结果。 ++ ++OpenCrew 可以将这个概念泛化: ++ ++``` ++[Builder 产出] ++ ↓ ++[CTO 评架构] ← 对抗维度 1:设计合理性 ++ ↓ ++[Ops 评安全] ← 对抗维度 2:运维风险 ++ ↓ ++[KO 评一致性] ← 对抗维度 3:知识冲突 ++ ↓ ++[综合反馈 → Builder 修订] ++ ↓ ++[重复直到收敛] ++``` ++ ++这不是简单的"一个 Evaluator 说好或不好",而是"多个 Evaluator 从不同角度挑战,Builder 必须同时满足所有维度"。质量上限更高,但收敛时间更长——这就是为什么防护栏中设了评审轮次上限。 ++ ++--- ++ ++## 5. Config Template ++ ++### 5.1 Multi-Account Slack 配置(3 核心 Agent) ++ ++以下是启用多 Agent 协作的完整配置片段。基于 research_slack_r1.md 中确认的 OpenClaw 能力。 ++ ++```jsonc ++{ ++ "channels": { ++ "slack": { ++ // ===== 多账号配置 ===== ++ // 每个 Agent 一个独立的 Slack App,拥有独立的 bot identity ++ "accounts": { ++ "cos": { ++ "botToken": "${SLACK_BOT_TOKEN_COS}", // 环境变量引用,不明文 ++ "appToken": "${SLACK_APP_TOKEN_COS}", ++ "name": "CoS" ++ }, ++ "cto": { ++ "botToken": "${SLACK_BOT_TOKEN_CTO}", ++ "appToken": "${SLACK_APP_TOKEN_CTO}", ++ "name": "CTO" ++ }, ++ "builder": { ++ "botToken": "${SLACK_BOT_TOKEN_BUILDER}", ++ "appToken": "${SLACK_APP_TOKEN_BUILDER}", ++ "name": "Builder" ++ } ++ }, ++ ++ // ===== 频道配置 ===== ++ "channels": { ++ // --- 协作频道(多 Agent 讨论发生在这里) --- ++ "<COLLAB_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": true, // 必须 @mention 才触发(循环防护) ++ "allowBots": true // 允许处理其他 bot 的消息(协作核心) ++ }, ++ ++ // --- 各 Agent 专属频道(保持现有行为) --- ++ "<HQ_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, // CoS 在自己频道不需要 mention ++ "allowBots": false // 专属频道不接受 bot 消息 ++ }, ++ "<CTO_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, ++ "allowBots": false ++ }, ++ "<BUILD_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, ++ "allowBots": false ++ } ++ }, ++ ++ // ===== Thread 配置 ===== ++ "thread": { ++ "historyScope": "thread", // 每个 thread 独立 session ++ "inheritParent": true, // thread 继承 root message 上下文 ++ "initialHistoryLimit": 50 // 加入 thread 时加载最近 50 条消息 ++ } ++ } ++ }, ++ ++ // ===== Agent 绑定 ===== ++ "bindings": [ ++ // CoS: 绑定到 cos 账号 ++ { ++ "agentId": "cos", ++ "match": { ++ "channel": "slack", ++ "accountId": "cos" ++ } ++ }, ++ // CTO: 绑定到 cto 账号 ++ { ++ "agentId": "cto", ++ "match": { ++ "channel": "slack", ++ "accountId": "cto" ++ } ++ }, ++ // Builder: 绑定到 builder 账号 ++ { ++ "agentId": "builder", ++ "match": { ++ "channel": "slack", ++ "accountId": "builder" ++ } ++ } ++ ] ++} ++``` ++ ++### 5.2 Slack App 创建清单(每个 Agent 重复) ++ ++每个 Slack App 需要以下配置: ++ ++**Bot Token Scopes**(OAuth & Permissions): ++- `channels:history` -- 读取频道历史 ++- `channels:read` -- 读取频道信息 ++- `chat:write` -- 发送消息 ++- `chat:write.customize` -- 自定义发送者名称/图标(可选,multi-account 下不需要) ++- `users:read` -- 读取用户信息 ++ ++**Event Subscriptions**: ++- `message.channels` -- 接收频道消息事件 ++- `app_mention` -- 接收 @mention 事件 ++ ++**Socket Mode**: ++- 启用 Socket Mode ++- App-level token scope: `connections:write` ++ ++**注意事项**: ++- 每个 App 必须被邀请进所有需要参与的频道(专属频道 + 共享协作频道) ++- Socket Mode 连接数:3 个 App = 3 个持久 WebSocket 连接。研究报告指出这在正常范围内,但如果扩展到 5-7 个 App,建议考虑切换到 HTTP Events API 模式(每个 account 配置不同的 `webhookPath`) ++ ++### 5.3 AGENTS.md 协作指令段(追加到现有 AGENTS.md) ++ ++以下指令段应追加到参与协作的 Agent 的 AGENTS.md 中: ++ ++```markdown ++## 多 Agent 协作协议(追加段) ++ ++### 识别协作 Thread ++- 以 `COLLAB <TYPE>` 开头的 thread 是协作讨论 ++- 以 `A2A <FROM>→<TO>` 开头的 thread 是委派任务 ++- 在协作 thread 中,你是讨论参与者,不是任务执行者 ++ ++### 发言格式 ++- 每条发言以 `[<你的角色>] <动作类型>` 开头 ++- 结尾标注 `[STATUS: <状态>]` ++- 动作类型:Proposal / Review / Response / Diagnosis / Fix / Synthesis / Escalation ++- 状态:WAIT / NEEDS_INPUT:<谁> / CONVERGED / BLOCKED:<原因> / DECISION:<结论> ++ ++### 编排纪律 ++- Level 1(默认):响应后 WAIT。不主动 @mention 其他 Agent ++- Level 2(用户指定你为编排者时):可以 @mention 其他 Agent 推进讨论。最多 8 轮后必须收敛或升级 ++- 未被 @mention 的协作 thread 消息:不响应(由 requireMention 硬约束保证) ++ ++### 防护栏 ++- 每条发言不超过 500 字。超过时拆分为"摘要 + 链接" ++- 每个 thread 最多参与 5 轮讨论。超过时发布 `[STATUS: CONVERGED]` 或 `Escalation` ++- 如果发现讨论偏离原始目标,发布 Escalation 提醒用户 ++``` ++ ++### 5.4 迁移策略(从单 bot 到多 bot) ++ ++``` ++Phase 0: 准备(无风险) ++ - 创建 3 个 Slack App(CoS/CTO/Builder) ++ - 获取 token,配置环境变量 ++ - 不修改 openclaw 配置 ++ ++Phase 1: 创建协作频道(低风险) ++ - 创建 #collab 频道 ++ - 邀请 3 个 bot 进入 #collab ++ - 在 openclaw 配置中添加 #collab 的频道配置(allowBots + requireMention) ++ - 原有频道配置不变 ++ ++Phase 2: 切换 Agent 绑定(中风险,可回滚) ++ - 修改 bindings 为 multi-account 模式 ++ - 保留原 bot 作为 fallback(如果用了新 App Token 后出问题) ++ - 逐个 Agent 切换:先 CTO → 验证 → 再 Builder → 验证 → 最后 CoS ++ - 每步验证:在专属频道和 #collab 分别测试响应 ++ ++Phase 3: 验证协作模式(低风险) ++ - 在 #collab 中手动测试 Architecture Review 模式 ++ - 验证:Agent A 的消息 Agent B 是否能看到 ++ - 验证:requireMention 是否正确过滤非相关消息 ++ - 验证:thread 历史加载是否完整 ++ ++Phase 4: 启用 Level 2 编排(中风险) ++ - 在 CTO/CoS 的 AGENTS.md 中追加协作指令段 ++ - 测试 Agent 编排模式 ++ - 观察是否有无限循环或噪音问题 ++``` ++ ++--- ++ ++## 6. Open Questions & Risks ++ ++### 确认度高的发现 ++ ++1. **Slack multi-account 支持一切所需能力** -- `allowBots: true` + `requireMention: true` + `initialHistoryLimit` 组合已被 OpenClaw 官方文档和社区实践确认 ++2. **Feishu 无法支持实时多 Agent 讨论** -- 平台限制(bot 消息不触发其他 bot 事件),这是 Feishu Open Platform 的设计决策,非 bug ++3. **Discord 等待 #11199 修复** -- 修复后机制与 Slack 类似 ++ ++### 需要验证的假设 ++ ++1. **Agent @mention 其他 bot 的可靠性**:当 CTO 的 LLM 在消息中输出 "@Builder" 时,OpenClaw 是否会将其转换为 Slack 的原生 mention 格式(`<@BOT_USER_ID>`)?如果只是纯文本 "@Builder",接收方的 bot 可能不会识别为 mention。**这是 Level 2 编排的关键前提,需要实测。** ++ ++2. **Thread 历史中 bot 消息的呈现**:当 Agent 加载 thread 历史时,其他 bot 的消息是否以可理解的方式呈现(包含发送者身份)?还是所有 bot 消息都显示为同一个"bot"? ++ ++3. **Socket Mode 多连接稳定性**:3 个 Slack App 各自维持 WebSocket 连接。在网络波动时是否存在重连竞争或消息丢失?研究报告建议 5+ App 时切换 HTTP 模式。 ++ ++4. **`allowBots: "mentions"` 模式的可用性**:DeepWiki 分析提到这个值但官方文档未明确列出。需要确认是否可用——如果可用,它比 `allowBots: true` 更安全(只处理 @mention 自己的 bot 消息)。 ++ ++### 架构风险 ++ ++| 风险 | 影响 | 缓解 | ++|------|------|------| ++| Agent 讨论质量不够(废话多、不收敛) | 用户体验差,噪音 > 信号 | 严格的发言格式 + 轮次限制 + 持续迭代 AGENTS.md | ++| 上下文窗口不够(长讨论后 Agent 忘记早期内容) | 讨论循环、重复、遗漏 | Context Anchor 机制 + 编排者摘要 | ++| Slack 免费版 90 天历史限制 | 历史讨论不可回溯 | 重要讨论的结论同步到 Git(closeout → 仓库) | ++| 多 bot 配置复杂度高 | 上手门槛增加 | 分阶段迁移 + 详细配置文档 | ++| Level 2 编排者偏见 | 讨论结论受编排者立场影响 | 多种编排者可选(CoS 主持战略、CTO 主持技术)+ 用户随时可介入 | ++ ++--- ++ ++## Appendix A: Quick Reference Card ++ ++### 选择协作模式 ++ ++``` ++需要做什么? ++├─ 评审技术方案 → Pattern 1: Architecture Review ++├─ 对齐新方向 → Pattern 2: Strategic Alignment ++├─ 评审已完成的工作 → Pattern 3: Code/Design Review ++├─ 处理紧急问题 → Pattern 4: Incident Response ++├─ 验证知识准确性 → Pattern 5: Knowledge Synthesis ++└─ 以上都不是 → COLLAB DISCUSS(通用讨论) ++``` ++ ++### 选择编排级别 ++ ++``` ++对 Agent 判断质量有信心吗? ++├─ 不确定 → Level 1: Human Orchestrated ++├─ 有信心,但想保持监督 → Level 2: Agent Orchestrated ++└─ 完全信任 + 已验证 → Level 3: Event-Driven (FUTURE) ++``` ++ ++### 讨论何时结束? ++ ++``` ++是否达到以下任一条件? ++├─ 用户说 RESOLVED/APPROVED → 结束 ++├─ 编排者发布 DECISION + CONVERGED → 结束 ++├─ 所有参与者都 CONVERGED → 结束 ++├─ 超过 maxDiscussionTurns → 强制结束,产出当前最佳结论 ++├─ 超过 4 小时无新消息 → 隐式结束 ++└─ 以上都没有 → 继续讨论 ++``` ++ ++## Appendix B: Collaboration vs Delegation Decision Matrix ++ ++| 维度 | 用 Delegation (A2A v1) | 用 Collaboration (A2A v2) | ++|------|----------------------|--------------------------| ++| 任务清晰度 | 高(DoD 明确) | 低(需要讨论才能明确) | ++| 参与者数量 | 2(指派方 + 执行方) | 3+(多方讨论) | ++| 是否需要对抗/挑战 | 否(执行即可) | 是(需要不同视角) | ++| 人类参与需求 | 低(启动 + 验收) | 高(引导讨论 + 裁决分歧) | ++| 适用任务类型 | A/P(执行型) | P/S(决策型、评审型) | ++| 平台要求 | 所有平台均支持 | Slack NOW / Discord AFTER #11199 / Feishu NOT SUPPORTED | ++ ++--- ++ ++> 本报告基于 research_slack_r1.md、research_discord_r1.md、research_feishu_r1.md 的发现,以及 ARCHITECTURE.md、A2A_PROTOCOL.md、SYSTEM_RULES.md 的现有架构设计。所有协作模式在 Slack 上仅需配置变更即可实现,无需上游代码修改。 diff --git a/.harness/reports/architecture_protocol_r1.md b/.harness/reports/architecture_protocol_r1.md new file mode 100644 index 0000000..a31024f --- /dev/null +++ b/.harness/reports/architecture_protocol_r1.md @@ -0,0 +1,918 @@ +commit 7e825263db36aef68792a050c324daef598b4c56 +Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> +Date: Sat Mar 28 17:38:48 2026 +0800 + + feat: add A2A v2 research harness, architecture, and agent definitions + + Multi-agent harness for researching and designing A2A v2 protocol: + + Research reports (Phase 1): + - Slack: true multi-agent collaboration via multi-account + @mention + - Feishu: groupSessionScope + platform limitation analysis + - Discord: multi-bot routing + Issue #11199 blocker analysis + + Architecture designs (Phase 2): + - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode + - 5 collaboration patterns: Architecture Review, Strategic Alignment, + Code Review, Incident Response, Knowledge Synthesis + - 3-level orchestration: Human → Agent → Event-Driven + - Platform configs, migration guides, 6 ADRs + + Agent definitions for Claude Code Agent Teams: + - researcher.md, architect.md, doc-fixer.md, qa.md + + QA verification: all issues resolved, PASS verdict after fixes. + + Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> + +diff --git a/.harness/reports/architecture_protocol_r1.md b/.harness/reports/architecture_protocol_r1.md +new file mode 100644 +index 0000000..6afaa80 +--- /dev/null ++++ b/.harness/reports/architecture_protocol_r1.md +@@ -0,0 +1,885 @@ ++# A2A 协作协议 v2(跨平台多 Agent) ++ ++> **版本**: v2.0-draft | **日期**: 2026-03-27 | **作者**: Architecture Agent ++> ++> 目标:统一 Slack / Feishu / Discord 三平台的 Agent-to-Agent 协作,同时支持: ++> - **Delegation(委派模式)**:结构化任务包 + `sessions_send` 触发(v1 继承,全平台可用) ++> - **Discussion(讨论模式)**:@mention 驱动的多方对话(v2 新增,Slack 立即可用,Discord 待修复) ++> ++> 设计原则: ++> 1. 单 bot 用户不受破坏(向后兼容) ++> 2. 多 bot 解锁新能力(渐进增强) ++> 3. 不同平台不同能力(平台感知) ++> 4. 两种模式共存,互补而非替代 ++ ++--- ++ ++## 0. 术语 Terminology ++ ++| 术语 | 定义 | ++|------|------| ++| **A2A** | Agent-to-Agent 协作流程总称,包含 Delegation 和 Discussion 两种模式 | ++| **Delegation(委派)** | v1 模式。Agent-A 构造完整任务包,通过 `sessions_send` 触发 Agent-B 执行。单向、结构化、全平台可用 | ++| **Discussion(讨论)** | v2 模式。多个 Agent 在同一 thread/topic 中通过 @mention 参与多方对话。多向、自然语言、平台受限 | ++| **Task Thread** | 在目标 Agent 的频道/群组/channel 里创建的任务线程。该线程即该任务的独立 Session | ++| **Anchor Message(锚点消息)** | 在目标频道发布的可见 root message,作为任务的人类审计入口 | ++| **Multi-Account(多账户)** | OpenClaw 特性:每个 Agent 使用独立的 bot app(独立 token/identity/quota) | ++| **Cross-Bot Visibility(跨 bot 可见性)** | 平台层面:Bot-A 的消息是否能被 Bot-B 的事件处理器接收 | ++| **Self-Loop Filter(自循环过滤)** | OpenClaw 层面:忽略"自己发出的消息"以防止无限循环 | ++| **Session Key** | OpenClaw 用于标识一次对话会话的唯一键,格式因平台和配置而异 | ++| **Turn** | 一次 Agent 响应周期。Discussion 模式中由 @mention 触发,Delegation 模式中由 `sessions_send` 触发 | ++| **Orchestrator** | 控制讨论节奏的角色(人类或指定 Agent),决定下一个发言者 | ++ ++--- ++ ++## 1. 权限矩阵 Permission Matrix(必须遵守) ++ ++> 本节与 v1 完全一致。技术上 bot 可以给任意频道发消息,但这是组织纪律,不遵守视为 bug。 ++ ++| 角色 | 可派单/发起讨论 | 约束 | ++|------|----------------|------| ++| CoS | CTO(默认不直达 Builder) | 方向对齐、战略决策 | ++| CTO | Builder / Research / KO / Ops | 技术决策、任务分解 | ++| Builder | 不主动派单;需澄清时回 CTO thread | 执行、实现 | ++| CIO | 尽量独立;必要时与 CoS/KO 同步 | 领域专长 | ++| KO/Ops | 通常不主动派单 | 审计、沉淀 | ++ ++**v2 补充(Discussion 模式)**: ++- Discussion 中的 @mention 也必须遵守权限矩阵。Builder 不应 @mention CoS 发起战略讨论。 ++- Orchestrator(通常是发起讨论的角色或 CTO)控制 @mention 顺序,间接执行权限。 ++- Agent SOUL.md/AGENTS.md 中须写明:"在 Discussion 中只回应自己被 @mention 的消息,不主动跨域发言。" ++ ++--- ++ ++## 2. 触发模式 Trigger Modes ++ ++### 2a. Delegation Mode(委派模式 -- v1 继承) ++ ++> 全平台可用。适用于:结构化任务委派、需要完整任务包的执行型工作。 ++ ++**触发流程(三步)**: ++ ++#### Step 1 -- 创建可见锚点消息 Anchor Message ++ ++A 在 B 的频道/群组创建 root message,第一行固定前缀: ++ ++``` ++A2A <FROM>-><TO> | <TITLE> | TID:<YYYYMMDD-HHMM>-<short> ++``` ++ ++正文必须是完整任务包(使用 `SUBAGENT_PACKET_TEMPLATE.md`): ++- Objective / DoD / Inputs / Constraints / Output format / CC ++ ++> 前置条件:bot 必须已加入目标频道/群组。Slack 报 `not_in_channel`;Feishu 需手动拉 bot 进群;Discord 需 bot 有 View Channel 权限。 ++ ++#### Step 2 -- `sessions_send` 触发目标 Agent ++ ++A 读取 root message 的消息 ID,拼出 thread/topic sessionKey: ++ ++| 平台 | Session Key 格式 | ++|------|-----------------| ++| Slack | `agent:<B>:slack:channel:<channelId>:thread:<root_ts>` | ++| Feishu | `agent:<B>:feishu:group:<chatId>:topic:<root_id>` (需 `groupSessionScope: "group_topic"`) | ++| Discord | `agent:<B>:discord:channel:<channelId>` (thread 继承父 channel) | ++ ++然后 A 用 `sessions_send(sessionKey=..., message=<完整任务包>)` 触发 B。 ++ ++> **timeout 容错**:`sessions_send` 返回 timeout 不等于没送达。消息可能已送达并被处理。 ++> 规避:在 thread 里补发兜底消息。 ++> ++> **SessionKey 注意**:不要手打。优先从 `sessions_list` 复制 `deliveryContext` 匹配的 key。注意大小写一致性。 ++ ++#### Step 3 -- 执行与汇报 ++ ++- B 的执行与产出都留在该 thread/topic。 ++- 上游(如 CTO)在自己的协调 thread 里同步 checkpoint/closeout。 ++- 完成后必须 closeout(见第 4 节)。 ++ ++### 2b. Discussion Mode(讨论模式 -- v2 新增) ++ ++> 仅在支持跨 bot 可见性的平台可用。适用于:多方讨论、设计评审、头脑风暴。 ++ ++**前提条件**: ++1. Multi-Account 已配置(每个参与讨论的 Agent 有独立 bot app) ++2. 共享频道配置了 `allowBots: true`(或 `"mentions"`)+ `requireMention: true` ++3. 平台支持跨 bot 消息可见性(见 2c 平台能力矩阵) ++ ++**触发流程(单步)**: ++ ++#### Step 1 -- 在共享 thread 中 @mention 目标 Agent ++ ++Orchestrator(人类或指定 Agent)在 thread 中发送包含 @mention 的消息: ++ ++``` ++@CTO 这个架构方案的可行性如何?请从技术角度评估。 ++``` ++ ++目标 Agent 的 bot 收到 mention 事件,加载 thread history 作为上下文,然后响应。 ++ ++#### Step 2 -- 多轮迭代 ++ ++Orchestrator 根据回复决定下一步: ++- @mention 另一个 Agent 获取不同视角 ++- @mention 同一个 Agent 追问 ++- 人类直接介入修正方向 ++- 达成共识后总结并结束讨论 ++ ++**关键配置**: ++- `thread.historyScope: "thread"` -- 确保 Agent 看到完整 thread 历史 ++- `thread.initialHistoryLimit >= 50` -- 讨论可能较长,需要足够历史 ++- `thread.inheritParent: true` -- thread 参与者继承 root message 上下文 ++ ++**Discussion 专用规则**: ++- 每个 Agent 只在被 @mention 时响应(`requireMention: true` 强制) ++- 响应格式:`[角色] 内容...`(与 Delegation 模式保持一致的审计格式) ++- Agent 不得在 Discussion 中自行 @mention 其他 Agent,除非其 SOUL.md 明确授权为 Orchestrator ++- 讨论必须有明确的发起者和终止者(通常是同一个角色) ++ ++### 2c. 平台能力矩阵 Platform Capability Matrix ++ ++| 能力 | Slack | Discord | Feishu | ++|------|-------|---------|--------| ++| **Multi-Account** | YES | YES (PR #3672) | YES (已知 bug #47436) | ++| **跨 bot 消息可见性** | YES (Events API) | YES (MESSAGE_CREATE) 但 OpenClaw 过滤 (#11199) | NO (平台限制) | ++| **Delegation Mode** | YES | YES | YES | ++| **Discussion Mode** | **YES (NOW)** | **BLOCKED** (待 #11199 修复) | **NO** (平台不支持) | ++| **Self-loop 隔离** | 每 bot 独立 user ID | 全局过滤所有配置 bot (bug) | N/A (bot 消息不触发事件) | ++| **Thread/Topic 隔离** | `historyScope: "thread"` | Thread 继承 parent channel | `groupSessionScope: "group_topic"` | ++| **视觉身份** | 多 bot = 多身份 | 多 bot = 多身份 | 多 bot = 多身份 | ++ ++**平台特性详解**: ++ ++**Slack(能力最强)**: ++- Slack Events API 将所有频道消息投递给所有已加入的 bot app,不区分来源 ++- OpenClaw 的 self-loop 过滤是 per-bot-user-ID:Bot-CTO 的消息不会被 Bot-Builder 过滤 ++- `allowBots: true` + `requireMention: true` 即可实现安全的跨 bot 讨论 ++- **一步触发 @mention 现在就能用**,无需代码修改 ++ ++**Discord(待修复后接近 Slack)**: ++- Discord MESSAGE_CREATE 事件在平台层面支持跨 bot 可见 ++- **BLOCKER**: OpenClaw Issue #11199 -- bot 消息过滤器将所有已配置的 bot 视为"自己",Bot-A 的消息被 Bot-B 的 handler 丢弃 ++- 相关修复 PR: #11644, #22611, #35479(状态待确认) ++- 另有 Issue #45300: `requireMention` 在多账户配置下可能失效 ++- **修复后**:Discord 的 Discussion 能力将与 Slack 接近 ++ ++**Feishu(仅支持 Delegation)**: ++- **平台硬限制**:`im.message.receive_v1` 事件仅对用户发送的消息触发,bot 消息对其他 bot 不可见 ++- 这不是 OpenClaw 的问题,是飞书平台的设计 ++- 两步触发(anchor + `sessions_send`)无法简化 ++- Multi-Account 的价值:视觉身份、API 配额独立、权限隔离 ++ ++### 2d. 模式选择指南 Mode Selection Guide ++ ++| 场景 | 推荐模式 | 原因 | ++|------|---------|------| ++| CTO 给 Builder 派具体任务 | Delegation | 结构化任务包,明确 DoD | ++| 架构方案多方评审 | Discussion (Slack) / Delegation chain (Feishu, Discord) | 多方观点汇聚 | ++| 紧急故障协同 | Discussion (Slack) / Human-in-loop (all) | 实时交互需求 | ++| 长期项目多阶段交付 | Delegation | 需要 checkpoint/closeout 完整链路 | ++| 知识整理、复盘 | Delegation (KO) | 结构化产出 | ++ ++--- ++ ++## 3. 可见性契约 Visibility Contract ++ ++> 用户必须能在聊天 UI 中看到所有关键信息。Agent 之间的内部通信(`sessions_send`)对用户不可见,因此必须有配套的可见性保证。 ++ ++### 3.1 基础可见性(全模式、全平台) ++ ++1. **任务根消息可见**:每个任务的 anchor message 必须在目标频道可见 ++2. **关键 checkpoint 可见**:开始/阻塞/完成 至少更新 1 次到 thread ++3. **上游负责到底**:谁派单谁在自己的协调 thread 跟进(避免用户跨频道"捞信息") ++4. **双通道留痕**: ++ - A2A reply(给上游的结构化回复)-- 仅上游能看到 ++ - Thread/topic message(给用户可见的审计日志)-- 用户能看到 ++ - **两者都要做** ++ ++### 3.2 Delegation 模式可见性 ++ ++- Step 1 的 anchor message 是用户的唯一入口 ++- 所有执行过程在 anchor message 的 thread/topic 内进行 ++- 完成后必须 closeout(见第 4 节 closeout 规则) ++- `sessions_send` 触发后,发送方应等待并验证 thread 内出现回复 ++- 如果未收到回复,标记 `failed-delivery` 并上报 ++ ++### 3.3 Discussion 模式可见性(v2 新增) ++ ++- Discussion 天然可见 -- 所有对话都在共享 thread 中,用户直接可读 ++- 每个 Agent 的消息附带视觉身份(多 bot 模式下,各 bot 有独立头像/名称) ++- **Discussion 可见性优势**: ++ - 用户实时看到讨论过程,可随时介入 ++ - 不存在 Delegation 中"A2A reply 对用户不可见"的问题 ++ - Thread 本身即审计日志 ++- **Discussion 可见性要求**: ++ - 讨论结束后,Orchestrator 必须在 thread 中发布结论摘要 ++ - 如果 Discussion 产生了后续 Delegation 任务,必须在摘要中注明 TID 关联 ++ ++### 3.4 Multi-Bot 视觉身份 ++ ++| 平台 | 单 bot 模式 | 多 bot 模式 | ++|------|-----------|-----------| ++| Slack | 所有 Agent 共享一个 bot 名称/头像 | 每个 Agent 有独立 Slack app 身份 | ++| Feishu | 所有 Agent 共享一个飞书应用身份 | 每个 Agent 有独立飞书应用身份 | ++| Discord | 所有 Agent 共享一个 bot 身份 | 每个 Agent 有独立 Discord bot 身份 | ++ ++多 bot 模式下的视觉身份不需要额外配置 -- 每个 bot app 的 profile(名称、头像)即是 Agent 身份。Slack 额外支持 `chat:write.customize` 做运行时身份覆盖,但多 bot 模式下不需要。 ++ ++--- ++ ++## 4. 多轮纪律 Multi-Round Discipline ++ ++> 本节适用于 Delegation 和 Discussion 两种模式。 ++ ++### 4.1 Delegation 模式多轮规则(v1 继承) ++ ++当 Delegation 任务需要多轮迭代时: ++ ++- **每轮只聚焦 1-2 个改动点**,完成后**必须 WAIT** ++- **禁止一次性做完所有步骤** -- 等上游下一轮指令后再继续 ++- 每轮输出格式固定: ++ ``` ++ [<角色>] Round N/M ++ Done: <做了什么> ++ Run: <执行了什么命令> ++ Output: <关键输出,允许截断> ++ WAIT: 等待上游指令 ++ ``` ++- 最终轮贴 closeout 到 thread,A2A reply 中回复 `REPLY_SKIP` 表示完成 ++ ++#### Round0 审计握手(推荐) ++ ++在正式 Round1 前,先做一个极小的真实动作验证审计链路: ++- 要求目标 Agent 执行一个无副作用命令(如 `pwd`)并把结果贴到 thread ++- **看不到 Round0 回传就停止** -- 说明 session 可能没绑定到正确的 deliveryContext ++ ++### 4.2 Discussion 模式多轮规则(v2 新增) ++ ++讨论模式的多轮控制更加关键 -- 没有控制的 Agent 讨论可能无限循环。 ++ ++**Orchestrator 控制原则**: ++- 每次只 @mention 一个 Agent(避免并发响应冲突) ++- 收到回复后,由 Orchestrator 决定下一步:继续讨论 / 切换 Agent / 结束 ++- Orchestrator 可以是人类,也可以是指定 Agent(如 CTO 主持技术讨论) ++ ++**讨论轮次上限**: ++- `maxDiscussionTurns`:建议值 = 5(Level 1 人工编排) / 8(Level 2 Agent 编排) ++- 达到上限后,Orchestrator 必须总结当前状态并决定:结束 / 转为 Delegation / 请人类介入 ++- 此限制由 Orchestrator 在 SOUL.md/AGENTS.md 中自律执行(不是系统级强制) ++ ++> 与 Delegation 的 `maxPingPongTurns = 4` 类似,Discussion 的轮次限制防止失控循环。 ++> Delegation 的 `maxPingPongTurns` 是 OpenClaw 系统级参数;Discussion 的 `maxDiscussionTurns` 是协议级约定,由 Agent 自律。 ++ ++**Agent 响应规则**: ++- 只在被 @mention 时响应(`requireMention: true` 强制) ++- 响应必须包含明确的观点或建议(不允许"我同意"这样的空回复) ++- 如果 Agent 认为自己没有有价值的补充,应回复 `[角色] PASS: <一句话原因>` ++- 如果 Agent 认为讨论已达成共识,应在回复末尾标注 `CONSENSUS: <一句话共识>` ++ ++**讨论结束协议**: ++- Orchestrator 发布 `DISCUSSION_CLOSE`: ++ ``` ++ DISCUSSION_CLOSE ++ Topic: <讨论主题> ++ Consensus: <共识 / "未达成共识"> ++ Actions: <后续 Delegation 任务列表,含 TID> ++ Participants: <参与 Agent 列表> ++ ``` ++ ++### 4.3 Closeout 规则(全模式通用) ++ ++完成后必须 closeout(DoD 硬规则,缺一不可): ++1. 在目标 Agent thread/topic 贴 closeout(产物路径 + 验证命令) ++2. **上游本机复核**(CLI-first):至少执行关键命令 + 贴 exit code ++3. **回发起方频道汇报**:同步最终结果 + 如何验证 + 风险遗留。**不做视为任务未完成** ++4. 通知 KO 沉淀(默认:同步到 #know / KO 群组 + 触发 KO ingest) ++ ++Discussion 模式的 closeout 等价物是 `DISCUSSION_CLOSE` 摘要。如果讨论产生了 Delegation 任务,各 Delegation 任务各自 closeout。 ++ ++--- ++ ++## 5. 频道/群组映射 Channel & Group Mapping ++ ++### 5.1 标准映射 ++ ++| 角色 | Slack Channel | Feishu Group | Discord Channel | ++|------|--------------|-------------|----------------| ++| CoS | #hq | HQ 群 | #hq | ++| CTO | #cto | CTO 群 | #cto | ++| Builder | #build | Build 群 | #build | ++| CIO | #invest | Invest 群 | #invest | ++| KO | #know | Know 群 | #know | ++| Ops | #ops | Ops 群 | #ops | ++| Research | #research | Research 群 | #research | ++ ++### 5.2 Discussion 专用频道(v2 新增,可选) ++ ++多 bot Discussion 模式建议增设共享频道: ++ ++| 频道 | 用途 | 参与 bot | ++|------|------|---------| ++| #collab / 协作群 / #collab | 跨域讨论(架构评审、方案对齐) | CoS, CTO, Builder, CIO | ++| #war-room / 战情群 / #war-room | 紧急事件协同 | 全部 | ++ ++**共享频道配置要点**: ++- 所有参与 bot 必须加入该频道/群组 ++- `requireMention: true` -- 防止所有 bot 同时响应 ++- `allowBots: true` 或 `"mentions"` -- 允许 bot 看到其他 bot 的消息 ++- 每个 Agent 对该频道的 binding 都需要显式配置 ++ ++### 5.3 Session Key 格式总览 ++ ++**Slack**: ++``` ++# 频道级 ++agent:<agentId>:slack:channel:<channelId> ++# Thread 级(推荐) ++agent:<agentId>:slack:channel:<channelId>:thread:<root_ts> ++# 多账户(accountId 不影响 session key 格式) ++``` ++ ++**Feishu**: ++``` ++# 群组级(默认) ++agent:<agentId>:feishu:group:<chatId> ++# Topic 级(启用 groupSessionScope: "group_topic") ++agent:<agentId>:feishu:group:<chatId>:topic:<root_id> ++``` ++ ++**Discord**: ++``` ++# Channel 级 ++agent:<agentId>:discord:channel:<channelId> ++# 多账户(session key 含 accountId) ++agent:<agentId>:discord:<accountId>:channel:<channelId> ++``` ++ ++### 5.4 并行规则 ++ ++- **一个任务 = 一个 thread/topic = 一个 session** ++- 同一个频道可以并行多个任务 thread/topic;不要在频道主线里混聊多个任务 ++- Discussion 和 Delegation 可以在同一频道的不同 thread 中并行进行 ++ ++--- ++ ++## 6. 失败与回退 Failure & Fallback ++ ++### 6.1 Delegation 失败回退 ++ ++| 故障 | 表现 | 回退方案 | ++|------|------|---------| ++| `sessions_send` timeout | 工具返回超时 | 不代表失败。在 thread 补发兜底消息;等待并检查回复 | ++| 目标 Agent 无响应 | Thread 中无 Round0 回传 | 停止后续步骤;标记 `failed-delivery`;检查 session key 和 deliveryContext | ++| Session 路由到 webchat | Agent 在跑但 thread 无可见输出 | Round0 审计握手可提前发现;重新检查 session key 大小写 | ++| bot 未加入频道 | `not_in_channel` / 发送失败 | 手动邀请 bot 进入频道/群组 | ++| Thread 行为异常 | 消息进错 thread 或不进 thread | 退回到"频道主线单任务"模式;或发 /new 重置 session | ++ ++### 6.2 Discussion 失败回退 ++ ++| 故障 | 表现 | 回退方案 | ++|------|------|---------| ++| Agent 未响应 @mention | Thread 中无回复 | 检查 `allowBots` / `requireMention` 配置;检查 bot 是否已加入频道 | ++| Agent 响应了错误 thread | 回复出现在意外位置 | 检查 `thread.historyScope` 和 `inheritParent` 配置 | ++| 讨论进入死循环 | Agent 互相重复类似观点 | Orchestrator 强制 `DISCUSSION_CLOSE`;转为 Delegation | ++| 超出轮次上限 | 达到 `maxDiscussionTurns` | Orchestrator 总结并决定:结束 / 人类接管 / 拆分为子任务 | ++| 平台不支持 Discussion | Feishu;Discord 未修复 #11199 | 降级为 Delegation mode。用 `sessions_send` 串联多个 Agent 意见 | ++ ++### 6.3 跨平台降级策略 ++ ++``` ++Discussion Mode 可用? ++├── YES (Slack) → 使用 @mention 驱动讨论 ++├── BLOCKED (Discord) → 降级为 Delegation chain ++│ └── Orchestrator 用 sessions_send 逐个征求 Agent 意见 ++│ → 每个 Agent 的回复在共享 thread 中发布(可见性锚点) ++│ → Orchestrator 汇总后发布结论 ++└── NO (Feishu) → 同上 Delegation chain 方案 ++``` ++ ++--- ++ ++## 7. 平台配置片段 Platform Config Snippets ++ ++### 7.1 Slack 配置(多账户 + Discussion 模式) ++ ++```json ++{ ++ "channels": { ++ "slack": { ++ "accounts": { ++ "default": { ++ "botToken": "xoxb-cos-...", ++ "appToken": "xapp-cos-...", ++ "name": "CoS" ++ }, ++ "cto": { ++ "botToken": "xoxb-cto-...", ++ "appToken": "xapp-cto-...", ++ "name": "CTO" ++ }, ++ "builder": { ++ "botToken": "xoxb-bld-...", ++ "appToken": "xapp-bld-...", ++ "name": "Builder" ++ } ++ }, ++ "channels": { ++ "<HQ_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, ++ "allowBots": false ++ }, ++ "<CTO_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, ++ "allowBots": false ++ }, ++ "<BUILD_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": false, ++ "allowBots": false ++ }, ++ "<COLLAB_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": true, ++ "allowBots": true ++ } ++ }, ++ "thread": { ++ "historyScope": "thread", ++ "inheritParent": true, ++ "initialHistoryLimit": 50 ++ } ++ } ++ }, ++ "bindings": [ ++ { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, ++ { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, ++ { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } ++ ] ++} ++``` ++ ++**配置要点**: ++- 每个 Agent 专属频道:`requireMention: false`(该频道只有一个 bot 响应,无需 mention 门控) ++- 共享频道 #collab:`requireMention: true` + `allowBots: true`(多 bot 安全协作) ++- `thread.initialHistoryLimit: 50` -- Discussion 模式需要较大的历史窗口 ++- 每个 Slack app 需要 Bot Token Scopes: `channels:history`, `channels:read`, `chat:write`, `users:read` ++- Event Subscriptions: `message.channels`, `app_mention` ++- Socket Mode: 每个 app 需要独立的 app-level token (`xapp-`) ++ ++### 7.2 Feishu 配置(多账户 + Topic 隔离) ++ ++```json ++{ ++ "channels": { ++ "feishu": { ++ "domain": "feishu", ++ "connectionMode": "websocket", ++ "groupSessionScope": "group_topic", ++ "accounts": { ++ "cos-bot": { ++ "name": "CoS", ++ "appId": "cli_cos_xxxxx", ++ "appSecret": "your-cos-secret", ++ "enabled": true ++ }, ++ "cto-bot": { ++ "name": "CTO", ++ "appId": "cli_cto_xxxxx", ++ "appSecret": "your-cto-secret", ++ "enabled": true ++ }, ++ "builder-bot": { ++ "name": "Builder", ++ "appId": "cli_build_xxxxx", ++ "appSecret": "your-builder-secret", ++ "enabled": true ++ } ++ } ++ } ++ }, ++ "bindings": [ ++ { ++ "agentId": "cos", ++ "match": { ++ "channel": "feishu", ++ "accountId": "cos-bot", ++ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_HQ>" } ++ } ++ }, ++ { ++ "agentId": "cto", ++ "match": { ++ "channel": "feishu", ++ "accountId": "cto-bot", ++ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_CTO>" } ++ } ++ }, ++ { ++ "agentId": "builder", ++ "match": { ++ "channel": "feishu", ++ "accountId": "builder-bot", ++ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_BUILD>" } ++ } ++ } ++ ] ++} ++``` ++ ++**配置要点**: ++- `groupSessionScope: "group_topic"` -- 核心配置,实现 topic 级 session 隔离 ++- 每个群组只拉入对应的 bot(避免跨账户 dedup 竞争) ++- 建议使用"话题群"(topic group) 类型 -- 强制所有消息必须属于 topic,更契合 OpenCrew 工作流 ++- **已知问题**:Issue #47436 -- 第二个账户使用 SecretRef 时 plugin crash。PR #47652 已提交修复。在合并前使用明文 secret 或等待 patch ++ ++**Feishu 无法使用 Discussion 模式**: ++- 原因:`im.message.receive_v1` 仅对用户消息触发,bot 消息对其他 bot 不可见 ++- 替代方案:所有跨 Agent 协作使用 Delegation(`sessions_send`) ++ ++### 7.3 Discord 配置(多账户 + 频道权限隔离) ++ ++```json ++{ ++ "channels": { ++ "discord": { ++ "accounts": { ++ "default": { ++ "token": "BOT_TOKEN_COS" ++ }, ++ "cto": { ++ "token": "BOT_TOKEN_CTO" ++ }, ++ "builder": { ++ "token": "BOT_TOKEN_BUILDER" ++ } ++ } ++ } ++ }, ++ "bindings": [ ++ { ++ "agentId": "cos", ++ "match": { ++ "channel": "discord", ++ "accountId": "default", ++ "guildId": "<GUILD_ID>", ++ "peer": { "kind": "channel", "id": "<CHANNEL_ID_HQ>" } ++ } ++ }, ++ { ++ "agentId": "cto", ++ "match": { ++ "channel": "discord", ++ "accountId": "cto", ++ "guildId": "<GUILD_ID>", ++ "peer": { "kind": "channel", "id": "<CHANNEL_ID_CTO>" } ++ } ++ }, ++ { ++ "agentId": "builder", ++ "match": { ++ "channel": "discord", ++ "accountId": "builder", ++ "guildId": "<GUILD_ID>", ++ "peer": { "kind": "channel", "id": "<CHANNEL_ID_BUILD>" } ++ } ++ } ++ ] ++} ++``` ++ ++**Discord 频道权限隔离(必须配置)**: ++ ++由于单 bot 模式下 Issue #34 的教训(cos/ops 对话混淆),**必须**配置频道权限隔离: ++ ++1. 每个 bot 创建独立 Role(如 "CoS Bot", "CTO Bot", "Builder Bot") ++2. Server 级权限:仅授予 View Channels + Read Message History,**不授予 Send Messages** ++3. 逐频道授权: ++ - #hq: "CoS Bot" role -> Allow Send Messages + Send Messages in Threads ++ - #cto: "CTO Bot" role -> Allow Send Messages + Send Messages in Threads ++ - #build: "Builder Bot" role -> Allow Send Messages + Send Messages in Threads ++4. **确保 bot role 没有 Administrator 权限**(否则所有 channel-level override 失效) ++ ++**Thread 注意事项**: ++- Discord thread 会自动归档(默认 24h 无活动) ++- Bot 需要 Manage Threads 权限以 unarchive ++- 已完成任务的 thread 自然归档是可接受的行为 ++ ++**当前限制**: ++- Issue #11199 未修复前,Discussion 模式不可用 ++- Issue #45300: `requireMention` 在多账户配置下可能失效 ++- 所有跨 Agent 协作使用 Delegation(`sessions_send`) ++ ++--- ++ ++## 8. 迁移指南 Migration Guide ++ ++### 8.1 通用原则 ++ ++- **增量迁移**:一次只改一个 Agent,验证后再继续 ++- **保留单 bot**:原有单 bot 配置作为 `default` 账户保留,新 bot 逐个添加 ++- **可回滚**:每步都能通过恢复配置 + 重启 gateway 回退 ++- **Session 注意**:多账户迁移可能导致 session key 格式变化,旧 session 可能孤立 ++ ++### 8.2 Slack 迁移:单 bot -> 多 bot ++ ++**Phase 0 -- 准备(不影响运行)** ++ ++1. 为每个需要独立身份的 Agent 创建新的 Slack app(参照 SLACK_SETUP.md) ++ - 建议先创建 3 个核心 app: CoS, CTO, Builder ++ - 配置 Bot Token Scopes, Event Subscriptions, Socket Mode ++2. 将新 bot 邀请进对应频道 ++3. 备份当前 `openclaw.json` ++ ++**Phase 1 -- 切换到多账户配置** ++ ++1. 修改 `channels.slack` 从单 token 改为 `accounts` 格式: ++ ```json ++ // Before: ++ { "channels": { "slack": { "botToken": "xoxb-...", "appToken": "xapp-..." } } } ++ ++ // After: ++ { "channels": { "slack": { "accounts": { "default": { "botToken": "xoxb-...", "appToken": "xapp-..." } } } } } ++ ``` ++2. 添加 `accountId` 到 bindings ++3. 重启 gateway 验证:原有功能不受影响 ++ ++**Phase 2 -- 添加新 bot 账户** ++ ++1. 逐个添加新 Agent 的账户到 `accounts` ++2. 更新对应 binding 的 `accountId` ++3. 每添加一个,重启验证 ++ ++**Phase 3 -- 启用 Discussion 模式** ++ ++1. 创建 #collab 频道,邀请所有参与 bot ++2. 配置 #collab: `requireMention: true` + `allowBots: true` ++3. 为 #collab 频道添加每个 Agent 的 binding ++4. 测试:人类在 #collab 发帖,@mention 不同 Agent,验证各 Agent 独立响应 ++ ++**回滚**:任何阶段恢复备份 `openclaw.json` + 重启 gateway 即可回到单 bot 模式。 ++ ++### 8.3 Feishu 迁移:单 app -> 多 app + Topic 隔离 ++ ++**Phase 0 -- 启用 Topic 隔离(独立于多 app,可先做)** ++ ++1. 在 `openclaw.json` 中添加 `groupSessionScope: "group_topic"` ++2. 重启 gateway ++3. 验证:在群组中创建 topic 发消息,检查 session key 包含 `:topic:` 后缀 ++4. 注意:主线(非 topic)消息仍使用群组级 session key,向后兼容 ++ ++**Phase 1 -- 创建新飞书应用** ++ ++1. 在飞书开放平台创建新应用(每个 Agent 一个) ++2. 配置事件订阅:`im.message.receive_v1` ++3. 启用 WebSocket 连接模式 ++4. **注意 Bug #47436**:在 PR #47652 合并前,避免使用 SecretRef,改用明文 secret ++ ++**Phase 2 -- 切换到多账户配置** ++ ++1. 保留原 app 为 `legacy` 账户 ++2. 逐个添加新账户 + 更新 binding ++3. 将新 bot 拉入对应群组(每个群组只需拉入对应 bot) ++4. 验证每个 Agent 在其专属群组正常响应 ++ ++**回滚**:恢复配置 + 重启。原 bot 保留在群组中,随时可切回。 ++ ++### 8.4 Discord 迁移:单 bot -> 多 bot + 权限隔离 ++ ++**Phase 0 -- 修复 Issue #34(单 bot 下也应做)** ++ ++1. 创建 bot-specific role(如 "OpenCrew Bot") ++2. Server 级:授予 View Channels + Read Message History,不授予 Send Messages ++3. 逐频道授权 Send Messages ++4. 验证:bot 只能在授权频道发送消息 ++ ++**Phase 1 -- 创建新 Discord bot** ++ ++1. 在 Discord Developer Portal 创建新 Application(每个 Agent 一个) ++2. 启用 Message Content Intent ++3. 生成 bot token,邀请 bot 进 server ++4. 为每个 bot 创建独立 role 并配置频道权限 ++ ++**Phase 2 -- 切换到多账户配置** ++ ++1. 修改 `channels.discord` 为 `accounts` 格式 ++2. 原 token 作为 `default` 账户 ++3. 逐个添加新账户 + binding ++4. 验证每个 Agent 在正确频道响应 ++ ++**Phase 3 -- 等待 Discussion 模式解锁** ++ ++- 跟踪 Issue #11199 修复状态(PR #11644, #22611, #35479) ++- 修复合并后,配置 `allowBots: true` + `requireMention: true` 在共享频道 ++- 测试跨 bot 可见性 ++ ++**回滚**:恢复配置 + 重启。移除新 bot 的频道权限即可。 ++ ++--- ++ ++## 9. 架构决策记录 Architecture Decision Records ++ ++### ADR-001: 两种模式共存而非替代 ++ ++**Context**: A2A v1 使用 `sessions_send` 两步触发。Slack 多 bot 支持一步 @mention 触发。需要决定是否用 v2 替代 v1。 ++ ++**Decision**: 两种模式共存。Delegation(v1)用于结构化任务委派,Discussion(v2)用于多方讨论。 ++ ++**Consequences**: ++- (+) 向后兼容:单 bot 用户不受影响,仍使用 Delegation ++- (+) 渐进增强:多 bot 用户解锁 Discussion 作为额外能力 ++- (+) 全平台覆盖:Feishu 只能用 Delegation,不被排除在外 ++- (-) 认知负担:Agent 需要理解两种模式的适用场景 ++- (-) 协议复杂度增加:SOUL.md/AGENTS.md 需要更多规则 ++ ++**Grounding**: Slack 研究确认 Discussion 模式仅需配置变更(no code changes)。Feishu 研究确认平台硬限制导致 Discussion 不可能。Discord 研究确认 Discussion 被 bug 阻断但修复后可用。三平台能力差异决定了不能用单一模式覆盖所有场景。 ++ ++### ADR-002: Orchestrator 控制讨论节奏而非自由讨论 ++ ++**Context**: Discussion 模式中,Agent 可以被动响应(等 @mention)或主动参与(看到相关消息就发言)。需要决定讨论模式。 ++ ++**Decision**: 采用 Orchestrator 控制模式。每次由人类或指定 Agent @mention 下一个发言者。Agent 只在被 @mention 时响应。 ++ ++**Consequences**: ++- (+) 防止 Agent 讨论失控循环 ++- (+) 人类可随时介入控制节奏 ++- (+) `requireMention: true` 在系统层面强制执行 ++- (+) 轮次计数可控(`maxDiscussionTurns`) ++- (-) 无法实现完全自主的 Agent 圆桌讨论 ++- (-) Orchestrator 成为瓶颈(每轮需等待 Orchestrator 决定下一步) ++ ++**Grounding**: Slack 研究指出 Agent-orchestrated turn management 需要验证 mention-parsing 可靠性(Open Question #2)。人类控制是最安全的起步模式(Phase 1)。`maxPingPongTurns` 已证明轮次限制对防止循环的价值。 ++ ++### ADR-003: 共享频道而非 DM 进行 Discussion ++ ++**Context**: 多 Agent 讨论可以在共享频道 thread(如 #collab)或通过 DM/私信进行。需要决定讨论场所。 ++ ++**Decision**: Discussion 必须在共享频道的 thread 中进行。不允许 Agent 间 DM 讨论。 ++ ++**Consequences**: ++- (+) 用户可见性:所有讨论对用户透明 ++- (+) 审计友好:thread 即审计日志 ++- (+) 与 SYSTEM_RULES 一致:"可见、可追踪、不串上下文" ++- (-) 需要创建额外的共享频道(如 #collab) ++- (-) 多 bot 都需要加入共享频道,增加配置工作量 ++ ++**Grounding**: SYSTEM_RULES.md 要求"通过结构化产物而非海量对话实现演化"。A2A_PROTOCOL v1 要求"用户必须能在频道里看到"。Discussion 在共享频道中天然满足这些要求。 ++ ++### ADR-004: `requireMention: true` 作为 Discussion 安全阀 ++ ++**Context**: 多 bot 在共享频道时,`allowBots: true` 意味着每个 bot 都能看到其他 bot 的消息。如果没有门控,所有 bot 可能同时响应同一条消息。 ++ ++**Decision**: 共享频道必须配置 `requireMention: true`。Agent 只响应 @mention 自己的消息。 ++ ++**Consequences**: ++- (+) 系统级循环防护(不依赖 Agent 自律) ++- (+) 用户/Orchestrator 精确控制哪个 Agent 参与 ++- (+) 即使 Agent SOUL.md 规则被忽略,系统仍安全 ++- (-) 无法实现"Agent 自主判断是否参与"的高级模式 ++- (-) Discord Issue #45300 报告 `requireMention` 在多账户下可能失效 ++ ++**Grounding**: Slack 研究确认 `allowBots: true` + `requireMention: true` 是官方推荐的安全组合。OpenClaw 文档明确推荐此组合用于多 Agent 场景。 ++ ++### ADR-005: Feishu 采用 `groupSessionScope: "group_topic"` 实现 Session 隔离 ++ ++**Context**: Feishu 群组默认共享单一 session(P0 问题)。需要决定隔离方案。 ++ ++**Decision**: 使用 `groupSessionScope: "group_topic"`,每个 topic thread 获得独立 session key。 ++ ++**Consequences**: ++- (+) 直接解决 P0 session 共享问题 ++- (+) 向后兼容:非 topic 消息仍使用群组级 session ++- (+) 与 Slack "thread = task = session" 模型对齐 ++- (+) 纯配置变更,不改代码 ++- (-) Session key 格式变化:`sessions_send` 时需要包含 topic 后缀 ++- (-) 需要 OpenClaw >= 2026.2(PR #29791) ++ ++**Grounding**: Feishu 研究确认 `buildFeishuConversationId` 函数在 `group_topic` 模式下生成 `chatId:topic:topicId` 格式的 session key。PR #29791 已合并,功能可用。 ++ ++### ADR-006: Discord 频道权限隔离作为必须配置 ++ ++**Context**: Issue #34 暴露了单 bot 模式下缺少频道权限隔离导致对话混淆的问题。 ++ ++**Decision**: Discord 部署必须配置频道级 Send Messages 权限隔离,无论单 bot 还是多 bot。 ++ ++**Consequences**: ++- (+) 根治 Issue #34 ++- (+) 多 bot 模式下每个 bot 天然隔离(只在自己频道有发送权限) ++- (+) 即使 OpenClaw binding 有 edge case,Discord 权限层提供兜底 ++- (-) 配置步骤增加 ++- (-) bot role 不能有 Administrator 权限(否则 override 失效) ++ ++**Grounding**: Discord 研究确认 Issue #34 root cause 是 "single-bot + missing channel permission overrides"。Reporter 自己确认了解决方案。 ++ ++--- ++ ++## Appendix A: Discussion 模式分阶段上线路线图 ++ ++``` ++Phase 1 -- Human Orchestrated (NOW, Slack only) ++├── 人类在 #collab thread 中 @mention Agent ++├── Agent 响应后,人类决定下一步 ++├── 最安全,零协议风险 ++└── 验证点:各 Agent 独立响应、thread history 正确加载 ++ ++Phase 2 -- Agent Orchestrated (NEAR, Slack only) ++├── CTO/CoS 在 SOUL.md 中被授权为 Orchestrator ++├── Orchestrator Agent 可以 @mention 其他 Agent ++├── maxDiscussionTurns = 8 作为安全阀 ++└── 验证点:Orchestrator mention 被目标 Agent 正确识别 ++ ++Phase 3 -- Cross-Platform (FUTURE, Slack + Discord) ++├── Discord Issue #11199 修复后启用 ++├── 统一 Discussion 协议在 Slack 和 Discord ++└── Feishu 保持 Delegation-only(平台限制) ++ ++Phase 4 -- Proactive Mode (EXPLORATION) ++├── Agent 不需要 @mention 即可判断相关性并参与 ++├── 需要 allowBots: "mentions" → allowBots: true 升级 ++├── 需要 Agent 端的 relevance filtering ++└── 参考:SlackAgents (EMNLP 2025) proactive mode ++``` ++ ++## Appendix B: 与 Anthropic Harness 模式对比 ++ ++| 维度 | Harness(文件协作) | OpenCrew Delegation(消息协作) | OpenCrew Discussion(讨论协作) | ++|------|--------------------|-----------------------------|-------------------------------| ++| 通信介质 | 磁盘文件 | `sessions_send` 内部消息 + thread 锚点 | Thread 消息(直接在聊天 UI) | ++| 持久性 | Git 可追踪 | Session 内存 + thread 日志 | Thread 日志 | ++| 结构化 | 高(sprint contract, spec 文件) | 高(任务包模板、closeout 模板) | 中(自然语言 + 格式约定) | ++| 延迟 | ~0(本地文件系统) | ~1-3s(内部 RPC + 平台 API) | ~1-3s(平台 API) | ++| 人类可见性 | 需要主动检查文件 | Thread 可见但需跟踪多频道 | **天然可见**(讨论就在 UI 中) | ++| 上下文窗口 | 完整文件内容 | Session history | Thread history(`initialHistoryLimit`) | ++| 轮次管理 | Harness 代码控制 | `maxPingPongTurns` | Orchestrator + `maxDiscussionTurns` | ++| 对抗式审查 | Generator vs Evaluator | 无内建(由上游人工审查) | **天然支持**(多 Agent 在同一 thread 辩论) | ++ ++**结论**:Delegation 适合执行型任务(Builder 写代码),Discussion 适合决策型任务(架构评审、方案对齐)。两者与 Harness 模式互补而非竞争 -- Harness 适合纯自动化 CI 管线,OpenCrew 适合需要人类参与和可见性的组织协作。 ++ ++## Appendix C: 配置 Quick Reference ++ ++### 最小配置(单 bot,仅 Delegation) ++ ++```json ++{ ++ "channels": { ++ "slack": { "botToken": "xoxb-...", "appToken": "xapp-..." } ++ } ++} ++``` ++ ++### 推荐配置(多 bot,Delegation + Discussion) ++ ++见第 7 节各平台配置片段。 ++ ++### 关键配置参数速查 ++ ++| 参数 | 值 | 用途 | 适用平台 | ++|------|-----|------|---------| ++| `allowBots` | `false` / `true` / `"mentions"` | 控制 bot 消息是否被处理 | Slack, Discord | ++| `requireMention` | `true` / `false` | 要求 @mention 才触发 Agent | Slack, Discord | ++| `thread.historyScope` | `"thread"` | Thread 级历史隔离 | Slack | ++| `thread.inheritParent` | `true` / `false` | Thread 是否继承 root message 上下文 | Slack | ++| `thread.initialHistoryLimit` | 数字 | Agent 加载的历史消息数 | Slack | ++| `groupSessionScope` | `"group"` / `"group_topic"` / `"group_sender"` / `"group_topic_sender"` | 群组 session 隔离粒度 | Feishu | ++| `maxPingPongTurns` | 数字 | Delegation A2A 最大往返轮数 | 全平台 | ++| `maxDiscussionTurns` | 5 (Level 1) / 8 (Level 2)(协议约定,非系统参数) | Discussion 最大 Agent 响应次数 | Slack (Discord future) | diff --git a/.harness/reports/qa_a2a_research_r1.md b/.harness/reports/qa_a2a_research_r1.md new file mode 100644 index 0000000..f0d1dc8 --- /dev/null +++ b/.harness/reports/qa_a2a_research_r1.md @@ -0,0 +1,237 @@ +# QA Report: A2A Research Verification + +**QA Agent**: Claude Opus 4.6 (1M context) +**Date**: 2026-03-27 +**Scope**: Verify 6 critical claims from research reports against official docs and source evidence +**Method**: Web search, WebFetch of official docs, GitHub CLI issue/PR inspection + +## Overall: NEEDS-WORK + +Four of six claims verified or partially verified. Two claims have significant accuracy issues that could mislead implementation decisions. The Slack self-loop filter claim (Claim 1) has an important nuance the reports gloss over, and the Discord #11199 status (Claim 3) is stale -- the issue was auto-closed, not fixed. + +--- + +## Claim 1: Slack multi-account enables true cross-bot communication + +**Report says**: `allowBots: true` + `requireMention: true` + multi-account = Bot-A's messages visible to Bot-B. Self-loop filter is per-bot-user-ID, so multi-account naturally bypasses it. + +### Verified: PARTIALLY + +### Evidence + +**What IS confirmed (HIGH confidence)**: + +1. **Slack Events API delivers cross-bot messages**: Confirmed via [Slack message.channels event docs](https://docs.slack.dev/reference/events/message.channels/) and [Slack Events API docs](https://docs.slack.dev/apis/events-api/). All apps subscribed to `message.channels` receive events for all messages in channels they've joined, including messages from other bots. + +2. **OpenClaw multi-account Slack support exists**: Confirmed via [OpenClaw Slack docs](https://docs.openclaw.ai/channels/slack) and [community gist](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f). The `channels.slack.accounts` configuration with per-account `botToken`/`appToken` is documented. + +3. **`allowBots` exists as a per-channel config**: Confirmed in official Slack docs as a per-channel control under `channels.slack.channels.<id>`. + +4. **`requireMention: true` exists**: Confirmed. Official docs say "Channel messages are mention-gated by default." + +5. **`initialHistoryLimit` exists and defaults to 20**: Confirmed. Official docs: "controls how many existing thread messages are fetched when a new thread session starts (default 20; set 0 to disable)." + +6. **`thread.historyScope` defaults to `"thread"`**: Confirmed in official docs. + +7. **`thread.inheritParent` defaults to `false`**: Confirmed in official docs. + +**What is UNCERTAIN (MEDIUM confidence)**: + +8. **Self-loop filter is per-bot-user-ID on Slack**: The report states this with HIGH confidence, but the evidence is indirect. Issue [#15836](https://github.com/openclaw/openclaw/issues/15836) shows the Slack filter code as `if (message.user === botUserId)`, which IS per-bot-user-ID. However, this issue was about single-bot mode. In multi-account mode, each account has its own `botUserId` fetched via `client.fetchUser("@me")`, so the filter SHOULD be per-account. But there is NO confirmed end-to-end test of multi-account Slack bot-to-bot communication in the evidence. The gist author's "verification checklist" describes human-mediated `@mention` workflows, not autonomous bot-to-bot. A user in the gist comments reported duplicate reply issues that required "deleted all slack apps and reset configs" to resolve. + +9. **`allowBots: "mentions"` as a Slack value**: The report (Slack research, Section 1.2) lists three modes for `allowBots`: `false`, `true`, and `"mentions"`. The `"mentions"` value is confirmed for **Discord** via [OpenClaw configuration reference](https://github.com/openclaw/openclaw/blob/main/docs/gateway/configuration-reference.md): "use `allowBots: "mentions"` to only accept bot messages that mention the bot." However, the Slack docs do NOT document `allowBots: "mentions"` -- only `allowBots` appears as a listed per-channel control without value specification. The report itself rates this as MEDIUM confidence and notes it was found via "DeepWiki source-level documentation" -- but it may only be a Discord feature. + +**What is UNVERIFIED**: + +10. **End-to-end Slack multi-account cross-bot messaging**: No first-party evidence of a successful test where Bot-A posts a message, Bot-B's OpenClaw handler receives it via `allowBots: true`, and Bot-B responds. The community gist describes the setup pattern but does not demonstrate confirmed autonomous bot-to-bot message delivery. Issue #15836 (Slack agent-to-agent routing) was closed as NOT_PLANNED, and the two fix PRs (#15863, #15946) were CLOSED without merging. + +### Issues + +- **RISK**: The report presents Slack multi-account cross-bot communication as "technically feasible today" and "NOW" achievable, but no one has demonstrated it working end-to-end. The architecture report builds five collaboration patterns on this assumption. +- **INACCURACY**: `allowBots: "mentions"` may not exist for Slack. The report should use `allowBots: true` + `requireMention: true` as the recommended Slack config (which is what the config snippets actually show). +- **MISSING CAVEAT**: Issue #15836 was closed NOT_PLANNED, suggesting the OpenClaw maintainers may consider `sessions_send` the canonical A2A mechanism, with channel messages reserved for human-agent interaction only. + +### Implementation Recommendation + +Before implementing multi-account Discussion mode, the team MUST run a proof-of-concept test: +1. Create 2 Slack apps with separate tokens +2. Configure OpenClaw multi-account with `allowBots: true` + `requireMention: true` +3. Have Bot-A post a message @mentioning Bot-B in a shared channel +4. Verify Bot-B's OpenClaw session receives and responds to the message + +--- + +## Claim 2: Feishu bot message invisibility + +**Report says**: `im.message.receive_v1` only fires for user messages. Bot messages are invisible to other bots. This is a Feishu platform limitation. + +### Verified: YES + +### Evidence + +**Feishu official documentation** at [open.feishu.cn/document/server-docs/im-v1/message/events/receive](https://open.feishu.cn/document/server-docs/im-v1/message/events/receive) explicitly states: + +1. `sender_type` field: "目前只支持用户(user)发送的消息" -- **"Currently only supports messages sent by users"** + +2. Group chat behavior: "可接收与机器人所在群聊会话中用户发送的所有消息(不包含机器人发送的消息)" -- **"Can receive all messages sent by users in group chats where the bot participates, excluding messages sent by the bot"** + +This confirms the report's claim with direct official documentation. The limitation is at the Feishu platform level and cannot be worked around by any OpenClaw configuration. + +### Issues + +None. The report accurately characterizes this limitation and correctly concludes that `sessions_send` remains necessary for Feishu cross-agent triggering. + +--- + +## Claim 3: Discord OpenClaw Issue #11199 blocks cross-bot messaging + +**Report says**: OpenClaw's bot filter treats ALL configured bots as "self." Related fix PRs: #11644, #22611, #35479. + +### Verified: PARTIALLY -- Status is STALE, not actively blocked + +### Evidence + +**Issue #11199 confirmed**: The [issue](https://github.com/openclaw/openclaw/issues/11199) exists and accurately describes the problem. The bug report includes detailed reproduction steps and code analysis showing the mention detection failure. + +**However, the report's characterization is incomplete**: + +1. **Issue was auto-closed on 2026-03-08** due to inactivity (stale bot), NOT because it was fixed. The closure message: "Closing due to inactivity. If this is still an issue, please retry on the latest OpenClaw release." + +2. **All three fix PRs were CLOSED without merging**: + - PR #11644 ("fix: bypass bot filter and mention gate for sibling Discord bots") -- CLOSED, not merged + - PR #22611 ("fix(discord): allow messages from other instance bots in multi-account setups") -- CLOSED, not merged + - PR #35479 ("fix(discord): add allowBotIds config to selectively allow bot messages") -- CLOSED, not merged + +3. **A community workaround exists**: A [comment by @garibong-labs](https://github.com/openclaw/openclaw/issues/11199#issuecomment-3904716720) provides a working config using `allowBots: true` + `requireMention: false` + per-channel `users` whitelist. However, this requires disabling mention gating entirely. + +4. **Additional blocker**: Issue [#45300](https://github.com/openclaw/openclaw/issues/45300) -- `requireMention: true` is broken in multi-account Discord config (still OPEN). This means even if #11199 were fixed, mention-gated bot-to-bot communication still would not work. + +### Issues + +- **STALE DATA**: The report says "status of these PRs could not be confirmed." I can confirm: all three PRs are CLOSED without merge. The issue itself is closed-as-stale, not closed-as-fixed. +- **WORKAROUND NOT MENTIONED**: The `requireMention: false` + `users` whitelist workaround exists and is confirmed working by multiple users, but the report does not mention it. +- **DOUBLE BLOCKER**: Even if #11199 is reopened and fixed, #45300 (`requireMention` broken in multi-account) is an independent blocker for the recommended `allowBots: true` + `requireMention: true` pattern. + +### Implementation Recommendation + +Discord Discussion mode should be classified as **BLOCKED (two independent issues)** rather than "BLOCKED (one issue)." The architecture report should document the `requireMention: false` workaround as an interim option with appropriate warnings about loop risk. + +--- + +## Claim 4: groupSessionScope: "group_topic" creates per-topic sessions + +### Verified: YES (previously verified in qa_docs_official_r1.md) + +The previous QA round confirmed this against OpenClaw v2026.3.1 release notes. The version requirement was corrected from "2026.2" to "2026.3.1." + +No contradicting information found in this verification round. + +--- + +## Claim 5: A2A v2 Protocol design is backward compatible + +**Report says**: Delegation (v1) + Discussion (v2) coexist. Single-bot users are unaffected. The `allowBots` + `requireMention` config on shared channels does not conflict with existing single-bot config. + +### Verified: YES + +### Evidence + +1. **Current A2A_PROTOCOL.md** (`shared/A2A_PROTOCOL.md`) uses only Delegation mode: anchor message + `sessions_send`. It explicitly notes: "Slack 中所有 Agent 共用同一个 bot 身份" and "bot 自己发到别的频道的消息,默认不会触发对方 Agent 自动运行." + +2. **v2 protocol design** adds Discussion mode as a NEW capability alongside Delegation. Key design decisions that preserve compatibility: + - Agent-dedicated channels (e.g., #hq, #cto, #build) keep `allowBots: false` and `requireMention: false` -- identical to current behavior + - Only the NEW `#collab` channel uses `allowBots: true` + `requireMention: true` + - All Delegation workflows (two-step anchor + `sessions_send`) remain unchanged + - Multi-account is additive: the `default` account can map to the existing single bot + +3. **No config conflicts**: The v2 config snippets show dedicated channels retaining their current settings. The `accounts` block is an extension of the existing `channels.slack` structure, not a replacement. + +4. **Permission matrix unchanged**: v2 adds "Discussion 中的 @mention 也必须遵守权限矩阵" as a supplement, not a modification. + +### Issues + +- **Minor**: The v2 protocol references `maxDiscussionTurns` as an AGENTS.md-level instruction, not a system config key. This should be clearly documented as a convention, not a config parameter, to avoid confusion. +- **Minor**: The v2 `thread.inheritParent: true` recommendation for shared channels differs from the current default of `false`. Implementers should be warned this changes behavior for ALL threads in configured channels, not just Discussion threads. + +--- + +## Claim 6: Collaboration patterns are mechanically feasible + +**Report says**: 5 patterns (Architecture Review, Strategic Alignment, Code Review, Incident Response, Knowledge Synthesis) work via @mention -> thread history loading -> response. + +### Verified: PARTIALLY + +### What is mechanically sound + +1. **@mention triggers agent response**: With `requireMention: true`, a bot only processes messages where it is @mentioned. This is confirmed behavior in OpenClaw for both Slack and Discord. + +2. **Thread history loading via `initialHistoryLimit`**: Confirmed. When an agent starts a new session in a thread, it loads the last N messages (default 20, configurable). This is the mechanism by which Agent-B would "see" Agent-A's earlier messages. + +3. **`initialHistoryLimit` exists and is configurable**: Confirmed at `channels.slack.thread.initialHistoryLimit`. The architecture report's recommendations of 50-100 depending on pattern are reasonable. + +4. **Visual identity**: With multi-account, each bot app has its own profile (name, avatar). Issue [#27080](https://github.com/openclaw/openclaw/issues/27080) (Slack agent identity fix) is CLOSED (resolved on 2026-03-01). + +5. **Turn management via @mention**: The Level 1 (human-orchestrated) pattern is mechanically straightforward -- human @mentions agents in sequence, each agent loads thread history and responds. + +### What is uncertain + +6. **Agent generating @mentions**: Level 2 (agent-orchestrated) requires the orchestrator agent to produce `<@BOT_USER_ID>` in its messages. The report acknowledges this as an open question (#2 in Slack Open Questions): "When Agent-CTO posts '@Builder what do you think?', does OpenClaw's Slack plugin reliably detect this as a mention?" This is unverified. + +7. **Bot-to-bot message delivery** (the foundation of all patterns): As noted in Claim 1, there is no confirmed end-to-end test of multi-account Slack bot-to-bot message delivery via `allowBots: true`. All five patterns depend on this working. + +8. **Thread history completeness**: With `initialHistoryLimit: 50`, an agent joining a long discussion late will only see the last 50 messages. The architecture report addresses this with "context anchor" and "orchestrator summary" strategies, which is sound design but adds implementation complexity. + +### Issues + +- **CRITICAL DEPENDENCY**: All 5 patterns depend on Claim 1 (Slack multi-account cross-bot messaging) being true. If bot-to-bot message delivery fails in practice, all Discussion-mode patterns are blocked. +- **Level 2 feasibility uncertain**: Agent-generated @mentions have not been validated. If agents produce `@CTO` as plain text rather than `<@U123BOT>` in Slack's mention format, the mention will not be detected. +- **Pattern descriptions are thorough and well-designed**: The step-by-step mechanics, guard rails (maxDiscussionTurns, context anchors, escalation), and degradation strategies (Feishu/Discord fallback to Delegation chains) are architecturally sound regardless of implementation verification. + +--- + +## Implementation Recommendations + +### Must-Do Before Implementation + +1. **RUN A PROOF OF CONCEPT**: Set up 2 Slack apps, configure multi-account OpenClaw with `allowBots: true` + `requireMention: true`, and verify bot-to-bot message delivery end-to-end. This is the single most important validation step. Everything else depends on this. + +2. **Verify `allowBots: "mentions"` for Slack**: The config reference confirms this value exists for Discord. Confirm whether it also works for Slack, or if the Slack implementation only accepts `true`/`false`. If Slack does not support `"mentions"`, remove all references to it from Slack-specific documentation. + +3. **Test agent-generated @mentions**: Have an agent produce a message containing `<@BOT_USER_ID>` and verify the receiving bot's OpenClaw instance recognizes it as a mention. + +### Implementation Cautions + +4. **Discord has TWO blockers, not one**: Issue #11199 (closed-stale, unfixed) AND Issue #45300 (`requireMention` broken in multi-account). Both must be resolved for Discussion mode on Discord. + +5. **Feishu SecretRef crash (Issue #47436)**: Still OPEN. PR #47652 (fix) is OPEN but not merged. Use plaintext secrets for multi-account Feishu until this is resolved. + +6. **`thread.inheritParent: true`**: The v2 config changes this from the default `false`. This affects all threads in configured channels, not just Discussion threads. Test for regressions in existing Delegation workflows. + +7. **Issue #15836 closure**: The OpenClaw maintainers closed the Slack agent-to-agent routing issue as NOT_PLANNED, which may signal that channel-based A2A is not an officially supported pattern. The `sessions_send` approach should remain the primary A2A mechanism, with Discussion mode as an enhancement for Slack-capable deployments. + +--- + +## Blocking Issues + +1. **NO CONFIRMED END-TO-END TEST** of Slack multi-account bot-to-bot communication via `allowBots: true`. The entire Discussion mode architecture rests on this assumption. A proof-of-concept MUST succeed before any implementation work begins. + +2. **Discord Discussion mode has two independent blockers** (Issues #11199 and #45300), both unresolved. Implementation for Discord should be deferred. + +--- + +*Report generated: 2026-03-27* +*Verification method: WebSearch, WebFetch of official documentation (docs.openclaw.ai, open.feishu.cn, docs.slack.dev), GitHub CLI (gh issue view, gh pr view), OpenCrew repository inspection* + +Sources: +- [OpenClaw Slack Documentation](https://docs.openclaw.ai/channels/slack) +- [OpenClaw Multi-Agent Routing](https://docs.openclaw.ai/concepts/multi-agent) +- [OpenClaw Configuration Reference](https://github.com/openclaw/openclaw/blob/main/docs/gateway/configuration-reference.md) +- [Feishu Receive Message Event](https://open.feishu.cn/document/server-docs/im-v1/message/events/receive) +- [Slack Events API](https://docs.slack.dev/apis/events-api/) +- [Slack message.channels Event](https://docs.slack.dev/reference/events/message.channels/) +- [Issue #11199: Discord bot-to-bot filter](https://github.com/openclaw/openclaw/issues/11199) +- [Issue #15836: Slack agent-to-agent routing](https://github.com/openclaw/openclaw/issues/15836) +- [Issue #45300: requireMention broken in multi-account Discord](https://github.com/openclaw/openclaw/issues/45300) +- [Issue #27080: Slack agent identity fix](https://github.com/openclaw/openclaw/issues/27080) +- [Issue #47436: Feishu SecretRef crash](https://github.com/openclaw/openclaw/issues/47436) +- [Community Gist: Multi-Agent Slack Setup](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f) diff --git a/.harness/reports/qa_docs_official_r1.md b/.harness/reports/qa_docs_official_r1.md new file mode 100644 index 0000000..8dc2f54 --- /dev/null +++ b/.harness/reports/qa_docs_official_r1.md @@ -0,0 +1,217 @@ +# QA Report: Documentation Review Against Official Sources + +## Overall Verdict: NEEDS-WORK + +Three factual inaccuracies were found that require correction. The documentation is generally well-written and clear, but the version claims and one PR reference are verifiably wrong. + +--- + +## 1. Discord Permission Isolation + +### Accuracy vs official docs: PASS (with minor note) + +The permission setup flow described in Step 5b (Create role -> server-level deny Send Messages -> per-channel Allow) is **correct** and matches Discord's documented permission hierarchy. + +According to Discord's official developer documentation at [docs.discord.com/developers/topics/permissions](https://docs.discord.com/developers/topics/permissions): + +1. Server-level role permissions provide the base. +2. Channel-level overrides (green check / red X / gray slash) override the server-level defaults. +3. If a role lacks "Send Messages" at the server level (not granted), and a channel override sets it to Allow (green checkmark), the bot **will** be able to send in that specific channel. + +### Permission names correct: PASS + +All permission names used in the docs match Discord's official names: +- "Send Messages" -- correct (API: `SEND_MESSAGES`) +- "View Channels" -- correct (API: `VIEW_CHANNEL`, UI displays as "View Channels") +- "Read Message History" -- correct (API: `READ_MESSAGE_HISTORY`) +- "Send Messages in Threads" -- correct (API: `SEND_MESSAGES_IN_THREADS`) +- "Create Public Threads" -- correct (API: `CREATE_PUBLIC_THREADS`) +- "Create Private Threads" -- correct (API: `CREATE_PRIVATE_THREADS`) +- "Manage Threads" -- correct (API: `MANAGE_THREADS`) +- "Add Reactions" -- correct (API: `ADD_REACTIONS`) +- "Mention Everyone" -- correct (API: `MENTION_EVERYONE`) +- "Manage Channels" -- correct (API: `MANAGE_CHANNELS`) + +### Permission hierarchy correct: PASS + +The statement "Do NOT grant Administrator -- Administrator bypasses all channel overrides" is **factually correct**. Discord's official dev docs state: "ADMINISTRATOR overrides any potential permission overwrites, so there is nothing to do here." The Administrator permission short-circuits all channel override calculations. + +### Role-per-bot approach: PASS + +The multi-bot setup suggesting a separate role per bot with per-channel Send Messages overrides is a valid and recommended pattern. + +### Minor note on "View Channel" vs "View Channels" + +Line 128 of the EN Discord doc uses "View Channel" (singular) while line 60 and 143 use "View Channels" (plural). The API flag is `VIEW_CHANNEL` (singular), but the Discord UI shows "View Channels" (plural). Both are understood, but consistency within the doc would be better. + +### 50-bot limit: PASS + +The claim "Discord servers allow up to 50 bots" is correct. Discord imposes a 50-bot limit on servers. + +### 100-server vs 75-server threshold: MINOR INACCURACY + +The EN doc (line 45) states "Bots in fewer than 100 servers do not need review." The official threshold is actually 75 servers -- the application form for Privileged Intents appears once a bot reaches 75 servers. The actual enforcement kicks in at 100 servers. The CN doc says the same ("小于 100 个服务器的 bot 无需审核"). This is a simplification that could mislead users with bots approaching 75 servers. + +The Multi-Bot section (line 209 EN) correctly says "Bots in more than 75 servers require a separate Message Content Intent approval" -- this contradicts the earlier 100-server claim at line 45. + +--- + +## 2. Feishu groupSessionScope + +### Config key verified: PASS + +`groupSessionScope` is a valid configuration key. The OpenClaw v2026.3.1 release notes confirm: "add configurable group session scopes (`group`, `group_sender`, `group_topic`, `group_topic_sender`)". The value `"group_topic"` is confirmed as one of the four valid options. + +### Version requirement verified: FAIL -- INCORRECT VERSION + +**Both EN and CN docs claim `groupSessionScope` requires "OpenClaw >= 2026.2". This is wrong.** + +Evidence: +- OpenClaw v2026.3.1 release notes (https://github.com/openclaw/openclaw/releases/tag/v2026.3.1) explicitly list "Feishu/Group session routing: add configurable group session scopes" as a new feature. +- The feature is also referenced in Issue #29791 (opened Feb 28, 2026, closed via PR #29788, merged March 2, 2026) -- well after the 2026.2 release. +- The official Feishu channel docs at docs.openclaw.ai/channels/feishu do NOT mention `groupSessionScope` by name, suggesting it may be documented under a different naming convention or is very recent. + +**Correct version**: OpenClaw >= 2026.3.1 + +This error appears in 4 locations: +- `docs/en/FEISHU_SETUP.md` line 25: heading says "OpenClaw >= 2026.2" +- `docs/FEISHU_SETUP.md` line 25: heading says "OpenClaw >= 2026.2" +- `docs/en/KNOWN_ISSUES.md` line 74 and 82: says "OpenClaw >= 2026.2" +- `docs/KNOWN_ISSUES.md` line 67 and 75: says "OpenClaw >= 2026.2" + +### YAML config example syntax: PASS + +The YAML example is syntactically correct. The structure with `channels.feishu.groupSessionScope: "group_topic"` alongside `domain`, `connectionMode`, `appId`, `appSecret` is properly formatted. + +### SecretRef bug (Issue #47436): PASS + +- The issue exists and is OPEN: "[Bug] Feishu multi-account (accounts.*) appSecret SecretRef fails to resolve, crashes feishu plugin after ~3 minutes" +- It confirms that SecretRef in multi-account Feishu mode causes the plugin to crash after ~3 minutes, taking down the primary bot as well. +- The workaround described in the docs ("use plaintext secrets" or "restart the gateway twice") is a reasonable approximation, though the issue itself does not explicitly state "restart twice" as a workaround -- a fix PR (#47652) has been submitted with per-account error isolation. +- The docs' description of the bug is substantively accurate. + +### Issue #10242 reference: PARTIALLY INACCURATE + +The docs reference Issue #10242 as evidence of the group chat thread isolation limitation. However, Issue #10242 is actually titled "[Feature Request] Restore 'New Thread' capability for Feishu (Lark) Channel in **DMs**" -- it is about DM thread capability, not group chat session isolation. While the broader point about Feishu lacking thread isolation is valid, this specific issue is not the right citation. Issue #29791 ("[Feature]: Support thread-based replies in Feishu plugin") would be a more accurate reference for the group chat topic. + +--- + +## 3. Deployment Order + +### Logical correctness: PASS + +The 9-step deployment order is logically sound: +1. Create bot/app on platform +2. Configure permissions/intents +3. Invite bot to server/workspace/groups +4. Create agent channels/groups +5. Connect platform to OpenClaw (`openclaw channels add`) +6. Deploy OpenCrew files (shared protocols + workspaces) +7. Write OpenClaw config (agent bindings, channel IDs) +8. Restart gateway +9. Verify + +This order correctly places platform-side setup (steps 1-4) before OpenClaw-side configuration (steps 5-8), which is necessary because `openclaw channels add` requires bot tokens that only exist after step 1. + +### Matches OpenClaw workflow: PASS + +You cannot run `openclaw channels add` without a bot token (Discord) or app credentials (Feishu), so the platform setup must come first. The deployment order correctly captures this dependency. + +### Common Mistakes section: PASS + +All 5 common mistakes are realistic and would genuinely be encountered by new users: +1. Wrong deployment order -- logical error new users would make +2. Skipping channel permission isolation -- addresses Issue #34 +3. Forgetting to restart gateway -- a standard "gotcha" +4. Channel ID mismatch -- common copy-paste error +5. Bot not invited to channels -- frequently missed step + +### PR #3672 reference: FAIL -- PR WAS NOT MERGED + +Both EN and CN Discord docs (line 210/211) state: "OpenClaw multi-account support was introduced in [PR #3672](https://github.com/openclaw/openclaw/pull/3672) (merged January 2026)." + +**This is factually incorrect.** PR #3672 was **CLOSED without merging** on 2026-02-01 (`mergedAt: null`, `mergeCommit: null`, `state: CLOSED`). The PR itself references `moltbot/moltbot` in its "Fixes" line, suggesting it predates a repo rename. Multi-account Discord support does exist in current OpenClaw versions (as evidenced by multiple issues referencing it), but it was NOT delivered via PR #3672. The correct PR that shipped this feature needs to be identified, or the reference should be removed. + +--- + +## 4. Known Issues Update + +### Factual accuracy: PASS (with version caveat) + +The Feishu P1 entry accurately describes: +- The symptom: "all conversations within a single group are flat -- there is no thread-level session isolation" +- The resolution: `groupSessionScope: "group_topic"` enabling per-topic session isolation +- The distinction between built-in and community plugins + +However, the version requirement "OpenClaw >= 2026.2" is incorrect (should be >= 2026.3.1), as noted in Section 2. + +### Link integrity: PASS (with style note) + +- CN KNOWN_ISSUES.md links to `FEISHU_SETUP.md#更新groupsessionscopeopenclaw--20262` -- this anchor matches the CN heading and will resolve correctly on GitHub. +- EN KNOWN_ISSUES.md links to `../en/FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` -- this anchor matches the EN heading and will resolve correctly. However, since both files are in `docs/en/`, the `../en/` prefix is unnecessarily verbose. `FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` would be cleaner and less fragile. + +--- + +## 5. Cross-file Consistency + +### EN/CN match: PASS + +- Discord permission names are consistent across EN and CN. The EN version uses the English Discord permission names ("Send Messages", "View Channels", etc.), while the CN version uses the same English names in the tables (appropriate since Discord's UI is in English) with Chinese descriptions. +- Feishu config keys (`groupSessionScope`, `appId`, `appSecret`, `connectionMode`, `accounts`) are identical in both EN and CN versions. +- The YAML config examples are structurally identical between EN and CN, differing only in placeholder values for appSecret (EN: "your_app_secret", CN: "你的AppSecret"). + +### Org name consistency: PASS + +No references to the wrong org name `open-claw/open-claw` were found anywhere in the docs directory. All references use the correct `openclaw/openclaw` format. + +### Broken links: NO BROKEN LINKS DETECTED + +All internal markdown links have matching anchors in the target files. External links to GitHub issues (#10242, #47436, #3306) and PRs (#3672) are valid URLs (though #3672 is closed/not-merged and #10242 is not the ideal reference, the links themselves work). + +--- + +## 6. Beginner Readability + +### Clarity assessment: GOOD + +The documentation is well-structured and would be followable by a beginner: +- Step numbering is clear and sequential +- Each step includes specific UI navigation paths (e.g., "Server Settings -> Roles -> Create Role") +- Warning/important callouts are used appropriately to flag critical steps +- Time estimates are helpful for setting expectations +- The "What you will have when you are done" sections provide clear success criteria + +### Missing steps: MINOR GAP + +In the Discord Step 5b, the docs say "Do NOT grant Send Messages at the server level" but do not explicitly say what to do if the bot already has Send Messages from the invite (Step 3). Since Step 3's permission table includes "Send Messages", a beginner might be confused about why they granted it in Step 3 only to revoke it in Step 5b. A clarifying note such as "The permissions you selected in Step 3 set the OAuth2 invite scope, but the role-based permissions in Step 5b take precedence for channel-level control" would help. + +### Confusing sections: MINOR + +The Feishu docs mention both "OpenClaw >= 2026.2" and reference features that actually require 2026.3.1. A beginner running OpenClaw 2026.2.x would follow these instructions, find that `groupSessionScope` does not work, and be stuck without understanding why. + +--- + +## Must-Fix Issues + +1. **INCORRECT VERSION**: Change `groupSessionScope` version requirement from "OpenClaw >= 2026.2" to "OpenClaw >= 2026.3.1" in all 4 files (EN/CN FEISHU_SETUP.md and EN/CN KNOWN_ISSUES.md). The feature was introduced in the v2026.3.1 release, as confirmed by the official release notes. + +2. **PR #3672 NOT MERGED**: Remove or correct the claim that "PR #3672 (merged January 2026)" introduced multi-account Discord support. PR #3672 was closed without merging (`mergedAt: null`). Either find the correct PR that shipped this feature, or remove the PR reference and simply state that multi-account support is available in current OpenClaw versions. + +3. **ISSUE #10242 MISCHARACTERIZED**: Issue #10242 is about DM thread capability, not group chat session isolation. Consider replacing the reference with Issue #29791 ("[Feature]: Support thread-based replies in Feishu plugin") which more accurately describes the group chat thread isolation problem, or add both references with proper context. + +## Recommendations + +1. **Reconcile 100-server vs 75-server threshold**: Line 45 of both EN/CN Discord docs says "fewer than 100 servers" while the Multi-Bot section correctly says "more than 75 servers." Discord's actual threshold for Privileged Intent application is 75 servers. Recommend updating line 45 to say "fewer than 75 servers" for consistency. + +2. **Normalize "View Channel" vs "View Channels"**: Line 128 of EN DISCORD_SETUP.md uses "View Channel" (singular) while lines 60 and 143 use "View Channels" (plural). Pick one and be consistent (recommend "View Channels" to match the Discord UI). + +3. **Simplify EN KNOWN_ISSUES link path**: Change `../en/FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` to `FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` since both files are in the same `docs/en/` directory. + +4. **Add clarifying note between Step 3 and Step 5b in Discord docs**: Explain the relationship between OAuth2 invite permissions (Step 3) and role-based channel permissions (Step 5b) to avoid beginner confusion. + +5. **SecretRef workaround accuracy**: The "restart the gateway twice" workaround is not explicitly documented in Issue #47436. Consider softening to "restart the gateway after credential changes; a fix is tracked in Issue #47436" or referencing PR #47652 which implements per-account error isolation. + +--- + +*Report generated: 2026-03-27* +*Verification method: Web search, official Discord developer docs, OpenClaw GitHub releases/issues/PRs via gh CLI, DeepWiki* diff --git a/.harness/reports/research_autonomous_slack_r1.md b/.harness/reports/research_autonomous_slack_r1.md new file mode 100644 index 0000000..bb722fc --- /dev/null +++ b/.harness/reports/research_autonomous_slack_r1.md @@ -0,0 +1,567 @@ +# Research: Autonomous Multi-Bot Slack Collaboration + +> Researcher: Claude Opus 4.6 | Date: 2026-03-27 | Scope: Can multiple independent Slack bots autonomously drive a multi-round discussion without human intervention at every step? + +--- + +## Executive Summary + +**Yes, this is technically feasible -- but with significant caveats.** + +Multiple independent Slack Apps (each running as a separate bot with its own token, app token, and bot user ID), all present in the same Slack channel/thread, can autonomously drive a multi-round discussion without human intervention at every step. The architecture requires: + +1. **OpenClaw multi-account mode** (`channels.slack.accounts`) -- one Slack App per participating agent, all managed by a single OpenClaw gateway instance. +2. **`allowBots: true` + `requireMention: true`** on shared channels -- this allows bots to see each other's messages while preventing uncontrolled loops. +3. **An orchestrator agent (e.g., CTO)** whose AGENTS.md/SOUL.md instructs it to drive discussions by @mentioning other agents using Slack's `<@BOT_USER_ID>` format. +4. **A soft turn limit** enforced by agent instructions (not system-level enforcement for Discussion mode). + +**Critical finding**: OpenClaw's self-loop filter on Slack is **per-bot-user-ID**, not global. In multi-account mode, Bot-CTO's messages are NOT filtered by Bot-Builder's handler because they have different bot user IDs. This is the key enabler. However, this behavior was the subject of bug fixes (Issue #15836, fixed via PRs #15863/#15946 with session origin tracking), confirming that the current codebase intentionally supports inter-agent message routing. + +**Confidence level**: MEDIUM-HIGH for Architecture A (single gateway, multi-account). The primitives all exist and are documented. No end-to-end production validation of fully autonomous (zero human intervention) multi-round discussions has been publicly reported. + +--- + +## 1. Architecture Comparison + +### 1a. Single Gateway Multi-Account (Architecture A) -- RECOMMENDED + +**How it works**: One OpenClaw gateway instance manages multiple Slack Apps via `channels.slack.accounts`. Each agent is bound to its specific account via `bindings[].match.accountId`. + +```json +{ + "channels": { + "slack": { + "accounts": { + "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-..." }, + "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-..." }, + "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-..." } + }, + "channels": { + "<COLLAB_CHANNEL_ID>": { + "allow": true, + "requireMention": true, + "allowBots": true + } + }, + "thread": { + "historyScope": "thread", + "inheritParent": true, + "initialHistoryLimit": 50 + } + } + }, + "bindings": [ + { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, + { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, + { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } + ] +} +``` + +**When Bot-CTO posts in a shared channel, does Bot-Builder's agent receive it?** + +Yes, under these conditions: +- `allowBots: true` on the channel -- without this, all bot messages are ignored. +- The message must @mention Bot-Builder (`<@BUILDER_BOT_USER_ID>`) if `requireMention: true` is set. +- OpenClaw's self-loop filter only blocks messages from the **same** bot user ID. Since Bot-CTO and Bot-Builder have different user IDs, Bot-CTO's messages pass through Bot-Builder's filter. + +**What does `allowBots: true` actually do?** + +At the code level, `allowBots` controls whether the OpenClaw Slack plugin processes messages with the `bot_message` subtype (or messages where `message.bot_id` is set). Three behaviors: + +| Value | Behavior | +|-------|----------| +| `false` (default) | All bot-authored messages are dropped before reaching agent routing. This is the current OpenCrew default. | +| `true` | All bot-authored messages are accepted as inbound. Combined with `requireMention: true`, only messages that @mention the receiving bot are processed. | +| `"mentions"` | Bot messages are accepted **only** if they contain an @mention of the receiving bot. This is functionally equivalent to `true` + `requireMention: true` but applies specifically to bot messages. (Confidence: MEDIUM -- referenced in source-level analysis and community reports but not prominently featured in official docs.) | + +**Self-loop filter mechanics**: OpenClaw checks `message.user === account.botUserId`. With multi-account, each account has a distinct `botUserId`, so Bot-CTO's messages (`user = U_CTO`) are not filtered by Bot-Builder's handler (which only filters `user = U_BUILDER`). This was confirmed by Issue #15836's fix (PRs #15863/#15946), which added session origin tracking to refine the filtering -- the fix allows routing bot messages to bound sessions EXCEPT the originating session. + +**Advantages**: +- Single process, single config, centralized management +- All session keys, A2A tools, and routing work within one gateway +- Existing OpenCrew A2A protocol (sessions_send) and Discussion mode (@mention) coexist +- Official OpenClaw documentation and community guides describe this exact pattern + +**Disadvantages**: +- Requires creating 3-7 separate Slack Apps (one per participating agent) +- All agents share one process -- if the gateway crashes, all agents go down +- Socket Mode: 3-7 persistent WebSocket connections from one process (OpenClaw docs recommend HTTP mode with distinct `webhookPath` per account for multi-account setups) + +### 1b. Multiple Independent Instances (Architecture B) + +**How it works**: Each agent runs its own OpenClaw gateway instance (separate process, separate `openclaw.json`). Each connects to Slack with its own App. They share a Slack channel. + +**Is this possible?** Technically yes, but with major complications: + +- Each OpenClaw instance would need its own `openclaw.json` with one agent definition. +- The `allowBots: true` + `requireMention: true` pattern works the same way on each instance -- each instance's self-loop filter only blocks its own bot user ID. +- Slack delivers events to all subscribed apps regardless of which instance runs them. + +**Problems**: +- **No shared A2A infrastructure**: `sessions_send` cannot cross process boundaries. The orchestrator agent cannot use OpenClaw's built-in A2A tools to trigger other agents -- it can only @mention them in Slack and hope the other instance processes the mention. +- **No shared session management**: Each instance tracks sessions independently. There is no coordination, no shared `maxPingPongTurns`, no shared session keys. +- **Operational complexity**: 3-7 separate processes, configs, logs, restarts. +- **No unified routing**: Each instance independently processes all incoming messages from its channel. Binding isolation must be configured per-instance. + +**Verdict**: Architecture B is technically possible but provides no advantage over Architecture A while adding significant complexity. The only real advantage would be complete process isolation (one crash does not affect others), which is a minor benefit compared to the operational cost. + +### 1c. Recommendation + +**Architecture A (single gateway, multi-account) is clearly superior.** It leverages OpenClaw's built-in multi-account routing, keeps all A2A infrastructure unified, and is the pattern documented by OpenClaw's official docs and community guides. All subsequent sections assume Architecture A. + +--- + +## 2. Autonomous Orchestration Mechanics + +### Event Flow + +Here is the complete event flow for an autonomous multi-round discussion: + +``` +SETUP: Human starts a thread in #collab + "Let's discuss the architecture for feature X. @CTO please kick off." + +ROUND 1: CTO responds (triggered by human's @mention) + CTO reads thread history, proposes architecture. + CTO's response includes: "<@BUILDER_BOT_USER_ID> please review feasibility." + + Event flow: + 1. CTO agent produces response text containing <@U_BUILDER> + 2. OpenClaw's Slack plugin posts this as Bot-CTO in the thread + 3. Slack Events API delivers message event to ALL subscribed apps in the channel + 4. Bot-Builder's app receives the message event + 5. OpenClaw checks: is this from a bot? Yes. Is allowBots enabled? Yes. + 6. OpenClaw checks: is this from OUR bot user ID? No (U_CTO != U_BUILDER). Pass. + 7. OpenClaw checks: does this message @mention our bot? Yes (<@U_BUILDER>). Pass. + 8. OpenClaw routes to Builder agent (matched by accountId binding) + 9. Builder agent's session is created/resumed for this thread + +ROUND 2: Builder responds (triggered by CTO's @mention) + Builder reads thread history (sees human's prompt + CTO's proposal). + Builder posts feasibility analysis. + Builder does NOT @mention anyone (it's not an orchestrator). + +ROUND 3: CTO sees Builder's response (how?) + THIS IS THE CRITICAL QUESTION. + + Option A -- CTO is also listening via allowBots: + If CTO's channel config has allowBots: true + requireMention: true, + CTO only activates when explicitly @mentioned. + Builder's response does NOT @mention CTO, so CTO does NOT auto-activate. + --> PROBLEM: CTO cannot autonomously continue the discussion. + + Option B -- CTO uses requireMention: false on the shared channel: + CTO activates on ALL messages in the thread (including Builder's). + --> PROBLEM: Every bot activates on every message --> infinite loop. + + Option C -- CTO is the orchestrator with special config: + CTO's binding for #collab has allowBots: true + requireMention: false. + All OTHER agents have allowBots: true + requireMention: true. + CTO sees all messages. Others only respond when @mentioned. + --> THIS IS THE KEY ARCHITECTURE INSIGHT. + + Option D -- Builder @mentions CTO back: + Builder's AGENTS.md instructs: "After responding, @mention the + orchestrator: <@U_CTO> I've posted my analysis." + --> Works but creates a tight loop. Needs turn counting. +``` + +### The Orchestrator Pattern (Option C -- Recommended) + +The orchestrator agent (CTO) needs a **different configuration** from other agents on the shared channel: + +```json +{ + "channels": { + "slack": { + "channels": { + "<COLLAB_CHANNEL_ID>": { + "allow": true, + "requireMention": true, + "allowBots": true + } + } + } + } +} +``` + +**The problem**: OpenClaw's `channels.slack.channels` config is **global across all agents** in the same gateway instance. You cannot set `requireMention: false` for CTO and `requireMention: true` for Builder on the same channel in the same `openclaw.json`. + +**Workarounds**: + +1. **Option D (Explicit @mention-back)**: Builder's instructions say "always end your response with `<@U_CTO>`". This triggers CTO to read the thread and decide the next step. This is the simplest approach and works within existing config constraints. + +2. **Per-account channel overrides**: If OpenClaw supports per-account channel configuration (e.g., `accounts.cto.channels.<ID>.requireMention = false`), this would solve it cleanly. **Status: UNVERIFIED** -- the official docs mention that "named accounts inherit from global config but can override any setting," but whether per-account `channels` overrides are supported at the channel level is not confirmed. + +3. **Dedicated orchestrator channel**: The orchestrator monitors its own dedicated channel (#cto) where `requireMention: false`. Other agents post summaries to #cto after responding in #collab. This fragments the discussion across channels, which is undesirable. + +4. **Hybrid: @mention + sessions_send**: Builder @mentions CTO in the thread AND does a `sessions_send` to CTO's session. This provides both visibility and a reliable trigger. But Builder needs `agentToAgent.allow` permission, which current config restricts. + +### @mention Rendering + +**Can an agent produce `<@BOT_USER_ID>` in its Slack message, and does it render as a proper mention?** + +Yes. When an agent's response text contains `<@U0XXXXX>`, OpenClaw's Slack plugin posts this verbatim to Slack. Slack's rendering engine converts `<@U0XXXXX>` into a clickable @mention with the bot's display name. This is standard Slack message formatting -- there is nothing special about bot-authored messages vs human-authored messages in this regard. + +**Does the mentioned bot receive an event?** + +The `app_mention` event documentation does not explicitly confirm or deny whether bot-authored mentions trigger `app_mention` for the mentioned app. However, OpenClaw's Slack plugin primarily listens on `message.channels` events (not just `app_mention`), which delivers ALL messages in channels the bot has joined, regardless of sender. OpenClaw then applies its own mention-detection logic by parsing the message text for `<@botUserId>` patterns. Therefore: + +- Bot-CTO posts `<@U_BUILDER> review this` in #collab thread +- Bot-Builder's app receives the `message.channels` event (Slack delivers all channel messages to all member apps) +- OpenClaw checks `allowBots: true` -- pass +- OpenClaw checks `requireMention: true` -- scans message text for `<@U_BUILDER>` -- found -- pass +- Message is routed to Builder agent + +**Confidence: HIGH** that this works. The `message.channels` subscription is the primary event listener, and OpenClaw's mention detection is text-based parsing of `<@userId>`, not reliance on Slack's `app_mention` event type. + +### Binding/Routing with Multiple Accounts + +With multi-account, routing uses the binding specificity hierarchy: + +1. Peer match (exact channel ID) +2. Account ID match +3. Channel-level match +4. Fallback to default + +When a message arrives on a Slack account (e.g., the "cto" account receives an event), OpenClaw matches it against bindings. The binding `{ "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }` routes all messages received by the CTO's Slack app to the CTO agent. + +**Key insight**: In multi-account mode, each Slack app independently receives events. The OpenClaw gateway maintains separate Socket Mode connections for each account. When Bot-CTO posts a message, Bot-Builder's Slack app independently receives the event via its own connection. OpenClaw processes each account's events through its own binding chain. + +--- + +## 3. Loop Prevention + +### Available Mechanisms + +| Mechanism | Type | Scope | Enforcement | +|-----------|------|-------|-------------| +| `requireMention: true` | Config | Per-channel, global | System-enforced by OpenClaw | +| `allowBots: false` | Config | Per-channel, global | System-enforced -- blocks all bot messages | +| `allowBots: "mentions"` | Config | Per-channel, global | System-enforced -- bot messages only when bot is mentioned | +| `maxPingPongTurns` (0-5) | Config | Per A2A `sessions_send` | System-enforced by OpenClaw session manager | +| `maxTurns` (default: 80) | Config | Per agent session | System-enforced -- max model calls per session | +| `timeoutSeconds` (default: 172800) | Config | Per agent | System-enforced -- 48-hour abort timer | +| `maxDiscussionTurns` | Protocol | Per discussion thread | Agent-self-enforced via AGENTS.md instructions | +| Self-loop filter | Code | Per bot user ID | System-enforced -- ignores own messages | +| Permission matrix | Protocol | Per agent role | Agent-self-enforced via SOUL.md/AGENTS.md | +| Agent instructions (WAIT discipline) | Protocol | Per agent | Agent-self-enforced | + +### What Applies to Autonomous Discussion? + +**`maxPingPongTurns` does NOT directly apply** to @mention-driven discussions. This parameter governs the `sessions_send` reply-back loop specifically. In Discussion mode, there is no `sessions_send` -- agents respond to @mentions in Slack threads. The ping-pong counter is not incremented. + +**`maxTurns` provides a backstop** but is too coarse. At 80 turns per session, an agent could send many messages before hitting this limit. + +**`requireMention: true` is the primary loop breaker.** If all agents require mentions, and agents only @mention the next speaker (never broadcasting), then: +- Agent responds only when mentioned +- Agent mentions at most one other agent +- That other agent responds, mentions the orchestrator back +- Orchestrator decides next step + +This creates a **controlled chain**, not an unbounded loop. The chain only continues as long as the orchestrator keeps @mentioning agents. + +### Recommended Approach: Multi-Layer Defense + +**Layer 1 -- Config-enforced (reliable)**: +- `requireMention: true` on ALL agents in shared channels +- `allowBots: true` (or `"mentions"`) on shared channels only +- `maxTurns` per agent as an absolute backstop (e.g., 20 for discussion participants) +- Agent-specific `timeoutSeconds` (e.g., 600 for discussion sessions) + +**Layer 2 -- Protocol-enforced (agent instructions)**: +- Orchestrator AGENTS.md: "You may run at most `MAX_ROUNDS` discussion rounds. After `MAX_ROUNDS`, you MUST post `DISCUSSION_CLOSE` and stop @mentioning other agents." +- Participant AGENTS.md: "You ONLY respond when @mentioned. You NEVER @mention another agent unless explicitly instructed. After responding, you STOP." +- Exception: The `@mention-back` pattern where participants mention the orchestrator after responding. This is controlled because only the orchestrator decides whether to continue. + +**Layer 3 -- External monitoring**: +- A cron job or heartbeat that checks thread message count and kills sessions if a thread exceeds N messages. +- Ops agent periodic audit of thread lengths. + +### The "47 replies in 12 seconds" Cautionary Tale + +A production incident documented by the community: enabling `allowBots: true` without `requireMention: true` in a channel with another AI bot caused 47 replies in 12 seconds before manual process kill. This underscores that `requireMention: true` is **non-negotiable** when `allowBots` is enabled. + +--- + +## 4. Implementation Path + +### 4.1 Config Changes + +**Step 1: Create Slack Apps** (human-manual, one-time) + +Create one Slack App per participating agent. Minimum 3 (CoS, CTO, Builder). Each app needs: +- Socket Mode enabled +- App-Level Token (`xapp-`) with `connections:write` scope +- Bot Token (`xoxb-`) with scopes: `channels:history`, `channels:read`, `chat:write`, `users:read`, `reactions:read`, `reactions:write` +- Event Subscriptions: `message.channels`, `app_mention` +- Bot user configured with distinct name and icon + +**Step 2: Create shared discussion channel** (human-manual) + +Create `#collab` (or similar). Invite ALL agent bots to this channel. + +**Step 3: Update openclaw.json** (agent-executable) + +Add multi-account config: +```json +{ + "channels": { + "slack": { + "accounts": { + "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-...", "name": "CoS" }, + "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-...", "name": "CTO" }, + "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-...", "name": "Builder" } + }, + "channels": { + "<COLLAB_CHANNEL_ID>": { + "allow": true, + "requireMention": true, + "allowBots": true + } + }, + "thread": { + "historyScope": "thread", + "inheritParent": true, + "initialHistoryLimit": 50 + } + } + }, + "bindings": [ + { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, + { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, + { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } + ] +} +``` + +Keep existing per-agent channel bindings (each agent's dedicated channel stays `requireMention: false`, `allowBots: false`). + +**Step 4: Record bot user IDs** (agent-executable) + +For each Slack App, obtain the Bot User ID (e.g., `U0CTO1234`, `U0BLD5678`). These are needed in agent instructions for @mention formatting. + +### 4.2 Agent Instructions + +**CTO AGENTS.md -- Add Orchestrator Section**: + +```markdown +## Discussion Orchestration (Multi-Agent Threads in #collab) + +When driving a multi-agent discussion in #collab: + +### Setup +- You are the ORCHESTRATOR. You control who speaks next. +- Bot User IDs: CTO=<@U_CTO>, Builder=<@U_BUILDER>, CoS=<@U_COS> +- Maximum rounds: 5. After 5 rounds, you MUST close the discussion. + +### Each Round +1. Read the full thread history (all prior messages) +2. Analyze the latest response +3. Decide: (a) Ask another agent for input, (b) Ask the same agent to clarify, (c) Close discussion +4. If continuing: Post your analysis + @mention the next agent + Format: "[CTO] <your analysis>. <@U_NEXT_AGENT> <your question/request>" +5. If closing: Post DISCUSSION_CLOSE summary + +### Round Counting +- Maintain a round counter in your responses: "Round N/5" +- After Round 5, you MUST close regardless of convergence + +### DISCUSSION_CLOSE Format +``` +DISCUSSION_CLOSE +Topic: <topic> +Rounds: N/5 +Consensus: <achieved | not achieved> +Decision: <what was decided> +Actions: <next steps, including any A2A delegation tasks> +Participants: CTO, Builder, ... +``` + +### Safety Rules +- NEVER @mention more than one agent per message +- NEVER skip round counting +- If you receive a response that is clearly a loop (repeating prior content), immediately close +``` + +**Builder AGENTS.md -- Add Discussion Participant Section**: + +```markdown +## Discussion Participation (Multi-Agent Threads in #collab) + +When @mentioned in a #collab discussion thread: + +1. Read the full thread history +2. Respond from your domain perspective (feasibility, implementation, effort) +3. End your response with: "<@U_CTO> I've posted my analysis." + (This notifies the orchestrator to continue the discussion) +4. Do NOT @mention any agent other than the orchestrator (<@U_CTO>) +5. Do NOT continue working after posting -- WAIT for the next @mention +6. If you have nothing to add: respond "[Builder] PASS: <reason>" +``` + +**CoS AGENTS.md -- Similar participant pattern, mentioning CTO back after responding.** + +### 4.3 The @Mention-Back Problem and Solutions + +The fundamental challenge is: **how does the orchestrator know when a participant has responded?** + +**Solution 1 -- Participant @mentions orchestrator back** (Recommended): +- Participant ends response with `<@U_CTO>` +- CTO receives the `message.channels` event with its mention +- CTO reads thread, continues orchestration +- Pro: Simple, works within existing config +- Con: CTO activates on every participant response, even partial ones + +**Solution 2 -- Orchestrator polls thread** (Not recommended): +- CTO uses a timer/heartbeat to periodically check the thread +- Pro: No mention-back needed +- Con: OpenClaw agents don't have native polling/timer capabilities for thread monitoring + +**Solution 3 -- Hybrid: @mention-back + sessions_send**: +- Participant @mentions CTO AND does sessions_send to CTO's session +- Pro: Belt-and-suspenders reliability +- Con: Requires Builder to have sessions_send permission (currently restricted) + +**Solution 1 is recommended** as the simplest path that works within existing constraints. + +### 4.4 Failure Modes + +| Failure Mode | Cause | Mitigation | +|-------------|-------|------------| +| **Infinite loop** | Agent A mentions Agent B mentions Agent A... | `requireMention: true` + only orchestrator decides next speaker + round counter | +| **Silent failure** | Agent doesn't respond to @mention | Orchestrator waits N seconds, then posts "ping" + re-mentions. After 2 failures, closes discussion. | +| **Context overflow** | Thread gets too long for `initialHistoryLimit` | Set `initialHistoryLimit >= 50`. Instruct agents to keep responses under 500 words. | +| **Wrong agent responds** | Binding misconfiguration | Test with Round0 handshake before production discussions. | +| **All agents respond simultaneously** | `requireMention: false` accidentally set | Config audit: `requireMention: true` on ALL shared channels. | +| **Orchestrator never closes** | Round counter not maintained | `maxTurns` per session as absolute backstop. External monitoring. | +| **Bot mentions not parsed** | Agent outputs `@Builder` instead of `<@U_BUILDER>` | AGENTS.md must contain exact bot user IDs, not display names. | +| **Self-loop filter blocks legitimate messages** | OpenClaw bug / regression | Monitor Issue #15836 fix status. Test with Round0 handshake. | +| **Socket Mode connection limits** | 5-7 WebSocket connections from one process | OpenClaw docs recommend HTTP mode for multi-account. Test Socket Mode first; switch to HTTP if stability issues arise. | +| **Gateway crash kills all agents** | Single process architecture | Standard process management (systemd, pm2). Restart automatically. | + +--- + +## 5. Comparison with Claude Code Agent Teams + +| Dimension | Claude Code Agent Teams | OpenCrew Slack Discussion | +|-----------|------------------------|--------------------------| +| **Communication** | Mailbox system (in-memory message passing) | Slack thread messages | +| **Shared state** | Task list files (`~/.claude/tasks/`) | Thread history (via `initialHistoryLimit`) | +| **Orchestration** | Lead agent creates team, assigns tasks | CTO/CoS @mentions agents in thread | +| **Context** | Each teammate has own context window | Each agent has own session (thread-scoped) | +| **Human visibility** | Terminal output, requires split-pane/tmux | Slack UI -- real-time, mobile, searchable | +| **Turn control** | Task completion triggers, idle hooks | @mention triggers, round counter in instructions | +| **Loop prevention** | Task dependency system, lead controls | `requireMention` + round counter + `maxTurns` | +| **Persistence** | Session-scoped (lost on restart) | Slack thread history (persists, searchable) | +| **Cost model** | Token-based, per context window | Token-based + Slack API calls | +| **Inter-agent debate** | Teammates message each other directly, challenge findings | Agents @mention each other, challenge in shared thread | + +**Key parallel**: Both systems use an orchestrator that decides task decomposition and agent assignment. Both allow agents to challenge each other. Both have context isolation per agent. + +**Key difference**: Claude Code Agent Teams use file-based task lists for coordination and in-memory mailboxes for messages. OpenCrew uses Slack threads as both the communication channel and the shared context. The Slack approach provides superior human visibility but higher latency (~1-3s per message vs near-zero for file I/O). + +**Mapping to OpenCrew**: +- **Team lead** = CTO (or CoS for strategic discussions) +- **Teammates** = Builder, CIO, Ops (responding when called) +- **Task list** = The orchestrator's round-by-round plan (maintained in agent instructions, not a shared file) +- **Mailbox** = @mentions in the Slack thread +- **Shared resources** = Thread history (`initialHistoryLimit`) + +--- + +## 6. Confidence Assessment + +| Finding | Confidence | Evidence | +|---------|-----------|---------| +| Slack Events API delivers Bot-A's messages to Bot-B's app | **HIGH** | Slack API docs: apps receive `message.channels` for all messages in joined channels. No sender-type filtering at platform level. | +| OpenClaw multi-account supports separate bot tokens per agent | **HIGH** | Official docs (`channels.slack.accounts`), community gist, tutorial sites all confirm. | +| Self-loop filter is per-bot-user-ID in multi-account mode | **HIGH** | Issue #15836 confirms the filter checks `message.user === botUserId`. Fix (PRs #15863/#15946) added origin tracking to refine this. | +| `allowBots: true` enables processing of other bots' messages | **HIGH** | Official docs, community guide, production incident report (47 replies in 12s) all confirm. | +| `requireMention: true` prevents uncontrolled bot loops | **HIGH** | Official docs explicitly recommend this combination. Community confirms. | +| Agent can produce `<@BOT_USER_ID>` and it renders as a mention | **HIGH** | Standard Slack message formatting. No special handling needed for bot-authored messages. | +| OpenClaw parses `<@userId>` in message text for mention detection | **HIGH** | Official docs: "Mention sources: explicit app mention (`<@botId>`), mention regex patterns." | +| Autonomous multi-round discussion without ANY human intervention | **MEDIUM** | All primitives verified. The @mention-back pattern (participant mentions orchestrator) is the key mechanism. Not yet validated end-to-end in production. | +| `allowBots: "mentions"` as a third option beyond true/false | **MEDIUM** | Referenced in source-level analysis (DeepWiki) and prior research. Not prominently in official docs. | +| Per-account channel overrides (different `requireMention` per bot) | **LOW** | Docs say "named accounts can override any setting" but don't explicitly show per-account `channels.<ID>` overrides. Unverified. | +| `maxPingPongTurns` applies to @mention-driven discussions | **LOW (likely NO)** | This parameter governs `sessions_send` reply-back loops specifically, not @mention-driven thread interactions. | + +--- + +## 7. Open Questions + +1. **Per-account channel config**: Can `channels.slack.accounts.cto.channels.<COLLAB_ID>.requireMention` be set to `false` while keeping the global `channels.slack.channels.<COLLAB_ID>.requireMention = true`? This would allow the orchestrator to see all messages without requiring @mention-back. Needs empirical testing against OpenClaw source. + +2. **app_mention vs message.channels**: The Slack docs do not explicitly confirm whether `app_mention` events fire when a bot (not a human) mentions another bot. OpenClaw's primary listener is `message.channels` with text-based mention parsing, so this likely doesn't matter -- but confirmation would increase confidence. + +3. **Socket Mode scalability**: With 5-7 Slack Apps all using Socket Mode, what is the resource impact on the OpenClaw gateway? Are there Slack-side rate limits on concurrent WebSocket connections from the same server? OpenClaw docs recommend HTTP mode for multi-account -- is this a strong recommendation or just an option? + +4. **Thread history limits**: When `initialHistoryLimit = 50` and a discussion spans 30+ messages, does each agent see the FULL 30 messages or only the last 50? Does the limit count thread messages or include parent channel context? This determines whether agents can maintain discussion continuity. + +5. **Concurrent @mentions**: What happens if the orchestrator @mentions two agents simultaneously in one message (e.g., `<@U_BUILDER> and <@U_OPS> please review`)? Do both agents respond? In what order? Can this cause race conditions in the thread? + +6. **Discussion session lifecycle**: When does a thread-scoped session expire? If a discussion spans hours (with gaps between rounds), does the session survive? Does each @mention create a new session or resume the existing one? + +7. **Cost estimation**: Each discussion round involves: (a) agent reading full thread history, (b) agent producing a response, (c) OpenClaw posting to Slack. For a 5-round discussion with 3 agents, how many API tokens are consumed? Is this comparable to Claude Code Agent Teams or significantly more/less? + +8. **Empirical validation**: Nobody has publicly documented a fully autonomous (zero human intervention after initial prompt) multi-round OpenCrew discussion. The first implementation should be treated as an experiment with careful monitoring. + +--- + +## Appendix A: Step-by-Step Validation Plan + +Before deploying autonomous discussions, validate each component: + +**Test 1: Multi-account basic message delivery** +- Configure 2 accounts (CTO + Builder) on a shared channel +- Human @mentions CTO in a thread +- Verify CTO responds +- CTO's response includes `<@U_BUILDER>` +- Verify Builder receives the event and responds +- Expected: Both agents respond in the same thread + +**Test 2: @mention-back pattern** +- Builder's response includes `<@U_CTO>` +- Verify CTO receives Builder's message and can read thread history +- Expected: CTO sees all prior messages and can continue + +**Test 3: Round counter enforcement** +- Set max rounds to 3 in CTO's AGENTS.md +- Start a discussion +- Verify CTO closes discussion after round 3 with DISCUSSION_CLOSE +- Expected: Discussion terminates cleanly + +**Test 4: Loop prevention** +- Remove round counter from CTO's instructions (dangerous -- test in isolated channel) +- Start a discussion +- Verify `maxTurns` per session catches any runaway loop +- Expected: Session terminates at maxTurns limit + +**Test 5: Failure recovery** +- Start a discussion, then manually kill Builder's session mid-discussion +- Verify CTO detects non-response and closes or escalates +- Expected: CTO posts timeout message and either retries or closes + +--- + +## Appendix B: Key References + +- [OpenClaw Slack Plugin Docs](https://docs.openclaw.ai/channels/slack) +- [OpenClaw Multi-Agent Routing Docs](https://docs.openclaw.ai/concepts/multi-agent) +- [OpenClaw Session Tools Docs](https://docs.openclaw.ai/concepts/session-tool) +- [OpenClaw Agent Loop Docs](https://docs.openclaw.ai/concepts/agent-loop) +- [Running Multiple AI Agents as Slack Teammates (GitHub Gist)](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f) +- [OpenClaw Issue #15836: Agent-to-agent Slack routing (FIXED)](https://github.com/openclaw/openclaw/issues/15836) +- [OpenClaw Issue #11199: Discord multi-bot filtering (FIXED)](https://github.com/openclaw/openclaw/issues/11199) +- [OpenClaw Issue #45450: Matrix bot-to-bot visibility](https://github.com/openclaw/openclaw/issues/45450) +- [OpenClaw Issue #9912: maxTurns/maxToolCalls config](https://github.com/openclaw/openclaw/issues/9912) +- [OpenClaw Slack Setup Best Practices (Macaron)](https://macaron.im/blog/openclaw-slack-setup) +- [Claude Code Agent Teams Documentation](https://code.claude.com/docs/en/agent-teams) +- [Slack Events API Documentation](https://docs.slack.dev/apis/events-api/) +- [Slack Message Event Reference](https://docs.slack.dev/reference/events/message/) +- [Slack app_mention Event Reference](https://docs.slack.dev/reference/events/app_mention) +- [Prior OpenCrew Research: research_slack_r1.md](.harness/reports/research_slack_r1.md) +- [Prior OpenCrew Architecture: architecture_protocol_r1.md](.harness/reports/architecture_protocol_r1.md) +- [Prior OpenCrew Architecture: architecture_collab_r1.md](.harness/reports/architecture_collab_r1.md) diff --git a/.harness/reports/research_discord_r1.md b/.harness/reports/research_discord_r1.md new file mode 100644 index 0000000..4ed752e --- /dev/null +++ b/.harness/reports/research_discord_r1.md @@ -0,0 +1,382 @@ +commit 7e825263db36aef68792a050c324daef598b4c56 +Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> +Date: Sat Mar 28 17:38:48 2026 +0800 + + feat: add A2A v2 research harness, architecture, and agent definitions + + Multi-agent harness for researching and designing A2A v2 protocol: + + Research reports (Phase 1): + - Slack: true multi-agent collaboration via multi-account + @mention + - Feishu: groupSessionScope + platform limitation analysis + - Discord: multi-bot routing + Issue #11199 blocker analysis + + Architecture designs (Phase 2): + - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode + - 5 collaboration patterns: Architecture Review, Strategic Alignment, + Code Review, Incident Response, Knowledge Synthesis + - 3-level orchestration: Human → Agent → Event-Driven + - Platform configs, migration guides, 6 ADRs + + Agent definitions for Claude Code Agent Teams: + - researcher.md, architect.md, doc-fixer.md, qa.md + + QA verification: all issues resolved, PASS verdict after fixes. + + Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> + +diff --git a/.harness/reports/research_discord_r1.md b/.harness/reports/research_discord_r1.md +new file mode 100644 +index 0000000..9467648 +--- /dev/null ++++ b/.harness/reports/research_discord_r1.md +@@ -0,0 +1,349 @@ ++# Research Report: Discord Multi-Bot Capabilities (U2, Round 1) ++ ++**Date**: 2026-03-27 ++**Researcher**: Claude (automated research agent) ++**Scope**: Multi-bot routing, channel isolation, thread support, and multi-agent collaboration on Discord for OpenCrew ++ ++--- ++ ++## Executive Summary ++ ++Discord fully supports multiple bots coexisting in a single server with distinct identities, and OpenClaw has shipped multi-account Discord support (PR #3672, merged ~Jan 2026). Each bot receives all MESSAGE_CREATE events for channels it can view, enabling cross-bot message visibility at the platform level. However, OpenClaw's internal bot-message filter currently treats all configured bot accounts as "self" and drops their messages (Issue #11199), which blocks visible bot-to-bot collaboration via Discord. The practical workaround is to use OpenClaw's internal `sessions_send` for agent-to-agent communication and restrict Discord to human-to-agent interaction per channel. ++ ++For Issue #34 (cos/ops conversation mixing), the root cause is a single-bot configuration where one bot identity serves all channels, combined with missing or incorrect channel-level permission overrides. The fix is either (a) proper Discord channel permission overwrites to restrict Send Messages per-channel on the single bot's role, or (b) migrating to multi-bot with each bot scoped to its designated channel via Discord permission overrides. ++ ++--- ++ ++## 1. Multi-Bot Routing Model ++ ++### 1.1 Multiple Bots in One Server ++ ++**Confidence: HIGH** (based on Discord API documentation and widespread community practice) ++ ++Discord servers support up to 50 bot users. Each bot is a separate Application in the Discord Developer Portal with its own token, avatar, display name, and online status. All bots in a server receive gateway events independently -- when a user posts in a channel, every bot with View Channel permission on that channel receives a `MESSAGE_CREATE` event via its own WebSocket connection. ++ ++Key facts: ++- Each bot requires its own Application, Bot Token, and server invite ++- Each bot must independently enable the **Message Content Intent** privileged gateway intent to read message body text ++- Bots appear as distinct members in the server member list with independent online/offline status ++- Each bot can register its own slash commands (no namespace collision if names differ) ++- Rate limits are per-bot, so multiple bots do not share rate limit buckets ++ ++### 1.2 Cross-Bot Message Visibility ++ ++**Confidence: HIGH** (Discord API behavior) / **MEDIUM** (OpenClaw handling) ++ ++At the Discord platform level, when Bot-A posts a message in #build, Bot-B **does** receive the `MESSAGE_CREATE` gateway event for that message, provided: ++1. Bot-B has View Channel permission on #build ++2. Bot-B has the Message Content Intent enabled ++3. Bot-B is connected to the Gateway using API v9 or above ++ ++The `message.author` object includes a `bot: true` flag, allowing the receiving bot to identify the source as another bot. ++ ++**OpenClaw complication**: OpenClaw's Discord plugin filters out messages authored by bots by default (`allowBots` defaults to `false`). More critically, even when `allowBots` is set to `true` or `"mentions"`, the current implementation (as of Issue [#11199](https://github.com/openclaw/openclaw/issues/11199)) checks the message author ID against **all** configured bot account IDs in the instance, not just the receiving account's own ID. This means Bot-A's message is treated as "own message" by Bot-B's handler and silently dropped. ++ ++**Workaround options**: ++1. Set `allowBots: true` with `requireMention: false` and whitelist sibling bot user IDs in per-channel `users` arrays. This works but disables mention gating. ++2. Use OpenClaw's internal `sessions_send` for all agent-to-agent communication (recommended by the A2A protocol). Discord messages then serve only as "visibility anchors" for human observers. ++ ++**Related PRs addressing #11199**: ++- PR #11644: "bypass bot filter and mention gate for sibling Discord bots" ++- PR #22611: "allow messages from other instance bots in multi-account setups" ++- PR #35479: "add allowBotIds config to selectively allow bot messages" ++ ++(Status of these PRs could not be confirmed in this research round.) ++ ++### 1.3 OpenClaw Multi-Account Support Status ++ ++**Confidence: HIGH** (confirmed via Issue #3306 comments and documentation) ++ ++OpenClaw's Discord plugin now supports multiple bot accounts in a single gateway instance. The feature was introduced via **PR #3672** ("feat: Introduce multi-account support for Discord, ensuring session keys and RPC IDs are account-aware"), which was merged around January 28, 2026. Issue #3306 was the original feature request; a commenter confirmed on February 9, 2026: "Multi-Agent works for current version." ++ ++Configuration structure: ++```json ++{ ++ "channels": { ++ "discord": { ++ "accounts": { ++ "default": { "token": "BOT_TOKEN_A" }, ++ "coding": { "token": "BOT_TOKEN_B" } ++ } ++ } ++ } ++} ++``` ++ ++Each account gets its own: ++- Bot token (the `default` account falls back to `DISCORD_BOT_TOKEN` env var) ++- Guild and channel allowlists ++- Session key namespace (session keys are account-aware: `agent:<agentId>:discord:<accountId>:channel:<channelId>`) ++ ++Bindings reference accounts via `accountId`: ++```json ++{ ++ "bindings": [ ++ { ++ "agentId": "cos", ++ "match": { ++ "channel": "discord", ++ "accountId": "default", ++ "guildId": "<GUILD_ID>", ++ "peer": { "kind": "channel", "id": "<CHANNEL_ID_HQ>" } ++ } ++ } ++ ] ++} ++``` ++ ++**Known issue**: `requireMention: true` is reportedly broken in multi-account configurations (Issue [#45300](https://github.com/openclaw/openclaw/issues/45300)) -- all guild messages are dropped at the preflight stage with reason "no-mention" even when the bot is explicitly @mentioned. ++ ++--- ++ ++## 2. Channel Permission Isolation ++ ++### 2.1 Root Cause of Issue #34 ++ ++**Confidence: HIGH** (confirmed by reporter FRED-DL's own comment) ++ ++Issue [#34](https://github.com/AlexAnys/opencrew/issues/34) ("routing bug, cos and ops conversations mixed together") was caused by the **single-bot configuration** where all agents share one Discord bot identity. ++ ++The reporter's comment translates to: "The Discord plugin configuration requires each bot to only send in specific channels, so the documentation should not describe them as public channels but rather as manually restricted bot send permissions." ++ ++The root cause chain: ++1. A single bot receives `MESSAGE_CREATE` events for **all** channels it has View Channel permission on ++2. OpenClaw's binding system routes messages by channel ID to the correct agent (e.g., #hq -> CoS, #ops -> Ops) ++3. However, when any agent responds, the **same bot identity** sends the message. If the bot has Send Messages permission in channels it should not be active in, or if bindings are misconfigured, responses can leak across channels ++4. Without explicit Discord permission overrides, the single bot can read and write in every channel, creating a surface for context mixing if the OpenClaw routing layer has any edge-case failures ++ ++The fix documented in the issue: manually restrict the bot's Send Messages permission so it can only send in its assigned channel(s). With multi-bot, this becomes natural -- each bot only needs permissions in its own channel. ++ ++### 2.2 Discord Permission Override Configuration ++ ++**Confidence: HIGH** (based on Discord API documentation) ++ ++Discord uses a layered permission system: ++ ++1. **Server-level role permissions** (base) ++2. **Category-level permission overwrites** (inherited by child channels unless overridden) ++3. **Channel-level permission overwrites** (most specific, wins) ++4. **Member-specific overwrites** (highest priority, per-user/bot) ++ ++Permission overwrites use an `allow`/`deny` bitfield model: ++- `allow` explicitly grants a permission at the channel level ++- `deny` explicitly revokes a permission at the channel level ++- Unset bits inherit from the parent level ++ ++Key permission bits for bot isolation: ++ ++| Permission | Bit | Hex Value | ++|---|---|---| ++| VIEW_CHANNEL | `1 << 10` | `0x0000000000000400` | ++| SEND_MESSAGES | `1 << 11` | `0x0000000000000800` | ++| SEND_MESSAGES_IN_THREADS | `1 << 38` | `0x0000004000000000` | ++| READ_MESSAGE_HISTORY | `1 << 16` | `0x0000000000010000` | ++ ++**Critical note**: If a bot's role has the **Administrator** permission, all channel-level overrides are bypassed. Ensure bot roles do NOT have Administrator. ++ ++### 2.3 Step-by-Step Isolation Setup ++ ++**Confidence: HIGH** (testable in any Discord server) ++ ++#### For single-bot setup (restrict one bot to specific channels): ++ ++1. **Create a bot-specific role** (e.g., "OpenCrew Bot") -- do NOT use Administrator permission ++2. **At the server level**, grant the role: View Channels, Read Message History ++3. **At the server level**, do NOT grant: Send Messages, Send Messages in Threads ++4. **For each agent channel** (e.g., #hq, #cto, #build): ++ - Right-click the channel -> Edit Channel -> Permissions ++ - Click "+" next to Roles/Members, add the "OpenCrew Bot" role ++ - Set "Send Messages" to **Allow** (green checkmark) ++ - Set "Send Messages in Threads" to **Allow** (green checkmark) ++5. **Verify**: The bot can now only send messages in channels where you explicitly allowed it ++ ++#### For multi-bot setup (each bot restricted to its own channel): ++ ++1. **Create a role per bot** (e.g., "CoS Bot", "CTO Bot", "Builder Bot") ++2. **At the server level**, grant each role: View Channels, Read Message History -- do NOT grant Send Messages ++3. **For each bot**, add a channel-level overwrite on its designated channel: ++ - #hq: "CoS Bot" role -> Allow Send Messages, Allow Send Messages in Threads ++ - #cto: "CTO Bot" role -> Allow Send Messages, Allow Send Messages in Threads ++ - #build: "Builder Bot" role -> Allow Send Messages, Allow Send Messages in Threads ++4. **Optional hardening**: On channels a bot should NOT access at all, add a channel-level overwrite denying View Channel for that bot's role ++ ++#### Programmatic approach (via Discord API): ++ ++``` ++PUT /channels/{channel_id}/permissions/{role_or_user_id} ++{ ++ "allow": "2048", // SEND_MESSAGES (1 << 11) ++ "deny": "0", ++ "type": 0 // 0 = role overwrite ++} ++``` ++ ++To deny Send Messages on a channel: ++``` ++PUT /channels/{channel_id}/permissions/{role_or_user_id} ++{ ++ "allow": "0", ++ "deny": "2048", // SEND_MESSAGES denied ++ "type": 0 ++} ++``` ++ ++--- ++ ++## 3. Thread Support ++ ++### 3.1 Discord Thread Model ++ ++**Confidence: HIGH** (based on Discord API documentation) ++ ++Discord threads are lightweight sub-channels that live under a parent text channel. Key properties: ++ ++- **Types**: Public threads (visible to anyone who can view the parent channel), Private threads (invite-only, or visible to those with Manage Threads permission) ++- **Auto-archive**: Threads automatically archive after a configurable period of inactivity: 1 hour, 24 hours, 3 days, or 7 days (higher values require server boost level). "Activity" means sending a message, unarchiving, or changing the auto-archive duration ++- **Archived threads**: Can still be viewed and searched, but no new messages can be added until unarchived. Locked threads can only be unarchived by users with Manage Threads permission ++- **Member limit**: Threads support up to 1,000 members ++- **Thread metadata**: Includes `archived`, `archive_timestamp`, `auto_archive_duration`, `locked`, `owner_id`, `parent_id` ++ ++### 3.2 Bot Access to Threads ++ ++**Confidence: HIGH** ++ ++- Bots **must** use API v9 or above to receive thread-related gateway events (MESSAGE_CREATE, THREAD_CREATE, etc.) ++- Threads **inherit** all parent channel permissions. The relevant permission for posting in threads is `SEND_MESSAGES_IN_THREADS` (not `SEND_MESSAGES`) ++- Bots with View Channel on the parent automatically see public threads; private threads require membership or Manage Threads permission ++- The `THREAD_LIST_SYNC` event synchronizes active threads when a bot gains access to a channel ++ ++**OpenClaw thread handling**: Discord threads are routed as channel sessions. Thread configuration inherits parent channel config unless a thread-specific entry exists. OpenClaw supports binding threads to specific agents or sessions via `/focus` and `/unfocus` commands. The OpenCrew config document confirms: "Discord threads automatically inherit the configuration of their parent channel (agent binding, requireMention, etc.) unless you configure a specific thread ID separately." ++ ++### 3.3 Comparison with Slack Thread Behavior ++ ++**Confidence: MEDIUM** (based on documented behavior, not direct testing) ++ ++| Aspect | Slack | Discord | ++|---|---|---| ++| **Thread creation** | Any message can become a thread parent by replying to it | Threads are created explicitly from a message or via API; Forum channels auto-create threads | ++| **Persistence** | Threads persist indefinitely (searchable, no expiry) | Threads auto-archive after inactivity (1h to 7d) | ++| **Visibility** | Thread replies can optionally be broadcast to the channel | Thread messages stay in the thread only | ++| **Session key** | `agent:<agentId>:slack:channel:<channelId>:thread:<root_ts>` | `agent:<agentId>:discord:channel:<channelId>` (thread inherits parent; thread-specific session may append thread ID) | ++| **Bot trigger** | Bot-authored messages in other channels don't auto-trigger agents (same single-bot limitation) | Same behavior -- OpenClaw ignores bot-authored inbound by default | ++| **A2A pattern** | Two-step: Slack root message (anchor) + `sessions_send` (trigger) | Same two-step pattern applies to Discord | ++| **OpenClaw session isolation** | `historyScope = "thread"` + `inheritParent=false` isolates thread context | Thread config inherits parent channel unless overridden; `/focus` and `/unfocus` provide explicit binding | ++ ++**Key difference**: Discord's auto-archive is the most significant operational difference. Slack threads never expire, so long-running tasks can span days without concern. Discord threads will auto-archive after inactivity, requiring either: ++- The bot to have Manage Threads permission to unarchive ++- A periodic "keep-alive" message (not recommended; adds noise) ++- Accepting that completed task threads will archive naturally (acceptable for most workflows) ++ ++--- ++ ++## 4. Multi-Agent Collaboration Potential ++ ++### 4.1 Multiple Bots in Same Thread ++ ++**Confidence: HIGH** (Discord platform) / **MEDIUM** (OpenClaw implementation) ++ ++At the Discord level, multiple bots can absolutely participate in the same thread with distinct identities -- each bot appears with its own name, avatar, and online status. Any bot with `SEND_MESSAGES_IN_THREADS` permission on the parent channel can post in threads under that channel. Each bot's messages are clearly attributed to its own identity. ++ ++**OpenClaw limitation**: The bot-to-bot filtering issue (Issue #11199) means that even if Bot-A and Bot-B are both in the same thread, Bot-B's OpenClaw instance will drop Bot-A's messages as "own-bot" messages. This prevents a pattern where Bot-A mentions Bot-B in a thread to trigger a response. ++ ++**Practical pattern today**: Agent-to-agent collaboration in a shared thread must use `sessions_send` internally. The Discord thread serves as a shared audit log where both agents post their outputs for human visibility, but the actual trigger mechanism is internal to OpenClaw. ++ ++### 4.2 Discussion/Review/Brainstorm Patterns ++ ++**Confidence: MEDIUM** (conceptual; not tested in production) ++ ++With the current OpenClaw architecture, several collaboration patterns are achievable: ++ ++**Pattern 1: Delegated Execution (works today)** ++- CTO posts a task root message in #build ++- CTO uses `sessions_send` to trigger Builder in that thread ++- Builder executes and posts results in the thread ++- CTO monitors the thread and posts checkpoint summaries in #cto ++- Both agents' messages appear in the #build thread with distinct identities (if multi-bot) ++ ++**Pattern 2: Sequential Review (works today with orchestration)** ++- CoS creates a review request thread ++- CoS triggers CTO via `sessions_send` with the review brief ++- CTO posts analysis in the thread, then triggers Builder if implementation needed ++- Each agent's contribution is visible in the thread, attributed to its bot identity ++ ++**Pattern 3: Multi-Agent Discussion (partially blocked)** ++- Requires multiple agents to read each other's thread messages and respond ++- Currently blocked by Issue #11199 (bot-to-bot filtering) ++- Workaround: An orchestrator agent uses `sessions_send` to relay context between agents, posting summaries in a shared thread ++- True "round-table" discussion where agents directly read and respond to each other's Discord messages is not yet supported ++ ++**Pattern 4: Human-in-the-Loop Brainstorm (works today)** ++- Human posts a question in a channel ++- Bound agent responds ++- Human can @mention a different bot to bring another agent into the conversation (if multi-bot and `requireMention` is configured per-bot) ++- Each agent responds with its own identity ++ ++### 4.3 Orchestration Options ++ ++**Confidence: MEDIUM** ++ ++Three orchestration approaches exist for multi-agent Discord collaboration: ++ ++1. **OpenClaw A2A Protocol (recommended)**: Uses `sessions_send` for agent-to-agent triggering. Discord messages are "visibility anchors." This is the documented approach in OpenCrew's `A2A_PROTOCOL.md` and works with both single-bot and multi-bot configurations. It does not depend on Discord's message delivery for inter-agent communication. ++ ++2. **Discord-native orchestration (blocked)**: Would rely on bots reading each other's Discord messages and responding. Currently blocked by Issue #11199. If/when fixed, this would enable more natural multi-agent threads where agents directly react to each other's messages. Requires `allowBots` configuration and careful loop prevention. ++ ++3. **Hybrid orchestration**: Uses `sessions_send` for triggering but has agents post structured outputs in shared Discord threads. A "coordinator" agent (e.g., CTO) reads thread history via OpenClaw's message history and synthesizes responses. This works today and provides the best human visibility. ++ ++--- ++ ++## 5. Confidence Assessment ++ ++| Finding | Confidence | Basis | ++|---|---|---| ++| Multiple bots coexist in one Discord server | HIGH | Discord API docs, widespread practice | ++| Cross-bot MESSAGE_CREATE visibility | HIGH | Discord API v9+ documented behavior | ++| OpenClaw multi-account support shipped (PR #3672) | HIGH | Issue #3306 confirmation, official docs | ++| Bot-to-bot filtering bug (Issue #11199) | HIGH | Issue report with reproduction steps | ++| Channel permission override isolation method | HIGH | Discord API docs, testable | ++| Issue #34 root cause (single-bot + missing permission overrides) | HIGH | Reporter's own comment confirms | ++| Thread auto-archive behavior | HIGH | Discord API docs | ++| Thread permission inheritance from parent | HIGH | Discord API docs | ++| `requireMention` broken in multi-account (Issue #45300) | MEDIUM | Single issue report, not independently verified | ++| Multi-agent discussion pattern feasibility | MEDIUM | Conceptual; depends on #11199 resolution | ++| Session key format for Discord threads | MEDIUM | Partially documented; thread-specific key format not fully confirmed | ++ ++--- ++ ++## 6. Open Questions ++ ++1. **Issue #11199 fix status**: What is the merge status of PRs #11644, #22611, and #35479? If any are merged, bot-to-bot Discord messaging would be unblocked, enabling richer collaboration patterns. ++ ++2. **`requireMention` in multi-account**: Issue #45300 reports this is broken. Is there a workaround or fix? This is important for noise reduction in multi-bot setups. ++ ++3. **Thread session key format**: The exact session key format for Discord threads (as opposed to channels) needs confirmation. Does OpenClaw append a thread ID to the channel session key, or does it use the parent channel key? ++ ++4. **Auto-archive impact on long tasks**: If an OpenCrew task thread auto-archives mid-execution (e.g., agent is processing a long task), does the agent's next message automatically unarchive the thread, or does it fail silently? ++ ++5. **Rate limits with many bots**: With 7 agents each having their own bot, are there aggregate rate limit concerns at the guild level? Discord's per-guild rate limits may be stricter than per-bot limits for certain operations. ++ ++6. **Webhook relay vs. multi-bot trade-offs**: The DISCORD_SETUP.md mentions webhook relay as a middle-ground option. Has anyone in the OpenCrew community tested this approach? Webhooks can send with custom names/avatars but cannot receive messages, which limits their utility for agent routing. ++ ++7. **PR #3672 compatibility with current OpenClaw version**: The OpenCrew docs reference PR #3672 as "still in development," but Issue #3306 comments suggest it works. Which OpenClaw version is required for multi-account Discord support? ++ ++--- ++ ++## Sources ++ ++- [Discord Threads API Documentation](https://docs.discord.com/developers/topics/threads) ++- [Discord Permissions API Documentation](https://docs.discord.com/developers/topics/permissions) ++- [OpenClaw Discord Channel Documentation](https://docs.openclaw.ai/channels/discord) ++- [OpenClaw Multi-Agent Routing Documentation](https://docs.openclaw.ai/concepts/multi-agent) ++- [OpenClaw Issue #3306: Support multiple Discord accounts](https://github.com/openclaw/openclaw/issues/3306) ++- [OpenClaw Issue #11199: Multiple agent bots filtered when talking to each other](https://github.com/openclaw/openclaw/issues/11199) ++- [OpenClaw Issue #28479: Support Multiple Discord Bot Accounts](https://github.com/openclaw/openclaw/issues/28479) ++- [OpenClaw Issue #45300: requireMention broken in multi-account Discord config](https://github.com/openclaw/openclaw/issues/45300) ++- [OpenCrew Issue #34: Routing bug, cos and ops conversations mixed](https://github.com/AlexAnys/opencrew/issues/34) ++- [OpenCrew DISCORD_SETUP.md](../../docs/en/DISCORD_SETUP.md) ++- [OpenCrew CONFIG_SNIPPET_DISCORD.md](../../docs/en/CONFIG_SNIPPET_DISCORD.md) ++- [OpenCrew A2A_PROTOCOL.md](../../shared/A2A_PROTOCOL.md) ++- [OpenCrew KNOWN_ISSUES.md](../../docs/KNOWN_ISSUES.md) diff --git a/.harness/reports/research_feishu_r1.md b/.harness/reports/research_feishu_r1.md new file mode 100644 index 0000000..e8e5e93 --- /dev/null +++ b/.harness/reports/research_feishu_r1.md @@ -0,0 +1,433 @@ +commit 7e825263db36aef68792a050c324daef598b4c56 +Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> +Date: Sat Mar 28 17:38:48 2026 +0800 + + feat: add A2A v2 research harness, architecture, and agent definitions + + Multi-agent harness for researching and designing A2A v2 protocol: + + Research reports (Phase 1): + - Slack: true multi-agent collaboration via multi-account + @mention + - Feishu: groupSessionScope + platform limitation analysis + - Discord: multi-bot routing + Issue #11199 blocker analysis + + Architecture designs (Phase 2): + - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode + - 5 collaboration patterns: Architecture Review, Strategic Alignment, + Code Review, Incident Response, Knowledge Synthesis + - 3-level orchestration: Human → Agent → Event-Driven + - Platform configs, migration guides, 6 ADRs + + Agent definitions for Claude Code Agent Teams: + - researcher.md, architect.md, doc-fixer.md, qa.md + + QA verification: all issues resolved, PASS verdict after fixes. + + Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> + +diff --git a/.harness/reports/research_feishu_r1.md b/.harness/reports/research_feishu_r1.md +new file mode 100644 +index 0000000..548ab4c +--- /dev/null ++++ b/.harness/reports/research_feishu_r1.md +@@ -0,0 +1,400 @@ ++# Research Report: Feishu Multi-Agent Capabilities (U1, Round 1) ++ ++## Executive Summary ++ ++The Feishu integration for OpenCrew is poised for a significant upgrade. Two independent developments converge to address the project's top limitations: ++ ++1. **Per-topic session isolation** is now available via the built-in OpenClaw Feishu plugin's `groupSessionScope: "group_topic"` config (the official replacement for the legacy `topicSessionMode` / the `threadSession` shorthand referenced in Issue #31). This directly solves the P0 session-sharing problem -- each Feishu topic thread gets its own session key, enabling parallel tasks without context intermingling. ++ ++2. **Multi-account (multi-bot) routing** is supported by both the built-in plugin and the community plugin, allowing each Agent to run as a distinct Feishu app with its own identity, API quota, and permissions. Combined with `accountId`-based bindings, messages route deterministically to the correct Agent. ++ ++However, the A2A two-step trigger **cannot be reduced to one step** via cross-bot messaging alone. Feishu's `im.message.receive_v1` event only fires for user-sent messages; bot-sent messages are invisible to other bots' event subscriptions. The `sessions_send` internal routing mechanism remains necessary for cross-agent triggering. ++ ++--- ++ ++## 1. threadSession Analysis ++ ++### 1.1 What It Does ++ ++**Confidence: HIGH** (backed by OpenClaw source code analysis via DeepWiki and PR #29791) ++ ++The term `threadSession` as used in Issue #31 (`openclaw config set channels.feishu.threadSession true`) refers to enabling per-topic session isolation in Feishu group chats. At the code level, this maps to the `groupSessionScope` configuration in the built-in OpenClaw Feishu extension (`extensions/feishu/`). ++ ++The built-in plugin supports four session scope modes, defined in `extensions/feishu/src/config-schema.ts`: ++ ++| `groupSessionScope` value | Session key format | Behavior | ++|---|---|---| ++| `"group"` (default) | `chatId` | One session per group chat | ++| `"group_sender"` | `chatId:sender:senderOpenId` | One session per (group + sender) | ++| `"group_topic"` | `chatId:topic:topicId` | One session per topic thread; falls back to `chatId` if no topic | ++| `"group_topic_sender"` | `chatId:topic:topicId:sender:senderOpenId` | One session per (topic + sender); cascading fallback | ++ ++The session key is constructed by the `buildFeishuConversationId` function (in `extensions/feishu/src/bot.ts` or related module): ++ ++```typescript ++function buildFeishuConversationId(params: { ++ chatId: string; ++ scope: FeishuGroupSessionScope; ++ senderOpenId?: string; ++ topicId?: string; ++}): string { ++ switch (params.scope) { ++ case "group_topic": ++ return topicId ? `${chatId}:topic:${topicId}` : chatId; ++ case "group_topic_sender": ++ if (topicId && senderOpenId) ++ return `${chatId}:topic:${topicId}:sender:${senderOpenId}`; ++ if (topicId) return `${chatId}:topic:${topicId}`; ++ return senderOpenId ? `${chatId}:sender:${senderOpenId}` : chatId; ++ // ... ++ } ++} ++``` ++ ++The `topicId` is derived from the Feishu message event's `root_id` (preferred) or `thread_id` (fallback). This was implemented in PR #29791 (merged March 2, 2026), which resolved the long-standing feature request for thread-based replies in Feishu. ++ ++**Historical note**: The deprecated `topicSessionMode: "enabled"` config is a legacy predecessor that maps internally to `groupSessionScope: "group_topic"`. The `threadSession = true` shorthand referenced in Issue #31 is likely another alias or community documentation shorthand for the same underlying mechanism. The canonical config key in current OpenClaw versions (2026.2+) is `groupSessionScope`. ++ ++### 1.2 How It Solves Session Sharing ++ ++**Confidence: HIGH** ++ ++This directly addresses OpenCrew's P0 issue ("Slack channel root messages share one session -- context pollution"). With `groupSessionScope: "group_topic"`: ++ ++- Each Feishu topic thread within a group gets a distinct session key (e.g., `oc_xxx:topic:om_root_123`) ++- Non-topic messages in the group mainline fall back to the group-level session (`oc_xxx`) ++- Different tasks running in different topics within the same group will have **fully isolated conversation context** ++- This mirrors the Slack model where "thread = task = session" ++ ++**Practical impact for OpenCrew**: In the CTO group, multiple A2A tasks can now run in parallel as separate topics, each with its own session. No more context intermingling. ++ ++### 1.3 Interaction with OpenCrew's Group Chat Model ++ ++**Confidence: MEDIUM** (theoretical analysis, not tested) ++ ++OpenCrew's model is "group chat = role" (each group is bound to one Agent). Adding topic-level session isolation is additive and non-breaking: ++ ++- **Routing**: The Agent binding still matches on `chatId` (group level). The `groupSessionScope` only affects the session key, not routing. Messages in any topic within the CTO group still route to the CTO Agent. ++- **A2A visibility**: Task root messages posted as Feishu topic starters become natural "anchors" (equivalent to Slack root messages). All follow-up conversation stays within the topic. ++- **Session key for A2A**: When using `sessions_send`, the session key format changes from `agent:cto:feishu:group:oc_xxx` to `agent:cto:feishu:group:oc_xxx:topic:om_root_yyy`. Existing A2A protocol session key construction logic will need to account for the topic suffix. ++ ++**Config to enable**: ++ ++```json ++{ ++ "channels": { ++ "feishu": { ++ "groupSessionScope": "group_topic" ++ } ++ } ++} ++``` ++ ++Or per-group override: ++ ++```json ++{ ++ "channels": { ++ "feishu": { ++ "groups": { ++ "oc_xxx": { ++ "groupSessionScope": "group_topic" ++ } ++ } ++ } ++ } ++} ++``` ++ ++--- ++ ++## 2. Multi-Account A2A Impact ++ ++### 2.1 Cross-App Message Routing ++ ++**Confidence: HIGH** (backed by OpenClaw source code and Feishu platform documentation) ++ ++In multi-account mode, each Feishu app (bot) runs as a separate account under `channels.feishu.accounts`. Each account establishes its own WebSocket connection to Feishu Cloud using its own `appId`/`appSecret`. The `startFeishuProvider` function in the built-in plugin creates a separate provider per enabled account. ++ ++When a user sends a message in a group where multiple bots are present, each bot receives an independent `im.message.receive_v1` event. OpenClaw handles this through cross-account broadcast deduplication: ++ ++1. The first account to claim the `messageId` in a shared "broadcast" namespace processes the message ++2. Subsequent accounts skip dispatch for that message ++3. The `tryRecordMessagePersistent` function enforces first-claim-wins semantics ++ ++With `accountId`-based bindings, the routing priority is: ++1. Exact `peer` match (specific group/DM ID) ++2. `parentPeer` match (thread inheritance) ++3. `accountId` match ++4. Channel-level fallback (`accountId: "*"`) ++5. Default agent fallback ++ ++**Recommended setup**: In "one bot per group" mode, add only the corresponding bot to each group. This avoids deduplication contention entirely since only one bot receives events per group. ++ ++### 2.2 Self-Loop Filter Bypass ++ ++**Confidence: HIGH** (backed by Feishu platform documentation) ++ ++**Critical finding**: The Feishu `im.message.receive_v1` event **only fires for user-sent messages**. The official Feishu documentation states: ++ ++> "Currently only supports messages sent by users" (`sender_type: "user"`) ++> "In group scenarios, you receive all messages sent by users (not including messages sent by the bot)" ++ ++This means: ++- When Bot-CTO posts a message in the Builder group, **Bot-Builder does NOT receive an `im.message.receive_v1` event** ++- This is a Feishu platform constraint, not an OpenClaw filter ++- The "self-loop filter bypass" question is moot -- there is nothing to bypass because bot messages simply do not generate inbound events for other bots ++ ++**Implication**: Cross-bot messaging via Feishu API cannot trigger another Agent. The only way to trigger Agent-B from Agent-A remains `sessions_send` (OpenClaw's internal A2A mechanism). ++ ++### 2.3 Implications for Two-Step Trigger ++ ++**Confidence: HIGH** ++ ++The current A2A two-step trigger cannot be simplified to one step via cross-bot messaging: ++ ++| Step | Current (single-bot) | Multi-bot mode | Change? | ++|------|---------------------|----------------|---------| ++| Step 1: Post visible root message in target channel | Bot posts in target group | Bot-A posts in Bot-B's group | **Same** (visibility anchor) | ++| Step 2: `sessions_send` to trigger target agent | Required (bot self-messages are ignored) | **Still required** (Feishu does not deliver bot messages to other bots) | **No change** | ++ ++However, multi-bot mode does provide these improvements: ++- **Visual clarity**: Each Agent's messages appear under a distinct bot name/avatar, making A2A exchanges easier to follow ++- **API quota independence**: Each bot has its own rate limits, preventing a chatty Agent from starving others ++- **Permission isolation**: Different Agents can have different Feishu permission scopes ++ ++The `sessions_send` mechanism works via OpenClaw's `INTERNAL_MESSAGE_CHANNEL` with `deliver: false`, meaning it routes entirely within the OpenClaw runtime without touching Feishu APIs. This is efficient and reliable regardless of bot configuration. ++ ++### 2.4 Config Examples ++ ++Multi-account Feishu config with topic session isolation: ++ ++```json ++{ ++ "channels": { ++ "feishu": { ++ "domain": "feishu", ++ "connectionMode": "websocket", ++ "groupSessionScope": "group_topic", ++ "accounts": { ++ "cos-bot": { ++ "name": "CoS Chief of Staff", ++ "appId": "cli_cos_xxxxx", ++ "appSecret": "your-cos-secret", ++ "enabled": true ++ }, ++ "cto-bot": { ++ "name": "CTO Tech Partner", ++ "appId": "cli_cto_xxxxx", ++ "appSecret": "your-cto-secret", ++ "enabled": true ++ }, ++ "builder-bot": { ++ "name": "Builder Executor", ++ "appId": "cli_build_xxxxx", ++ "appSecret": "your-builder-secret", ++ "enabled": true ++ } ++ } ++ } ++ }, ++ "bindings": [ ++ { ++ "agentId": "cos", ++ "match": { ++ "channel": "feishu", ++ "accountId": "cos-bot", ++ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_HQ>" } ++ } ++ }, ++ { ++ "agentId": "cto", ++ "match": { ++ "channel": "feishu", ++ "accountId": "cto-bot", ++ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_CTO>" } ++ } ++ }, ++ { ++ "agentId": "builder", ++ "match": { ++ "channel": "feishu", ++ "accountId": "builder-bot", ++ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_BUILD>" } ++ } ++ } ++ ] ++} ++``` ++ ++**Important caveat**: A known bug (Issue #47436) in OpenClaw 2026.3.13 causes the Feishu plugin to crash when a second account uses `SecretRef` for `appSecret`. A fix has been submitted in PR #47652 (wraps per-account errors in try-catch). Until this is merged, use plaintext secrets or wait for the patch. ++ ++--- ++ ++## 3. Multi-Agent Collaboration Potential ++ ++### 3.1 Multiple Bots in Same Group/Topic ++ ++**Confidence: MEDIUM** ++ ++Multiple Feishu bots CAN coexist in the same group. The behaviors: ++ ++- **User messages**: All bots receive the event. OpenClaw's cross-account dedup ensures only one processes it (first-claim-wins). With `requireMention: true`, bots only respond when explicitly @mentioned, which is the cleanest pattern. ++- **Bot messages**: No bot receives events from other bots (Feishu platform limitation). Cross-bot conversation within a group is therefore not possible via Feishu events alone. ++- **Recommended pattern**: Each Agent's group should contain only its own bot. If multi-Agent collaboration is needed within a single group, use `sessions_send` for triggering and Feishu API messages for visibility only. ++ ++### 3.2 Discussion Patterns in Feishu ++ ++**Confidence: MEDIUM** ++ ++With `groupSessionScope: "group_topic"`, Feishu topics enable a workflow analogous to Slack threads: ++ ++1. **Task initiation**: Human or Agent creates a new topic in the Agent's group (this becomes the session root) ++2. **Execution**: Agent works within the topic, maintaining isolated context ++3. **A2A delegation**: Agent-A posts a topic in Agent-B's group (visibility anchor), then uses `sessions_send` to trigger Agent-B in that topic's session ++4. **Parallel tasks**: Multiple topics in the same group run independently ++ ++The key difference from Slack: Feishu topics in standard groups are less prominent in the UI than Slack threads. Feishu "topic groups" (话题群) are a special group type where all messages must belong to a topic -- this would be the ideal group type for OpenCrew's use case, as it enforces topic-based organization. ++ ++### 3.3 Comparison with Slack Capabilities ++ ++| Capability | Slack | Feishu (with groupSessionScope) | ++|---|---|---| ++| Thread/topic isolation | Native (thread = session) | Now available via `group_topic` scope | ++| Bot self-loop filter | Bot ignores own messages (configurable) | Platform-level: bot events only for user messages | ++| Cross-bot triggering | Not possible (single bot identity) | Not possible (bot messages invisible to other bots) | ++| A2A trigger mechanism | `sessions_send` (required) | `sessions_send` (required) | ++| Visual identity | Single bot, shared name | Multi-bot, distinct names/avatars | ++| Thread UI prominence | High (native threading) | Medium (topic groups better than standard groups) | ++ ++--- ++ ++## 4. Migration Path ++ ++### 4.1 Single-App to Multi-App ++ ++**Confidence: MEDIUM** (logical analysis, not tested end-to-end) ++ ++Migration steps: ++ ++1. **Create new Feishu apps** for each Agent (follow existing FEISHU_SETUP.md Steps 1-3 for each) ++2. **Update openclaw.json** to use `accounts` format instead of top-level `appId`/`appSecret`: ++ ++ Before (single-app): ++ ```json ++ { ++ "channels": { ++ "feishu": { ++ "appId": "cli_original_xxx", ++ "appSecret": "original-secret" ++ } ++ } ++ } ++ ``` ++ ++ After (multi-app): ++ ```json ++ { ++ "channels": { ++ "feishu": { ++ "accounts": { ++ "legacy": { ++ "appId": "cli_original_xxx", ++ "appSecret": "original-secret", ++ "enabled": true ++ }, ++ "cto-bot": { ++ "appId": "cli_cto_xxx", ++ "appSecret": "cto-secret", ++ "enabled": true ++ } ++ } ++ } ++ } ++ } ++ ``` ++ ++3. **Add `accountId` to bindings** incrementally -- start with one Agent, verify, then expand ++4. **Add each new bot to its group** in Feishu settings ++5. **Enable topic sessions** with `groupSessionScope: "group_topic"` (can be done independently of multi-app migration) ++ ++### 4.2 Session Key Compatibility ++ ++**Confidence: LOW** (requires testing) ++ ++When migrating from single-app to multi-app, session keys may change format: ++ ++- **Old format**: `agent:cto:feishu:group:oc_xxx` (no accountId component) ++- **New format**: Potentially `agent:cto:feishu:cto-bot:group:oc_xxx` (with accountId) ++ ++If the session key changes, existing conversation history associated with old session keys becomes orphaned. The Agent starts with a fresh session in the new key. ++ ++**Mitigation strategies**: ++- Keep the original app as the `legacy` account and migrate agents one at a time ++- Use `session.resetByType` to explicitly reset group sessions during migration (treat it as a clean-slate moment) ++- Back up `~/.openclaw/sessions/` before migration ++ ++When adding `groupSessionScope: "group_topic"`: ++- Messages in the group mainline (no topic) continue using the base `chatId` key -- unchanged ++- Only messages within topics get the new `chatId:topic:topicId` key ++- This is backward-compatible: existing mainline sessions are unaffected ++ ++### 4.3 Rollback Strategy ++ ++1. **Config rollback**: Restore from backup (`openclaw.json.bak.<timestamp>`) ++2. **Bot rollback**: Remove new bots from groups; the original bot remains functional ++3. **Session rollback**: Session data for the old key format is preserved -- reverting config restores old routing ++4. **Gateway restart**: `openclaw gateway restart` applies all changes ++ ++The migration is designed to be incremental and reversible at each step. ++ ++--- ++ ++## 5. Confidence Assessment ++ ++| Finding | Confidence | Source | ++|---|---|---| ++| `groupSessionScope: "group_topic"` creates per-topic sessions | **HIGH** | OpenClaw source code (`buildFeishuConversationId`), PR #29791, DeepWiki analysis | ++| `threadSession` is a shorthand/alias for topic session isolation | **MEDIUM** | Issue #31 comment + correlation with `topicSessionMode` legacy config; exact alias mechanism not found in source | ++| Feishu `im.message.receive_v1` only fires for user messages | **HIGH** | Official Feishu Open Platform documentation | ++| Bot-to-bot messages do NOT trigger other bots | **HIGH** | Feishu platform documentation: "sender_type currently only supports user" | ++| A2A two-step trigger cannot become one-step | **HIGH** | Combination of Feishu platform constraint + OpenClaw `sessions_send` architecture | ++| Multi-account config format with `accounts` block | **HIGH** | OpenClaw source code, config schema, DeepWiki analysis | ++| Cross-account dedup via broadcast namespace | **HIGH** | OpenClaw source code, test cases documented in DeepWiki | ++| Multi-account SecretRef crash bug (Issue #47436) | **HIGH** | GitHub issue with reproduction steps and submitted fix | ++| Session key format change during migration | **LOW** | Theoretical analysis; needs empirical testing | ++| `groupSessionScope` can be set per-group | **MEDIUM** | Config schema supports it; not tested in practice | ++ ++--- ++ ++## 6. Open Questions ++ ++1. **Exact `threadSession` config path**: The Issue #31 comment references `openclaw config set channels.feishu.threadSession true`, but the canonical config key found in OpenClaw source is `groupSessionScope`. Is `threadSession` a CLI shorthand that resolves to `groupSessionScope: "group_topic"`? Or is it specific to the community plugin (`AlexAnys/feishu-openclaw`)? This needs verification against the actual CLI behavior. ++ ++2. **Session key migration**: When adding `accountId` to bindings, does the session key incorporate the accountId? If so, what happens to existing sessions? This needs empirical testing. ++ ++3. **Topic group type**: Feishu distinguishes between standard groups (topics optional) and "topic groups" (话题群, topics mandatory). Which type works better with OpenCrew? Does `groupSessionScope` work identically in both? ++ ++4. **Announce step behavior**: When Agent-A uses `sessions_send` to trigger Agent-B, the "announce step" posts a summary to the target channel. With multi-bot mode, which bot identity is used for the announce post -- Agent-B's bot (correct) or a shared bot? This affects visual clarity of A2A exchanges. ++ ++5. **Rate limiting with many accounts**: Each Feishu app has independent API quotas. However, the health check ping interval (noted in `docs/api-quota-fix.md`) consumes API calls per account. With 7 agents = 7 apps, health check overhead may be significant. What is the optimal ping interval? ++ ++6. **Community plugin vs built-in**: The `AlexAnys/feishu-openclaw` community plugin does NOT support `groupSessionScope` (it uses a simpler `chatId`-only session key with `threading.resolveReplyToMode: "off"`). OpenCrew's current setup guide references this community plugin. Is the project already using the built-in plugin (OpenClaw >= 2026.2), or does it need to migrate? ++ ++7. **Cross-account broadcast in shared groups**: If a "shared collaboration group" with multiple bots is desired (e.g., a "war room"), how should the broadcast dedup be configured? Should one bot be designated as the "listener" with others set to `requireMention: true`? ++ ++--- ++ ++## Sources ++ ++- [OpenClaw Feishu documentation](https://docs.openclaw.ai/channels/feishu) ++- [OpenClaw GitHub - Feishu docs](https://github.com/openclaw/openclaw/blob/main/docs/channels/feishu.md) ++- [AlexAnys/openclaw-feishu (community plugin)](https://github.com/AlexAnys/openclaw-feishu) ++- [AlexAnys/feishu-openclaw (bridge)](https://github.com/AlexAnys/feishu-openclaw) ++- [Issue #29791: Thread-based replies in Feishu](https://github.com/openclaw/openclaw/issues/29791) -- closed, resolved via PR #29788 ++- [Issue #8692: Multi-bot routing issues](https://github.com/openclaw/openclaw/issues/8692) ++- [Issue #47436: Multi-account SecretRef crash](https://github.com/openclaw/openclaw/issues/47436) ++- [Feishu Open Platform - Receive message event](https://open.feishu.cn/document/server-docs/im-v1/message/events/receive) ++- [DeepWiki - OpenClaw Session Management](https://deepwiki.com/openclaw/openclaw/2.4-session-management) ++- [DeepWiki - AlexAnys/openclaw-feishu](https://deepwiki.com/AlexAnys/openclaw-feishu) ++- [OpenCrew Issue #31: Feishu multi-agent bot routing](https://github.com/AlexAnys/opencrew/issues/31) diff --git a/.harness/reports/research_platform_limitations_r1.md b/.harness/reports/research_platform_limitations_r1.md new file mode 100644 index 0000000..fee64d8 --- /dev/null +++ b/.harness/reports/research_platform_limitations_r1.md @@ -0,0 +1,109 @@ +# Why Discord and Feishu Cannot Match Slack's Cross-Bot Collaboration + +**Date**: 2026-03-27 +**Purpose**: Definitive answer to why Slack supports cross-bot collaboration but Discord and Feishu do not. + +--- + +## How Slack Works (the baseline) + +Slack's architecture enables cross-bot collaboration through five properties working together: + +1. **Independent bot identities**: Each agent gets its own Slack App with a unique bot user ID. +2. **Per-bot self-loop filter**: OpenClaw checks `message.user === ctx.botUserId` per-account. Bot-CTO (user ID `U_CTO`) is NOT filtered by Bot-Builder's handler (which only filters `U_BUILDER`). +3. **`allowBots: true`**: Enables processing messages authored by other bots. +4. **Per-account channel config**: Each account can have different `requireMention` settings, preventing uncontrolled loops while allowing targeted @mention-based triggering. +5. **Platform-level event delivery**: Slack's Events API delivers all channel messages to all subscribed apps, including messages from other bots. + +The result: Bot-CTO can @mention Bot-Builder in a thread, Bot-Builder's OpenClaw handler receives it as a legitimate inbound message, processes it, and responds -- all visible to humans as a natural conversation between distinct identities. + +--- + +## Discord: Blocked by TWO OpenClaw Code Bugs + +### The blocker is NOT a Discord platform limitation. + +Discord's platform fully supports cross-bot messaging. When Bot-A posts in a channel, Bot-B receives the `MESSAGE_CREATE` gateway event (provided Bot-B has View Channel permission and Message Content Intent enabled). The `message.author.bot` flag identifies it as a bot message. This is identical to Slack's behavior. + +### Bug 1: Issue #11199 -- Bot filter treats all configured bots as "self" + +**What happens at the code level**: OpenClaw's Discord plugin checks the message author ID against ALL configured bot account IDs in the instance, not just the receiving account's own ID. When Bot-A posts a message, Bot-B's handler sees Bot-A's user ID in its "known bot IDs" list and drops the message as if it were a self-loop. + +**Contrast with Slack**: The Slack plugin checks `message.user === ctx.botUserId` where `ctx.botUserId` is the specific bot user ID of THAT account. Different accounts have different `botUserId` values, so cross-bot messages pass through. The Discord plugin lacks this per-account scoping. + +**Fix status**: Three PRs were submitted to fix this (#11644, #22611, #35479). All three were CLOSED without merging. The issue itself was auto-closed on 2026-03-08 by a stale bot due to inactivity -- it was NOT fixed. + +**Workaround**: A community workaround exists: set `allowBots: true` + `requireMention: false` + per-channel `users` whitelist. This works but requires disabling mention gating entirely, which removes the primary loop-prevention mechanism. + +### Bug 2: Issue #45300 -- `requireMention` broken in multi-account Discord + +**What happens**: When multiple Discord bot accounts are configured, the `requireMention: true` check drops ALL guild messages at the preflight stage with reason "no-mention" -- even when the bot IS explicitly @mentioned. The mention detection logic fails to correctly resolve mentions against the receiving bot's user ID in multi-account configurations. + +**Why this matters**: Even if #11199 were fixed, the recommended safe pattern (`allowBots: true` + `requireMention: true`) would still not work. Without `requireMention`, every bot message in the channel triggers every other bot, creating infinite loops. + +**Status**: Issue is still OPEN. No fix PR identified. + +### What would need to change + +1. Fix the self-loop filter to be per-account (check author ID against only the receiving account's bot user ID, not all configured bot IDs). +2. Fix mention detection in multi-account mode to correctly identify @mentions of the receiving bot. +3. Both fixes are straightforward code changes -- they align Discord's behavior with Slack's existing implementation. + +### Timeline + +**Uncertain.** All three fix PRs for #11199 were closed without merge, and the issue was auto-closed as stale. The OpenClaw maintainers have not signaled intent to prioritize this. Given that `sessions_send` (internal A2A routing) is the officially recommended pattern, channel-based cross-bot communication may not be considered a priority. + +--- + +## Feishu: Blocked by a Platform-Level API Limitation + +### The blocker IS a Feishu platform limitation. It cannot be fixed by OpenClaw. + +### The technical constraint + +Feishu's `im.message.receive_v1` event -- the only event type for receiving chat messages -- explicitly delivers ONLY user-sent messages. The official documentation states: + +> "目前只支持用户(user)发送的消息" +> ("Currently only supports messages sent by users") + +> "可接收与机器人所在群聊会话中用户发送的所有消息(不包含机器人发送的消息)" +> ("Can receive all messages sent by users in group chats where the bot participates, excluding messages sent by the bot") + +When Bot-CTO posts a message in a group, Bot-Builder does NOT receive any event. The message is simply invisible to other bots at the API level. There is no `allowBots` flag or configuration that can change this -- the events are never generated by Feishu's servers. + +### Are there alternative APIs? + +No viable alternative exists: + +- **`im.message.receive_v1`** is the only message reception event. There is no `im.message.receive_bot_v1` or equivalent. +- **Message list API** (`GET /im/v1/messages`): Could theoretically poll for messages, but this is a REST endpoint, not a real-time event. Polling introduces latency, complexity, and API quota consumption. It also cannot distinguish which messages have already been processed. +- **Feishu's event system** has no event type for "bot message posted in group." The platform was designed with a user-to-bot interaction model, not a bot-to-bot model. +- Searching for alternative approaches (e.g., "飞书 机器人消息 其他机器人接收") confirms this is a well-known and accepted limitation of the Feishu platform with no documented workaround. + +### What would need to change + +Feishu (ByteDance/Lark) would need to add a new event type or extend `im.message.receive_v1` to include bot-sent messages with an opt-in flag. There is no public indication this is planned. + +--- + +## Summary Table + +| Dimension | Slack | Discord | Feishu | +|-----------|-------|---------|--------| +| Platform delivers cross-bot messages? | YES | YES | **NO** | +| OpenClaw processes cross-bot messages? | YES (per-account self-loop filter) | **NO** (global bot filter bug #11199) | N/A (no events to process) | +| Mention gating works in multi-account? | YES | **NO** (broken, #45300) | N/A | +| Blocker type | None | **Code bugs** (fixable) | **Platform limitation** (unfixable by us) | +| Fix complexity | N/A | Low (align with Slack's implementation) | Requires Feishu platform change | +| Fix timeline | N/A | Uncertain (PRs closed, issue stale) | No indication from Feishu | +| Current workaround | N/A | `allowBots: true` + `requireMention: false` + `users` whitelist (loop risk) | `sessions_send` only | + +### The bottom line + +- **Discord** could work exactly like Slack if two code bugs in OpenClaw were fixed. The Discord platform itself is fully capable. The fixes are straightforward but have not been prioritized by OpenClaw maintainers. +- **Feishu** cannot work like Slack regardless of any code changes. The limitation is baked into Feishu's event delivery architecture. Only `sessions_send` (OpenClaw's internal routing) can achieve cross-agent triggering on Feishu. +- **Both platforms** fully support the Delegation pattern (anchor message + `sessions_send`). Only the Discussion pattern (autonomous cross-bot conversation visible in chat) is blocked. + +--- + +*Sources: OpenClaw Issues #11199, #45300, #15836; PRs #11644, #22611, #35479; Feishu Open Platform docs (open.feishu.cn); OpenClaw source code verification (verify_source_code_r1.md); QA verification (qa_a2a_research_r1.md)* diff --git a/.harness/reports/research_selective_agents_r1.md b/.harness/reports/research_selective_agents_r1.md new file mode 100644 index 0000000..0739e21 --- /dev/null +++ b/.harness/reports/research_selective_agents_r1.md @@ -0,0 +1,501 @@ +# Research: Selective Independent Agents Architecture + +> Researcher: Claude Opus 4.6 | Date: 2026-03-27 | Scope: Three architectural questions -- orchestrator role, hybrid bot architecture, instance vs workspace independence + +--- + +## Executive Summary + +**Question 1 (Orchestrator)**: CoS is the correct orchestrator, not CTO. In Anthropic's Harness Design, the orchestrator is an **external script** (not a participant agent). In OpenCrew's Slack-native context, CoS maps most naturally to this role: it represents the user's intent, drives strategy forward, and does not do execution work. CTO should be a Generator/participant, not the orchestrator. + +**Question 2 (Hybrid Architecture)**: The proposed hybrid model -- single bot for execution agents + independent CoS bot + independent QA bot -- is technically feasible within a single OpenClaw gateway using multi-account mode. The key insight: one account CAN serve multiple agents via peer binding (channel-to-agent), while other accounts each serve one agent via account binding. Two bots CAN coexist in the same channel when `allowBots: true` + `requireMention: true` is set. The @mention-back flow works without changing existing channel configs, provided `allowBots: true` is added to channels where cross-bot interaction is needed. + +**Question 3 (Instance vs Workspace)**: A single OpenClaw gateway with multi-account is strongly recommended over separate instances. Multi-account means multiple Slack Apps managed by one gateway process. This preserves `sessions_send` interoperability, shared config, and single-process management. Separate instances would break A2A tools across process boundaries. + +--- + +## 1. Orchestration: CoS vs CTO vs External Harness + +### Anthropic Harness Design Analysis + +The Anthropic Harness Design methodology (as implemented in Claude Code Agent Teams and documented in the harness-design skill) follows a clear separation: + +| Component | Role | Is an Agent? | +|-----------|------|--------------| +| **Harness** (script) | Orchestrator -- decides what runs next, parses outputs, manages flow | NO -- it is external code | +| **Planner** | Analyzes requirements, produces spec/plan | YES -- spawned by harness | +| **Builder/Generator** | Executes the plan, produces artifacts | YES -- spawned by harness | +| **QA/Evaluator** | Reviews outputs, challenges quality, catches issues | YES -- spawned by harness | + +Key architectural principle: **No single agent is both a participant AND the orchestrator.** The harness is not an LLM agent -- it is a deterministic script that reads outputs and decides the next step. This prevents: +- Orchestrator bias (an agent-orchestrator favors its own perspective) +- Context pollution (orchestration logic competes with domain reasoning) +- Role confusion (is the agent thinking about the problem or about who to call next?) + +### Mapping to OpenCrew Roles + +The current A2A_PROTOCOL.md and architecture_collab_r1.md designate CTO as the discussion orchestrator for technical discussions. But this creates a problem identified in the Harness Design: + +**CTO as orchestrator = CTO is both participant and controller.** When CTO drives an architecture review, it is simultaneously: +1. Proposing the technical approach (Generator role) +2. Deciding who speaks next (Orchestrator role) +3. Evaluating Builder's response (Evaluator role) + +This triple-hat violates the Harness Design's core separation. CTO's technical opinions will bias which agents it calls and how it frames questions. + +**CoS maps naturally to the Harness's orchestrator role:** + +| Harness Concept | CoS Mapping | Evidence | +|----------------|-------------|---------| +| External to generation | CoS does NOT do technical implementation | SOUL.md: "you are not a gateway, not a doer" | +| Represents user intent | CoS's core role is "deep intent alignment" | SOUL.md: "strategic partner who drives things forward when user is away" | +| Decides what runs next | CoS determines priorities and delegation | AGENTS.md: "strategic tradeoff + pacing + coordination" | +| Reads outputs, routes decisions | CoS synthesizes and routes to CTO/CIO | ARCHITECTURE.md: "CoS evaluates/delegates to CTO/CIO" | +| Does not participate in generation | CoS does not write code, do research, or build | SOUL.md: "push main thread, lower cognitive load" | + +The user's insight is correct: CoS's SOUL.md description -- "strategic partner who drives things forward when the user is away" -- is almost word-for-word the harness's role description. + +**But CoS is not a pure external script -- it is an LLM agent.** This is a key difference from the Harness Design. In OpenCrew's Slack-native architecture, a non-agent orchestrator would be a Slack bot with hardcoded routing logic, which loses the strategic judgment that makes CoS valuable. The pragmatic solution: CoS acts as orchestrator but with strict role discipline: + +- CoS **decides** who to engage and what to ask (orchestrator hat) +- CoS **does not** propose technical solutions or challenge technical details (no generator/evaluator hat) +- CoS **synthesizes** outcomes and aligns with user intent (unique CoS value-add) + +### Recommendation + +**CoS should be the orchestrator. CTO should be a participant/generator.** + +This means: +1. **Discussion mode**: CoS @mentions CTO, Builder, CIO as needed. CTO responds with technical input but does not decide who speaks next. +2. **Delegation mode**: CoS delegates to CTO via A2A. CTO then orchestrates within its execution scope (CTO-to-Builder), which is fine -- this is scoped orchestration of subordinates, not strategic orchestration. +3. **QA as independent evaluator**: QA reviews outputs without being called by the generator (CTO/Builder). This mirrors the Harness's Evaluator independence. + +The existing Permission Matrix (CoS -> CTO only, CTO -> Builder/Research/KO/Ops) already supports this. The change is conceptual: CTO stops being the Discussion orchestrator and becomes a discussion participant. + +--- + +## 2. Hybrid Architecture: Single Bot + Selective Independent Agents + +### 2.1 Config Feasibility + +**Proposed model:** +- `accounts.default` (single Slack App) -- serves CTO, CIO, Builder, KO, Ops, Research via peer binding per channel +- `accounts.cos` (independent Slack App) -- serves CoS only via account binding +- `accounts.qa` (independent Slack App) -- serves QA only via account binding + +**Does this work?** Yes. OpenClaw's binding system supports mixing peer-match and account-match bindings in the same config. The binding resolution order is: + +1. **Peer match** (most specific): `match.peer.kind = "channel", match.peer.id = "<CHANNEL_ID>"` -- routes messages from a specific Slack channel to a specific agent +2. **Account match**: `match.accountId = "cos"` -- routes ALL messages received by the CoS Slack App to the CoS agent +3. **Fallback**: unmatched messages go to the default agent + +The current OpenCrew config (CONFIG_SNIPPET_2026.2.9.md) uses peer binding exclusively -- each agent is bound to its channel via `match.peer`. This binding method is **account-agnostic** -- it works on whichever Slack App receives the event. In the current single-bot setup, all events come through one bot, and peer binding routes them to the correct agent by channel. + +**The hybrid config would look like:** + +```jsonc +{ + "channels": { + "slack": { + "accounts": { + // Single bot for execution agents (existing App) + "default": { + "botToken": "${SLACK_BOT_TOKEN_DEFAULT}", + "appToken": "${SLACK_APP_TOKEN_DEFAULT}" + }, + // Independent CoS bot (new App) + "cos": { + "botToken": "${SLACK_BOT_TOKEN_COS}", + "appToken": "${SLACK_APP_TOKEN_COS}" + }, + // Independent QA bot (new App) + "qa": { + "botToken": "${SLACK_BOT_TOKEN_QA}", + "appToken": "${SLACK_APP_TOKEN_QA}" + } + }, + "channels": { + // Existing agent channels -- unchanged + "<HQ_CHANNEL_ID>": { "allow": true, "requireMention": false }, + "<CTO_CHANNEL_ID>": { "allow": true, "requireMention": false, "allowBots": true }, + "<BUILD_CHANNEL_ID>": { "allow": true, "requireMention": false, "allowBots": true }, + "<INVEST_CHANNEL_ID>": { "allow": true, "requireMention": false }, + "<KNOW_CHANNEL_ID>": { "allow": true, "requireMention": true }, + "<OPS_CHANNEL_ID>": { "allow": true, "requireMention": true }, + "<RESEARCH_CHANNEL_ID>": { "allow": true, "requireMention": false } + } + } + }, + "bindings": [ + // CoS: account-level binding (all events from CoS App -> CoS agent) + { "agentId": "cos", "match": { "channel": "slack", "accountId": "cos" } }, + + // QA: account-level binding (all events from QA App -> QA agent) + { "agentId": "qa", "match": { "channel": "slack", "accountId": "qa" } }, + + // Execution agents: peer binding on default account (unchanged from current) + { "agentId": "cto", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<CTO_CHANNEL_ID>" } } }, + { "agentId": "builder", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<BUILD_CHANNEL_ID>" } } }, + { "agentId": "cio", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<INVEST_CHANNEL_ID>" } } }, + { "agentId": "ko", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<KNOW_CHANNEL_ID>" } } }, + { "agentId": "ops", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<OPS_CHANNEL_ID>" } } }, + { "agentId": "research", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<RESEARCH_CHANNEL_ID>" } } } + ] +} +``` + +**Key insight**: The peer bindings for CTO/Builder/etc. are processed against the default account's events. The account bindings for CoS/QA are processed against their respective accounts' events. These don't conflict -- OpenClaw processes each account's event stream independently through its own binding chain. + +### 2.2 Channel Coexistence (Two Bots in Same Channel) + +**Scenario**: CoS-Bot and Default-Bot (bound to CTO) are both in #cto. Someone posts in #cto. + +**What happens:** + +1. Slack delivers the `message.channels` event to ALL apps that have joined #cto +2. Default-Bot's app receives the event -> OpenClaw checks bindings -> peer match on `<CTO_CHANNEL_ID>` -> routes to CTO agent +3. CoS-Bot's app receives the event -> OpenClaw checks bindings -> account match on "cos" -> routes to CoS agent + +**Without any guards, BOTH agents would respond.** This is the core coexistence question. + +**Solution: `requireMention: true` on the CoS and QA accounts' channel configs.** + +There is a subtlety here. OpenClaw's channel config (`channels.slack.channels`) is **global across all accounts** in the same gateway. You cannot set `requireMention: false` for CTO on #cto and `requireMention: true` for CoS on #cto in the same channel config block. + +**However**, the binding model handles this naturally: + +- **Default-Bot in #cto**: The peer binding for CTO matches on channel ID, not on mention. The channel config for #cto has `requireMention: false`, so CTO responds to all messages. This is the existing behavior. +- **CoS-Bot in #cto**: CoS-Bot is NOT the "native" bot for #cto. CoS's binding is account-level (`accountId: cos`), not peer-level on #cto. When CoS-Bot receives a message from #cto, the `requireMention` check applies. If `requireMention: true` is set on #cto's channel config, CoS only activates when @mentioned. + +**The problem**: Setting `requireMention: true` on #cto globally would also require CTO to be @mentioned -- breaking its current behavior. + +**Resolution approaches:** + +1. **Option A -- Per-account channel overrides**: If OpenClaw supports `accounts.cos.channels.<CTO_CHANNEL_ID>.requireMention = true` while global remains `false`, this solves it cleanly. **Status: UNVERIFIED** in prior research. The docs say named accounts "can override any setting" but this has not been confirmed at the per-channel level. + +2. **Option B -- CoS-Bot uses `requireMention: true` natively**: CoS-Bot joins #cto but ONLY responds when @mentioned (`<@U_COS>`). The global `requireMention: false` on #cto applies to the default bot (CTO), while CoS-Bot's agent instructions enforce mention-only behavior. **Problem**: This relies on agent-level self-discipline, not config-level enforcement. If `requireMention: false` is the channel setting, OpenClaw WILL trigger CoS's session for non-mention messages. + +3. **Option C -- CoS does NOT join agent channels by default**: CoS-Bot only joins #hq (its home channel). When CoS needs to interact with #cto, it uses `sessions_send` (A2A delegation) rather than direct @mention. CoS only joins other channels for active Discussion mode sessions. **This is the cleanest solution that preserves existing channel behavior.** + +4. **Option D -- Separate "discussion" threads**: CoS-Bot joins #cto only for specific discussion threads (not the channel at large). The bot is invited to the channel but only activates in threads where it is @mentioned. With `requireMention: true` set globally and CTO's channel using `requireMention: false`, CTO auto-responds to channel messages while CoS only responds in threads where it is mentioned. **Problem**: Same global config conflict as Option B. + +**Recommended approach: Option C with selective @mention for discussions.** + +CoS-Bot stays in #hq as its home. For orchestration: +- **Delegation**: CoS uses `sessions_send` to trigger CTO (existing A2A v1 flow, proven) +- **Discussion**: CoS @mentions CTO in a dedicated #collab or #hq thread where `allowBots: true` + `requireMention: true` are both set. CTO joins this discussion via @mention trigger. +- **Progress checking**: CoS @mentions CTO in #cto threads (CTO's channel has `allowBots: true`, CTO responds because it is @mentioned) + +This avoids the global `requireMention` conflict entirely. + +### 2.3 @Mention Interaction Patterns + +**Pattern A: CoS checks CTO's progress in #cto** + +``` +Precondition: + - #cto has allowBots: true (MUST ADD -- currently false) + - #cto has requireMention: false (existing) + - CoS-Bot has been invited to #cto + +Flow: + 1. CTO (default bot) is working in #cto thread on a task + 2. CoS-Bot joins #cto and posts in the thread: "@CTO what's the status on X?" + 3. Default-Bot's app receives CoS-Bot's message in #cto + 4. OpenClaw checks: is this from a bot? Yes. Is allowBots: true? Yes. + 5. OpenClaw checks: is this from our bot user ID? No (U_COS != U_DEFAULT). Pass. + 6. OpenClaw checks requireMention: false on #cto -> pass (no mention needed) + 7. Peer binding matches <CTO_CHANNEL_ID> -> routes to CTO agent + 8. CTO reads thread, sees CoS's question, responds + +Problem: Step 6 means CTO responds to ALL of CoS-Bot's messages, not just @mentions. +This is actually DESIRABLE for this pattern -- CoS posting in CTO's thread IS an interaction. + +Counter-problem: CoS-Bot also receives CTO's response (via its own app's event stream). + - CoS's account binding routes ALL events from CoS-Bot to CoS agent + - CoS receives CTO's response in the thread -> CoS might auto-respond + - This creates potential ping-pong + +Mitigation: CoS's AGENTS.md must include explicit WAIT discipline: + "After posting a progress check in another agent's channel, WAIT for response. + Do not auto-respond to the reply unless you have a specific follow-up." +``` + +**Pattern B: QA reviews Builder's output in #build** + +``` +Precondition: + - #build has allowBots: true (MUST ADD -- currently false) + - QA-Bot has been invited to #build + +Flow: + 1. Builder (default bot) posts closeout in #build thread + 2. QA-Bot reads the thread and posts review: "@Builder three issues found..." + 3. Default-Bot's app receives QA-Bot's message + 4. allowBots: true -> pass + 5. Self-loop filter: U_QA != U_DEFAULT -> pass + 6. Peer binding matches <BUILD_CHANNEL_ID> -> routes to Builder agent + 7. Builder reads QA's review and responds with fixes/clarifications + +This pattern works. The key config change: allowBots: true on #build. +``` + +**Pattern C: CoS orchestrates discussion in #hq thread** + +``` +Precondition: + - #hq has allowBots: true + requireMention: true + - CTO-equivalent needs to join #hq -- but CTO uses the default bot + - Default-Bot is already in #hq (it's the shared bot for all execution agents) + +Flow: + 1. CoS creates thread in #hq: "Strategic discussion: should we add QA agent?" + 2. CoS @mentions CTO: "<@U_DEFAULT_BOT> CTO, what's the technical feasibility?" + + PROBLEM: The default bot has ONE user ID shared across CTO, Builder, KO, etc. + When CoS @mentions the default bot, the peer binding for #hq routes to CoS + (because #hq is CoS's channel). The CTO peer binding only matches on + <CTO_CHANNEL_ID>, not on #hq. + + This means: the default bot receiving @mention in #hq routes to CoS, not CTO. + CTO never sees it. +``` + +**This is a critical discovery.** In the hybrid model where the default bot serves multiple agents via peer binding, you CANNOT use @mention to reach a specific agent through the default bot in a channel that is not that agent's home channel. The peer binding is by channel, so the bot always routes to whichever agent owns that channel. + +**The fix**: For Discussion mode, CoS uses `sessions_send` to trigger CTO on a specific thread, not @mention. Or, discussions happen in a dedicated #collab channel where binding can be configured differently. + +**Alternative fix**: CTO gets its own Slack App (promoting it from default-bot to independent). This would mean the "default" account serves fewer agents (Builder, CIO, KO, Ops, Research) while CoS and CTO each get independent apps. This is a graduated approach -- start with CoS independent, add QA, then consider CTO. + +### 2.4 Recommended Config + +Given the analysis, the cleanest hybrid architecture is: + +**Phase 1 (Minimal -- CoS independent only):** +- `accounts.default` -- CTO, Builder, CIO, KO, Ops, Research (peer binding, existing model) +- `accounts.cos` -- CoS only (account binding, independent Slack App) +- CoS uses `sessions_send` for delegation (existing A2A v1 mechanism) +- CoS uses #hq as its home channel (existing) +- Add `allowBots: true` to #cto and #build so CoS can post progress checks +- No Discussion mode yet -- defer to Phase 2 + +**Phase 2 (Add QA):** +- `accounts.qa` -- QA agent (account binding, independent Slack App) +- QA-Bot joins #build, #cto, #know with `allowBots: true` +- QA auto-reviews closeouts and @mentions the producing agent for feedback +- QA's AGENTS.md defines review triggers and quality criteria + +**Phase 3 (Full Discussion mode):** +- Create #collab channel with `allowBots: true` + `requireMention: true` +- CoS orchestrates discussions by @mentioning agents in #collab threads +- If default-bot's peer binding is ambiguous in #collab (which agent?), consider promoting CTO to independent account + +--- + +## 3. Instance vs Workspace Independence + +### 3.1 Same Gateway with Multi-Account + +**What multi-account means**: One OpenClaw gateway process manages multiple Slack Apps. Each App has its own `botToken`, `appToken`, and maintains its own Socket Mode WebSocket connection (or HTTP webhook endpoint). The gateway runs a single event loop that dispatches events from each account through the binding chain. + +``` + OpenClaw Gateway (single process) + / | \ + Socket Mode Socket Mode Socket Mode + | | | + [CoS App] [Default App] [QA App] + (Slack) (Slack) (Slack) +``` + +**Is workspace independence sufficient?** Yes, combined with account binding: +- Each agent already has its own workspace (`~/.openclaw/workspace-cos/`, etc.) with SOUL.md, AGENTS.md, MEMORY.md, etc. +- Each agent already has isolated sessions (thread-level session keys) +- Adding an independent Slack App via `accounts.cos` gives CoS its own bot identity (distinct name, avatar, bot user ID) +- The account-level binding (`accountId: cos`) ensures all events from CoS's App route exclusively to the CoS agent + +This provides **logical independence** (own identity, own workspace, own session space) within **shared infrastructure** (one gateway, shared A2A tools, shared config). + +### 3.2 Separate Gateway Instances + +**What this means**: Each independent agent runs its own OpenClaw process with its own `openclaw.json`. + +``` + [Gateway 1: CoS] [Gateway 2: Default] [Gateway 3: QA] + openclaw-cos.json openclaw.json openclaw-qa.json + | | | + [CoS App] [Default App] [QA App] +``` + +**Problems:** + +| Issue | Impact | +|-------|--------| +| `sessions_send` cannot cross processes | CoS cannot delegate to CTO via A2A -- must rely entirely on @mention in Slack, losing the reliable two-step trigger | +| No shared session management | `maxPingPongTurns`, session timeouts, and other safety constraints are per-instance. No unified loop prevention | +| No shared agent registry | Gateway 1 does not know Gateway 2's agents exist. `sessions_list`, `sessions_send`, agent ID resolution all fail cross-process | +| Triple operational burden | Three processes to monitor, restart, log-manage, and configure | +| Config duplication | Channel configs, thread settings, tool permissions must be maintained in three separate files | +| Heartbeat isolation | CoS's heartbeat cannot check on CTO's status via internal APIs | + +**The ONLY advantage**: Complete process isolation. If Gateway 2 crashes, CoS (Gateway 1) and QA (Gateway 3) continue operating. But this is a minor benefit -- a single gateway restart takes seconds, and process managers (pm2, systemd) handle automatic restarts. + +### 3.3 Recommendation + +**Same gateway with multi-account is the clear winner.** + +| Dimension | Same Gateway | Separate Gateways | +|-----------|-------------|-------------------| +| A2A delegation (`sessions_send`) | Works natively | BROKEN | +| Session management | Unified | Fragmented | +| Config management | Single file | Three files | +| Process management | One process | Three processes | +| Process isolation | No (single failure point) | Yes | +| Operational complexity | Low | High | +| Bot identity independence | Yes (multi-account) | Yes | +| Workspace independence | Yes (existing) | Yes | + +The only scenario where separate gateways make sense is if you need to run agents on different physical machines (e.g., CoS on a cloud server for uptime, Builder on a local machine with code access). This is not the current requirement. + +--- + +## 4. Proposed Architecture Diagram + +``` + +-----------------------+ + | Slack Workspace | + +-----------------------+ + | | + +----------+ +----------+ +----------+ | + | CoS App | |Default | | QA App | | + | (Bot-CoS)| |App (Bot) | | (Bot-QA) | | + +----+-----+ +----+-----+ +----+-----+ | + | | | | + | +-----------+-----------+ | | + | | Channels: | | | + | | #hq #cto #build | | | + | | #invest #know #ops | | | + | | #research | | | + | +-----------------------+ | | + +-----------------------+------+----------+ + | + +-----------------+------------------+ + | OpenClaw Gateway (single) | + +-------------------------------------+ + | | + | accounts: | + | cos: CoS App tokens | + | default: Default App tokens | + | qa: QA App tokens | + | | + | bindings: | + | cos <- accountId: cos | + | qa <- accountId: qa | + | cto <- peer: #cto channel | + | builder <- peer: #build channel | + | cio <- peer: #invest channel | + | ko <- peer: #know channel | + | ops <- peer: #ops channel | + | research<- peer: #research channel| + | | + | A2A tools: sessions_send, | + | sessions_list (shared) | + +-------------------------------------+ + + Interaction patterns: + + CoS (independent) ---sessions_send---> CTO (default bot) + CoS (independent) ---@mention in #cto thread---> CTO (needs allowBots:true on #cto) + QA (independent) ---@mention in #build thread--> Builder (needs allowBots:true on #build) + CTO (default bot) ---sessions_send---> Builder (existing, unchanged) + + Harness mapping: + CoS = Orchestrator/Planner (drives what to do) + QA = Evaluator (challenges what was done) + CTO/Builder/CIO = Generator (does the work) + KO/Ops = System maintenance (unchanged) +``` + +--- + +## 5. Implementation Path + +### What Changes in OpenCrew + +**Config changes (openclaw.json):** + +1. Add `accounts` block with `cos` and `qa` account entries (new Slack App tokens) +2. Keep `default` account (existing single bot -- rename from implicit default) +3. Add account-level bindings for CoS and QA +4. Keep peer-level bindings for CTO, Builder, CIO, KO, Ops, Research (unchanged) +5. Add `allowBots: true` to #cto and #build channel configs (enables cross-bot interaction) +6. Optionally add `requireMention: true` to #cto and #build (if you want CoS/QA to only respond when @mentioned in those channels -- recommended) + +**Agent workspace changes:** + +7. Create QA workspace (`~/.openclaw/workspace-qa/`) with SOUL.md, AGENTS.md +8. Define QA's role: review closeouts, challenge quality, check DoD compliance +9. Update CoS AGENTS.md: add orchestration responsibilities, Discussion mode instructions +10. Update CTO AGENTS.md: clarify CTO is a participant in discussions (not orchestrator), add `allowBots` interaction patterns + +**Protocol changes (shared/):** + +11. Update A2A_PROTOCOL.md: clarify that CoS is the Discussion orchestrator for strategic discussions; CTO orchestrates only within its execution scope (CTO->Builder) +12. Add QA agent to Permission Matrix: QA can review any agent's closeout, QA cannot delegate execution tasks + +**Slack setup (human-manual):** + +13. Create CoS Slack App (bot token, app token, Socket Mode) +14. Create QA Slack App (bot token, app token, Socket Mode) +15. Invite CoS-Bot to #hq, #cto, #build (and any other channels CoS should monitor) +16. Invite QA-Bot to #build, #cto, #know (channels where QA should review) +17. Record bot user IDs for @mention formatting + +### What Stays the Same + +- All existing agent workspaces (CTO, Builder, CIO, KO, Ops, Research) -- unchanged +- All existing peer bindings -- unchanged +- The default Slack App (single bot for execution agents) -- unchanged +- A2A Delegation mode (sessions_send) -- unchanged, still the primary mechanism +- Task types (Q/A/P/S), Closeout protocol, Autonomy Ladder -- unchanged +- Channel structure (#hq, #cto, #build, etc.) -- unchanged +- Thread-level session isolation -- unchanged + +--- + +## 6. Confidence Assessment + +| Finding | Confidence | Basis | +|---------|-----------|-------| +| One account can serve multiple agents via peer binding | **HIGH** | This is OpenCrew's current production model (CONFIG_SNIPPET_2026.2.9.md) | +| Multi-account supports mixing peer and account bindings | **HIGH** | OpenClaw docs confirm binding specificity hierarchy: peer > accountId > channel > fallback | +| Two bots in same channel both receive events | **HIGH** | Slack Events API delivers to ALL subscribed apps. Confirmed in research_autonomous_slack_r1.md | +| `allowBots: true` enables cross-bot message processing | **HIGH** | Confirmed by docs, community reports, and the "47 replies in 12 seconds" incident | +| Self-loop filter is per-bot-user-ID | **HIGH** | Confirmed by Issue #15836 fix analysis | +| CoS as orchestrator maps to Harness Design | **HIGH** | Role analysis against SOUL.md and Harness methodology | +| Default-bot @mention in non-home channel routes correctly | **LOW** | Default bot's peer binding routes by channel, not by @mention target. @mentioning default bot in #hq routes to CoS (not CTO). This is a limitation. | +| Per-account channel config overrides | **LOW** | Docs say "named accounts can override" but per-channel overrides within accounts are unverified | +| QA as independent evaluator adds net value | **MEDIUM** | Conceptually sound (Harness Design validates the pattern), but no empirical data on QA agent quality in Slack-based review | +| Socket Mode with 3 apps is stable | **MEDIUM** | Within normal range per OpenClaw docs. 5+ apps recommended to switch to HTTP mode | + +--- + +## 7. Open Questions + +1. **Per-account channel overrides**: Can `accounts.cos.channels.<CTO_CH>.requireMention` override the global channel setting? If yes, this solves the coexistence problem cleanly. Needs empirical testing. + +2. **Default bot @mention routing in foreign channels**: When CoS @mentions the default bot in #hq (CoS's channel), does the peer binding route to CoS or CTO? Initial analysis says CoS (because #hq's peer binding maps to CoS). This means Discussion mode via @mention in #hq cannot reach CTO through the default bot. Needs testing. + +3. **QA agent scope definition**: What exactly should QA review? Options: + - All closeouts (comprehensive but noisy) + - Only S-type and P-type closeouts (high-signal) + - Only when explicitly triggered by CoS or user + +4. **Bot display names in thread history**: When CTO (default bot) and CoS (CoS-Bot) both post in a thread, do agents loading thread history see distinct sender identities? Or do they see generic "bot" labels? This affects discussion context quality. + +5. **CoS heartbeat as orchestration trigger**: CoS already has a 12-hour heartbeat. Could this heartbeat serve as the "harness loop" -- checking agent status, driving pending tasks forward, synthesizing overnight progress? This would make CoS a proactive orchestrator without requiring external cron jobs. + +6. **Graduated independence**: The analysis assumes CoS and QA both need independence. Should CTO also eventually become independent (own Slack App)? This would solve the @mention routing problem (Question 2) but adds a fourth Slack App. What is the right graduation path? + +7. **Cost impact**: Three Slack Apps means three Socket Mode connections and potentially 3x the event processing for shared channels. What is the token cost of CoS and QA processing events from channels they monitor but do not act on (filtered by `requireMention`)? + +8. **A2A protocol for QA**: The current A2A_PROTOCOL.md does not define a QA role. What is QA's permission in the matrix? Can QA send tasks back to Builder? Can QA escalate to CTO? Does QA participate in the closeout flow or sit outside it? diff --git a/.harness/reports/research_slack_r1.md b/.harness/reports/research_slack_r1.md new file mode 100644 index 0000000..bf24687 --- /dev/null +++ b/.harness/reports/research_slack_r1.md @@ -0,0 +1,382 @@ +commit 7e825263db36aef68792a050c324daef598b4c56 +Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> +Date: Sat Mar 28 17:38:48 2026 +0800 + + feat: add A2A v2 research harness, architecture, and agent definitions + + Multi-agent harness for researching and designing A2A v2 protocol: + + Research reports (Phase 1): + - Slack: true multi-agent collaboration via multi-account + @mention + - Feishu: groupSessionScope + platform limitation analysis + - Discord: multi-bot routing + Issue #11199 blocker analysis + + Architecture designs (Phase 2): + - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode + - 5 collaboration patterns: Architecture Review, Strategic Alignment, + Code Review, Incident Response, Knowledge Synthesis + - 3-level orchestration: Human → Agent → Event-Driven + - Platform configs, migration guides, 6 ADRs + + Agent definitions for Claude Code Agent Teams: + - researcher.md, architect.md, doc-fixer.md, qa.md + + QA verification: all issues resolved, PASS verdict after fixes. + + Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> + +diff --git a/.harness/reports/research_slack_r1.md b/.harness/reports/research_slack_r1.md +new file mode 100644 +index 0000000..16022c5 +--- /dev/null ++++ b/.harness/reports/research_slack_r1.md +@@ -0,0 +1,349 @@ ++# Research Report: Slack True Multi-Agent Collaboration (U0, Round 1) ++ ++> Researcher: Claude Opus 4.6 | Date: 2026-03-27 | Contract: `.harness/contracts/research-slack.md` ++ ++--- ++ ++## Executive Summary ++ ++True multi-agent collaboration on Slack -- where multiple agents participate in the same thread as distinct identities, each bringing independent judgment -- is **technically feasible today** using OpenClaw's multi-account Slack support. The key enabler is `channels.slack.accounts`: each agent gets its own Slack app/bot, its own `xoxb-` token, and a binding to a specific OpenClaw agent. With `allowBots: true` + `requireMention: true` on shared channels, Bot-B can see Bot-A's messages as context and respond when explicitly @mentioned. This eliminates the self-loop problem that forces today's two-step `sessions_send` workaround. However, the current OpenCrew deployment uses single-bot mode, so migrating requires creating 6-7 Slack apps and reconfiguring bindings -- a significant but well-documented path. ++ ++--- ++ ++## 1. Platform Capability Assessment ++ ++### 1.1 Slack Multi-Bot in Threads ++ ++**Confidence: HIGH** (verified against Slack API docs and practical testing reports) ++ ++Slack's Events API delivers message events to all apps that are members of a channel, regardless of who posted the message. Specifically: ++ ++- **Bot-A posts in a thread; Bot-B receives the event**: Yes. Each Slack app subscribed to `message.channels` (or equivalent) receives events for all messages in channels it has joined, including messages posted by other bots. The `bot_message` subtype identifies these. ([Slack Events API docs](https://docs.slack.dev/apis/events-api/), [Slack message event reference](https://docs.slack.dev/reference/events/message/)) ++ ++- **Self-loop prevention is app-side, not platform-side**: Slack itself does NOT filter out a bot's own messages from event delivery. Frameworks like Bolt implement `ignoring_self` as an application-level guard. This means each app must decide whether to ignore its own messages. ([Slack bot interactions docs](https://api.slack.com/bot-users)) ++ ++- **Thread participation**: Any bot that is a member of a channel can post to any thread in that channel using the `thread_ts` parameter. No special permissions needed beyond `chat:write`. ([Slack threading blog](https://medium.com/slack-developer-blog/bringing-your-bot-into-threaded-messages-cd272a42924f)) ++ ++- **Visual identity**: Each Slack app has its own name, icon, and bot user ID. Messages from different bots are visually distinct in threads -- this is the key advantage over single-bot mode where all agents look like the same entity. ++ ++**Key implication**: The Slack platform fully supports multiple bots having a real-time conversation in a thread. There is no platform-level barrier. ++ ++### 1.2 OpenClaw Slack Plugin Current State ++ ++**Confidence: HIGH** (verified against OpenClaw official docs, GitHub gist, and DeepWiki source analysis) ++ ++OpenClaw's Slack plugin already supports multi-account mode: ++ ++**Multi-account configuration** (`channels.slack.accounts`): ++```json ++{ ++ "channels": { ++ "slack": { ++ "accounts": { ++ "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-..." }, ++ "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-...", "name": "CTO" }, ++ "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-...", "name": "Builder" } ++ } ++ } ++ } ++} ++``` ++ ++**Binding per account**: ++```json ++{ ++ "bindings": [ ++ { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, ++ { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, ++ { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } ++ ] ++} ++``` ++ ++**Bot message handling** -- three modes for `allowBots`: ++- `false` (default): All bot messages ignored. Current OpenCrew behavior. ++- `true`: All bot messages accepted as inbound. Requires loop prevention. ++- `"mentions"`: Bot messages accepted only if they @mention this bot. Safest for multi-agent. ++ ++**Self-loop prevention**: OpenClaw ignores messages from the same bot user ID (`message.user === botUserId`). With multi-account, each account has a different `botUserId`, so Bot-CTO's messages are NOT filtered by Bot-Builder's agent -- they are treated as real inbound. ([OpenClaw Slack docs](https://docs.openclaw.ai/channels/slack), [GitHub gist](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f)) ++ ++**Agent identity** (visual differentiation): OpenClaw supports `chat:write.customize` scope for per-agent name/icon override. Issue #27080 (identity not applied on inbound-triggered replies) was fixed in PR #27134. With multi-account, each bot has its native identity, so `chat:write.customize` is not even needed -- each app's profile serves as the identity. ([GitHub issue #27080](https://github.com/openclaw/openclaw/issues/27080)) ++ ++**Thread session isolation**: `thread.historyScope = "thread"` + `inheritParent = false` ensures each thread is an independent session. `initialHistoryLimit` controls how many prior messages load when a new session starts in an existing thread. ++ ++### 1.3 Gap Analysis ++ ++| Capability | Slack Platform | OpenClaw Plugin | OpenCrew Config | Gap | ++|-----------|---------------|----------------|----------------|-----| ++| Multiple bots in one channel | YES | YES (multi-account) | NO (single bot) | **OpenCrew config change** | ++| Bot-B sees Bot-A's messages | YES (Events API) | YES (`allowBots`) | NO (`allowBots: false`) | **OpenCrew config change** | ++| Visual identity per agent | YES (separate apps) | YES (multi-account or `chat:write.customize`) | NO (shared identity) | **OpenCrew config change** | ++| Thread-level session isolation | YES (native threads) | YES (`historyScope: "thread"`) | YES (already configured) | None | ++| Loop prevention | N/A (app-side) | YES (`requireMention`, `allowBots: "mentions"`) | N/A | **OpenCrew config change** | ++| Orchestrated turn-taking | N/A | Partial (`sessions_send` for explicit trigger) | YES (two-step A2A) | **New orchestration logic needed** | ++| Agent sees full thread history | YES | YES (`initialHistoryLimit`) | Partial | **Config tuning** | ++ ++**Bottom line**: The platform (Slack) and middleware (OpenClaw) already support everything needed. The gap is entirely at the OpenCrew configuration and protocol layer. No upstream code changes are required. ++ ++--- ++ ++## 2. Collaboration Patterns Catalog ++ ++### 2.1 Discussion Pattern ++ ++**Description**: Multiple agents participate in a single Slack thread, each contributing from their domain perspective. Example: CTO proposes architecture, Builder critiques feasibility, QA identifies risks -- they iterate to convergence. ++ ++**Mechanics**: ++1. Human (or CoS) posts a topic in a shared channel (e.g., `#cto`) or a dedicated `#collab` channel. ++2. CTO is bound to that channel and responds first with an architecture proposal. ++3. Human (or orchestrator agent) @mentions `@Builder` in the thread: "What do you think about feasibility?" ++4. Builder's bot receives the thread message (because `allowBots: true` + it was @mentioned), loads thread history via `initialHistoryLimit`, and responds with feasibility analysis. ++5. Human @mentions `@CTO` again: "How do you respond to Builder's concerns?" ++6. CTO sees Builder's messages in thread history and refines the proposal. ++7. Repeat until convergence. ++ ++**Requirements**: ++- Multi-account Slack setup (one app per participating agent) ++- `allowBots: true` or `"mentions"` on the shared channel ++- `requireMention: true` on shared channels (loop prevention) ++- `thread.historyScope: "thread"` + `initialHistoryLimit >= 50` (so each agent sees the full discussion) ++- `inheritParent: true` on shared channels (so thread participants inherit the root message context) ++ ++**Feasibility**: **NOW** -- achievable with config changes only. No code changes to OpenClaw or OpenCrew needed. ++ ++### 2.2 Review Pattern ++ ++**Description**: One agent produces work, multiple agents review it in the same thread. Example: Builder submits a design doc, CTO reviews architecture soundness, QA reviews correctness, KO checks knowledge consistency. ++ ++**Mechanics**: ++1. Builder posts deliverable in `#build` thread (the existing A2A closeout thread). ++2. CTO is @mentioned in the thread: "@CTO please review architecture." ++3. CTO's bot receives the message, loads thread history (seeing Builder's full output), and posts architecture review. ++4. QA is @mentioned: "@QA please review correctness." ++5. QA reads both Builder's output and CTO's review, posts correctness assessment. ++6. Builder is @mentioned with consolidated feedback: "@Builder please address these items." ++ ++**Requirements**: ++- Same multi-account setup as Discussion Pattern. ++- Reviewers' bots must be invited to the channel where the review thread lives. ++- `initialHistoryLimit` must be high enough to capture the full deliverable (possibly 80-100 for large outputs). ++- Each reviewer agent needs workspace instructions (SOUL.md/AGENTS.md) that define their review perspective. ++ ++**Feasibility**: **NOW** -- identical infrastructure to Discussion Pattern. The only addition is role-specific review instructions in each agent's workspace. ++ ++### 2.3 Brainstorm Pattern ++ ++**Description**: Agents take turns building on each other's ideas in a free-form exploration. Example: CoS states a goal, CTO proposes technical approaches, CIO adds domain constraints, Builder estimates effort -- they converge on a plan. ++ ++**Mechanics**: ++1. Human posts a brainstorm prompt in a shared channel: "How should we approach X?" ++2. Multiple agents are @mentioned (or an orchestrator agent manages turn order). ++3. Each agent reads the full thread history before contributing. ++4. **Turn management option A -- Human orchestrated**: Human @mentions the next agent after each response. ++5. **Turn management option B -- Agent orchestrated**: A designated orchestrator agent (CoS or CTO) reads each response and decides who to call next, posting "@Builder what's your take?" or "@CIO any domain constraints?" ++6. **Turn management option C -- Round-robin**: A lightweight script or orchestrator sends @mentions in a fixed order with a configurable delay. ++ ++**Requirements**: ++- All participating agents need bots in the shared channel. ++- Higher `initialHistoryLimit` (brainstorms can get long). ++- Clear termination criteria (who decides the brainstorm is "done"?). ++- For Option B (agent-orchestrated): The orchestrator agent needs `allowBots: true` to see other agents' messages AND the ability to @mention other bots in its responses. ++ ++**Feasibility**: **NEAR** -- The infrastructure is the same as Discussion Pattern (NOW). However, agent-orchestrated turn management (Option B) requires that agents can reliably @mention other bots in their messages and that the mentioned bot's Slack app correctly recognizes the mention. This needs validation. Human-orchestrated (Option A) works NOW. ++ ++--- ++ ++## 3. Comparison with Harness Design ++ ++### 3.1 File-based Blackboard vs Chat-based Collaboration ++ ++Anthropic's harness design for long-running applications uses a **file-based Blackboard pattern**: agents write files, other agents read them. The Planner writes a spec file, the Generator reads it and writes code, the Evaluator reads the code and writes a review file. Communication is asynchronous, persistent, and structured. ([Anthropic engineering blog](https://www.anthropic.com/engineering/harness-design-long-running-apps)) ++ ++| Dimension | Harness (File-based) | OpenCrew Slack (Chat-based) | ++|-----------|---------------------|---------------------------| ++| **Communication medium** | Files on disk | Slack thread messages | ++| **Persistence** | Git-trackable files | Slack thread history (ephemeral on free plan) | ++| **Structure** | Highly structured (sprint contracts, spec files) | Semi-structured (thread messages with conventions) | ++| **Latency** | Near-zero (local filesystem) | ~1-3s per message (Slack API round-trip) | ++| **Human visibility** | Requires explicit file inspection | Built-in (Slack UI) | ++| **Context window** | Full file contents per agent | Thread history limited by `initialHistoryLimit` | ++| **Turn management** | Explicit (harness orchestrator) | @mention-based or human-driven | ++| **Adversarial review** | Generator vs Evaluator (GAN-inspired) | Any agent vs any agent (same mechanism) | ++ ++The harness pattern's key strength is **deterministic orchestration**: the harness code decides exactly when each agent runs and what context it receives. The Slack pattern's strength is **human-in-the-loop visibility and intervention**: any human can read the thread, jump in, redirect, or override at any point. ++ ++### 3.2 Unique Value of Real-time Multi-Agent Chat ++ ++**Confidence: MEDIUM** (inference based on architecture comparison, not empirical measurement) ++ ++The Slack-based approach offers several advantages the file-based harness cannot: ++ ++1. **Real-time human oversight**: The user watches the discussion unfold in real-time and can intervene ("Actually, ignore that constraint -- we changed requirements"). File-based harnesses require the user to inspect files after-the-fact. ++ ++2. **Natural escalation**: If agents get stuck or disagree, the human is already in the thread and can break the tie. In a harness, you need explicit escalation mechanisms. ++ ++3. **Organizational memory**: Slack threads persist as searchable organizational history. Future agents (or humans) can search for "that architecture discussion we had about X" and find the full multi-agent deliberation. ++ ++4. **Progressive trust building**: Users can start with human-orchestrated discussions (manually @mentioning agents) and gradually move to agent-orchestrated as trust builds. The harness pattern is all-or-nothing autonomous. ++ ++5. **Cross-domain collaboration**: A Slack thread can include agents from different "layers" (CoS + CTO + Builder) that wouldn't interact in a harness's rigid pipeline. ++ ++**Trade-off**: Slack-based collaboration is slower (network latency, message rendering) and less structured than file-based. For pure software generation tasks, the harness pattern is likely more efficient. For strategic decisions, design reviews, and cross-functional alignment, Slack-based collaboration is superior. ++ ++--- ++ ++## 4. Recommended Architecture ++ ++### 4.1 Multi-Bot Configuration ++ ++**Recommended approach**: Create separate Slack apps for each "conversational" agent. Not all 7 agents need their own app -- only those that participate in multi-agent discussions. ++ ++**Tier 1 -- Own Slack App** (agents that discuss): ++- CoS, CTO, Builder (core discussion participants) ++ ++**Tier 2 -- Own Slack App if needed** (agents that may review): ++- CIO (domain specialist, participates in strategic discussions) ++- KO (participates in knowledge reviews) ++ ++**Tier 3 -- Shared or no Slack App** (agents that don't discuss): ++- Ops (audit only, doesn't need to participate in discussions) ++- Research (ephemeral worker, spawned via `sessions_spawn`) ++ ++**Configuration skeleton**: ++```json ++{ ++ "channels": { ++ "slack": { ++ "accounts": { ++ "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-..." }, ++ "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-..." }, ++ "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-..." } ++ }, ++ "channels": { ++ "<COLLAB_CHANNEL_ID>": { ++ "allow": true, ++ "requireMention": true, ++ "allowBots": true ++ } ++ }, ++ "thread": { ++ "historyScope": "thread", ++ "inheritParent": true, ++ "initialHistoryLimit": 50 ++ } ++ } ++ }, ++ "bindings": [ ++ { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, ++ { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, ++ { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } ++ ] ++} ++``` ++ ++**Each Slack app requires**: Bot Token Scopes: `channels:history`, `channels:read`, `chat:write`, `chat:write.customize`, `users:read`. Event Subscriptions: `message.channels`, `app_mention`. Socket Mode enabled with `connections:write` scope on the app-level token. ++ ++### 4.2 Orchestration Model ++ ++**Recommended: Hybrid human + agent orchestration** (phased rollout) ++ ++**Phase 1 -- Human Orchestrated (NOW)**: ++- User @mentions agents in threads to drive discussion. ++- All agents have `requireMention: true` + `allowBots: true`. ++- User controls pace, topic, and turn order. ++- This is the safest starting point and requires zero protocol changes. ++ ++**Phase 2 -- Agent Orchestrated (NEAR)**: ++- Designate CTO (or CoS) as the orchestrator for technical discussions. ++- Orchestrator agent's AGENTS.md includes instructions: "After receiving input, decide which specialist to consult next and @mention them in the thread." ++- Add guardrails: max 3 agent-to-agent turns per discussion before requiring human input. ++- `maxPingPongTurns` from A2A protocol can be repurposed as a discussion round limit. ++ ++**Phase 3 -- Event-Driven with Guardrails (FUTURE)**: ++- Agents proactively respond when they detect messages relevant to their domain (similar to SlackAgents' proactive mode from [EMNLP 2025](https://aclanthology.org/2025.emnlp-demos.76.pdf)). ++- Requires sophisticated relevance filtering to avoid noise. ++- Use `allowBots: "mentions"` as the safety valve. ++ ++### 4.3 Integration with Existing A2A Protocol ++ ++The multi-bot architecture does NOT replace the existing A2A protocol -- it extends it with a new mode: ++ ++**A2A v1 (existing -- single-bot delegation)**: ++- Two-step trigger: visible anchor + `sessions_send` ++- Use case: Structured task delegation (CTO assigns Builder a specific task) ++- Keep for: All existing delegation workflows, task tracking, closeout flows ++ ++**A2A v2 (new -- multi-bot discussion)**: ++- One-step trigger: @mention in a shared thread ++- Use case: Multi-party discussion, review, brainstorm ++- Session: Each agent's session is the thread itself (`thread:<threadTs>`) ++- Context: Thread history serves as the shared context (Blackboard equivalent) ++ ++**Coexistence**: Both modes can coexist. A Discussion (v2) in `#collab` can result in a Delegation (v1) where CTO creates a task thread in `#build` and `sessions_send`s Builder. The discussion thread serves as the "why" record; the task thread serves as the "what" execution record. ++ ++**Migration path**: No breaking changes. Add multi-bot accounts alongside existing single-bot. Existing bindings continue to work for channels that don't need multi-agent discussion. New `#collab` or shared channels use multi-bot + `allowBots` + `requireMention`. ++ ++--- ++ ++## 5. Confidence Assessment ++ ++| Finding | Confidence | Evidence | ++|---------|-----------|---------| ++| Slack Events API delivers Bot-A's messages to Bot-B | **HIGH** | Slack API docs confirm apps receive all message events in channels they've joined. Community testing confirms. | ++| OpenClaw `channels.slack.accounts` supports multi-bot | **HIGH** | Official docs, GitHub gist with working config, DeepWiki source analysis all confirm. | ++| `allowBots: true` + `requireMention: true` prevents loops | **HIGH** | Official OpenClaw docs explicitly recommend this combination. Community reports confirm. | ++| Self-loop filter is per-bot-user-ID (not global) | **HIGH** | OpenClaw Slack docs: "ignores messages from the same bot user ID." Multi-account = different user IDs. Discord issue #11199 confirms the same logic was fixed for Discord with sibling bot registry. | ++| Agent identity fix (PR #27134) enables visual differentiation | **HIGH** | GitHub issue #27080 closed with fix. With multi-account, native app profiles provide identity without needing `chat:write.customize`. | ++| Discussion/Review patterns work NOW with config changes | **MEDIUM** | All required primitives exist (multi-account, allowBots, thread history). Not yet tested end-to-end in an OpenCrew deployment. | ++| Agent-orchestrated turn management works | **MEDIUM** | Requires agents to @mention other bots in their messages. OpenClaw's message tool can include @mentions, but reliable mention-parsing by receiving bot needs validation. | ++| `allowBots: "mentions"` mode exists | **MEDIUM** | DeepWiki analysis mentions this as a supported value. Not found in the main official docs page, but referenced in source-level documentation. | ++| Brainstorm pattern with automatic turn-taking | **LOW** | Conceptually sound but no existing implementation. Requires custom orchestration logic not yet built. | ++| Thread history is sufficient as Blackboard replacement | **LOW** | Depends on thread length, `initialHistoryLimit` setting, and whether agents can parse unstructured thread content as effectively as structured files. | ++ ++--- ++ ++## 6. Open Questions ++ ++1. **Slack free plan thread history**: Does the free Slack plan's message history limit (90 days) affect thread-based collaboration? For long-running projects, will old discussion threads become inaccessible? ++ ++2. **Mention parsing reliability**: When Agent-CTO posts "@Builder what do you think?", does OpenClaw's Slack plugin reliably detect this as a mention of the Builder bot and route it to the Builder agent? Or does it require Slack's native mention format (`<@BOT_USER_ID>`)? This needs empirical testing. ++ ++3. **Socket Mode connection limits**: With 5-7 separate Slack apps all using Socket Mode, does this create issues with Slack's connection limits or rate limits? The OpenClaw docs recommend HTTP mode for multi-account: "Give each account a distinct `webhookPath` so registrations do not collide." ++ ++4. **Session isolation in shared channels**: When CTO and Builder both participate in a thread in `#collab`, they each have their own session (`agent:cto:slack:channel:COLLAB:thread:TS` and `agent:builder:slack:channel:COLLAB:thread:TS`). Do these sessions conflict? Can both write to the same thread without routing issues? ++ ++5. **`maxPingPongTurns` applicability**: The existing A2A protocol uses `maxPingPongTurns = 4` for `sessions_send` loops. In the new discussion pattern, is there an equivalent limit for @mention-driven discussions? Without one, an agent-orchestrated discussion could theoretically run indefinitely. ++ ++6. **Cost implications**: Each Slack app consumes one Socket Mode connection. With 5-7 apps, the OpenClaw gateway maintains 5-7 persistent WebSocket connections. What is the resource impact? Is HTTP Events API mode more appropriate for this scale? ++ ++7. **Discord/Feishu parity**: The Discord plugin had a similar global bot-filter bug (#11199) that was fixed with a sibling bot registry (PRs #11644, #22611, #35479). Has the Slack plugin received an equivalent fix, or does it still use the simpler per-bot-user-ID check? The Slack issue #15836 (agent-to-agent routing) was closed as NOT_PLANNED -- does multi-account mode make that issue moot? ++ ++8. **SlackAgents (EMNLP 2025) proactive mode**: The research paper describes agents that listen to threads without being mentioned and proactively contribute. Could this "proactive mode" be adapted for OpenCrew? What relevance filtering would prevent noise? ++ ++--- ++ ++## Appendix A: Key References ++ ++- [OpenClaw Slack Plugin Docs](https://docs.openclaw.ai/channels/slack) ++- [Running Multiple AI Agents as Slack Teammates (GitHub Gist)](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f) ++- [OpenClaw Multi-Agent Routing Docs](https://docs.openclaw.ai/concepts/multi-agent) ++- [OpenClaw Issue #15836: Agent-to-agent Slack routing](https://github.com/openclaw/openclaw/issues/15836) ++- [OpenClaw Issue #27080: Slack agent identity fix](https://github.com/openclaw/openclaw/issues/27080) ++- [OpenClaw Issue #11199: Discord multi-bot filtering](https://github.com/openclaw/openclaw/issues/11199) ++- [Anthropic Harness Design Blog Post](https://www.anthropic.com/engineering/harness-design-long-running-apps) ++- [SlackAgents: EMNLP 2025 Demo Paper](https://aclanthology.org/2025.emnlp-demos.76.pdf) ++- [Slack Events API Documentation](https://docs.slack.dev/apis/events-api/) ++- [Slack Message Event Reference](https://docs.slack.dev/reference/events/message/) ++- [OpenClaw Slack Setup Best Practices (Macaron)](https://macaron.im/blog/openclaw-slack-setup) ++- [OpenClaw Multi-Agent Setup Tutorial (LumaDock)](https://lumadock.com/tutorials/openclaw-multi-agent-setup) ++ ++## Appendix B: Glossary ++ ++- **A2A**: Agent-to-Agent protocol used in OpenCrew for task delegation ++- **allowBots**: OpenClaw config setting controlling whether bot-authored messages are processed ++- **Blackboard pattern**: Communication pattern where agents read/write shared files (used in Anthropic's harness) ++- **Multi-account**: OpenClaw feature allowing multiple Slack apps (each with own bot token) in one gateway ++- **requireMention**: OpenClaw config setting requiring explicit @mention for agent activation ++- **sessions_send**: OpenClaw tool for sending messages to another agent's session (the "real trigger" in A2A v1) ++- **Socket Mode**: Slack connection mode using persistent WebSocket (default for OpenClaw) ++- **Two-step trigger**: Current OpenCrew A2A mechanism: visible anchor message + sessions_send diff --git a/.harness/reports/verify_source_code_r1.md b/.harness/reports/verify_source_code_r1.md new file mode 100644 index 0000000..7625a91 --- /dev/null +++ b/.harness/reports/verify_source_code_r1.md @@ -0,0 +1,389 @@ +# Source Code Verification: Cross-Bot Message Routing + +**Repo**: `openclaw/openclaw` (GitHub, accessed 2026-03-27) +**Method**: Direct source code reads via `gh api` against the `openclaw/openclaw` repository. + +--- + +## 1. allowBots Filter + +### Source Files +- **Config type definition**: `src/config/types.slack.ts` (lines 41, 115) +- **Channel-level config resolution**: `extensions/slack/src/monitor/channel-config.ts` +- **Actual filter logic**: `extensions/slack/src/monitor/message-handler/prepare.ts` + +### Code Snippet (from `prepare.ts`, `resolveSlackConversationContext`) + +```typescript +const allowBots = + channelConfig?.allowBots ?? + account.config?.allowBots ?? + cfg.channels?.slack?.allowBots ?? + false; +``` + +### Behavior + +The `allowBots` flag is resolved with a **three-tier fallback**: + +1. **Per-channel config** (`channels.slack.channels.<channelId>.allowBots`) -- highest priority +2. **Per-account config** (`channels.slack.accounts.<accountId>.allowBots`) -- middle priority +3. **Global Slack config** (`channels.slack.allowBots`) -- lowest priority +4. **Default**: `false` + +When `allowBots` is `false` (the default), all messages with a `bot_id` field are silently dropped. The check occurs in `authorizeSlackInboundMessage`: + +```typescript +if (isBotMessage) { + if (message.user && ctx.botUserId && message.user === ctx.botUserId) { + return null; // self-loop filter (always blocks own messages) + } + if (!allowBots) { + logVerbose(`slack: drop bot message ${message.bot_id ?? "unknown"} (allowBots=false)`); + return null; + } +} +``` + +--- + +## 2. Self-Loop Filter + +### Source File +- `extensions/slack/src/monitor/message-handler/prepare.ts`, function `authorizeSlackInboundMessage` + +### Code Snippet + +```typescript +if (isBotMessage) { + if (message.user && ctx.botUserId && message.user === ctx.botUserId) { + return null; + } + if (!allowBots) { + logVerbose(`slack: drop bot message ${message.bot_id ?? "unknown"} (allowBots=false)`); + return null; + } +} +``` + +Where `isBotMessage` is defined as: + +```typescript +isBotMessage: Boolean(message.bot_id), +``` + +And `ctx.botUserId` is set per-account from `auth.test`: + +```typescript +const auth = await app.client.auth.test({ token: botToken }); +botUserId = auth.user_id ?? ""; +``` + +### Per-Account or Global? + +**Per-account.** Each Slack account (`monitorSlackProvider`) creates its own `SlackMonitorContext` with its own `botUserId`. The self-loop check compares `message.user === ctx.botUserId`, which is the bot user ID **of that specific Slack App/account**. + +This means: +- **Default-Bot** (account `default`, botUserId = `U_DEFAULT`) will only drop messages where `message.user === "U_DEFAULT"` +- **CoS-Bot** (account `cos`, botUserId = `U_COS`) will only drop messages where `message.user === "U_COS"` + +When CoS-Bot posts a message, Default-Bot's self-loop check (`message.user === U_COS`) does NOT match `U_DEFAULT`, so it passes through (assuming `allowBots=true`). + +--- + +## 3. Multi-Account Event Dispatch + +### Source Files +- **Gateway orchestration**: `src/gateway/server-channels.ts` +- **Per-account provider boot**: `extensions/slack/src/monitor/provider.ts` +- **Channel plugin gateway hook**: `extensions/slack/src/channel.ts` (`gateway.startAccount`) +- **Event registration**: `extensions/slack/src/monitor/events/messages.ts` + +### How Events Are Routed + +Each Slack account gets **its own independent Bolt `App` instance** with its own socket connection or HTTP receiver: + +From `provider.ts`: +```typescript +const app = new App( + slackMode === "socket" + ? { token: botToken, appToken, socketMode: true, clientOptions } + : { token: botToken, receiver: receiver ?? undefined, clientOptions }, +); +``` + +From `channel.ts` (`gateway.startAccount`): +```typescript +startAccount: async (ctx) => { + const account = ctx.account; + const botToken = account.botToken?.trim(); + const appToken = account.appToken?.trim(); + ctx.log?.info(`[${account.accountId}] starting provider`); + return getSlackRuntime().channel.slack.monitorSlackProvider({ + botToken: botToken ?? "", + appToken: appToken ?? "", + accountId: account.accountId, + config: ctx.cfg, + // ... + }); +}, +``` + +From `server-channels.ts` (`startChannelInternal`): +```typescript +const accountIds = accountId ? [accountId] : plugin.config.listAccountIds(cfg); +// ... +await Promise.all( + accountIds.map(async (id) => { + // each account gets its own startAccount call + }), +); +``` + +**Each account runs a completely separate Slack Bolt App** with its own WebSocket connection to Slack. Events from Slack are delivered directly to the Bolt App that owns that bot token. + +### Critical Architectural Point: No Cross-Account Event Delivery + +Unlike Feishu (where all bots in a group receive every message from every member, including other bots), **Slack delivers events only to the App that owns the relevant subscription**. Each Slack App gets its own events stream. + +### Event Isolation via `shouldDropMismatchedSlackEvent` + +Each account's context also includes `shouldDropMismatchedSlackEvent`, which checks `api_app_id` and `team_id`: + +```typescript +const shouldDropMismatchedSlackEvent = (body: unknown) => { + // ... + if (params.apiAppId && incomingApiAppId && incomingApiAppId !== params.apiAppId) { + logVerbose(`slack: drop event with api_app_id=${incomingApiAppId} (expected ${params.apiAppId})`); + return true; + } + if (params.teamId && incomingTeamId && incomingTeamId !== params.teamId) { + logVerbose(`slack: drop event with team_id=${incomingTeamId} (expected ${params.teamId})`); + return true; + } + return false; +}; +``` + +This is a safety net for HTTP mode where requests might be shared, but in socket mode each App has its own connection so this rarely fires. + +### Dedup Behavior + +**Slack has NO cross-account broadcast deduplication** like Feishu does. The reason is architectural: + +- **Feishu**: All bots in a group receive the same event via a shared webhook. Feishu explicitly deduplicates with `tryRecordMessagePersistent(ctx.messageId, "broadcast")` using a shared namespace. +- **Slack**: Each bot App has its own event subscription stream. There is no shared event delivery mechanism, so no cross-account dedup is needed. + +Within a single account, there is a `markMessageSeen` dedup to handle the `message` vs `app_mention` race: + +```typescript +const seenMessages = createDedupeCache({ ttlMs: 60_000, maxSize: 500 }); +const markMessageSeen = (channelId: string | undefined, ts?: string) => { + if (!channelId || !ts) { return false; } + return seenMessages.check(`${channelId}:${ts}`); +}; +``` + +This dedup is **per-account** (each `SlackMonitorContext` has its own `seenMessages` cache). It exists to prevent the same message from being processed twice within ONE account (e.g., when both `message` and `app_mention` events fire for the same message). + +There is also a global `inbound-dedupe.ts` that runs at the agent dispatch layer (`shouldSkipDuplicateInbound`), but its key includes `AccountId`: + +```typescript +return [provider, accountId, sessionScope, peerId, threadId, messageId].filter(Boolean).join("|"); +``` + +Since `accountId` is part of the key, the same physical message delivered to two different accounts generates **different dedup keys** and is processed independently by both. + +--- + +## 4. Binding Resolution with Two Bots in Same Channel + +### Source File +- `src/routing/resolve-route.ts` + +### How It Works + +When a message arrives at a specific account, `resolveAgentRoute` is called with: +```typescript +const route = resolveAgentRoute({ + cfg: ctx.cfg, + channel: "slack", + accountId: account.accountId, // <-- per-account + teamId: ctx.teamId || undefined, + peer: { + kind: isDirectMessage ? "direct" : isRoom ? "channel" : "group", + id: isDirectMessage ? (message.user ?? "unknown") : message.channel, + }, +}); +``` + +The binding resolution uses a tiered priority system: +1. `binding.peer` -- exact channel/peer match +2. `binding.peer.parent` -- parent thread match +3. `binding.guild+roles` -- Discord-specific +4. `binding.guild` -- Discord-specific +5. `binding.team` -- team-level binding +6. `binding.account` -- account-level binding +7. `binding.channel` -- channel-level (wildcard account) binding +8. `default` -- uses `resolveDefaultAgentId(cfg)` + +Bindings are filtered by both `channel` AND `accountId`: + +```typescript +function getEvaluatedBindingsForChannelAccount( + cfg: OpenClawConfig, + channel: string, + accountId: string, +): EvaluatedBinding[] { +``` + +### Which Binding Wins? + +Each account resolves its **own** route independently. There is no conflict because: + +- Default-Bot (account `default`) + binding `{match: {channel: "slack", accountId: "default", peer: {kind: "channel", id: "C_CTO"}}, agentId: "cto"}` resolves to agent `cto` +- CoS-Bot (account `cos`) + binding `{match: {channel: "slack", accountId: "cos"}, agentId: "cos"}` resolves to agent `cos` + +The two accounts process events in parallel; each resolves its own agentId based on its own bindings. + +--- + +## 5. requireMention Scope + +### Source Files +- `extensions/slack/src/monitor/channel-config.ts` +- `extensions/slack/src/monitor/message-handler/prepare.ts` + +### Code Snippet + +```typescript +const shouldRequireMention = isRoom + ? (channelConfig?.requireMention ?? ctx.defaultRequireMention) + : false; +``` + +Where `channelConfig` is resolved per-channel: +```typescript +const channelConfig = isRoom + ? resolveSlackChannelConfig({ + channelId: message.channel, + channelName, + channels: ctx.channelsConfig, + channelKeys: ctx.channelsConfigKeys, + defaultRequireMention: ctx.defaultRequireMention, + allowNameMatching: ctx.allowNameMatching, + }) + : null; +``` + +And `ctx.channelsConfig` comes from the **per-account merged config**: + +```typescript +channelsConfig: slackCfg.channels, +``` + +Where `slackCfg = account.config` (merged from account-specific + global config). + +### Per-Account or Global? + +**Per-account.** Each account's `SlackMonitorContext` has its own `channelsConfig` and `defaultRequireMention` derived from its merged config. However, the channel config entries themselves (`channels.slack.channels.*`) are typically shared globally in the config file unless explicitly overridden in `channels.slack.accounts.<id>.channels.*`. + +In practice: +- If `channels.slack.channels.C_CTO.requireMention: true` is set globally, **both** Default-Bot and CoS-Bot accounts will see the same `requireMention: true` for channel `C_CTO` +- To give CoS-Bot different mention requirements, you would need `channels.slack.accounts.cos.channels.C_CTO.requireMention: false` (if supported by the merge logic in `resolveMergedAccountConfig`) + +The `mergeSlackAccountConfig` function in `accounts.ts` does merge account-level config over global: + +```typescript +return resolveMergedAccountConfig<SlackAccountConfig>({ + channelConfig: cfg.channels?.slack as SlackAccountConfig | undefined, + accounts: cfg.channels?.slack?.accounts as Record<string, Partial<SlackAccountConfig>> | undefined, + accountId, +}); +``` + +So **per-account channel config overrides are supported** via the `accounts.<id>.channels` namespace. + +--- + +## 6. Verdict + +**Can CoS-Bot and CTO (via Default-Bot) have a conversation in #cto?** + +### PARTIALLY -- with important caveats + +### Evidence For (YES, it can work): + +1. **Self-loop filter is per-account**: Each account only filters out messages from its own `botUserId`. CoS-Bot's messages won't be filtered by Default-Bot's self-loop check, and vice versa. + +2. **allowBots enables cross-bot reception**: Setting `allowBots: true` on the `#cto` channel config (or per-account) will allow each bot to receive messages from the other bot. + +3. **Bindings route correctly**: Account-scoped bindings ensure Default-Bot's events route to the CTO agent and CoS-Bot's events route to the CoS agent. + +4. **No cross-account dedup**: Unlike Feishu, Slack accounts have fully independent event streams. Both accounts will process events independently without interfering with each other. + +5. **Independent Bolt Apps**: Each account runs its own Slack Bolt App with its own WebSocket connection, so there's no event contention. + +### Evidence For Caveats (PARTIALLY): + +1. **Both bots see ALL messages in #cto**: When a human user posts in #cto, BOTH Default-Bot and CoS-Bot receive the event independently (from their own Slack event streams). This means both CTO agent AND CoS agent will process the message and potentially respond, unless properly gated. The inbound dedup at `inbound-dedupe.ts` includes `accountId` in its key, so the same physical message will NOT be deduped across accounts. + +2. **requireMention is the primary gating mechanism**: To prevent both agents from responding to every message, `requireMention` must be configured. But if CoS-Bot specifically @mentions Default-Bot's agent name, that mention is resolved by Default-Bot's context using its own `botUserId` and `mentionRegexes`. The cross-bot mention resolution works correctly because `explicitlyMentioned` checks `message.text?.includes(<@${ctx.botUserId}>)` using each account's own bot user ID. + +3. **Infinite loop risk**: If `allowBots: true` is set and `requireMention` is not configured, the CTO agent responding will trigger CoS agent to respond (because CoS-Bot sees the CTO agent's reply as a bot message in #cto), and vice versa. The code has NO built-in loop-breaker beyond requireMention and behavioral instructions. The docs explicitly warn: "if you allow replying to other bots (allowBots=true), use requireMention, users allowlist, and/or explicit guardrails in AGENTS.md and SOUL.md to prevent bot reply loops." + +4. **Thread behavior**: When CoS-Bot posts in #cto, and CTO agent replies (via Default-Bot), the reply goes to the channel or thread. If CTO agent responds in-thread, CoS-Bot will receive the thread reply event, and the thread participation check (`hasSlackThreadParticipation`) means CoS agent may get an **implicit mention** for subsequent thread messages, bypassing `requireMention`. + +5. **Session isolation**: Each agent's conversation is tracked in separate sessions (because `accountId` and `agentId` differ). This means neither agent sees the other's conversation history natively -- they only see the raw Slack messages. Cross-agent context must be inferred from the Slack message text. + +### Required Configuration for the Claimed Scenario: + +```yaml +channels: + slack: + accounts: + default: + botToken: xoxb-DEFAULT-BOT-TOKEN + appToken: xapp-DEFAULT-APP-TOKEN + channels: + C_CTO: + allowBots: true # Required: accept messages from CoS-Bot + requireMention: true # Recommended: prevent responding to everything + cos: + botToken: xoxb-COS-BOT-TOKEN + appToken: xapp-COS-APP-TOKEN + channels: + C_CTO: + allowBots: true # Required: accept messages from Default-Bot + requireMention: true # Recommended: prevent responding to everything + +bindings: + - match: { channel: slack, accountId: default, peer: { kind: channel, id: C_CTO } } + agentId: cto + - match: { channel: slack, accountId: cos } + agentId: cos +``` + +### Summary + +The original claim is **technically accurate** -- the code supports it -- but the claim omits critical operational details: + +| Claim | Verified? | Notes | +|-------|-----------|-------| +| "CoS-Bot posts in #cto, CTO agent receives it" | YES | Requires `allowBots: true` on Default-Bot's channel config | +| "CTO responds, CoS agent sees the response" | YES | Requires `allowBots: true` on CoS-Bot's channel config | +| "They can have a back-and-forth conversation" | PARTIALLY | Works but requires careful loop prevention; no built-in cross-bot conversation termination | + +--- + +## 7. What I Could NOT Find + +1. **Explicit cross-account dedup for Slack**: I confirmed Feishu has `tryRecordMessagePersistent(ctx.messageId, "broadcast")` for cross-account dedup. Slack has NO equivalent. I could not find any code that deduplicates the same physical human message across two Slack accounts. This means when a human posts in #cto, both CTO agent and CoS agent WILL process it independently. + +2. **Loop detection/circuit-breaker**: I found no automatic bot-to-bot loop detection beyond the self-loop filter. The protection is entirely behavioral (requireMention + AGENTS.md guardrails). + +3. **Thread participation cross-account behavior**: I found `hasSlackThreadParticipation` which tracks sent messages per `accountId + channel + threadTs`, but I could not find whether this tracking affects the OTHER account's implicit mention resolution. Each account has its own `sent-thread-cache`, so if Default-Bot posts in a thread, CoS-Bot's `hasSlackThreadParticipation` for that thread would be `false` (unless CoS-Bot has also posted there). This means **implicit mention via thread participation does NOT cross account boundaries**, which is good for loop prevention but means CoS agent won't automatically follow thread conversations started by CTO agent unless explicitly @mentioned. + +4. **`resolveMergedAccountConfig` deep merge behavior**: I read the function signature but did not retrieve its full implementation to confirm whether nested `channels.*` config in `accounts.<id>` truly deep-merges or shallow-replaces. The merge behavior affects whether per-account channel config overrides work as expected. + +5. **Rate limiting or throttling between bot accounts**: I found no evidence of cross-account rate limiting that would prevent rapid back-and-forth between two bots. diff --git a/CLAUDE.md b/CLAUDE.md new file mode 100644 index 0000000..ed74c07 --- /dev/null +++ b/CLAUDE.md @@ -0,0 +1,98 @@ +# OpenCrew — Project Context + +> This file is local development context for Claude Code. Do NOT commit to the repo. + +## Core Philosophy + +OpenCrew is a multi-agent operating system where **agents are the primary operators, humans are strategic decision-makers**. Everything flows from three principles: + +### 1. Minimize Human Intervention + +Users should be able to quickly identify the absolute minimum manual steps, then delegate everything else. + +**What MUST be human:** +- Creating platform apps/bots (Slack/Discord/Feishu) — requires browser auth, OAuth consent +- Granting privileged intents / permissions — platform-level human verification +- Providing credentials (tokens, secrets) — security boundary +- L3 decisions (irreversible: deploy, delete, trade) — Autonomy Ladder enforcement + +**What agents handle:** +- Reading DEPLOY.md and executing deployment steps +- Copying workspace files, creating symlinks, directory structure +- Fetching channel/group IDs via API +- Merging config snippets into openclaw.json (incremental, bounded) +- Restarting gateway and verifying connectivity +- All L1/L2 operations (reversible work, impactful-but-rollbackable changes) + +### 2. Agent-First Configuration + +Remaining setup is done by Tool Agents — but with strict boundaries: + +- **Incremental patches, not rewrites**: CONFIG_SNIPPET files define minimal additions to merge, never full config replacements. Agent must not touch auth/models/gateway sections. +- **Config enforces what docs can't**: A2A allowlist, maxPingPongTurns, subagent deny — all in openclaw.json, not just documentation. If a rule can be a config constraint, it must be. +- **Bounded tool use**: Agents modify specific config sections, not broad file rewrites. `rsync --ignore-existing` for workspace files, targeted JSON merges for config. + +### 3. Reliable Instruction Following Across Models + +Protocols are structured so different LLMs can follow them step-by-step: + +- **SOUL.md first**: Identity anchors all decisions. Read before workflow. +- **Fixed read order**: SOUL → AGENTS → USER → MEMORY → shared protocols +- **Explicit state machines**: AGENTS.md specifies exact flowcharts (receive input → classify QAPS → branch), not fuzzy guidelines +- **Fixed-format templates**: Closeout, Checkpoint, Subagent Packet — all have mandatory fields, not suggestions +- **Numerical signals**: Signal score 0-3 in closeouts removes subjective judgment from KO filtering +- **Config > docs**: Hard constraints in openclaw.json trump soft constraints in .md files + +## Document Audience Map + +| Audience | Documents | +|----------|-----------| +| **Human only** | README, CONCEPTS, ARCHITECTURE, FAQ, KNOWN_ISSUES, JOURNEY, CUSTOMIZATION | +| **Agent only** | SOUL.md, AGENTS.md, USER.md, MEMORY.md, shared/*.md, AGENT_ONBOARDING | +| **Both (bridge)** | DEPLOY.md (human: prerequisites + credentials; agent: execution steps), GETTING_STARTED (human guide referencing agent-executable setup), CONFIG_SNIPPET_*.md (agent reads during deploy, human reviews) | +| **Human → Agent handoff** | Platform setup docs (SLACK_SETUP, DISCORD_SETUP, FEISHU_SETUP): human does platform-side steps, then hands credentials to agent for OpenClaw-side config | + +## Architecture Quick Reference + +``` +Layer 1: Intent Alignment — YOU + CoS (strategic partner, not gateway) +Layer 2: Execution — CTO → Builder, CIO (swappable domain), Research (spawn-only) +Layer 3: Maintenance — KO (knowledge distillation), Ops (audit, governance) +``` + +- Channel = Role, Thread = Task +- Autonomy Ladder: L0 suggest → L1 reversible → L2 impactful → L3 irreversible +- Task types: Q (query) → A (artifact) → P (project) → S (system change) +- Knowledge pipeline: Raw chat → Closeout (25x compression) → KO extraction + +## Working With This Repo + +### What belongs in the repo +- Agent-facing protocols (shared/*.md, workspaces/*/SOUL.md etc.) +- Human-facing documentation (docs/*, README, DEPLOY) +- Platform setup guides and config snippets + +### What does NOT belong in the repo +- `.claude/` (local Claude Code config and agents) +- `.harness/` (local development harness artifacts) +- `CLAUDE.md` (this file — local project context) +- Research reports, QA reports, architecture drafts (local harness outputs) + +### When editing docs +- Platform setup instructions must be verified against official docs (Discord API, Feishu Open Platform, Slack API) +- Config keys must be verified against actual OpenClaw releases — never fabricate +- Update both Chinese and English versions +- Remember: setup docs are a human-agent bridge. Mark clearly which steps are human-manual vs agent-automated +- YAML/JSON examples in code blocks must use ASCII quotes, never typographic quotes + +### When proposing architecture changes +- Distinguish config-layer constraints (reliable, system-enforced) from doc-layer constraints (soft, agent-voluntary) +- Favor pushing rules into openclaw.json over adding them to .md files +- Changes to shared/*.md affect ALL agents — review impact across roles +- v2-lite direction: fewer files, more config constraints, simpler protocols + +## Active Context + +- **Issues**: #31 (Feishu multi-agent), #33 (doc ordering), #34 (Discord routing) — addressed in PR #36 +- **A2A Evolution**: Research completed on Slack/Feishu/Discord multi-bot capabilities. Architecture reports in local `.harness/reports/`. Key finding: Slack supports true multi-agent discussion NOW via multi-account + @mention; Feishu limited by platform (bot messages invisible to other bots); Discord blocked by OpenClaw #11199. +- **PR #36**: Documentation fixes only. Local harness/agent files stay local. diff --git a/docs/A2A_SETUP_GUIDE.md b/docs/A2A_SETUP_GUIDE.md index 46de8c9..7c5a3d0 100644 --- a/docs/A2A_SETUP_GUIDE.md +++ b/docs/A2A_SETUP_GUIDE.md @@ -1,266 +1,324 @@ -**中文** | [English](en/A2A_SETUP_GUIDE.md) +# Discussion Mode 实操指南 — 独立 Bot 引入 + 协作机制 -# A2A 跑通指南 — 给 Agent 读的设置文档 - -> **这个文档的读者是 Agent(推荐由 Ops 执行),不是人。** -> 目标:用最小必要变更,让你的 OpenCrew 的 A2A 闭环真正跑通。 -> 如果你是人类用户,可以直接把 [README 里的 prompt](#快速-prompt) 发给你的 Ops / CTO / 任一 Agent。 +> 供审计用。基于 2026-03-30 ~ 2026-04-02 实战测试。 +> PR: https://github.com/AlexAnys/opencrew/pull/38 --- -## 0. 前置条件 - -- OpenCrew 已部署(最小可用:CoS + CTO + Builder,各自能在 Slack 频道正常回复) -- 如果部署了 Ops,Ops 可以执行本指南;否则可由 CTO 或任一 Agent 执行 -- `openclaw.json` 中 `tools.agentToAgent.enabled: true` -- Agent 间的 Slack 频道绑定(bindings)已配好 - ---- +## 第一部分:如何引入独立 Bot -## 1. 关键认知:为什么 A2A 需要"设置" +### 背景 -OpenCrew 部署完成后,每个 Agent 都能独立回答问题。但 **跨 Agent 协作(A2A)** 需要额外条件: +OpenCrew 默认所有 Agent 共享一个 Slack App(一个 bot user)。这意味着 Bot A 发到 Bot B 频道的消息会被 self-reply filter 忽略(同一个 bot user)。 -| 问题 | 原因 | 解法 | -|------|------|------| -| CTO 在 #build 发消息,Builder 不响应 | 同一个 bot 的消息默认被忽略(防自循环) | 两步触发:Slack 锚点 + `sessions_send` | -| `sessions_send` 返回 timeout | OpenClaw 默认超时较短 | timeout ≠ 失败;需要 thread 兜底消息 | -| Builder 做完了但用户不知道 | 结果只在 A2A 内部流转 | 双通道:A2A reply + Slack thread 留痕 | -| 任务做完但没人汇报 | 没有闭环规则 | DoD 硬规则:回发起频道汇报 | - ---- +**Discussion Mode** 的前提是"选择性独立化":让少数高价值 Agent(如 Orchestrator)拥有独立 Slack App,然后拉进其他 Agent 的频道直接对话。 -## 2. 配置层变更(openclaw.json) +### Step 1:创建独立 Slack App -以下是跑通 A2A 的**最小必要配置变更**。用 `config.patch` 或手动编辑均可。 - -### 2.1 agentToAgent.allow — 谁能发起 A2A +在 [api.slack.com/apps](https://api.slack.com/apps) → Create New App → **From manifest**: ```json { - "tools": { - "agentToAgent": { - "enabled": true, - "allow": ["cos", "cto", "ops", "ko", "builder"] + "display_information": { + "name": "Ali-Bot", + "description": "OpenClaw Discussion Mode agent" + }, + "features": { + "bot_user": { + "display_name": "Ali-Bot", + "always_online": true + }, + "app_home": { + "messages_tab_enabled": true, + "messages_tab_read_only_enabled": false + } + }, + "oauth_config": { + "scopes": { + "bot": [ + "chat:write", + "im:write", + "channels:history", + "groups:history", + "groups:read", + "im:history", + "im:read", + "mpim:history", + "mpim:read", + "channels:read", + "users:read", + "app_mentions:read", + "assistant:write", + "reactions:read", + "reactions:write", + "pins:read", + "pins:write", + "emoji:read", + "files:read", + "files:write" + ] + } + }, + "settings": { + "event_subscriptions": { + "bot_events": [ + "app_mention", + "message.channels", + "message.groups", + "message.im", + "message.mpim", + "reaction_added", + "reaction_removed", + "member_joined_channel", + "member_left_channel", + "channel_rename", + "pin_added", + "pin_removed" + ] + }, + "socket_mode_enabled": true, + "org_deploy_enabled": false, + "is_hosted": false, + "token_rotation_enabled": false } - } } ``` -> **为什么 builder 也在里面?** Builder 需要能用 `sessions_send` 回复 CTO 的多轮指导。如果你的 Builder 仅通过 Slack thread 回复(不走 A2A reply),可以不加 builder。 +创建后: +1. Basic Information → App-Level Tokens → Generate Token(scope: `connections:write`)→ 拿到 `xapp-...` +2. Install to Workspace → 拿到 `xoxb-...` +3. 记录 Bot User ID(Settings → Basic Info 或在 Slack 中查看 bot profile) -### 2.2 maxPingPongTurns — 多轮迭代上限 +### Step 2:配置 OpenClaw 多账号 -```json -{ - "session": { - "agentToAgent": { - "maxPingPongTurns": 5 - } - } -} -``` +> ⚠️ **硬性要求:必须同时声明 `accounts.default`** +> +> `account-helpers.ts:listAccountIds()` 的逻辑:一旦 `accounts` 对象存在且有任何 key,OpenClaw **只启动显式声明的账号**,不再隐式创建 default。 +> +> 如果只添加 `accounts.ali-bot` 而不添加 `accounts.default`,主 bot 的 provider 不会启动,所有现有 Agent 的 Slack 连接将断开。 +> +> 这不是 bug,是设计如此。 -> 默认 4 轮。建议 5 轮(给 Round0 握手 + 3-4 轮实际工作留空间)。 +**正确配置**(增量修改 `openclaw.json`): -### 2.3 subagents 防无限扇出(通常已默认配好) - -```json +```jsonc { - "tools": { - "subagents": { - "tools": { - "deny": ["group:sessions"] + "channels": { + "slack": { + // 顶层 token 保留为 fallback + "botToken": "xoxb-main-...", + "appToken": "xapp-main-...", + + // ★ 关键:显式声明 accounts.default + "accounts": { + "default": { + "botToken": "xoxb-main-...", // 与顶层相同 + "appToken": "xapp-main-..." // 与顶层相同 + }, + "ali-bot": { + "botToken": "xoxb-ali-...", + "appToken": "xapp-ali-...", + "channels": { + "<TARGET_CHANNEL_ID>": { + "allow": true, + "requireMention": true, // 只响应显式 @mention + "allowBots": true // 能看到其他 bot 的消息 + } + } + } + }, + + // 目标频道开启 allowBots(让原有 Agent 看到 Ali-Bot 的消息) + "channels": { + "<TARGET_CHANNEL_ID>": { + "allow": true, + "requireMention": true, // 建议改为 true(见第二部分) + "allowBots": true + } + } + } + }, + + // Ali-Bot 的路由绑定 + "bindings": [ + // ★ 放在目标频道的 peer binding 之前(更具体的匹配优先) + { + "agentId": "main", + "match": { + "channel": "slack", + "accountId": "ali-bot", + "peer": { "kind": "channel", "id": "<TARGET_CHANNEL_ID>" } + } + }, + // 现有 binding 不变 + { + "agentId": "ops", + "match": { + "channel": "slack", + "peer": { "kind": "channel", "id": "<TARGET_CHANNEL_ID>" } } } - } + ] } ``` ---- - -## 3. Workspace 文件变更(最小必要) - -以下是需要在各 Agent 的 `AGENTS.md` 中**追加**的内容。不需要重写整个文件,只需在合适位置加入对应 section。 - -### 3.1 CTO 的 AGENTS.md — 追加 A2A 派单 section +### Step 3:邀请 Bot 并验证 -在 CTO 的 `AGENTS.md` 中,找到任务处理流程之后,追加: +1. 在目标频道执行 `/invite @Ali-Bot` +2. 写入 config 后**等待热重载**(不要立即 SIGTERM,热重载会自动检测变更) +3. 检查 gateway 日志确认两个 provider 都启动: -```markdown -## A2A 派单(主流程:跨频道 thread) - -当进入实施阶段: -- 在 **#build**(或 #research)创建任务 root message(锚点): - `A2A CTO→Builder | <TITLE> | TID:<...>` -- 正文给完整任务包(建议 `shared/SUBAGENT_PACKET_TEMPLATE.md`)。 -- ⚠️ 不要依赖 Slack 的"看到消息就自动触发"(bot-authored inbound 默认会被忽略,避免自循环)。 -- 必须用 **sessions_send** 把任务真正触发到目标 thread sessionKey: - `agent:builder:slack:channel:<#build_id>:thread:<root_ts>` - -执行期间(CTO 负责到底): -- **#build thread 留痕**:每轮 ping-pong 中,用 `message(send, channel=slack, target=<#build_id>, threadId=<root_ts>)` 把你这轮的指令/反馈贴到 #build thread,格式 `[CTO] 内容...`。Builder 也会在 thread 里贴它的进展。 -- **#cto checkpoint**:每次收到 Builder 的 checkpoint/结果后,在 #cto 的对应协调 thread 同步一条 checkpoint(让用户不用去 #build 捞信息)。 - -sessions_send timeout 容错: -- `sessions_send` 返回 timeout **≠ 没送达**。 -- 规避:在 thread 里补发一条兜底消息("已通过 A2A 发送,如未收到可在此查看全文")。 - -完成后(DoD 硬规则,缺一不可): -1. 在 Builder thread 贴 closeout(产物路径 + 验证命令)。 -2. **CTO 本机复核**(CLI-first):至少执行关键命令 + 贴 exit code,确认产出可用。 -3. **回 #cto 汇报**:在 #cto 发起 thread 同步最终结果 + 如何验证 + 风险遗留。**这是闭环关键,不做视为任务未完成**。 +``` +[slack] [default] starting provider ✅ +[slack] [ali-bot] starting provider ✅ +channels resolved: ...(无 missing_scope) ✅ +socket mode connected ✅(出现两次) ``` -### 3.2 Builder 的 AGENTS.md — 追加 A2A 协作 section - -在 Builder 的 `AGENTS.md` 中,执行流程之后追加: - -```markdown -## A2A 协作(收到 sessions_send 任务时) - -当通过 `sessions_send` 收到来自 CTO 的 A2A 任务: - -1. **识别 thread_id**:A2A 消息中会包含 `#build thread_id`(message_id) -2. **多轮 WAIT 纪律**: - - 每轮只聚焦 1-2 个改动点,完成后**必须 WAIT** - - 输出格式固定:`Done: ... / Run: ... / Output: ... / WAIT` - - **禁止一次性做完所有步骤**——等 CTO 下一轮指令后再继续 -3. **贴进展到 thread**:每轮回复前,先用 `message(send, channel=slack, target=<#build_id>, threadId=<thread_id>)` 把本轮进展/结果贴到 #build thread,格式 `[Builder] 内容...` -4. **返回 A2A reply**:正常返回结果给 CTO(sessions_send 的 ping-pong 机制) -5. **最终轮**:贴 closeout 到 thread(产物路径 + 验证命令),A2A reply 中回复 `REPLY_SKIP` 表示完成 +### 回滚 -> ⚠️ thread 留痕是为了用户能在 #build 直接看到完整过程。A2A reply 是给 CTO 的结构化回复。两者都要做。 +```bash +cp ~/.openclaw/openclaw.json.bak-before-xxx ~/.openclaw/openclaw.json +# 等待热重载,或: +launchctl kill SIGTERM gui/501/ai.openclaw.gateway ``` -### 3.3 CoS 的 AGENTS.md — 追加 A2A 指派 section +### 已知陷阱 -CoS 是用户的主入口,向 CTO/CIO 指派任务是常见路径: +| 陷阱 | 后果 | 防范 | +|------|------|------| +| 只加 `accounts.ali-bot` 不加 `accounts.default` | 主 bot 断连,所有 Agent 失联 | 必须同时声明 default | +| config 写入后立即 SIGTERM | 热重载的 in-memory 修复被杀掉 | 等热重载完成再验证 | +| Ali-Bot Slack App 缺 scope | `channels resolved` 报 `missing_scope` | 用完整 manifest 创建 | +| Binding 顺序错误 | ali-bot 的消息路由到错误的 agent | accountId+peer binding 放在 peer-only binding 之前 | -```markdown -## A2A 指派(CoS → CTO / CIO) +--- -当需要让 CTO 处理技术任务: -- 在 **#cto** 创建任务 root message: - `A2A CoS→CTO | <TITLE> | TID:<...>` -- 正文给完整任务包(建议 `shared/SUBAGENT_PACKET_TEMPLATE.md`)。 -- 用 `sessions_send` 触发 CTO: - `agent:cto:slack:channel:<#cto_id>:thread:<root_ts>` -- 等待 CTO 在 #cto 的结果汇报,收到后同步到 #hq 向用户汇报。 +## 第二部分:引入后的协作机制 -当需要让 CIO 处理领域任务(如投资分析): -- 同理在 **#invest** 创建 root message 并用 `sessions_send` 触发。 +### 核心挑战 -sessions_send timeout 容错:同 CTO(timeout ≠ 失败,需 thread 兜底)。 -``` +两个 bot 在同一 Slack thread 中,会遇到三个问题: +1. **双响应**:人类发一条消息,两个 bot 都回复 +2. **循环**:Bot A 回复 → Bot B 被触发也回复 → Bot A 又被触发 → ∞ +3. **无路由**:没有机制决定"谁该回复、谁该沉默" ---- +### 为什么 Config 不够(源码验证) -## 4. 验证步骤(跑通检查清单) +| Config 选项 | 预期 | 实际 | +|---|---|---| +| `requireMention: true` | Thread 内只响应显式 @mention | ❌ 一旦 bot 在 thread 中回复过,`implicitMention` 永远为 true,绕过 requireMention | +| `allowBots: "mentions"` | 只处理显式 @mention 自己的 bot 消息 | ❌ Slack provider 只做 truthy/falsy 检查,`"mentions"` 等同于 `true`(仅 Discord 有效) | -### 4.1 配置验证 +**源码证据**(`resolveMentionGating`): -```bash -# 检查 agentToAgent 是否启用 -# 在 openclaw.json 中确认: -# tools.agentToAgent.enabled = true -# tools.agentToAgent.allow 包含 cos, cto, ops -# session.agentToAgent.maxPingPongTurns >= 5 +```js +implicitMention = !isDirectMessage && botUserId && message.thread_ts && + (message.parent_user_id === botUserId || hasSlackThreadParticipation(...)) +// → 一旦 bot 参与过 thread,implicitMention 永远 true +// → requireMention: true 被永久绕过 ``` -### 4.2 CTO → Builder 闭环验证 +### 解决方案:两层防线 -在 `#cto` 频道告诉 CTO: +#### 第 1 层:Config — `requireMention: true`(Channel 级生效) -``` -请执行一次 A2A 闭环测试: -1. 在 #build 创建一个测试任务(让 Builder 执行 `pwd && ls -la | head` 并回报) -2. 用 sessions_send 触发 Builder -3. 确认 Builder 在 Slack thread 里回复了(Round0 握手) -4. 完成后回 #cto 汇报结果 +```jsonc +"<CHANNEL_ID>": { + "allow": true, + "requireMention": true, // 对 channel 根消息有效 + "allowBots": true +} ``` -**验收标准**: -- ✅ #build 出现了 root message(A2A CTO→Builder | ...) -- ✅ Builder 在该 thread 里回复了(`[Builder] Done: ... / WAIT`) -- ✅ CTO 回到 #cto 汇报了结果 +**效果**:Channel 根消息必须显式 @ 才触发 → 只有被 @ 的 bot 进入 thread。 +**局限**:Thread 内无效(implicitMention 绕过)。 -### 4.3 CoS → CTO 闭环验证 +#### 第 2 层:Prompt 规则 — 显式 @mention 协议(Thread 级生效) -在 `#hq` 频道告诉 CoS: +在每个参与 Discussion Mode 的 Agent 的 workspace 文件中加入: -``` -请执行一次 A2A 闭环测试: -1. 在 #cto 创建一个任务 root message(让 CTO 检查当前 workspace 目录结构并回报) -2. 用 sessions_send 触发 CTO -3. 确认 CTO 在 Slack thread 里回复了 -4. CTO 完成后,回 #hq 向我汇报结果 -``` - -**验收标准**: -- ✅ #cto 出现了 root message(A2A CoS→CTO | ...) -- ✅ CTO 在该 thread 里回复了 -- ✅ CoS 回到 #hq 汇报了结果 +```markdown +## Multi-Agent Thread 协作规则 ---- +在 Slack thread 中如果有其他 bot 也在参与: -## 5. 已验证的关键模式(从实战中沉淀) +1. **显式 @mention 检查**:检查消息文本是否包含 `<@你的BotID>`。 + 如果没有 → 整条回复只输出 `NO_REPLY`,不解释、不叙述、不加任何其他文字。 -这些模式经过 CTO↔Builder 和 CTO↔Ops 的真实 A2A 闭环验证: +2. **发送消息时必须 @mention 目标**:`<@目标BotID>` 显式 mention 目标 bot。 + 不 @ 任何 bot = 对话终止信号。 -### 5.1 Round0 审计握手 +3. **角色分工**: + - Coordinator(发起方):选择 @Worker / @Human / 不@(结束) + - Worker(执行方):每次回复必须 @ Coordinator -在正式任务前,先做一个**极小的真实动作**验证审计链路: -- 要求目标 Agent 执行 `pwd` 并把结果贴到 Slack thread -- **看不到 Round0 回传就停止**——说明目标 Agent 的 session 可能没绑定 Slack +4. **终止**:说"完毕/done/结论"后不再发送,除非被重新 @。 -### 5.2 多轮 WAIT 纪律 +5. **轮次上限**:同一 thread 内最多 8 轮回复,超过后暂停并向人类汇报。 -每轮只做 1-2 个改动点,输出后 WAIT: -``` -[Builder] Round 1/3 -Done: 创建了 xxx 文件 -Run: node xxx.js -Output: exit 0 -WAIT: 等待 CTO 指令 +Bot ID 速查: +- Ali-Bot: U0AP8JFFD7Z +- Default Bot (Ops/CTO/Builder 等): U0AD60Q0EKU ``` -**禁止一次性做完所有步骤**——这会跳过指导链路,失去多轮迭代的审计和纠错价值。 +### 协作流程(文件 + 频道) -### 5.3 sessions_send timeout 容错 +借鉴 Anthropic Harness Design 的**双角色架构**: -`sessions_send` 返回 timeout **不等于失败**。消息可能已送达。 -规避:在 Slack thread 里补发一条兜底消息。 - -### 5.4 闭环 DoD(Definition of Done) +``` +Alex → @Orchestrator: "调查 XXX" -任务完成的标准不是"Builder 做完了",而是: -1. Builder thread closeout ✅ -2. CTO 本机复核(CLI-first)✅ -3. **回 #cto 汇报**(这一步是闭环关键)✅ -4. 通知 KO 沉淀(可选)✅ +Orchestrator 输出 DISCUSSION SPEC(Phase 0): + → 📁 discussions/<topic>/spec.md(目标 + 验收标准 + 终止条件) + → Thread 消息:「展开了 spec,N 条验收标准。@Worker 请先...」 -### 5.5 SessionKey 注意事项 +Round 1: + Orchestrator → @Worker: 具体问题 + Worker → @Orchestrator: 回复 + 📁 round-1.md(详细分析) + Thread 消息只放摘要 -- 不要手打 sessionKey,从 `sessions_list` 复制 -- 注意 channel ID 大小写一致性(大小写不一致可能导致 session 路由到 webchat) +Round 2: + Orchestrator 评估 → 📁 review-1.md → @Worker 反馈 + ... ---- +终止(三选一): + ✅ 达成共识 → DISCUSSION_CLOSE + ⚠️ 达到最大轮次 → 请人类介入 + 🔄 连续 2 轮无进展 → 请人类介入 +``` -## 6. 常见问题 +**关键原则**: +1. **文件是主通信通道**——Slack thread 只放摘要和 @mention 路由,详细分析/方案写文件 +2. **先 spec 再讨论**——Orchestrator 第一条消息必须定义验收标准和终止条件 +3. **自评失效,必须分离**——Orchestrator 不生成方案,只协调和评估 +4. **正式 A2A 任务走 `sessions_send`**——有硬性 `maxPingPongTurns` (0-5),Slack thread 仅做留痕 -**Q:配置改完需要重启吗?** -A:用 `config.patch` 会自动重启。手动编辑需要 `openclaw gateway restart`。⚠️ **注意**:gateway 重启会中断所有 Agent 的当前会话。首次 A2A 设置时这是预期行为(一次性),重启后 Agent 会自动恢复。如果 Agent 在设置过程中"突然停止工作",大概率是重启导致的——等恢复后重新发起验证即可。 +### 终止机制 -**Q:Builder 在 thread 里没回复怎么办?** -A:检查 `bindings` 是否正确、频道是否 `allow: true`、Builder 的 session 是否绑定了 Slack(而非 webchat)。 +| 层级 | 机制 | 类型 | +|------|------|------| +| Prompt | @mention 协议 + Round N/M 计数 + DISCUSSION_CLOSE | 软约束(指令遵从) | +| Config | `requireMention: true`(Channel 根消息) | 硬约束(系统级) | +| Config | `loopDetection.pingPong: true` | 硬约束(tool-call 级别) | +| A2A | `maxPingPongTurns` (sessions_send) | 硬约束(系统级) | -**Q:可以让 Builder 直接给 CTO 发 A2A 吗?** -A:可以,但组织纪律上 Builder 是"只接单执行"。如果 Builder 需要澄清,建议在 Slack thread 里直接提问(CTO 能看到)。 +### 已知局限 -**Q:CIO 也需要 A2A 吗?** -A:取决于你的使用场景。CIO 通常独立运作,只在需要 Research 时用 spawn。如果需要 CoS→CIO 派单,按同样模式配即可。 +1. **Input token 仍消耗**——消息被送达所有 bot,只是 agent 选择 NO_REPLY;无法从 config 阻止送达 +2. **Prompt 是软约束**——LLM 可能偶尔违反(特别在复杂上下文中) +3. **`allowBots: "mentions"` 仅 Discord 可用**——Slack 需要 OpenClaw 代码改动 +4. **`requireMention: true` 在 thread 内被绕过**——需要 OpenClaw 增加 `thread.requireExplicitMention` 才能系统级解决 --- -> 📖 相关文档 → [A2A 协议](../shared/A2A_PROTOCOL.md) · [核心概念](CONCEPTS.md) · [Agent 入职指南](AGENT_ONBOARDING.md) · [自定义指南](CUSTOMIZATION.md) +## 平台能力对比 + +| 能力 | Slack | Discord | Feishu | +|------|-------|---------|--------| +| Delegation (sessions_send) | ✅ | ✅ | ✅ | +| Discussion (跨 bot 对话) | ✅ 已验证 | ❌ (OpenClaw bug) | ❌ (平台限制) | +| `allowBots: "mentions"` | ❌ (不支持) | ✅ | N/A | +| `requireMention` 在 thread | ❌ (implicitMention 绕过) | ❌ (同理) | N/A | +| Multi-Account | ✅ | ✅ | ✅ | +| 硬性轮次控制 | 仅 sessions_send | 仅 sessions_send | 仅 sessions_send | diff --git a/docs/CONCEPTS.md b/docs/CONCEPTS.md index b474930..7b42211 100644 --- a/docs/CONCEPTS.md +++ b/docs/CONCEPTS.md @@ -123,29 +123,31 @@ CIO → 独立运作(必要时与 CoS 同步) 上面描述的两步触发是 **Delegation(委派)** 模式——一个 Agent 通过 `sessions_send` 把结构化任务交给另一个 Agent。这是 A2A 的基础,所有平台都支持,流程清晰、单向可控。 -v2 引入了第二种模式:**Discussion(讨论)**。少数高价值 Agent(如 CoS)拥有独立 Slack App,直接进入其他 Agent 的频道进行实时讨论。 [待 POC 验证] +v2 引入了第二种模式:**Discussion(讨论)**。少数高价值 Agent(如 Orchestrator)拥有独立 Slack App,直接进入其他 Agent 的频道进行实时讨论。 -**核心思路:选择性独立化。** 不需要每个 Agent 都有独立 App——执行层(CTO、Builder、CIO 等)继续共享一个 Slack App。只让 CoS(代表用户推进方向)等需要跨域协作的 Agent 拥有独立 App,然后把它拉进目标频道就能直接对话。 +**核心思路:选择性独立化。** 不需要每个 Agent 都有独立 App——执行层(CTO、Builder、CIO 等)继续共享一个 Slack App。只让需要跨域协作的 Agent(如 Orchestrator)拥有独立 App,把它拉进目标频道就能直接对话。 + +这里借鉴了 **Anthropic Harness Design** 的核心洞察:当一个 AI 既做执行又做 QA 时,它倾向于宽容自己的错误;既做规划又做执行时,倾向于投机取巧。将"想"和"做"分给不同 Agent——Orchestrator 负责规划和评估,执行层 Agent 负责实现——是解决 AI 自评失效最有效的杠杆。 **什么时候用哪个?** -| | Delegation(委派) | Discussion(讨论)[待 POC 验证] | -|--|-------------------|-------------------------------| -| 场景 | "CTO 给 Builder 派一个具体任务" | "CoS 进 #cto 跟 CTO 讨论方案,然后去 #build 跟 Builder 确认可行性" | -| 触发方式 | `sessions_send` | @mention / 直接发消息 | +| | Delegation(委派) | Discussion(讨论) | +|--|-------------------|-------------------| +| 场景 | "CTO 给 Builder 派一个具体任务" | "Orchestrator 进 #cto 跟 CTO 讨论方案,然后去 #build 跟 Builder 确认可行性" | +| 触发方式 | `sessions_send` | 显式 @mention | | 方向性 | 单向,一对一 | 多向,多对多 | | 平台 | Slack / Discord / Feishu | 仅 Slack(需独立 App) | 两种模式共存,不互相替代。Delegation 是"给任务",Discussion 是"一起想"。 -**为什么需要 Discussion?** Delegation 是任务分发:CTO 说做什么,Builder 照做。但真实团队不只是派活——他们讨论。CoS 走进 CTO 的办公室说"这个方向你怎么看?",CTO 说"技术上可以但成本高",CoS 又去问 Builder"你觉得多久能做完?"。Discussion 模式让 Agent 也能这样协作——CoS-Bot 被拉进 #cto 频道,直接在 CTO 的地盘上对话,你可以实时旁观、随时插话。 +**为什么需要 Discussion?** Delegation 是任务分发:CTO 说做什么,Builder 照做。但真实团队不只是派活——他们讨论。Orchestrator 走进 CTO 的办公室说"这个方向你怎么看?",CTO 说"技术上可以但成本高",Orchestrator 又去问 Builder"你觉得多久能做完?"。Discussion 模式让 Agent 也能这样协作——Orchestrator-Bot 被拉进 #cto 频道,直接在 CTO 的地盘上对话,你可以实时旁观、随时插话。 ### 平台能力对比 | 能力 | Slack | Discord | Feishu | |------|-------|---------|--------| | Delegation(sessions_send) | ✅ | ✅ | ✅ | -| Discussion(跨 bot 对话)| 待 POC 验证 | 不支持(OpenClaw 代码层 bug) | 不支持(飞书平台限制) | +| Discussion(跨 bot 对话)| ✅ 已验证 | 不支持(OpenClaw 代码层 bug) | 不支持(飞书平台限制) | | Thread / Topic 隔离 | 原生 thread | Thread(自动归档) | groupSessionScope(>= 2026.3.1) | **为什么 Discord 和 Feishu 不支持 Discussion?** @@ -155,14 +157,20 @@ v2 引入了第二种模式:**Discussion(讨论)**。少数高价值 Agent ### Discussion 模式的当前状态 -坦率地说:Discussion 模式还没有被端到端验证过。 +Discussion 模式已通过端到端验证(2026-04-02)。两个独立 bot 可以在同一频道互相看到消息并进行结构化讨论。 + +通过 OpenClaw 源码验证 + 实测确认: +- Self-loop 过滤是 per-account 的(不同 Slack App 不互相过滤)✅ +- `allowBots` 支持三级 fallback(per-channel > per-account > global)✅ +- Per-account channel config 可以给同一频道的不同 bot 设置不同的 `requireMention` ✅ +- 多账号配置需要**同时声明 `accounts.default`**(否则主 bot 断连)⚠️ -通过 OpenClaw 源码验证(`extensions/slack/src/monitor/message-handler/prepare.ts`),以下机制已确认: -- Self-loop 过滤是 per-account 的(不同 Slack App 不互相过滤) -- `allowBots` 支持三级 fallback(per-channel > per-account > global) -- Per-account channel config 可以给同一频道的不同 bot 设置不同的 `requireMention` +**实战发现的限制**: +- `requireMention: true` 在 Thread 内被 `implicitMention` 绕过——需要 Prompt 规则配合 +- `allowBots: "mentions"` 在 Slack 无效(仅 Discord 支持) +- Input token 无法避免——所有消息送达所有 bot,只能让 agent 回复 NO_REPLY -但完整链路——CoS-Bot 在 #cto 发消息,CTO 收到并回复,CoS 看到回复并继续对话——还需要实际 POC 验证。详见 `shared/A2A_PROTOCOL.md` 附录 B。 +详细配置指南和陷阱清单见 [Discussion Mode 实操指南](A2A_SETUP_GUIDE.md)。完整协议见 `shared/A2A_PROTOCOL.md`。 --- @@ -317,7 +325,7 @@ Layer 2: 抽象知识 │ ├─ Agent 之间通过 A2A 协作 │ ├─ Delegation:两步触发 + 权限矩阵 - │ └─ Discussion:@mention 多 Agent 讨论 [待 POC 验证] + │ └─ Discussion:@mention 多 Agent 讨论 [已验证] │ └─ 知识通过三层沉淀积累 └─ 对话 → Closeout → KO 抽象知识 diff --git a/docs/en/CONCEPTS.md b/docs/en/CONCEPTS.md index b50005a..372646e 100644 --- a/docs/en/CONCEPTS.md +++ b/docs/en/CONCEPTS.md @@ -123,29 +123,31 @@ CIO -> Operates independently (syncs with CoS when needed) The two-step trigger described above is the **Delegation** mode -- one Agent hands a structured task to another via `sessions_send`. This is the foundation of A2A, supported on all platforms, with a clear one-directional flow. -v2 introduces a second mode: **Discussion**. A small number of high-value Agents (e.g., CoS) get their own independent Slack App, then join other Agents' channels to collaborate in real-time. [Pending POC Verification] +v2 introduces a second mode: **Discussion**. A small number of high-value Agents (e.g., Orchestrator) get their own independent Slack App, then join other Agents' channels to collaborate in real-time. -**Core idea: selective independence.** Not every Agent needs its own App -- execution-layer Agents (CTO, Builder, CIO, etc.) keep sharing one Slack App. Only Agents that need cross-domain collaboration (like CoS, who represents the user and drives strategy) get their own App, then get invited into target channels for direct conversation. +**Core idea: selective independence.** Not every Agent needs its own App -- execution-layer Agents (CTO, Builder, CIO, etc.) keep sharing one Slack App. Only Agents that need cross-domain collaboration (like the Orchestrator) get their own App, then get invited into target channels for direct conversation. + +This borrows a core insight from **Anthropic's Harness Design**: when an AI both executes and does QA, it tends to go easy on its own mistakes; when it both plans and executes, it tends to cut corners. Separating "thinking" and "doing" into different Agents -- Orchestrator handles planning and evaluation, execution-layer Agents handle implementation -- is the most effective lever for solving AI self-evaluation failure. **When to use which?** -| | Delegation | Discussion [Pending POC Verification] | -|--|------------|---------------------------------------| -| Scenario | "CTO assigns a specific task to Builder" | "CoS walks into #cto to discuss the approach with CTO, then checks with Builder in #build" | -| Trigger | `sessions_send` | @mention / direct message | +| | Delegation | Discussion | +|--|------------|------------| +| Scenario | "CTO assigns a specific task to Builder" | "Orchestrator walks into #cto to discuss the approach with CTO, then checks feasibility with Builder in #build" | +| Trigger | `sessions_send` | Explicit @mention | | Directionality | One-way, one-to-one | Multi-directional, many-to-many | | Platform | Slack / Discord / Feishu | Slack only (requires independent App) | The two modes coexist -- they do not replace each other. Delegation is for "assign the work." Discussion is for "think it through together." -**Why Discussion?** Delegation is task distribution: CTO says what to do, Builder executes. But real teams don't just assign tasks -- they discuss. CoS walks into CTO's office and asks "what do you think about this direction?", CTO says "technically feasible but expensive", CoS then asks Builder "how long would this take?". Discussion mode lets Agents work the same way -- CoS-Bot gets invited into #cto and talks directly on CTO's home turf. You can watch in real-time and intervene at any point. +**Why Discussion?** Delegation is task distribution: CTO says what to do, Builder executes. But real teams don't just assign tasks -- they discuss. The Orchestrator walks into CTO's office and asks "what do you think about this direction?", CTO says "technically feasible but expensive", then the Orchestrator asks Builder "how long would this take?". Discussion mode lets Agents work the same way -- the Orchestrator-Bot gets invited into #cto and talks directly on CTO's home turf. You can watch in real-time and intervene at any point. ### Platform Capability Comparison | Capability | Slack | Discord | Feishu | |------------|-------|---------|--------| | Delegation (sessions_send) | YES | YES | YES | -| Discussion (cross-bot) | Pending POC Verification | Not supported (OpenClaw code-level bug) | Not supported (platform limitation) | +| Discussion (cross-bot) | ✅ Verified | Not supported (OpenClaw code-level bug) | Not supported (platform limitation) | | Thread / Topic isolation | Native thread | Thread (auto-archive) | groupSessionScope (>= 2026.3.1) | **Why can't Discord and Feishu support Discussion?** @@ -155,14 +157,20 @@ The two modes coexist -- they do not replace each other. Delegation is for "assi ### Current Status of Discussion Mode -To be candid: Discussion mode has not been verified end-to-end. +Discussion mode has been verified end-to-end (2026-04-02). Two independent bots can see each other's messages in the same channel and conduct structured discussions. + +Through OpenClaw source code verification + live testing: +- Self-loop filtering is per-account (different Slack Apps don't filter each other) ✅ +- `allowBots` supports three-tier fallback (per-channel > per-account > global) ✅ +- Per-account channel config can give different bots different `requireMention` settings ✅ +- Multi-account config requires **explicit `accounts.default` declaration** (otherwise main bot disconnects) ⚠️ -Through OpenClaw source code verification (`extensions/slack/src/monitor/message-handler/prepare.ts`), the following mechanisms are confirmed: -- Self-loop filtering is per-account (different Slack Apps don't filter each other) -- `allowBots` supports three-tier fallback (per-channel > per-account > global) -- Per-account channel config can give different bots different `requireMention` settings on the same channel +**Limitations discovered in practice**: +- `requireMention: true` is bypassed in threads by `implicitMention` -- requires prompt rules to compensate +- `allowBots: "mentions"` does not work on Slack (Discord only) +- Input token cost is unavoidable -- all messages reach all bots; agents can only reply NO_REPLY -But the complete chain -- CoS-Bot posts in #cto, CTO receives and replies, CoS sees the reply and continues the conversation -- still needs a real POC test. See `shared/A2A_PROTOCOL.md` Appendix B for verification steps. +For detailed setup guide and pitfall checklist, see [Discussion Mode Setup Guide](A2A_SETUP_GUIDE.md). Full protocol in `shared/A2A_PROTOCOL.md`. --- @@ -317,7 +325,7 @@ You (decision-maker) | |-- Agents collaborate via A2A | |-- Delegation: two-step trigger + permission matrix - | +-- Discussion: @mention multi-agent deliberation [Pending POC Verification] + | +-- Discussion: @mention multi-agent deliberation [Verified] | +-- Knowledge accumulates through three-layer distillation +-- Conversation -> Closeout -> KO abstract knowledge diff --git a/shared/A2A_PROTOCOL.md b/shared/A2A_PROTOCOL.md index f3c4ebd..95510e1 100644 --- a/shared/A2A_PROTOCOL.md +++ b/shared/A2A_PROTOCOL.md @@ -6,7 +6,9 @@ > - 不串上下文(thread 级隔离 + 任务包完整) > > v2 覆盖平台:Slack / Feishu / Discord -> v2 协作模式:**Delegation**(全平台)+ **Discussion**(Slack 多 Bot)[待 POC 验证] +> v2 协作模式:**Delegation**(全平台)+ **Discussion**(Slack 多 Bot)[已验证] +> +> 配置指南:[Discussion Mode 实操指南](../docs/A2A_SETUP_GUIDE.md) --- @@ -94,64 +96,115 @@ A 读取 root message 的 message id(ts),拼出 thread sessionKey: --- -## 2b) Discussion Mode(讨论模式 — Slack 多 Bot)[待 POC 验证] +## 2b) Discussion Mode(讨论模式 — Slack 多 Bot)[已验证] > Discussion 是 Delegation 的增强,不是替代。适用于需要多方实时讨论的场景。 > 仅 Slack 平台支持。原因见 §7。 +> 完整配置指南见 [A2A_SETUP_GUIDE.md](../docs/A2A_SETUP_GUIDE.md)。 ### 核心思路:选择性独立化 -不需要每个 Agent 都有独立 Slack App。只需让**少数高价值的横向 Agent**(如 CoS、QA)拥有独立 App,然后**把它们拉进现有 Agent 的频道**进行协作: +不需要每个 Agent 都有独立 Slack App。只需让**少数高价值的横向 Agent**(如 Orchestrator)拥有独立 App,然后**把它们拉进现有 Agent 的频道**进行协作: ``` 独立 Slack App 共享 Slack App (现有) - ┌─────────┐ ┌─────────────────┐ - │ CoS-Bot │ │ Default-Bot │ - └────┬────┘ └───┬───┬───┬─────┘ - │ │ │ │ - 频道: #hq(home) #cto #build #cto #build #invest ... + ┌──────────────┐ ┌─────────────────┐ + │ Orchestrator │ │ Default-Bot │ + └──────┬───────┘ └───┬───┬───┬─────┘ + │ │ │ │ + 频道: #hq(home) #cto #build #cto #build #invest ... ────────────────────────────────────────────────────── - Agent: CoS ← 进入协作 → CTO Builder CIO ... + Agent: Orchestrator ← 进入协作 → CTO Builder CIO ... ``` -**CoS-Bot 被拉进 #cto** → 直接在 CTO 的地盘对话 → 像两个人在同一间办公室讨论。 +这里的 Orchestrator 融合了 Anthropic Harness Design 中 **Planner + Evaluator** 的职责: +- 展开需求为验收标准(Planner) +- 评估参与者的产出(Evaluator) +- 控制讨论节奏和终止(Orchestrator) + +执行层 Agent(CTO/Builder 等)是 **Generator**:执行具体工作,不自评通过。 + +> **为什么要分离?** 当一个 AI 既做执行又做 QA 时,它倾向于宽容自己的错误;既做规划又做执行时,倾向于投机取巧。将"想"和"做"分给不同 Agent,是解决 AI 自评失效最有效的杠杆。 ### 技术原理(源码验证) 1. **Self-loop 按 account 隔离**:每个 Slack App 有独立 `botUserId`,OpenClaw 只过滤来自自己的消息(`message.user === ctx.botUserId`),不同 App 之间不互相过滤。 2. **`allowBots: true`**:允许处理其他 bot 的消息。须在目标频道的 channel config 中开启。 -3. **Per-account channel config**:同一频道可以给不同 account 设置不同的 `requireMention`。例如 #cto 频道:CTO 的 account → `requireMention: false`(照常响应所有消息);CoS 的 account → `requireMention: true`(只在被 @mention 时响应)。 -4. **Thread participation 隐式 mention**:CoS-Bot 一旦在某个 thread 中发过消息,后续该 thread 的消息会触发 CoS 的隐式 mention,让对话可以持续进行。 +3. **Per-account channel config**:同一频道可以给不同 account 设置不同的 `requireMention`。 -### 协作流程 +### ⚠️ Thread 内的隐式触发问题(实战发现) +`requireMention: true` 只在 **Channel 根消息**(非 thread)有效。一旦 bot 在 thread 中回复过,`implicitMention` 永远为 true,绕过 `requireMention`。 + +源码证据(`resolveMentionGating`): + +```js +implicitMention = !isDirectMessage && botUserId && message.thread_ts && + (message.parent_user_id === botUserId || hasSlackThreadParticipation(...)) ``` -用户在 #cto: "@CoS 请协调评审 X 功能的架构方案" -CoS-Bot 收到 @mention → 进入 #cto thread - → CoS: "好的,我来协调。@CTO 请先提出你的方案。" +**影响**:Thread 内所有消息都会触发已参与的 bot,可能导致双响应或循环。 + +**解决:两层防线** + +| 层级 | 机制 | 效果 | 类型 | +|------|------|------|------| +| Config | `requireMention: true` | 防止 Channel 根消息双触发 | 硬约束 | +| Prompt | 显式 @mention 协议(见下文) | 防止 Thread 内双响应和循环 | 软约束 | + +### 显式 @mention 协议(Multi-Agent Thread 规则) + +每个参与 Discussion Mode 的 Agent 必须在 workspace 文件中包含此规则: + +```markdown +## Multi-Agent Thread 协作规则 + +在 Slack thread 中如果有其他 bot 也在参与: -CTO (Default-Bot) 收到消息(requireMention: false,在自己频道) - → CTO: "方案如下:... 建议用方案 A。" +1. **显式 @mention 检查**:检查消息文本是否包含 `<@你的BotID>`。 + 如果没有 → 整条回复只输出 `NO_REPLY`,不解释、不叙述。 -CoS-Bot 收到(thread participation 隐式 mention) - → CoS: "@CTO 方案 A 的成本如何?另外 @Builder 请评估可行性。" - (CoS 决定下一个参与者——编排者角色) +2. **发送消息时必须 @mention 目标**:`<@目标BotID>` 显式 mention。 + 不 @ 任何 bot = 对话终止信号。 -Builder (Default-Bot) 在 #cto 收到 @mention → 加载 thread 历史 - → Builder: "方案 A 工期约 2 周,有一个依赖需要先解决。" +3. **角色分工**: + - Orchestrator(编排者):选择 @Worker / @Human / 不@(结束) + - Worker(执行方):每次回复必须 @ Orchestrator + +4. **终止**:说"完毕/done/结论"后不再发送,除非被重新 @。 + +5. **轮次上限**:同一 thread 内最多 8 轮,超过后暂停并向人类汇报。 +``` + +### 协作流程(融合 Anthropic Harness Design) -CoS-Bot 收到 → 综合意见 - → CoS: "DISCUSSION_CLOSE | 共识:采用方案 A,Builder 先处理依赖。" - → CoS 用 Delegation (sessions_send) 给 CTO 派正式任务 ``` +用户 → @Orchestrator: "讨论 X 议题" + +Phase 0(Orchestrator 展开 spec): + → 📁 discussions/<topic>/spec.md(目标 + 验收标准 + 终止条件) + → Thread: 「DISCUSSION SPEC: 目标...验收标准 N 条。@Worker 请先...」 -### 关键约束 +Round 1/M: + Orchestrator → @Worker: 具体问题 + Worker → @Orchestrator: 摘要 + 📁 round-1.md(详细分析) -- **CoS 是编排者**:决定讨论节奏,谁下一个发言。CTO/Builder 是参与者,不主动 @mention 其他 Agent。 -- **轮次上限**:CoS 的 AGENTS.md 写明 `maxDiscussionTurns: 5`。到限后必须结束。 -- **Thread 隔离**:每个讨论 = 一个 thread。 -- **讨论后转 Delegation**:Discussion 的 Action Item 通过 Delegation 执行。 +Round 2/M: + Orchestrator 评估 → 📁 review-1.md → @Worker 反馈 + ... + +终止(三选一): + ✅ 所有验收标准满足 → DISCUSSION_CLOSE + ⚠️ 达到最大轮次 → WARNING,请人类介入 + 🔄 连续 2 轮无进展 → 请人类介入 +``` + +**关键原则**(源自 Anthropic Harness Design): +1. **先 spec 再讨论**——Phase 0 定义验收标准,不能跳过 +2. **文件是主通信通道**——Thread 只放摘要和 @mention 路由,详细内容写文件 +3. **自评失效,必须分离**——Orchestrator 不生成方案,只协调和评估 +4. **Orchestrator 默认太宽松**——需刻意严格,逐条对照标准判断 +5. **正式任务走 `sessions_send`**——Discussion 的 Action Item 通过 Delegation 执行 ### Discussion 终止协议 @@ -160,9 +213,13 @@ CoS-Bot 收到 → 综合意见 ``` DISCUSSION_CLOSE Topic: <讨论主题> -Consensus: <共识 / "未达成共识"> -Actions: <后续 Delegation 任务列表,含 TID> -Participants: <参与 Agent 列表> +Consensus: <共识 / "未达成共识,原因:..."> +Criteria Status: + 1. ✅/❌ <标准 1>: <状态> + 2. ✅/❌ <标准 2>: <状态> +Actions: <后续 Delegation 任务列表,含负责人> +Participants: <参与 Agent> +Rounds Used: N/M ``` --- @@ -235,59 +292,67 @@ Discussion 模式下,CoS-Bot 进入其他 Agent 的频道(如 #cto、#build --- -## 7) 已知限制与待验证 +## 7) 已知限制 -1. **Slack Discussion [待 POC 验证]**:各组件已通过源码验证(self-loop per-account、allowBots 三级 fallback、per-account channel config),但端到端链路尚无实测记录。 +1. **Slack Discussion [已验证]**:端到端链路已通过实测验证(2026-04-02)。两个 bot 可在同一频道互相看到消息并进行结构化讨论。但 Thread 内的隐式触发需要 Prompt 规则配合(见 §2b)。 2. **Discord Discussion [NO]**:OpenClaw Issues #11199(bot filter 全局化)+ #45300(requireMention 多账户失效),均已关闭未修复。 3. **Feishu Discussion [NO]**:飞书 `im.message.receive_v1` 仅投递用户消息(平台限制,非 OpenClaw bug)。 4. **Issue #15836**:OpenClaw 关闭了 Slack A2A routing 请求(NOT_PLANNED)。`sessions_send` 仍是官方推荐方式。Discussion 作为增强,非替代。 +5. **`allowBots: "mentions"` 仅 Discord 可用**:Slack provider 只做 truthy/falsy 检查,`"mentions"` 等同于 `true`。需要 OpenClaw 代码改动才能在 Slack 支持。 +6. **`requireMention: true` 在 Thread 内被绕过**:`implicitMention`(thread participation)会永久绕过 `requireMention`。需要 OpenClaw 增加 `thread.requireExplicitMention` 选项才能从系统层解决。 +7. **Input token 无法避免**:Thread 中所有消息都会送达所有 bot,Prompt 规则只让 agent 回复 NO_REPLY,但 input token 消耗不可避免。 +8. **多账号 `accounts.default` 必须显式声明**:详见附录 A 的警告。实战中因遗漏导致过 ~13h 全 Agent 断连。 --- ## 附录 A:Discussion Mode 配置指南(Slack) -> 以下配置可由你的 OpenClaw agent 协助完成。人工操作仅需创建 Slack App 和邀请 bot。 +> 详细实操指南见 [A2A_SETUP_GUIDE.md](../docs/A2A_SETUP_GUIDE.md),包含完整 manifest、陷阱清单和回滚方式。 +> 以下是精简版。 ### 人工操作(一次性) -1. **创建独立 Slack App**(如 CoS-Bot): - - 前往 [api.slack.com/apps](https://api.slack.com/apps) → Create New App - - 启用 Socket Mode,获取 App Token (`xapp-`) - - 添加 Bot Token Scopes: `channels:history`, `channels:read`, `chat:write`, `users:read` - - 添加 Event Subscriptions: `message.channels`, `app_mention` - - 获取 Bot Token (`xoxb-`) - - 记录 Bot User ID(Settings → Basic Info → App Credentials,或在 Slack 中查看 bot 的 profile) +1. **创建独立 Slack App**:前往 [api.slack.com/apps](https://api.slack.com/apps) → Create New App → **From manifest** + - 使用 [A2A_SETUP_GUIDE.md](../docs/A2A_SETUP_GUIDE.md) 中的完整 manifest(含所有必要 scope 和 event) + - 创建后:Basic Information → App-Level Tokens → Generate(scope: `connections:write`)→ 拿到 `xapp-` + - Install to Workspace → 拿到 `xoxb-` + +2. **邀请 Bot 到目标频道**:`/invite @Bot-Name` -2. **邀请 CoS-Bot 到目标频道**:在 #cto、#build 等频道中运行 `/invite @CoS-Bot` +### Agent 可执行的配置 -### Agent 可执行的配置(发给你的 OpenClaw) +> ⚠️ **硬性要求:必须同时声明 `accounts.default`** +> +> OpenClaw 的 `account-helpers.ts:listAccountIds()` 逻辑:一旦 `accounts` 对象存在且有任何 key, +> 只启动显式声明的账号。漏掉 `accounts.default` = 主 bot 断连,所有现有 Agent 失联。 +> +> 这是设计如此,不是 bug。实战中因此导致过约 13 小时的全 Agent 断连事故。 -> 以下是给 OpenClaw agent 的执行提示。将凭证替换为实际值后,发送给你的 agent。 +将以下配置提示发给你的 OpenClaw agent(替换凭证): ``` 请帮我配置 Discussion Mode。 -CoS-Bot 凭证(写入配置,不要回显): -- Bot Token: xoxb-cos-xxx -- App Token: xapp-cos-xxx +Bot 凭证(写入配置,不要回显): +- Bot Token: xoxb-xxx +- App Token: xapp-xxx -请在 openclaw.json 中执行以下增量修改(不要覆盖现有配置): +请在 openclaw.json 中执行以下增量修改: -1. 在 channels.slack 下添加 accounts 块: - - 将现有的 botToken/appToken 移入 accounts.default - - 添加 accounts.cos(使用上面的凭证) +1. 在 channels.slack.accounts 下: + ★ 必须同时声明 accounts.default(用现有顶层 token) + - accounts.default = { botToken: 现有, appToken: 现有 } + - accounts.<new-bot> = { botToken: 新的, appToken: 新的, channels: {...} } -2. 添加 CoS 的 account binding: - { "agentId": "cos", "match": { "channel": "slack", "accountId": "cos" } } +2. 在目标频道开启双向 allowBots: + - 全局 channel config: allowBots: true(让原有 Agent 看到新 bot) + - 新 account channel config: allowBots: true, requireMention: true -3. 在需要跨 bot 协作的频道配置中添加 allowBots: true: - channels.slack.accounts.default.channels.<CTO_CHANNEL_ID>.allowBots = true - channels.slack.accounts.cos.channels.<CTO_CHANNEL_ID>.requireMention = true - channels.slack.accounts.cos.channels.<CTO_CHANNEL_ID>.allowBots = true +3. 添加 binding(accountId + peer,放在现有 peer binding 之前) -4. 不要修改现有的 agent bindings、models、auth、gateway 配置。 +4. 等待热重载,不要立即 SIGTERM -5. 重启 gateway 并验证 CoS-Bot 在 #cto 中可以被 @mention 触发。 +5. 验证日志:两个 provider 都 starting + channels resolved 无 missing_scope ``` ### 配置结构参考 From 31b62fe8b2411f90f8237ca26a2988ad78b04b18 Mon Sep 17 00:00:00 2001 From: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> Date: Thu, 2 Apr 2026 22:25:41 +0800 Subject: [PATCH 3/4] chore: gitignore .harness/ and CLAUDE.md (local agent work files) --- .gitignore | 2 + .harness/reports/architecture_collab_r1.md | 1099 ----------------- .harness/reports/architecture_protocol_r1.md | 918 -------------- .harness/reports/qa_a2a_research_r1.md | 237 ---- .harness/reports/qa_docs_official_r1.md | 217 ---- .../reports/research_autonomous_slack_r1.md | 567 --------- .harness/reports/research_discord_r1.md | 382 ------ .harness/reports/research_feishu_r1.md | 433 ------- .../research_platform_limitations_r1.md | 109 -- .../reports/research_selective_agents_r1.md | 501 -------- .harness/reports/research_slack_r1.md | 382 ------ .harness/reports/verify_source_code_r1.md | 389 ------ CLAUDE.md | 98 -- 13 files changed, 2 insertions(+), 5332 deletions(-) delete mode 100644 .harness/reports/architecture_collab_r1.md delete mode 100644 .harness/reports/architecture_protocol_r1.md delete mode 100644 .harness/reports/qa_a2a_research_r1.md delete mode 100644 .harness/reports/qa_docs_official_r1.md delete mode 100644 .harness/reports/research_autonomous_slack_r1.md delete mode 100644 .harness/reports/research_discord_r1.md delete mode 100644 .harness/reports/research_feishu_r1.md delete mode 100644 .harness/reports/research_platform_limitations_r1.md delete mode 100644 .harness/reports/research_selective_agents_r1.md delete mode 100644 .harness/reports/research_slack_r1.md delete mode 100644 .harness/reports/verify_source_code_r1.md delete mode 100644 CLAUDE.md diff --git a/.gitignore b/.gitignore index 55316c4..1f11cc5 100644 --- a/.gitignore +++ b/.gitignore @@ -16,3 +16,5 @@ Thumbs.db *.env secrets/ credentials/ +.harness/ +CLAUDE.md diff --git a/.harness/reports/architecture_collab_r1.md b/.harness/reports/architecture_collab_r1.md deleted file mode 100644 index 65bf3f0..0000000 --- a/.harness/reports/architecture_collab_r1.md +++ /dev/null @@ -1,1099 +0,0 @@ -commit 7e825263db36aef68792a050c324daef598b4c56 -Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> -Date: Sat Mar 28 17:38:48 2026 +0800 - - feat: add A2A v2 research harness, architecture, and agent definitions - - Multi-agent harness for researching and designing A2A v2 protocol: - - Research reports (Phase 1): - - Slack: true multi-agent collaboration via multi-account + @mention - - Feishu: groupSessionScope + platform limitation analysis - - Discord: multi-bot routing + Issue #11199 blocker analysis - - Architecture designs (Phase 2): - - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode - - 5 collaboration patterns: Architecture Review, Strategic Alignment, - Code Review, Incident Response, Knowledge Synthesis - - 3-level orchestration: Human → Agent → Event-Driven - - Platform configs, migration guides, 6 ADRs - - Agent definitions for Claude Code Agent Teams: - - researcher.md, architect.md, doc-fixer.md, qa.md - - QA verification: all issues resolved, PASS verdict after fixes. - - Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> - -diff --git a/.harness/reports/architecture_collab_r1.md b/.harness/reports/architecture_collab_r1.md -new file mode 100644 -index 0000000..8ff71a1 ---- /dev/null -+++ b/.harness/reports/architecture_collab_r1.md -@@ -0,0 +1,1066 @@ -+# Architecture Report: Multi-Agent Collaboration Patterns (U0/U3, Round 1) -+ -+> Architect: Claude Opus 4.6 | Date: 2026-03-27 | Contract: `.harness/contracts/architecture-a2a.md` -+ -+--- -+ -+## Executive Summary -+ -+本报告定义了 OpenCrew 从"委派式 A2A"演化为"协作式 A2A"的完整架构设计。核心洞察:当前 A2A 是**单向委派**(A 发任务包给 B,B 执行后汇报),我们需要的是**多方协作**(多个 Agent 在同一 thread 中各自带入领域判断,互相挑战和完善)。 -+ -+设计产出五个部分: -+1. **协作模式目录** -- 5 种可落地的协作模式,每种含完整机制描述 -+2. **编排模型** -- 3 级编排层次,从人工驱动到事件驱动 -+3. **共享 Thread 协议** -- 多 Agent thread 的命名、轮次、终止、升级规范 -+4. **Harness 集成** -- 文件式(harness)与聊天式(OpenCrew)协作的映射关系 -+5. **配置模板** -- 可直接使用的 Slack 多账号配置片段 -+ -+**平台支持矩阵**(贯穿全文): -+ -+| 平台 | 多方协作 | 阻塞因素 | 替代方案 | -+|------|---------|---------|---------| -+| **Slack** | NOW | 无 -- `allowBots` + `requireMention` + multi-account 已就绪 | -- | -+| **Discord** | BLOCKED | Issue #11199(同实例 bot 消息互相过滤) | `sessions_send` 委派模式可用 | -+| **Feishu** | NOT POSSIBLE | 平台限制:`im.message.receive_v1` 仅投递用户消息,bot 消息对其他 bot 不可见 | `sessions_send` + 话题隔离可用 | -+ -+--- -+ -+## 1. Collaboration Patterns Catalog -+ -+### Pattern 1: Architecture Review(架构评审) -+ -+**描述**:CTO 提出技术方案,Builder 从可行性角度质疑,QA/Ops 识别风险,迭代至收敛。这是"提案-挑战-精炼"循环。 -+ -+**适用场景**: -+- 实现重大功能前的技术方案评审 -+- 引入新依赖或架构变更 -+- 跨系统集成方案论证 -+ -+**参与者**: -+| Agent | 角色 | 贡献 | -+|-------|------|------| -+| CTO | 提案者 | 提出架构方案、回应质疑、迭代设计 | -+| Builder | 可行性评审者 | 从实现角度评估复杂度、工期、技术债 | -+| Ops | 风险评审者 | 评估运维影响、安全风险、回滚策略 | -+| 用户 | 决策者 | 定目标、验收、在分歧时拍板 | -+ -+**机制(step-by-step)**: -+ -+``` -+Step 1: 用户(或 CoS)在 #collab 频道发帖 -+ "我们需要评审 X 功能的技术方案" -+ → CTO 作为频道绑定 Agent 自动收到(或被 @mention) -+ -+Step 2: CTO 发布架构提案(在 thread 中) -+ 格式:[CTO] Proposal: <标题> -+ 内容:目标、方案、技术选型、风险评估 -+ -+Step 3: 用户 @mention Builder -+ "@Builder 请从实现角度评估这个方案" -+ → Builder 的 bot 收到 mention event -+ → Builder 加载 thread 历史(通过 initialHistoryLimit) -+ → Builder 发布可行性评估 -+ -+Step 4: 用户 @mention Ops(如需要) -+ "@Ops 请评估运维和安全影响" -+ → Ops 加载 thread 历史,看到 CTO 提案 + Builder 评估 -+ → Ops 发布风险评估 -+ -+Step 5: 用户 @mention CTO -+ "@CTO 请回应 Builder 和 Ops 的反馈" -+ → CTO 看到所有历史,发布修订方案或反驳 -+ -+Step 6: 重复 Step 3-5 直到收敛 -+ 收敛标志:所有参与者表示"无进一步反对"或用户拍板 -+ -+Step 7: CTO 发布最终决议总结(thread 内) -+ 格式:[CTO] DECISION: <结论> -+ 内容:最终方案、遗留风险、下一步 action -+``` -+ -+**平台支持**: -+- **Slack**: NOW -- 多账号 + `allowBots: true` + `requireMention: true` -+- **Discord**: AFTER #11199 -- 修复后机制完全相同 -+- **Feishu**: NOT SUPPORTED -- bot 消息对其他 bot 不可见;替代:用户手动转述或 `sessions_send` 逐步委派 -+ -+**防护栏**: -+- **循环防护**:设定 `maxDiscussionTurns = 5`(AGENTS.md 指令层)。超过 5 轮未收敛,必须升级到用户拍板 -+- **噪音防护**:所有共享频道 `requireMention: true`,Agent 不被 @mention 就不响应 -+- **上下文溢出防护**:每位 Agent 的发言不超过 500 字。超过时拆分为"摘要 + 详细附件链接" -+- **发散防护**:每轮必须聚焦 1 个议题。CTO 作为提案者负责"下一轮聚焦什么" -+ -+**示例场景**: -+ -+> 用户想给 OpenCrew 增加"自动 PR 创建"功能。 -+> -+> 1. 用户在 #collab: "评审自动 PR 创建功能的技术方案" -+> 2. CTO: "建议用 GitHub API + Builder Agent 的 bash tool 实现。架构:CoS 收到用户请求 → 转 CTO 拆解 → Builder 执行 git/gh 命令 → Builder closeout 含 PR URL" -+> 3. @Builder: "git 操作需要 repo 写权限。当前 Builder 的 tool 配置没有 gh auth。需要增加 GITHUB_TOKEN 环境变量。工期估计:配置 30 分钟,测试 1 小时" -+> 4. @Ops: "风险:GITHUB_TOKEN 泄露到 closeout 日志。建议:使用 SecretRef 而非明文。另外需要 L3 用户确认(创建 PR 是可回滚但影响较大的动作)" -+> 5. @CTO: "接受两点反馈。修订:使用 SecretRef 管理 token;PR 创建前在 thread 里 WAIT 用户确认(L2→L3 升级)" -+> 6. 用户: "APPROVED。CTO 请拆解任务给 Builder" -+> 7. CTO 发布 DECISION 总结 → 转入 A2A v1 委派流程给 Builder -+ -+--- -+ -+### Pattern 2: Strategic Alignment(战略对齐) -+ -+**描述**:用户表达高层目标,CoS 解读深层意图,CTO 规划技术路径,CIO 补充领域约束,多方收敛出可执行计划。 -+ -+**适用场景**: -+- 新项目/新方向启动 -+- 季度 OKR 拆解 -+- 重大方向转变(pivot) -+ -+**参与者**: -+| Agent | 角色 | 贡献 | -+|-------|------|------| -+| CoS | 意图解读者 | 澄清用户真实意图、优先级、价值判断 | -+| CTO | 技术规划者 | 将意图转化为技术路径、评估可行性 | -+| CIO | 领域约束者 | 补充领域知识、市场约束、合规要求 | -+| 用户 | 决策者 | 定方向、校正偏差 | -+ -+**机制(step-by-step)**: -+ -+``` -+Step 1: 用户在 #hq 频道发布目标 -+ "我想做 X,因为 Y"(可以是模糊的一句话) -+ → CoS 作为 #hq 绑定 Agent 首先响应 -+ -+Step 2: CoS 解读意图 -+ 格式:[CoS] Intent Alignment -+ 内容:我理解的目标、隐含假设、需要确认的点 -+ → 用户确认/校正 -+ -+Step 3: CoS @mention CTO(在同一 thread) -+ "@CTO 基于以上对齐的目标,请规划技术路径" -+ → CTO 加载 thread(看到用户原始目标 + CoS 解读 + 用户确认) -+ → CTO 发布技术方案草案 -+ -+Step 4: CoS @mention CIO(如涉及领域) -+ "@CIO 请补充领域约束和市场视角" -+ → CIO 加载 thread,看到全部上下文 -+ → CIO 发布领域分析 -+ -+Step 5: CoS 综合所有输入,产出对齐摘要 -+ 格式:[CoS] Alignment Summary -+ 内容:确认的目标、技术路径、领域约束、优先级排序、下一步 -+ → 用户最终确认 -+ -+Step 6: 确认后,CoS 通过 A2A v1 委派给 CTO 执行拆解 -+``` -+ -+**关键设计决策**:CoS 作为编排者而非信息中转。CoS 的价值不是"传话",而是在每一轮中加入自己的判断——"用户说了 X,但我认为真正的需求是 Y"。这与 ARCHITECTURE.md 中"CoS 是战略伙伴不是秘书"的定位一致。 -+ -+**平台支持**: -+- **Slack**: NOW -+- **Discord**: AFTER #11199 -+- **Feishu**: NOT SUPPORTED(替代:CoS 在各 Agent 群组间用 `sessions_send` 逐步推进,手动综合) -+ -+**防护栏**: -+- **意图漂移防护**:CoS 每轮输出必须包含"与用户原始目标的对齐度"评估 -+- **过度规划防护**:战略对齐最多 3 轮。3 轮后必须产出可执行的 next action -+- **CTO/CIO 范围防护**:CTO 只谈技术、CIO 只谈领域。越界时 CoS 有权引导回正题 -+ -+**示例场景**: -+ -+> 用户: "我想让 OpenCrew 支持自动处理 GitHub Issues" -+> -+> 1. CoS: "理解目标。我的解读:你想让 Agent 自动 triage issues、分类、分配。但我想确认——是全自动(Agent 直接处理)还是半自动(Agent 分析后等你确认)?另外范围是所有 repo 还是特定 repo?" -+> 2. 用户: "先做半自动,只针对 opencrew repo" -+> 3. CoS @CTO: "目标确认:半自动 GitHub Issues triage for opencrew repo。请规划技术路径。" -+> 4. CTO: "方案:用 GitHub Webhook → OpenClaw channel → Ops Agent 接收。Ops 分析 issue 后产出建议(分类 + 优先级 + 建议处理者),发到 #ops 等用户确认。技术需要:新增 GitHub channel 配置、Ops AGENTS.md 增加 triage 指令。" -+> 5. CoS @CIO: "从项目管理视角,有什么分类标准建议?" -+> 6. CIO: "建议分类:bug/feature/docs/question。优先级用 impact x urgency 矩阵。注意:外部贡献者的 issue 应该比内部的优先响应(社区建设)" -+> 7. CoS 综合: "Alignment Summary: 目标=半自动 issue triage for opencrew。路径=GitHub Webhook + Ops Agent。分类=bug/feature/docs/question。优先级=impact x urgency。社区 issue 优先。下一步:CTO 拆解任务。" → 用户确认 → 委派 -+ -+--- -+ -+### Pattern 3: Code/Design Review(代码/设计评审) -+ -+**描述**:Builder 产出代码或设计文档,CTO 评审架构合理性,QA/Ops 评审正确性和安全性,KO 检查知识一致性。这是"生产-多维评审-修订"循环。 -+ -+**适用场景**: -+- PR 合并前评审 -+- 设计文档评审 -+- 配置变更评审 -+- 知识库内容评审 -+ -+**参与者**: -+| Agent | 角色 | 贡献 | -+|-------|------|------| -+| Builder | 生产者 | 产出代码/文档,根据反馈修订 | -+| CTO | 架构评审者 | 评审架构合理性、设计模式、长期可维护性 | -+| Ops | 正确性/安全评审者 | 评审安全风险、运维影响、合规性 | -+| KO | 知识一致性评审者 | 检查与已有知识(principles/patterns/scars)的一致性 | -+ -+**机制(step-by-step)**: -+ -+``` -+Step 1: Builder 在 #build thread 完成实现,产出 closeout -+ closeout 包含:产出物路径、变更摘要、验证命令 -+ -+Step 2: CTO 在 #build thread 中 @mention(或 CTO 主动发起评审 thread) -+ → 评审在一个"评审 thread"中进行(可以在 #collab 或 #cto) -+ → CTO 发布架构评审 -+ -+Step 3: @Ops 评审安全/运维 -+ → Ops 看到 Builder 的 closeout + CTO 的评审 -+ → 发布安全/运维评估 -+ -+Step 4: @KO 评审知识一致性(如涉及系统变更) -+ → KO 看到全部上下文 -+ → 检查是否与 principles.md/patterns.md/scars.md 冲突 -+ → 如有冲突则指出 -+ -+Step 5: CTO 综合所有评审意见 -+ 格式:[CTO] Review Summary -+ 状态:APPROVED / NEEDS_REVISION / REJECTED -+ 如 NEEDS_REVISION:列出具体修改项 → 回到 Builder -+ -+Step 6: Builder 修订后重新提交(thread 内) -+ → 重复 Step 2-5(通常 1-2 轮即可收敛) -+``` -+ -+**与 Harness Evaluator 的对比**: -+ -+Anthropic harness 设计中的 Evaluator 是单一评审者,采用"Generator vs Evaluator"对抗模式。OpenCrew 的评审是**多维度评审**——不同 Agent 从不同维度审查同一产出。这更接近真实团队中"架构师看设计、安全团队看漏洞、文档团队看一致性"的工作方式。 -+ -+**平台支持**: -+- **Slack**: NOW -+- **Discord**: AFTER #11199 -+- **Feishu**: NOT SUPPORTED(替代:CTO 手动在各群组间转述评审结论,或人工触发 `sessions_send`) -+ -+**防护栏**: -+- **评审范围限定**:每位评审者只评审自己领域。CTO 不评审安全,Ops 不评审架构 -+- **评审轮次限制**:最多 3 轮修订。3 轮后要么 APPROVED 要么升级到用户决策 -+- **上下文容量管理**:`initialHistoryLimit` 建议设为 80-100(评审 thread 内容较多) -+- **评审格式标准化**:每条评审必须包含 `severity: [BLOCKER|MAJOR|MINOR|NITPICK]` + `具体修改建议` -+ -+**示例场景**: -+ -+> Builder 完成了 A2A_PROTOCOL.md v2 的草案。 -+> -+> 1. Builder closeout: "A2A_PROTOCOL_V2.md 已产出,路径 shared/A2A_PROTOCOL_V2.md。变更:新增多 bot 模式、简化触发流程、新增协作模式引用。" -+> 2. @CTO: "架构评审:新增的 multi-bot 模式逻辑清晰。但第 3 节协作模式引用缺少 Feishu 的降级方案。MAJOR:请补充。另外命名建议:A2A v1/v2 改为 delegation-mode/discussion-mode,避免暗示 v1 被废弃。MINOR。" -+> 3. @Ops: "安全评审:multi-bot 配置示例中 botToken 是明文。BLOCKER:必须改为 SecretRef 或环境变量引用。另外 allowBots: true 的安全含义需要在文档中明确警告。MAJOR。" -+> 4. @KO: "知识一致性检查:与 scars.md 中'sessions_send timeout 不等于未送达'的记录一致。与 patterns.md 中'一个任务 = 一个 thread = 一个 session'的原则需要更新——discussion-mode 中一个 thread 可能对应多个 Agent 的 session。MAJOR:建议更新 patterns.md。" -+> 5. CTO Review Summary: "NEEDS_REVISION。3 项需修改:Feishu 降级方案、token 安全化、patterns.md 更新。" -+> 6. Builder 修订 → 重新提交 → 第二轮评审 → APPROVED -+ -+--- -+ -+### Pattern 4: Incident Response(事件响应) -+ -+**描述**:Ops 检测到异常,CTO 诊断根因,Builder 提出修复方案,QA 验证修复。快速迭代直至解决。 -+ -+**适用场景**: -+- 生产环境异常(Agent 无响应、消息路由错误、配置漂移) -+- A2A 通信故障 -+- 安全事件 -+ -+**参与者**: -+| Agent | 角色 | 贡献 | -+|-------|------|------| -+| Ops | 检测 + 初步分析 | 发现问题、收集初步证据、触发响应流程 | -+| CTO | 诊断者 | 分析根因、确定影响范围、决定修复策略 | -+| Builder | 修复者 | 实施修复、验证修复 | -+| 用户 | 审批者 | L3 动作审批(如需要) | -+ -+**机制(step-by-step)**: -+ -+``` -+Step 1: Ops 在 #ops 检测到异常(手动或自动) -+ → Ops 在 #collab 创建事件响应 thread -+ 格式:[Ops] INCIDENT: <标题> | Severity: P1/P2/P3 -+ 内容:症状描述、初步证据(日志/错误信息)、影响范围 -+ -+Step 2: @CTO 诊断 -+ → CTO 加载 thread,分析 Ops 提供的证据 -+ → 发布诊断结果:根因假设、影响范围评估、建议修复策略 -+ → 如需更多信息:@Ops "请补充 X 日志" -+ -+Step 3: @Builder 修复(如诊断明确) -+ → CTO 在 thread 中 @Builder 并附带修复方案 -+ → Builder 实施修复、在 thread 中报告修复步骤和验证结果 -+ -+Step 4: @Ops 验证 -+ → Ops 验证修复后系统状态 -+ → 发布验证结果:修复有效 / 部分有效 / 无效 -+ -+Step 5: 如修复无效 → 回到 Step 2 -+ 如修复有效 → CTO 发布事件总结 -+ -+Step 6: CTO 发布 INCIDENT RESOLVED -+ 格式:[CTO] INCIDENT RESOLVED: <标题> -+ 内容:根因、修复措施、遗留风险、防复发建议 -+ → 同步到 KO(signal ≥ 2,记入 scars.md) -+``` -+ -+**紧急性设计**:事件响应与其他协作模式的关键区别是**时间压力**。机制设计体现为: -+- **快速启动**:Ops 直接 @mention CTO,不需要 CoS 中转 -+- **并行诊断**:CTO 可以同时 @mention Builder 准备修复环境 -+- **简化格式**:允许短消息,不强制完整的任务包格式 -+- **L3 快速通道**:P1 事件中,用户可以预授权"Builder 可以执行通常需要确认的操作" -+ -+**平台支持**: -+- **Slack**: NOW -+- **Discord**: AFTER #11199(事件响应对实时性要求最高,Discord 解决后应优先支持此模式) -+- **Feishu**: PARTIAL -- Ops 可以在各群组间用 `sessions_send` 分步协调,但缺乏所有参与者共同可见的统一 thread -+ -+**防护栏**: -+- **升级时限**:P1 事件如果 3 轮内未解决(约 15 分钟),自动升级到用户 -+- **操作审计**:所有修复操作必须在 thread 中明文记录(命令 + 输出) -+- **回滚预案**:Builder 每次修复前必须声明回滚步骤 -+- **事后复盘**:INCIDENT RESOLVED 后 24 小时内必须完成 postmortem(可由 KO 协助) -+ -+**示例场景**: -+ -+> Ops 发现 Builder Agent 在 #build 频道不响应。 -+> -+> 1. Ops: "INCIDENT: Builder Agent 无响应 | Severity: P2。症状:#build 频道 @Builder 无反应已 30 分钟。初步检查:gateway 进程正常、Slack WebSocket 连接正常。" -+> 2. @CTO: "诊断:可能原因:(1) Builder 的 session 卡在长任务中 (2) binding 配置问题 (3) Slack app token 过期。请 @Ops 检查 sessions_list 中 Builder 的活跃 session 数量和最近活动时间。" -+> 3. @Ops: "sessions_list 结果:Builder 有 3 个活跃 session,最近一个 45 分钟前创建,状态 active。看起来是卡在长任务。" -+> 4. CTO: "确认根因:Builder session 卡在长任务。修复策略:(A) 等待当前任务完成 (B) 重置卡住的 session。建议 A,如果 30 分钟后仍无响应再执行 B。@Builder 你当前在执行什么任务?" -+> 5. Builder(恢复后): "刚完成一个大文件的 git 操作,耗时 40 分钟。已恢复正常。" -+> 6. CTO: "INCIDENT RESOLVED: Builder 因长任务阻塞导致短暂无响应。根因:大文件 git 操作超过预期耗时。防复发:在 Builder AGENTS.md 中增加'长任务必须每 10 分钟发 checkpoint'的指令。" -+ -+--- -+ -+### Pattern 5: Knowledge Synthesis(知识综合) -+ -+**描述**:KO 呈现提炼后的知识(从 closeout 中提取),CTO 验证技术准确性,CIO 验证领域准确性,CoS 评估战略相关性。 -+ -+**适用场景**: -+- 周期性知识复盘(每周/每月) -+- 新知识条目入库前的交叉验证 -+- 知识库重大更新 -+ -+**参与者**: -+| Agent | 角色 | 贡献 | -+|-------|------|------| -+| KO | 呈现者 | 提炼知识、组织结构、提出入库建议 | -+| CTO | 技术验证者 | 验证技术内容的准确性和时效性 | -+| CIO | 领域验证者 | 验证领域知识的准确性和适用性 | -+| CoS | 战略评估者 | 评估知识的战略相关性和优先级 | -+ -+**机制(step-by-step)**: -+ -+``` -+Step 1: KO 在 #know 或 #collab 创建知识综合 thread -+ 格式:[KO] Knowledge Synthesis: <主题/周期> -+ 内容: -+ - 新增原则(candidates for principles.md) -+ - 新增模式(candidates for patterns.md) -+ - 新增教训(candidates for scars.md) -+ - 建议变更(对现有条目的更新) -+ -+Step 2: @CTO 技术验证 -+ → CTO 逐条验证技术内容 -+ → 标注:ACCURATE / OUTDATED / NEEDS_CONTEXT / INCORRECT -+ → 对 NEEDS_CONTEXT 和 INCORRECT 提供修正建议 -+ -+Step 3: @CIO 领域验证(如涉及领域知识) -+ → CIO 验证领域内容 -+ → 同样标注 + 修正建议 -+ -+Step 4: @CoS 战略评估 -+ → CoS 评估每条知识的战略权重 -+ → 标注:HIGH_VALUE / USEFUL / LOW_VALUE / IRRELEVANT -+ → 对 HIGH_VALUE 建议"升级为原则"或"影响后续规划" -+ -+Step 5: KO 综合所有反馈 -+ → 发布最终版本:哪些入库、哪些修改、哪些丢弃 -+ → 执行知识库更新 -+ → 发布 closeout(signal ≥ 2) -+``` -+ -+**与现有知识管道的集成**: -+ -+当前 KNOWLEDGE_PIPELINE.md 定义了 closeout → KO ingest 的单向流。Knowledge Synthesis 模式将其扩展为**双向验证**:KO 不仅是接收者,还是综合者,主动发起交叉验证。这增加了知识库的可靠性。 -+ -+**平台支持**: -+- **Slack**: NOW -+- **Discord**: AFTER #11199 -+- **Feishu**: NOT SUPPORTED(替代:KO 在各 Agent 群组逐一发送验证请求,手动综合结果) -+ -+**防护栏**: -+- **验证粒度**:每次综合不超过 10 条知识条目(避免评审者认知过载) -+- **频率控制**:每周最多 1 次全面综合。临时入库可以走简化流程(KO 自行决定,signal < 2 不需要交叉验证) -+- **否决权**:CTO 对技术内容有否决权,CIO 对领域内容有否决权。被否决的条目不入库 -+ -+**示例场景**: -+ -+> KO 每周五执行知识综合。 -+> -+> 1. KO: "Knowledge Synthesis: Week 12。新增候选条目:(1) Principle: 'A2A sessions_send timeout 不等于失败,必须有兜底消息' (2) Pattern: '多 Agent 评审用 severity 标签分级' (3) Scar: 'Feishu bot 消息对其他 bot 不可见,跨 bot 触发必须走 sessions_send'" -+> 2. @CTO: "(1) ACCURATE,已在实战中多次验证。(2) ACCURATE,建议补充 severity 定义。(3) ACCURATE,这是 Feishu 平台限制非 OpenClaw bug。" -+> 3. @CIO: "本周无领域相关条目,SKIP。" -+> 4. @CoS: "(1) HIGH_VALUE — 影响所有 A2A 流程设计。(2) USEFUL — 评审效率提升。(3) HIGH_VALUE — 影响 Feishu 多 Agent 架构决策。" -+> 5. KO: "综合结果:3 条全部入库。(1) → principles.md (2) → patterns.md,补充 severity 定义后入库 (3) → scars.md。已更新。" -+ -+--- -+ -+## 2. Orchestration Model -+ -+### Level 1: Human Orchestrated(人工编排) -+ -+**描述**:用户手动 @mention Agent 驱动讨论。用户完全控制节奏、话题、参与者顺序。 -+ -+**机制**: -+1. 用户在共享频道/thread 中发帖 -+2. 用户通过 @mention 指定下一位发言的 Agent -+3. Agent 响应后 WAIT——不主动 @mention 其他 Agent -+4. 用户阅读响应后,决定下一步:@mention 另一位 Agent、要求当前 Agent 深入、或结束讨论 -+ -+**配置要求**: -+```json -+{ -+ "channels": { -+ "slack": { -+ "channels": { -+ "<COLLAB_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": true, -+ "allowBots": true -+ } -+ } -+ } -+ } -+} -+``` -+ -+**防护栏**: -+- 所有 Agent 的 AGENTS.md 中增加指令:`"在协作 thread 中,响应后 WAIT。不要主动 @mention 其他 Agent,除非处于 Level 2 编排模式。"` -+- `requireMention: true` 是硬约束——即使 Agent "想"响应也触发不了(无 mention = 无 event) -+ -+**推荐成熟度**:初始部署阶段。建立信任期。用户需要理解每位 Agent 的判断质量和领域能力。 -+ -+**优势**:完全可控、零风险无限循环、用户随时可以重定向讨论 -+**劣势**:用户成为瓶颈——每一步都需要人工输入 -+ -+--- -+ -+### Level 2: Agent Orchestrated(Agent 编排) -+ -+**描述**:指定一位"编排 Agent"(CTO 负责技术讨论、CoS 负责战略讨论),该 Agent 有权 @mention 其他 Agent 并管理讨论节奏。 -+ -+**机制**: -+1. 用户启动讨论并指定编排者:"@CTO 请主持这次架构评审,涉及 @Builder 和 @Ops" -+2. 编排 Agent(CTO)分析需求,决定第一步找谁 -+3. CTO @mention Builder: "请评估可行性" -+4. Builder 响应后(`allowBots: true` 使 CTO 能看到 Builder 的消息),CTO 决定下一步 -+5. CTO @mention Ops: "请评估安全影响" -+6. CTO 综合后发布结论或继续迭代 -+7. 如需用户决策,CTO @mention 用户 -+ -+**配置要求**: -+```json -+// 编排者 Agent 的 AGENTS.md 增加以下指令段: -+// ## 协作编排模式(Level 2) -+// 当用户指定你为编排者时: -+// 1. 分析参与者列表和讨论目标 -+// 2. 按逻辑顺序 @mention 参与者 -+// 3. 每位参与者响应后,判断:需要更多输入?收敛了?有分歧需要用户裁决? -+// 4. 最多 {maxOrchestratedRounds} 轮后必须产出结论或升级到用户 -+// 5. 每轮 @mention 最多 1 位 Agent(避免并发冲突) -+``` -+ -+技术实现的关键点——编排者如何看到其他 Agent 的响应: -+- 编排者的 bot 在共享频道中,`allowBots: true` 使其收到其他 bot 的消息 -+- `requireMention: true` 确保编排者不会对非相关消息响应 -+- 编排者的 AGENTS.md 指令决定何时主动 @mention 下一位参与者 -+ -+**防护栏**: -+- **轮次硬限制**:`maxOrchestratedRounds = 8`(AGENTS.md 指令层)。超过时编排者必须产出"当前最佳结论 + 遗留分歧" -+- **沉默检测**:如果被 @mention 的 Agent 30 秒内无响应,编排者应在 thread 中标注 `[TIMEOUT: <Agent> 未响应]` 并继续 -+- **用户干预**:用户随时可以在 thread 中发言,所有 Agent 看到用户消息后应暂停自动编排,等待用户指示 -+- **编排者不自封**:编排者角色由用户指定,Agent 不能自行升级为编排者 -+ -+**推荐成熟度**:经过 Level 1 验证 Agent 判断质量后。适合重复性高的协作模式(如每周评审)。 -+ -+**优势**:减少用户参与频率,Agent 自主推进讨论 -+**劣势**:编排者可能引入偏见(总是先问某个 Agent),讨论可能偏离用户预期 -+ -+--- -+ -+### Level 3: Event-Driven(事件驱动) -+ -+**描述**:Agent 基于"相关性信号"自主决定是否加入讨论。不需要被 @mention,而是检测到与自身领域相关的内容时主动贡献。 -+ -+**机制**: -+1. 用户或 Agent 在 thread 中发言 -+2. 所有参与频道的 Agent 收到消息(`allowBots: true` + `requireMention: false`) -+3. 每位 Agent 内部评估"这条消息与我的领域相关吗?" -+4. 如果相关度高于阈值,Agent 主动发言 -+5. 如果相关度低,Agent 保持沉默 -+ -+**配置要求**: -+```json -+{ -+ "channels": { -+ "slack": { -+ "channels": { -+ "<COLLAB_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, // 关键:不需要 mention 即可触发 -+ "allowBots": true -+ } -+ } -+ } -+ } -+} -+ -+// 每位 Agent 的 AGENTS.md 增加: -+// ## 事件驱动参与(Level 3) -+// 当你在协作频道收到非 @mention 消息时: -+// 1. 评估与你领域的相关性(0-10) -+// 2. 相关性 < 7:不响应 -+// 3. 相关性 >= 7 且你有独特视角:发言,前缀 "[proactive]" -+// 4. 相关性 >= 7 但已有其他 Agent 覆盖:不响应 -+// 5. 每个 thread 最多主动发言 2 次(避免噪音) -+``` -+ -+**为什么列为 FUTURE**: -+ -+Event-Driven 模式的核心挑战是**相关性判断的可靠性**。当前 LLM 在"是否应该发言"这个元判断上不够稳定——可能过度参与(噪音)或遗漏关键时刻(静默错误)。需要以下前置条件成熟后再启用: -+1. 通过 Level 1/2 积累足够多的"Agent 应该在什么时候发言"的实战数据 -+2. 在 AGENTS.md 中精炼出每位 Agent 的"触发条件清单" -+3. 建立"proactive 发言质量"的评估机制 -+ -+**防护栏**: -+- **频率限制**:每位 Agent 在每个 thread 中最多主动发言 2 次 -+- **冷却期**:同一 Agent 在同一 thread 中两次主动发言间隔至少 3 分钟 -+- **[proactive] 前缀**:主动发言必须标注,方便用户区分"被要求的"和"自主的" -+- **用户静音**:用户可以在 thread 中发 `@Agent MUTE` 让特定 Agent 在该 thread 中保持沉默 -+- **紧急刹车**:如果一个 thread 中 Agent 消息数超过 20 且无用户参与,所有 Agent 自动进入 WAIT -+ -+**推荐成熟度**:远期目标。需要 Level 2 运行稳定 + 相关性判断经过充分验证。 -+ -+**优势**:最接近"真实团队"的讨论体验,Agent 主动贡献洞察 -+**劣势**:噪音风险最高,调试困难,用户可能感到"Agent 在自说自话" -+ -+--- -+ -+## 3. Shared Thread Protocol -+ -+### 3.1 Thread 命名规范 -+ -+多 Agent 协作 thread 的 root message 必须包含以下前缀: -+ -+``` -+COLLAB <TYPE> | <TITLE> | <DATE> -+``` -+ -+其中 TYPE 对应协作模式: -+ -+| TYPE | 对应模式 | 示例 | -+|------|---------|------| -+| `REVIEW` | Architecture Review / Code Review | `COLLAB REVIEW \| A2A v2 协议草案评审 \| 2026-03-27` | -+| `ALIGN` | Strategic Alignment | `COLLAB ALIGN \| GitHub Issues 自动化方向 \| 2026-03-27` | -+| `INCIDENT` | Incident Response | `COLLAB INCIDENT \| Builder 无响应 P2 \| 2026-03-27` | -+| `SYNTH` | Knowledge Synthesis | `COLLAB SYNTH \| Week 12 知识综合 \| 2026-03-27` | -+| `DISCUSS` | 通用讨论(不匹配以上模式) | `COLLAB DISCUSS \| 是否引入 MCP 支持 \| 2026-03-27` | -+ -+**与现有 A2A 前缀的共存**: -+- 委派式 A2A 继续使用 `A2A <FROM>→<TO> | <TITLE> | TID:<timestamp>` 前缀 -+- 协作式 thread 使用 `COLLAB <TYPE>` 前缀 -+- 两种前缀可以在同一频道共存——人和 Agent 都能快速区分"委派任务"和"协作讨论" -+ -+### 3.2 Turn Structure(轮次格式) -+ -+每位 Agent 在协作 thread 中的发言遵循以下格式: -+ -+``` -+[<角色>] <动作类型> -+<内容> -+[STATUS: <状态>] -+``` -+ -+**动作类型**: -+ -+| 动作 | 含义 | 使用者 | -+|------|------|-------| -+| `Proposal` | 提出方案 | 任何提案者 | -+| `Review` | 评审意见 | 评审者 | -+| `Response` | 回应他人意见 | 被评审者 | -+| `Diagnosis` | 诊断分析 | 技术角色 | -+| `Fix` | 修复方案 | 实施者 | -+| `Synthesis` | 综合总结 | 编排者/KO | -+| `Escalation` | 升级到用户 | 任何 Agent | -+| `[proactive]` | 主动发言 | Level 3 模式 | -+ -+**状态标签**: -+ -+| STATUS | 含义 | -+|--------|------| -+| `WAIT` | 等待下一步指令 | -+| `NEEDS_INPUT:<Agent/User>` | 需要特定方的输入 | -+| `CONVERGED` | 认为讨论已收敛 | -+| `BLOCKED:<原因>` | 被阻塞 | -+| `DECISION:<结论>` | 最终决策(通常由编排者/用户发出) | -+ -+**示例**: -+ -+``` -+[CTO] Proposal -+建议使用 multi-account Slack 配置实现多 Agent 协作。 -+核心变更:3 个 Slack app(CoS/CTO/Builder),共享 #collab 频道,allowBots + requireMention。 -+[STATUS: NEEDS_INPUT:Builder] -+``` -+ -+``` -+[Builder] Review -+可行性评估:multi-account 配置本身简单(已有文档),但需要创建 3 个 Slack app + 配置 OAuth。 -+工期估计:2 小时配置 + 1 小时验证。 -+风险:Socket Mode 下 3 个 app 的连接稳定性未验证。 -+severity: MINOR(风险可控) -+[STATUS: WAIT] -+``` -+ -+### 3.3 Context Management(上下文管理) -+ -+**Thread 历史控制**: -+ -+| 场景 | `initialHistoryLimit` 建议值 | 理由 | -+|------|---------------------------|------| -+| Architecture Review | 50 | 方案文本通常较长 | -+| Strategic Alignment | 30 | 对话式,轮次多但每轮短 | -+| Code Review | 80-100 | 代码/配置内容占用大量 token | -+| Incident Response | 30 | 快速迭代,每轮短 | -+| Knowledge Synthesis | 50 | 条目列表 + 评审意见 | -+ -+**上下文压缩策略**: -+ -+当 thread 超过 `initialHistoryLimit` 时,后加入的 Agent 只能看到最近 N 条消息。为此: -+ -+1. **编排者摘要责任**:每 5 轮,编排者发布一条 `[<角色>] Synthesis: Thread Summary`,概括到目前为止的关键决策和未决问题 -+2. **"到此为止"锚点**:编排者可以发布 `--- CONTEXT ANCHOR ---`,后续 Agent 只需要从这个锚点开始阅读 -+3. **附件而非内联**:超过 200 字的内容(代码块、配置文件、日志)应该放在 Slack 文件附件或外部链接中,而非内联到 thread 消息 -+ -+### 3.4 Termination Criteria(终止条件) -+ -+讨论"完成"的判定标准: -+ -+**显式终止**(推荐): -+1. **用户宣布**:用户在 thread 中发布 `RESOLVED` 或 `APPROVED` -+2. **编排者宣布**:编排者发布 `[<角色>] DECISION: <结论>` + `[STATUS: CONVERGED]` -+3. **所有参与者同意**:每位参与 Agent 发布 `[STATUS: CONVERGED]` -+ -+**隐式终止**(防护栏触发): -+1. **轮次上限**:达到 `maxDiscussionTurns`(默认 5 轮 for Level 1, 8 轮 for Level 2) -+2. **时间上限**:thread 最后一条消息超过 4 小时无新内容 -+3. **Agent 消息上限**:thread 中 Agent 消息总数超过 20 条(Level 3 的紧急刹车) -+ -+**终止后动作**: -+1. 编排者(或最后发言的 Agent)发布 discussion closeout(精简版 CLOSEOUT_TEMPLATE) -+2. 如果讨论产生了可执行任务 → 转入 A2A v1 委派流程 -+3. 如果讨论产生了知识 → 同步到 #know -+4. Thread 保留为可搜索的组织记忆 -+ -+### 3.5 Escalation(升级到人类) -+ -+以下情况 Agent 必须升级到用户: -+ -+| 触发条件 | 升级方式 | 升级消息格式 | -+|---------|---------|------------| -+| Agent 间分歧无法收敛(2+ 轮) | Thread 内 @mention 用户 | `[<角色>] Escalation: <Agent-A> 认为 X,<Agent-B> 认为 Y。需要你裁决。` | -+| 涉及 L3 动作 | Thread 内 @mention 用户 | `[<角色>] Escalation: 需要 L3 审批——<具体动作>。` | -+| 讨论偏离原始目标 | Thread 内 @mention 用户 | `[<角色>] Escalation: 讨论已偏离 "<原始目标>"。请确认是否调整方向。` | -+| 编排 Agent 不确定下一步 | Thread 内 @mention 用户 | `[<角色>] Escalation: 不确定应该咨询哪位 Agent 或是否可以结论。` | -+ -+### 3.6 与现有模板的集成 -+ -+**Discussion Closeout**(讨论收尾,基于 CLOSEOUT_TEMPLATE 精简版): -+ -+``` -+## Discussion Closeout -+- Thread: [COLLAB <TYPE> | <TITLE> | <DATE>] -+- Participants: [Agent 列表] -+- Rounds: [轮次数] -+ -+## Decisions -+1. ... -+2. ... -+ -+## Dissent(未达成共识的点) -+- ... -+ -+## Next Actions -+| Action | Owner | Type | -+|--------|-------|------| -+| ... | ... | A2A v1 委派 / 用户确认 / 知识入库 | -+ -+## Signal Score -+- [0-3] -+``` -+ -+**Discussion Checkpoint**(讨论中间切割,当讨论跨天或上下文膨胀时): -+ -+``` -+## Discussion Checkpoint #N -+- Thread: [COLLAB <TYPE> | <TITLE> | <DATE>] -+- Current Round: [M/max] -+ -+## 到目前为止的决策 -+- ... -+ -+## 未决问题 -+- ... -+ -+## 下一步 -+- 继续讨论:@<Agent> <问题> -+- 或:升级到用户 -+``` -+ -+--- -+ -+## 4. Integration with Harness Design -+ -+### 4.1 模式映射 -+ -+Anthropic 的 harness 设计使用 Planner → Builder → QA 的流水线。OpenCrew 的协作模式可以映射到 harness 的角色: -+ -+| Harness 角色 | OpenCrew 对应 | 协作模式中的体现 | -+|-------------|-------------|----------------| -+| **Planner** | CoS + CTO | Strategic Alignment (Pattern 2) 中 CoS 解读意图、CTO 规划路径 | -+| **Builder/Generator** | Builder | 所有模式中作为实施者/生产者 | -+| **Evaluator/QA** | CTO + Ops + KO | Code Review (Pattern 3) 中的多维评审 | -+| **Orchestrator/Harness** | CoS (Level 2) / 用户 (Level 1) | 编排模型中的编排者角色 | -+ -+**关键差异**: -+ -+1. **Harness 的 Evaluator 是单一的,OpenCrew 的评审是多维的**。Harness 的 GAN-inspired 模式是"Generator 产出 → Evaluator 挑战 → 迭代"。OpenCrew 的评审是"Builder 产出 → CTO 评架构 → Ops 评安全 → KO 评知识一致性"。多维评审能发现单一评审者的盲区。 -+ -+2. **Harness 的 Planner 是预先规划的,OpenCrew 的对齐是协商的**。Harness 的 Planner 独立写 spec,其他 Agent 执行。OpenCrew 的 Strategic Alignment 中,CTO 和 CIO 可以挑战 CoS 的解读——方案是讨论出来的,不是单方面决定的。 -+ -+3. **Harness 是批处理的,OpenCrew 是流式的**。Harness 中 Planner 写完 spec 文件再由 Builder 读取。OpenCrew 中 Agent 可以在 thread 中实时看到其他 Agent 的思路演化,实时调整自己的判断。 -+ -+### 4.2 Slack Thread 作为"Live Blackboard" -+ -+Harness 设计中的核心通信模式是 **Blackboard Pattern**:Agent 写文件到共享空间,其他 Agent 读取。 -+ -+Slack thread 可以视为**实时版 Blackboard**: -+ -+| Blackboard 概念 | 文件式(Harness) | 聊天式(OpenCrew Slack) | -+|----------------|-----------------|------------------------| -+| 写入 | Agent 写文件到 `output/` | Agent 在 thread 中发消息 | -+| 读取 | Agent 读取 `output/` 中的文件 | Agent 加载 thread 历史(`initialHistoryLimit`) | -+| 结构 | 文件名 + 文件内容 | 消息前缀 `[角色] 动作类型` + 内容 | -+| 持久化 | Git 仓库 | Slack 搜索(免费版 90 天) | -+| 版本控制 | Git diff | Thread 消息时间线 | -+| 访问控制 | 文件系统权限 | Slack 频道权限 + `requireMention` | -+ -+**优势**: -+- **人可读**:Slack thread 是人类自然使用的界面。不需要"打开文件查看 Agent 在做什么" -+- **可介入**:人在 thread 中发消息等于"实时写入 Blackboard",Agent 立即可见 -+- **可搜索**:Slack 的全文搜索等于 Blackboard 的历史索引 -+ -+**劣势**: -+- **结构松散**:文件可以有精确的 schema,thread 消息是自由文本 -+- **上下文窗口受限**:文件可以无限大,thread 受 `initialHistoryLimit` 限制 -+- **持久性不足**:Slack 免费版 90 天历史限制(Harness 的 Git 是永久的) -+ -+### 4.3 文件式 vs 聊天式:何时用哪个 -+ -+**用文件式(Harness)**: -+- 纯代码生成/修改任务(Builder 在仓库中工作) -+- 需要精确结构的产出物(配置文件、协议文档) -+- 需要 Git 版本控制的内容 -+- 长期运行的自动化流水线(无人值守) -+ -+**用聊天式(OpenCrew Slack)**: -+- 需要多方判断的决策(架构评审、战略对齐) -+- 需要实时人类参与的场景(事件响应、方向校正) -+- 需要跨领域视角的综合(知识综合) -+- 需要渐进式信任建设的场景(先 Level 1 再 Level 2) -+ -+**混合使用**: -+- 一次 Strategic Alignment(聊天式)产出结论后 → CTO 创建 Harness spec(文件式)→ Builder 在 Harness 中执行 → 产出通过 Code Review(聊天式)评审 -+- 这是"讨论决定做什么,Harness 执行怎么做,讨论验证做对了"的闭环 -+ -+### 4.4 Harness Evaluator 与 OpenCrew 多维评审的融合 -+ -+Harness 设计的最强概念之一是"Generator-Evaluator 对抗循环":Generator 倾向于创造,Evaluator 倾向于挑战,两者对抗产出高质量结果。 -+ -+OpenCrew 可以将这个概念泛化: -+ -+``` -+[Builder 产出] -+ ↓ -+[CTO 评架构] ← 对抗维度 1:设计合理性 -+ ↓ -+[Ops 评安全] ← 对抗维度 2:运维风险 -+ ↓ -+[KO 评一致性] ← 对抗维度 3:知识冲突 -+ ↓ -+[综合反馈 → Builder 修订] -+ ↓ -+[重复直到收敛] -+``` -+ -+这不是简单的"一个 Evaluator 说好或不好",而是"多个 Evaluator 从不同角度挑战,Builder 必须同时满足所有维度"。质量上限更高,但收敛时间更长——这就是为什么防护栏中设了评审轮次上限。 -+ -+--- -+ -+## 5. Config Template -+ -+### 5.1 Multi-Account Slack 配置(3 核心 Agent) -+ -+以下是启用多 Agent 协作的完整配置片段。基于 research_slack_r1.md 中确认的 OpenClaw 能力。 -+ -+```jsonc -+{ -+ "channels": { -+ "slack": { -+ // ===== 多账号配置 ===== -+ // 每个 Agent 一个独立的 Slack App,拥有独立的 bot identity -+ "accounts": { -+ "cos": { -+ "botToken": "${SLACK_BOT_TOKEN_COS}", // 环境变量引用,不明文 -+ "appToken": "${SLACK_APP_TOKEN_COS}", -+ "name": "CoS" -+ }, -+ "cto": { -+ "botToken": "${SLACK_BOT_TOKEN_CTO}", -+ "appToken": "${SLACK_APP_TOKEN_CTO}", -+ "name": "CTO" -+ }, -+ "builder": { -+ "botToken": "${SLACK_BOT_TOKEN_BUILDER}", -+ "appToken": "${SLACK_APP_TOKEN_BUILDER}", -+ "name": "Builder" -+ } -+ }, -+ -+ // ===== 频道配置 ===== -+ "channels": { -+ // --- 协作频道(多 Agent 讨论发生在这里) --- -+ "<COLLAB_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": true, // 必须 @mention 才触发(循环防护) -+ "allowBots": true // 允许处理其他 bot 的消息(协作核心) -+ }, -+ -+ // --- 各 Agent 专属频道(保持现有行为) --- -+ "<HQ_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, // CoS 在自己频道不需要 mention -+ "allowBots": false // 专属频道不接受 bot 消息 -+ }, -+ "<CTO_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, -+ "allowBots": false -+ }, -+ "<BUILD_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, -+ "allowBots": false -+ } -+ }, -+ -+ // ===== Thread 配置 ===== -+ "thread": { -+ "historyScope": "thread", // 每个 thread 独立 session -+ "inheritParent": true, // thread 继承 root message 上下文 -+ "initialHistoryLimit": 50 // 加入 thread 时加载最近 50 条消息 -+ } -+ } -+ }, -+ -+ // ===== Agent 绑定 ===== -+ "bindings": [ -+ // CoS: 绑定到 cos 账号 -+ { -+ "agentId": "cos", -+ "match": { -+ "channel": "slack", -+ "accountId": "cos" -+ } -+ }, -+ // CTO: 绑定到 cto 账号 -+ { -+ "agentId": "cto", -+ "match": { -+ "channel": "slack", -+ "accountId": "cto" -+ } -+ }, -+ // Builder: 绑定到 builder 账号 -+ { -+ "agentId": "builder", -+ "match": { -+ "channel": "slack", -+ "accountId": "builder" -+ } -+ } -+ ] -+} -+``` -+ -+### 5.2 Slack App 创建清单(每个 Agent 重复) -+ -+每个 Slack App 需要以下配置: -+ -+**Bot Token Scopes**(OAuth & Permissions): -+- `channels:history` -- 读取频道历史 -+- `channels:read` -- 读取频道信息 -+- `chat:write` -- 发送消息 -+- `chat:write.customize` -- 自定义发送者名称/图标(可选,multi-account 下不需要) -+- `users:read` -- 读取用户信息 -+ -+**Event Subscriptions**: -+- `message.channels` -- 接收频道消息事件 -+- `app_mention` -- 接收 @mention 事件 -+ -+**Socket Mode**: -+- 启用 Socket Mode -+- App-level token scope: `connections:write` -+ -+**注意事项**: -+- 每个 App 必须被邀请进所有需要参与的频道(专属频道 + 共享协作频道) -+- Socket Mode 连接数:3 个 App = 3 个持久 WebSocket 连接。研究报告指出这在正常范围内,但如果扩展到 5-7 个 App,建议考虑切换到 HTTP Events API 模式(每个 account 配置不同的 `webhookPath`) -+ -+### 5.3 AGENTS.md 协作指令段(追加到现有 AGENTS.md) -+ -+以下指令段应追加到参与协作的 Agent 的 AGENTS.md 中: -+ -+```markdown -+## 多 Agent 协作协议(追加段) -+ -+### 识别协作 Thread -+- 以 `COLLAB <TYPE>` 开头的 thread 是协作讨论 -+- 以 `A2A <FROM>→<TO>` 开头的 thread 是委派任务 -+- 在协作 thread 中,你是讨论参与者,不是任务执行者 -+ -+### 发言格式 -+- 每条发言以 `[<你的角色>] <动作类型>` 开头 -+- 结尾标注 `[STATUS: <状态>]` -+- 动作类型:Proposal / Review / Response / Diagnosis / Fix / Synthesis / Escalation -+- 状态:WAIT / NEEDS_INPUT:<谁> / CONVERGED / BLOCKED:<原因> / DECISION:<结论> -+ -+### 编排纪律 -+- Level 1(默认):响应后 WAIT。不主动 @mention 其他 Agent -+- Level 2(用户指定你为编排者时):可以 @mention 其他 Agent 推进讨论。最多 8 轮后必须收敛或升级 -+- 未被 @mention 的协作 thread 消息:不响应(由 requireMention 硬约束保证) -+ -+### 防护栏 -+- 每条发言不超过 500 字。超过时拆分为"摘要 + 链接" -+- 每个 thread 最多参与 5 轮讨论。超过时发布 `[STATUS: CONVERGED]` 或 `Escalation` -+- 如果发现讨论偏离原始目标,发布 Escalation 提醒用户 -+``` -+ -+### 5.4 迁移策略(从单 bot 到多 bot) -+ -+``` -+Phase 0: 准备(无风险) -+ - 创建 3 个 Slack App(CoS/CTO/Builder) -+ - 获取 token,配置环境变量 -+ - 不修改 openclaw 配置 -+ -+Phase 1: 创建协作频道(低风险) -+ - 创建 #collab 频道 -+ - 邀请 3 个 bot 进入 #collab -+ - 在 openclaw 配置中添加 #collab 的频道配置(allowBots + requireMention) -+ - 原有频道配置不变 -+ -+Phase 2: 切换 Agent 绑定(中风险,可回滚) -+ - 修改 bindings 为 multi-account 模式 -+ - 保留原 bot 作为 fallback(如果用了新 App Token 后出问题) -+ - 逐个 Agent 切换:先 CTO → 验证 → 再 Builder → 验证 → 最后 CoS -+ - 每步验证:在专属频道和 #collab 分别测试响应 -+ -+Phase 3: 验证协作模式(低风险) -+ - 在 #collab 中手动测试 Architecture Review 模式 -+ - 验证:Agent A 的消息 Agent B 是否能看到 -+ - 验证:requireMention 是否正确过滤非相关消息 -+ - 验证:thread 历史加载是否完整 -+ -+Phase 4: 启用 Level 2 编排(中风险) -+ - 在 CTO/CoS 的 AGENTS.md 中追加协作指令段 -+ - 测试 Agent 编排模式 -+ - 观察是否有无限循环或噪音问题 -+``` -+ -+--- -+ -+## 6. Open Questions & Risks -+ -+### 确认度高的发现 -+ -+1. **Slack multi-account 支持一切所需能力** -- `allowBots: true` + `requireMention: true` + `initialHistoryLimit` 组合已被 OpenClaw 官方文档和社区实践确认 -+2. **Feishu 无法支持实时多 Agent 讨论** -- 平台限制(bot 消息不触发其他 bot 事件),这是 Feishu Open Platform 的设计决策,非 bug -+3. **Discord 等待 #11199 修复** -- 修复后机制与 Slack 类似 -+ -+### 需要验证的假设 -+ -+1. **Agent @mention 其他 bot 的可靠性**:当 CTO 的 LLM 在消息中输出 "@Builder" 时,OpenClaw 是否会将其转换为 Slack 的原生 mention 格式(`<@BOT_USER_ID>`)?如果只是纯文本 "@Builder",接收方的 bot 可能不会识别为 mention。**这是 Level 2 编排的关键前提,需要实测。** -+ -+2. **Thread 历史中 bot 消息的呈现**:当 Agent 加载 thread 历史时,其他 bot 的消息是否以可理解的方式呈现(包含发送者身份)?还是所有 bot 消息都显示为同一个"bot"? -+ -+3. **Socket Mode 多连接稳定性**:3 个 Slack App 各自维持 WebSocket 连接。在网络波动时是否存在重连竞争或消息丢失?研究报告建议 5+ App 时切换 HTTP 模式。 -+ -+4. **`allowBots: "mentions"` 模式的可用性**:DeepWiki 分析提到这个值但官方文档未明确列出。需要确认是否可用——如果可用,它比 `allowBots: true` 更安全(只处理 @mention 自己的 bot 消息)。 -+ -+### 架构风险 -+ -+| 风险 | 影响 | 缓解 | -+|------|------|------| -+| Agent 讨论质量不够(废话多、不收敛) | 用户体验差,噪音 > 信号 | 严格的发言格式 + 轮次限制 + 持续迭代 AGENTS.md | -+| 上下文窗口不够(长讨论后 Agent 忘记早期内容) | 讨论循环、重复、遗漏 | Context Anchor 机制 + 编排者摘要 | -+| Slack 免费版 90 天历史限制 | 历史讨论不可回溯 | 重要讨论的结论同步到 Git(closeout → 仓库) | -+| 多 bot 配置复杂度高 | 上手门槛增加 | 分阶段迁移 + 详细配置文档 | -+| Level 2 编排者偏见 | 讨论结论受编排者立场影响 | 多种编排者可选(CoS 主持战略、CTO 主持技术)+ 用户随时可介入 | -+ -+--- -+ -+## Appendix A: Quick Reference Card -+ -+### 选择协作模式 -+ -+``` -+需要做什么? -+├─ 评审技术方案 → Pattern 1: Architecture Review -+├─ 对齐新方向 → Pattern 2: Strategic Alignment -+├─ 评审已完成的工作 → Pattern 3: Code/Design Review -+├─ 处理紧急问题 → Pattern 4: Incident Response -+├─ 验证知识准确性 → Pattern 5: Knowledge Synthesis -+└─ 以上都不是 → COLLAB DISCUSS(通用讨论) -+``` -+ -+### 选择编排级别 -+ -+``` -+对 Agent 判断质量有信心吗? -+├─ 不确定 → Level 1: Human Orchestrated -+├─ 有信心,但想保持监督 → Level 2: Agent Orchestrated -+└─ 完全信任 + 已验证 → Level 3: Event-Driven (FUTURE) -+``` -+ -+### 讨论何时结束? -+ -+``` -+是否达到以下任一条件? -+├─ 用户说 RESOLVED/APPROVED → 结束 -+├─ 编排者发布 DECISION + CONVERGED → 结束 -+├─ 所有参与者都 CONVERGED → 结束 -+├─ 超过 maxDiscussionTurns → 强制结束,产出当前最佳结论 -+├─ 超过 4 小时无新消息 → 隐式结束 -+└─ 以上都没有 → 继续讨论 -+``` -+ -+## Appendix B: Collaboration vs Delegation Decision Matrix -+ -+| 维度 | 用 Delegation (A2A v1) | 用 Collaboration (A2A v2) | -+|------|----------------------|--------------------------| -+| 任务清晰度 | 高(DoD 明确) | 低(需要讨论才能明确) | -+| 参与者数量 | 2(指派方 + 执行方) | 3+(多方讨论) | -+| 是否需要对抗/挑战 | 否(执行即可) | 是(需要不同视角) | -+| 人类参与需求 | 低(启动 + 验收) | 高(引导讨论 + 裁决分歧) | -+| 适用任务类型 | A/P(执行型) | P/S(决策型、评审型) | -+| 平台要求 | 所有平台均支持 | Slack NOW / Discord AFTER #11199 / Feishu NOT SUPPORTED | -+ -+--- -+ -+> 本报告基于 research_slack_r1.md、research_discord_r1.md、research_feishu_r1.md 的发现,以及 ARCHITECTURE.md、A2A_PROTOCOL.md、SYSTEM_RULES.md 的现有架构设计。所有协作模式在 Slack 上仅需配置变更即可实现,无需上游代码修改。 diff --git a/.harness/reports/architecture_protocol_r1.md b/.harness/reports/architecture_protocol_r1.md deleted file mode 100644 index a31024f..0000000 --- a/.harness/reports/architecture_protocol_r1.md +++ /dev/null @@ -1,918 +0,0 @@ -commit 7e825263db36aef68792a050c324daef598b4c56 -Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> -Date: Sat Mar 28 17:38:48 2026 +0800 - - feat: add A2A v2 research harness, architecture, and agent definitions - - Multi-agent harness for researching and designing A2A v2 protocol: - - Research reports (Phase 1): - - Slack: true multi-agent collaboration via multi-account + @mention - - Feishu: groupSessionScope + platform limitation analysis - - Discord: multi-bot routing + Issue #11199 blocker analysis - - Architecture designs (Phase 2): - - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode - - 5 collaboration patterns: Architecture Review, Strategic Alignment, - Code Review, Incident Response, Knowledge Synthesis - - 3-level orchestration: Human → Agent → Event-Driven - - Platform configs, migration guides, 6 ADRs - - Agent definitions for Claude Code Agent Teams: - - researcher.md, architect.md, doc-fixer.md, qa.md - - QA verification: all issues resolved, PASS verdict after fixes. - - Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> - -diff --git a/.harness/reports/architecture_protocol_r1.md b/.harness/reports/architecture_protocol_r1.md -new file mode 100644 -index 0000000..6afaa80 ---- /dev/null -+++ b/.harness/reports/architecture_protocol_r1.md -@@ -0,0 +1,885 @@ -+# A2A 协作协议 v2(跨平台多 Agent) -+ -+> **版本**: v2.0-draft | **日期**: 2026-03-27 | **作者**: Architecture Agent -+> -+> 目标:统一 Slack / Feishu / Discord 三平台的 Agent-to-Agent 协作,同时支持: -+> - **Delegation(委派模式)**:结构化任务包 + `sessions_send` 触发(v1 继承,全平台可用) -+> - **Discussion(讨论模式)**:@mention 驱动的多方对话(v2 新增,Slack 立即可用,Discord 待修复) -+> -+> 设计原则: -+> 1. 单 bot 用户不受破坏(向后兼容) -+> 2. 多 bot 解锁新能力(渐进增强) -+> 3. 不同平台不同能力(平台感知) -+> 4. 两种模式共存,互补而非替代 -+ -+--- -+ -+## 0. 术语 Terminology -+ -+| 术语 | 定义 | -+|------|------| -+| **A2A** | Agent-to-Agent 协作流程总称,包含 Delegation 和 Discussion 两种模式 | -+| **Delegation(委派)** | v1 模式。Agent-A 构造完整任务包,通过 `sessions_send` 触发 Agent-B 执行。单向、结构化、全平台可用 | -+| **Discussion(讨论)** | v2 模式。多个 Agent 在同一 thread/topic 中通过 @mention 参与多方对话。多向、自然语言、平台受限 | -+| **Task Thread** | 在目标 Agent 的频道/群组/channel 里创建的任务线程。该线程即该任务的独立 Session | -+| **Anchor Message(锚点消息)** | 在目标频道发布的可见 root message,作为任务的人类审计入口 | -+| **Multi-Account(多账户)** | OpenClaw 特性:每个 Agent 使用独立的 bot app(独立 token/identity/quota) | -+| **Cross-Bot Visibility(跨 bot 可见性)** | 平台层面:Bot-A 的消息是否能被 Bot-B 的事件处理器接收 | -+| **Self-Loop Filter(自循环过滤)** | OpenClaw 层面:忽略"自己发出的消息"以防止无限循环 | -+| **Session Key** | OpenClaw 用于标识一次对话会话的唯一键,格式因平台和配置而异 | -+| **Turn** | 一次 Agent 响应周期。Discussion 模式中由 @mention 触发,Delegation 模式中由 `sessions_send` 触发 | -+| **Orchestrator** | 控制讨论节奏的角色(人类或指定 Agent),决定下一个发言者 | -+ -+--- -+ -+## 1. 权限矩阵 Permission Matrix(必须遵守) -+ -+> 本节与 v1 完全一致。技术上 bot 可以给任意频道发消息,但这是组织纪律,不遵守视为 bug。 -+ -+| 角色 | 可派单/发起讨论 | 约束 | -+|------|----------------|------| -+| CoS | CTO(默认不直达 Builder) | 方向对齐、战略决策 | -+| CTO | Builder / Research / KO / Ops | 技术决策、任务分解 | -+| Builder | 不主动派单;需澄清时回 CTO thread | 执行、实现 | -+| CIO | 尽量独立;必要时与 CoS/KO 同步 | 领域专长 | -+| KO/Ops | 通常不主动派单 | 审计、沉淀 | -+ -+**v2 补充(Discussion 模式)**: -+- Discussion 中的 @mention 也必须遵守权限矩阵。Builder 不应 @mention CoS 发起战略讨论。 -+- Orchestrator(通常是发起讨论的角色或 CTO)控制 @mention 顺序,间接执行权限。 -+- Agent SOUL.md/AGENTS.md 中须写明:"在 Discussion 中只回应自己被 @mention 的消息,不主动跨域发言。" -+ -+--- -+ -+## 2. 触发模式 Trigger Modes -+ -+### 2a. Delegation Mode(委派模式 -- v1 继承) -+ -+> 全平台可用。适用于:结构化任务委派、需要完整任务包的执行型工作。 -+ -+**触发流程(三步)**: -+ -+#### Step 1 -- 创建可见锚点消息 Anchor Message -+ -+A 在 B 的频道/群组创建 root message,第一行固定前缀: -+ -+``` -+A2A <FROM>-><TO> | <TITLE> | TID:<YYYYMMDD-HHMM>-<short> -+``` -+ -+正文必须是完整任务包(使用 `SUBAGENT_PACKET_TEMPLATE.md`): -+- Objective / DoD / Inputs / Constraints / Output format / CC -+ -+> 前置条件:bot 必须已加入目标频道/群组。Slack 报 `not_in_channel`;Feishu 需手动拉 bot 进群;Discord 需 bot 有 View Channel 权限。 -+ -+#### Step 2 -- `sessions_send` 触发目标 Agent -+ -+A 读取 root message 的消息 ID,拼出 thread/topic sessionKey: -+ -+| 平台 | Session Key 格式 | -+|------|-----------------| -+| Slack | `agent:<B>:slack:channel:<channelId>:thread:<root_ts>` | -+| Feishu | `agent:<B>:feishu:group:<chatId>:topic:<root_id>` (需 `groupSessionScope: "group_topic"`) | -+| Discord | `agent:<B>:discord:channel:<channelId>` (thread 继承父 channel) | -+ -+然后 A 用 `sessions_send(sessionKey=..., message=<完整任务包>)` 触发 B。 -+ -+> **timeout 容错**:`sessions_send` 返回 timeout 不等于没送达。消息可能已送达并被处理。 -+> 规避:在 thread 里补发兜底消息。 -+> -+> **SessionKey 注意**:不要手打。优先从 `sessions_list` 复制 `deliveryContext` 匹配的 key。注意大小写一致性。 -+ -+#### Step 3 -- 执行与汇报 -+ -+- B 的执行与产出都留在该 thread/topic。 -+- 上游(如 CTO)在自己的协调 thread 里同步 checkpoint/closeout。 -+- 完成后必须 closeout(见第 4 节)。 -+ -+### 2b. Discussion Mode(讨论模式 -- v2 新增) -+ -+> 仅在支持跨 bot 可见性的平台可用。适用于:多方讨论、设计评审、头脑风暴。 -+ -+**前提条件**: -+1. Multi-Account 已配置(每个参与讨论的 Agent 有独立 bot app) -+2. 共享频道配置了 `allowBots: true`(或 `"mentions"`)+ `requireMention: true` -+3. 平台支持跨 bot 消息可见性(见 2c 平台能力矩阵) -+ -+**触发流程(单步)**: -+ -+#### Step 1 -- 在共享 thread 中 @mention 目标 Agent -+ -+Orchestrator(人类或指定 Agent)在 thread 中发送包含 @mention 的消息: -+ -+``` -+@CTO 这个架构方案的可行性如何?请从技术角度评估。 -+``` -+ -+目标 Agent 的 bot 收到 mention 事件,加载 thread history 作为上下文,然后响应。 -+ -+#### Step 2 -- 多轮迭代 -+ -+Orchestrator 根据回复决定下一步: -+- @mention 另一个 Agent 获取不同视角 -+- @mention 同一个 Agent 追问 -+- 人类直接介入修正方向 -+- 达成共识后总结并结束讨论 -+ -+**关键配置**: -+- `thread.historyScope: "thread"` -- 确保 Agent 看到完整 thread 历史 -+- `thread.initialHistoryLimit >= 50` -- 讨论可能较长,需要足够历史 -+- `thread.inheritParent: true` -- thread 参与者继承 root message 上下文 -+ -+**Discussion 专用规则**: -+- 每个 Agent 只在被 @mention 时响应(`requireMention: true` 强制) -+- 响应格式:`[角色] 内容...`(与 Delegation 模式保持一致的审计格式) -+- Agent 不得在 Discussion 中自行 @mention 其他 Agent,除非其 SOUL.md 明确授权为 Orchestrator -+- 讨论必须有明确的发起者和终止者(通常是同一个角色) -+ -+### 2c. 平台能力矩阵 Platform Capability Matrix -+ -+| 能力 | Slack | Discord | Feishu | -+|------|-------|---------|--------| -+| **Multi-Account** | YES | YES (PR #3672) | YES (已知 bug #47436) | -+| **跨 bot 消息可见性** | YES (Events API) | YES (MESSAGE_CREATE) 但 OpenClaw 过滤 (#11199) | NO (平台限制) | -+| **Delegation Mode** | YES | YES | YES | -+| **Discussion Mode** | **YES (NOW)** | **BLOCKED** (待 #11199 修复) | **NO** (平台不支持) | -+| **Self-loop 隔离** | 每 bot 独立 user ID | 全局过滤所有配置 bot (bug) | N/A (bot 消息不触发事件) | -+| **Thread/Topic 隔离** | `historyScope: "thread"` | Thread 继承 parent channel | `groupSessionScope: "group_topic"` | -+| **视觉身份** | 多 bot = 多身份 | 多 bot = 多身份 | 多 bot = 多身份 | -+ -+**平台特性详解**: -+ -+**Slack(能力最强)**: -+- Slack Events API 将所有频道消息投递给所有已加入的 bot app,不区分来源 -+- OpenClaw 的 self-loop 过滤是 per-bot-user-ID:Bot-CTO 的消息不会被 Bot-Builder 过滤 -+- `allowBots: true` + `requireMention: true` 即可实现安全的跨 bot 讨论 -+- **一步触发 @mention 现在就能用**,无需代码修改 -+ -+**Discord(待修复后接近 Slack)**: -+- Discord MESSAGE_CREATE 事件在平台层面支持跨 bot 可见 -+- **BLOCKER**: OpenClaw Issue #11199 -- bot 消息过滤器将所有已配置的 bot 视为"自己",Bot-A 的消息被 Bot-B 的 handler 丢弃 -+- 相关修复 PR: #11644, #22611, #35479(状态待确认) -+- 另有 Issue #45300: `requireMention` 在多账户配置下可能失效 -+- **修复后**:Discord 的 Discussion 能力将与 Slack 接近 -+ -+**Feishu(仅支持 Delegation)**: -+- **平台硬限制**:`im.message.receive_v1` 事件仅对用户发送的消息触发,bot 消息对其他 bot 不可见 -+- 这不是 OpenClaw 的问题,是飞书平台的设计 -+- 两步触发(anchor + `sessions_send`)无法简化 -+- Multi-Account 的价值:视觉身份、API 配额独立、权限隔离 -+ -+### 2d. 模式选择指南 Mode Selection Guide -+ -+| 场景 | 推荐模式 | 原因 | -+|------|---------|------| -+| CTO 给 Builder 派具体任务 | Delegation | 结构化任务包,明确 DoD | -+| 架构方案多方评审 | Discussion (Slack) / Delegation chain (Feishu, Discord) | 多方观点汇聚 | -+| 紧急故障协同 | Discussion (Slack) / Human-in-loop (all) | 实时交互需求 | -+| 长期项目多阶段交付 | Delegation | 需要 checkpoint/closeout 完整链路 | -+| 知识整理、复盘 | Delegation (KO) | 结构化产出 | -+ -+--- -+ -+## 3. 可见性契约 Visibility Contract -+ -+> 用户必须能在聊天 UI 中看到所有关键信息。Agent 之间的内部通信(`sessions_send`)对用户不可见,因此必须有配套的可见性保证。 -+ -+### 3.1 基础可见性(全模式、全平台) -+ -+1. **任务根消息可见**:每个任务的 anchor message 必须在目标频道可见 -+2. **关键 checkpoint 可见**:开始/阻塞/完成 至少更新 1 次到 thread -+3. **上游负责到底**:谁派单谁在自己的协调 thread 跟进(避免用户跨频道"捞信息") -+4. **双通道留痕**: -+ - A2A reply(给上游的结构化回复)-- 仅上游能看到 -+ - Thread/topic message(给用户可见的审计日志)-- 用户能看到 -+ - **两者都要做** -+ -+### 3.2 Delegation 模式可见性 -+ -+- Step 1 的 anchor message 是用户的唯一入口 -+- 所有执行过程在 anchor message 的 thread/topic 内进行 -+- 完成后必须 closeout(见第 4 节 closeout 规则) -+- `sessions_send` 触发后,发送方应等待并验证 thread 内出现回复 -+- 如果未收到回复,标记 `failed-delivery` 并上报 -+ -+### 3.3 Discussion 模式可见性(v2 新增) -+ -+- Discussion 天然可见 -- 所有对话都在共享 thread 中,用户直接可读 -+- 每个 Agent 的消息附带视觉身份(多 bot 模式下,各 bot 有独立头像/名称) -+- **Discussion 可见性优势**: -+ - 用户实时看到讨论过程,可随时介入 -+ - 不存在 Delegation 中"A2A reply 对用户不可见"的问题 -+ - Thread 本身即审计日志 -+- **Discussion 可见性要求**: -+ - 讨论结束后,Orchestrator 必须在 thread 中发布结论摘要 -+ - 如果 Discussion 产生了后续 Delegation 任务,必须在摘要中注明 TID 关联 -+ -+### 3.4 Multi-Bot 视觉身份 -+ -+| 平台 | 单 bot 模式 | 多 bot 模式 | -+|------|-----------|-----------| -+| Slack | 所有 Agent 共享一个 bot 名称/头像 | 每个 Agent 有独立 Slack app 身份 | -+| Feishu | 所有 Agent 共享一个飞书应用身份 | 每个 Agent 有独立飞书应用身份 | -+| Discord | 所有 Agent 共享一个 bot 身份 | 每个 Agent 有独立 Discord bot 身份 | -+ -+多 bot 模式下的视觉身份不需要额外配置 -- 每个 bot app 的 profile(名称、头像)即是 Agent 身份。Slack 额外支持 `chat:write.customize` 做运行时身份覆盖,但多 bot 模式下不需要。 -+ -+--- -+ -+## 4. 多轮纪律 Multi-Round Discipline -+ -+> 本节适用于 Delegation 和 Discussion 两种模式。 -+ -+### 4.1 Delegation 模式多轮规则(v1 继承) -+ -+当 Delegation 任务需要多轮迭代时: -+ -+- **每轮只聚焦 1-2 个改动点**,完成后**必须 WAIT** -+- **禁止一次性做完所有步骤** -- 等上游下一轮指令后再继续 -+- 每轮输出格式固定: -+ ``` -+ [<角色>] Round N/M -+ Done: <做了什么> -+ Run: <执行了什么命令> -+ Output: <关键输出,允许截断> -+ WAIT: 等待上游指令 -+ ``` -+- 最终轮贴 closeout 到 thread,A2A reply 中回复 `REPLY_SKIP` 表示完成 -+ -+#### Round0 审计握手(推荐) -+ -+在正式 Round1 前,先做一个极小的真实动作验证审计链路: -+- 要求目标 Agent 执行一个无副作用命令(如 `pwd`)并把结果贴到 thread -+- **看不到 Round0 回传就停止** -- 说明 session 可能没绑定到正确的 deliveryContext -+ -+### 4.2 Discussion 模式多轮规则(v2 新增) -+ -+讨论模式的多轮控制更加关键 -- 没有控制的 Agent 讨论可能无限循环。 -+ -+**Orchestrator 控制原则**: -+- 每次只 @mention 一个 Agent(避免并发响应冲突) -+- 收到回复后,由 Orchestrator 决定下一步:继续讨论 / 切换 Agent / 结束 -+- Orchestrator 可以是人类,也可以是指定 Agent(如 CTO 主持技术讨论) -+ -+**讨论轮次上限**: -+- `maxDiscussionTurns`:建议值 = 5(Level 1 人工编排) / 8(Level 2 Agent 编排) -+- 达到上限后,Orchestrator 必须总结当前状态并决定:结束 / 转为 Delegation / 请人类介入 -+- 此限制由 Orchestrator 在 SOUL.md/AGENTS.md 中自律执行(不是系统级强制) -+ -+> 与 Delegation 的 `maxPingPongTurns = 4` 类似,Discussion 的轮次限制防止失控循环。 -+> Delegation 的 `maxPingPongTurns` 是 OpenClaw 系统级参数;Discussion 的 `maxDiscussionTurns` 是协议级约定,由 Agent 自律。 -+ -+**Agent 响应规则**: -+- 只在被 @mention 时响应(`requireMention: true` 强制) -+- 响应必须包含明确的观点或建议(不允许"我同意"这样的空回复) -+- 如果 Agent 认为自己没有有价值的补充,应回复 `[角色] PASS: <一句话原因>` -+- 如果 Agent 认为讨论已达成共识,应在回复末尾标注 `CONSENSUS: <一句话共识>` -+ -+**讨论结束协议**: -+- Orchestrator 发布 `DISCUSSION_CLOSE`: -+ ``` -+ DISCUSSION_CLOSE -+ Topic: <讨论主题> -+ Consensus: <共识 / "未达成共识"> -+ Actions: <后续 Delegation 任务列表,含 TID> -+ Participants: <参与 Agent 列表> -+ ``` -+ -+### 4.3 Closeout 规则(全模式通用) -+ -+完成后必须 closeout(DoD 硬规则,缺一不可): -+1. 在目标 Agent thread/topic 贴 closeout(产物路径 + 验证命令) -+2. **上游本机复核**(CLI-first):至少执行关键命令 + 贴 exit code -+3. **回发起方频道汇报**:同步最终结果 + 如何验证 + 风险遗留。**不做视为任务未完成** -+4. 通知 KO 沉淀(默认:同步到 #know / KO 群组 + 触发 KO ingest) -+ -+Discussion 模式的 closeout 等价物是 `DISCUSSION_CLOSE` 摘要。如果讨论产生了 Delegation 任务,各 Delegation 任务各自 closeout。 -+ -+--- -+ -+## 5. 频道/群组映射 Channel & Group Mapping -+ -+### 5.1 标准映射 -+ -+| 角色 | Slack Channel | Feishu Group | Discord Channel | -+|------|--------------|-------------|----------------| -+| CoS | #hq | HQ 群 | #hq | -+| CTO | #cto | CTO 群 | #cto | -+| Builder | #build | Build 群 | #build | -+| CIO | #invest | Invest 群 | #invest | -+| KO | #know | Know 群 | #know | -+| Ops | #ops | Ops 群 | #ops | -+| Research | #research | Research 群 | #research | -+ -+### 5.2 Discussion 专用频道(v2 新增,可选) -+ -+多 bot Discussion 模式建议增设共享频道: -+ -+| 频道 | 用途 | 参与 bot | -+|------|------|---------| -+| #collab / 协作群 / #collab | 跨域讨论(架构评审、方案对齐) | CoS, CTO, Builder, CIO | -+| #war-room / 战情群 / #war-room | 紧急事件协同 | 全部 | -+ -+**共享频道配置要点**: -+- 所有参与 bot 必须加入该频道/群组 -+- `requireMention: true` -- 防止所有 bot 同时响应 -+- `allowBots: true` 或 `"mentions"` -- 允许 bot 看到其他 bot 的消息 -+- 每个 Agent 对该频道的 binding 都需要显式配置 -+ -+### 5.3 Session Key 格式总览 -+ -+**Slack**: -+``` -+# 频道级 -+agent:<agentId>:slack:channel:<channelId> -+# Thread 级(推荐) -+agent:<agentId>:slack:channel:<channelId>:thread:<root_ts> -+# 多账户(accountId 不影响 session key 格式) -+``` -+ -+**Feishu**: -+``` -+# 群组级(默认) -+agent:<agentId>:feishu:group:<chatId> -+# Topic 级(启用 groupSessionScope: "group_topic") -+agent:<agentId>:feishu:group:<chatId>:topic:<root_id> -+``` -+ -+**Discord**: -+``` -+# Channel 级 -+agent:<agentId>:discord:channel:<channelId> -+# 多账户(session key 含 accountId) -+agent:<agentId>:discord:<accountId>:channel:<channelId> -+``` -+ -+### 5.4 并行规则 -+ -+- **一个任务 = 一个 thread/topic = 一个 session** -+- 同一个频道可以并行多个任务 thread/topic;不要在频道主线里混聊多个任务 -+- Discussion 和 Delegation 可以在同一频道的不同 thread 中并行进行 -+ -+--- -+ -+## 6. 失败与回退 Failure & Fallback -+ -+### 6.1 Delegation 失败回退 -+ -+| 故障 | 表现 | 回退方案 | -+|------|------|---------| -+| `sessions_send` timeout | 工具返回超时 | 不代表失败。在 thread 补发兜底消息;等待并检查回复 | -+| 目标 Agent 无响应 | Thread 中无 Round0 回传 | 停止后续步骤;标记 `failed-delivery`;检查 session key 和 deliveryContext | -+| Session 路由到 webchat | Agent 在跑但 thread 无可见输出 | Round0 审计握手可提前发现;重新检查 session key 大小写 | -+| bot 未加入频道 | `not_in_channel` / 发送失败 | 手动邀请 bot 进入频道/群组 | -+| Thread 行为异常 | 消息进错 thread 或不进 thread | 退回到"频道主线单任务"模式;或发 /new 重置 session | -+ -+### 6.2 Discussion 失败回退 -+ -+| 故障 | 表现 | 回退方案 | -+|------|------|---------| -+| Agent 未响应 @mention | Thread 中无回复 | 检查 `allowBots` / `requireMention` 配置;检查 bot 是否已加入频道 | -+| Agent 响应了错误 thread | 回复出现在意外位置 | 检查 `thread.historyScope` 和 `inheritParent` 配置 | -+| 讨论进入死循环 | Agent 互相重复类似观点 | Orchestrator 强制 `DISCUSSION_CLOSE`;转为 Delegation | -+| 超出轮次上限 | 达到 `maxDiscussionTurns` | Orchestrator 总结并决定:结束 / 人类接管 / 拆分为子任务 | -+| 平台不支持 Discussion | Feishu;Discord 未修复 #11199 | 降级为 Delegation mode。用 `sessions_send` 串联多个 Agent 意见 | -+ -+### 6.3 跨平台降级策略 -+ -+``` -+Discussion Mode 可用? -+├── YES (Slack) → 使用 @mention 驱动讨论 -+├── BLOCKED (Discord) → 降级为 Delegation chain -+│ └── Orchestrator 用 sessions_send 逐个征求 Agent 意见 -+│ → 每个 Agent 的回复在共享 thread 中发布(可见性锚点) -+│ → Orchestrator 汇总后发布结论 -+└── NO (Feishu) → 同上 Delegation chain 方案 -+``` -+ -+--- -+ -+## 7. 平台配置片段 Platform Config Snippets -+ -+### 7.1 Slack 配置(多账户 + Discussion 模式) -+ -+```json -+{ -+ "channels": { -+ "slack": { -+ "accounts": { -+ "default": { -+ "botToken": "xoxb-cos-...", -+ "appToken": "xapp-cos-...", -+ "name": "CoS" -+ }, -+ "cto": { -+ "botToken": "xoxb-cto-...", -+ "appToken": "xapp-cto-...", -+ "name": "CTO" -+ }, -+ "builder": { -+ "botToken": "xoxb-bld-...", -+ "appToken": "xapp-bld-...", -+ "name": "Builder" -+ } -+ }, -+ "channels": { -+ "<HQ_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, -+ "allowBots": false -+ }, -+ "<CTO_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, -+ "allowBots": false -+ }, -+ "<BUILD_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": false, -+ "allowBots": false -+ }, -+ "<COLLAB_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": true, -+ "allowBots": true -+ } -+ }, -+ "thread": { -+ "historyScope": "thread", -+ "inheritParent": true, -+ "initialHistoryLimit": 50 -+ } -+ } -+ }, -+ "bindings": [ -+ { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, -+ { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, -+ { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } -+ ] -+} -+``` -+ -+**配置要点**: -+- 每个 Agent 专属频道:`requireMention: false`(该频道只有一个 bot 响应,无需 mention 门控) -+- 共享频道 #collab:`requireMention: true` + `allowBots: true`(多 bot 安全协作) -+- `thread.initialHistoryLimit: 50` -- Discussion 模式需要较大的历史窗口 -+- 每个 Slack app 需要 Bot Token Scopes: `channels:history`, `channels:read`, `chat:write`, `users:read` -+- Event Subscriptions: `message.channels`, `app_mention` -+- Socket Mode: 每个 app 需要独立的 app-level token (`xapp-`) -+ -+### 7.2 Feishu 配置(多账户 + Topic 隔离) -+ -+```json -+{ -+ "channels": { -+ "feishu": { -+ "domain": "feishu", -+ "connectionMode": "websocket", -+ "groupSessionScope": "group_topic", -+ "accounts": { -+ "cos-bot": { -+ "name": "CoS", -+ "appId": "cli_cos_xxxxx", -+ "appSecret": "your-cos-secret", -+ "enabled": true -+ }, -+ "cto-bot": { -+ "name": "CTO", -+ "appId": "cli_cto_xxxxx", -+ "appSecret": "your-cto-secret", -+ "enabled": true -+ }, -+ "builder-bot": { -+ "name": "Builder", -+ "appId": "cli_build_xxxxx", -+ "appSecret": "your-builder-secret", -+ "enabled": true -+ } -+ } -+ } -+ }, -+ "bindings": [ -+ { -+ "agentId": "cos", -+ "match": { -+ "channel": "feishu", -+ "accountId": "cos-bot", -+ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_HQ>" } -+ } -+ }, -+ { -+ "agentId": "cto", -+ "match": { -+ "channel": "feishu", -+ "accountId": "cto-bot", -+ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_CTO>" } -+ } -+ }, -+ { -+ "agentId": "builder", -+ "match": { -+ "channel": "feishu", -+ "accountId": "builder-bot", -+ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_BUILD>" } -+ } -+ } -+ ] -+} -+``` -+ -+**配置要点**: -+- `groupSessionScope: "group_topic"` -- 核心配置,实现 topic 级 session 隔离 -+- 每个群组只拉入对应的 bot(避免跨账户 dedup 竞争) -+- 建议使用"话题群"(topic group) 类型 -- 强制所有消息必须属于 topic,更契合 OpenCrew 工作流 -+- **已知问题**:Issue #47436 -- 第二个账户使用 SecretRef 时 plugin crash。PR #47652 已提交修复。在合并前使用明文 secret 或等待 patch -+ -+**Feishu 无法使用 Discussion 模式**: -+- 原因:`im.message.receive_v1` 仅对用户消息触发,bot 消息对其他 bot 不可见 -+- 替代方案:所有跨 Agent 协作使用 Delegation(`sessions_send`) -+ -+### 7.3 Discord 配置(多账户 + 频道权限隔离) -+ -+```json -+{ -+ "channels": { -+ "discord": { -+ "accounts": { -+ "default": { -+ "token": "BOT_TOKEN_COS" -+ }, -+ "cto": { -+ "token": "BOT_TOKEN_CTO" -+ }, -+ "builder": { -+ "token": "BOT_TOKEN_BUILDER" -+ } -+ } -+ } -+ }, -+ "bindings": [ -+ { -+ "agentId": "cos", -+ "match": { -+ "channel": "discord", -+ "accountId": "default", -+ "guildId": "<GUILD_ID>", -+ "peer": { "kind": "channel", "id": "<CHANNEL_ID_HQ>" } -+ } -+ }, -+ { -+ "agentId": "cto", -+ "match": { -+ "channel": "discord", -+ "accountId": "cto", -+ "guildId": "<GUILD_ID>", -+ "peer": { "kind": "channel", "id": "<CHANNEL_ID_CTO>" } -+ } -+ }, -+ { -+ "agentId": "builder", -+ "match": { -+ "channel": "discord", -+ "accountId": "builder", -+ "guildId": "<GUILD_ID>", -+ "peer": { "kind": "channel", "id": "<CHANNEL_ID_BUILD>" } -+ } -+ } -+ ] -+} -+``` -+ -+**Discord 频道权限隔离(必须配置)**: -+ -+由于单 bot 模式下 Issue #34 的教训(cos/ops 对话混淆),**必须**配置频道权限隔离: -+ -+1. 每个 bot 创建独立 Role(如 "CoS Bot", "CTO Bot", "Builder Bot") -+2. Server 级权限:仅授予 View Channels + Read Message History,**不授予 Send Messages** -+3. 逐频道授权: -+ - #hq: "CoS Bot" role -> Allow Send Messages + Send Messages in Threads -+ - #cto: "CTO Bot" role -> Allow Send Messages + Send Messages in Threads -+ - #build: "Builder Bot" role -> Allow Send Messages + Send Messages in Threads -+4. **确保 bot role 没有 Administrator 权限**(否则所有 channel-level override 失效) -+ -+**Thread 注意事项**: -+- Discord thread 会自动归档(默认 24h 无活动) -+- Bot 需要 Manage Threads 权限以 unarchive -+- 已完成任务的 thread 自然归档是可接受的行为 -+ -+**当前限制**: -+- Issue #11199 未修复前,Discussion 模式不可用 -+- Issue #45300: `requireMention` 在多账户配置下可能失效 -+- 所有跨 Agent 协作使用 Delegation(`sessions_send`) -+ -+--- -+ -+## 8. 迁移指南 Migration Guide -+ -+### 8.1 通用原则 -+ -+- **增量迁移**:一次只改一个 Agent,验证后再继续 -+- **保留单 bot**:原有单 bot 配置作为 `default` 账户保留,新 bot 逐个添加 -+- **可回滚**:每步都能通过恢复配置 + 重启 gateway 回退 -+- **Session 注意**:多账户迁移可能导致 session key 格式变化,旧 session 可能孤立 -+ -+### 8.2 Slack 迁移:单 bot -> 多 bot -+ -+**Phase 0 -- 准备(不影响运行)** -+ -+1. 为每个需要独立身份的 Agent 创建新的 Slack app(参照 SLACK_SETUP.md) -+ - 建议先创建 3 个核心 app: CoS, CTO, Builder -+ - 配置 Bot Token Scopes, Event Subscriptions, Socket Mode -+2. 将新 bot 邀请进对应频道 -+3. 备份当前 `openclaw.json` -+ -+**Phase 1 -- 切换到多账户配置** -+ -+1. 修改 `channels.slack` 从单 token 改为 `accounts` 格式: -+ ```json -+ // Before: -+ { "channels": { "slack": { "botToken": "xoxb-...", "appToken": "xapp-..." } } } -+ -+ // After: -+ { "channels": { "slack": { "accounts": { "default": { "botToken": "xoxb-...", "appToken": "xapp-..." } } } } } -+ ``` -+2. 添加 `accountId` 到 bindings -+3. 重启 gateway 验证:原有功能不受影响 -+ -+**Phase 2 -- 添加新 bot 账户** -+ -+1. 逐个添加新 Agent 的账户到 `accounts` -+2. 更新对应 binding 的 `accountId` -+3. 每添加一个,重启验证 -+ -+**Phase 3 -- 启用 Discussion 模式** -+ -+1. 创建 #collab 频道,邀请所有参与 bot -+2. 配置 #collab: `requireMention: true` + `allowBots: true` -+3. 为 #collab 频道添加每个 Agent 的 binding -+4. 测试:人类在 #collab 发帖,@mention 不同 Agent,验证各 Agent 独立响应 -+ -+**回滚**:任何阶段恢复备份 `openclaw.json` + 重启 gateway 即可回到单 bot 模式。 -+ -+### 8.3 Feishu 迁移:单 app -> 多 app + Topic 隔离 -+ -+**Phase 0 -- 启用 Topic 隔离(独立于多 app,可先做)** -+ -+1. 在 `openclaw.json` 中添加 `groupSessionScope: "group_topic"` -+2. 重启 gateway -+3. 验证:在群组中创建 topic 发消息,检查 session key 包含 `:topic:` 后缀 -+4. 注意:主线(非 topic)消息仍使用群组级 session key,向后兼容 -+ -+**Phase 1 -- 创建新飞书应用** -+ -+1. 在飞书开放平台创建新应用(每个 Agent 一个) -+2. 配置事件订阅:`im.message.receive_v1` -+3. 启用 WebSocket 连接模式 -+4. **注意 Bug #47436**:在 PR #47652 合并前,避免使用 SecretRef,改用明文 secret -+ -+**Phase 2 -- 切换到多账户配置** -+ -+1. 保留原 app 为 `legacy` 账户 -+2. 逐个添加新账户 + 更新 binding -+3. 将新 bot 拉入对应群组(每个群组只需拉入对应 bot) -+4. 验证每个 Agent 在其专属群组正常响应 -+ -+**回滚**:恢复配置 + 重启。原 bot 保留在群组中,随时可切回。 -+ -+### 8.4 Discord 迁移:单 bot -> 多 bot + 权限隔离 -+ -+**Phase 0 -- 修复 Issue #34(单 bot 下也应做)** -+ -+1. 创建 bot-specific role(如 "OpenCrew Bot") -+2. Server 级:授予 View Channels + Read Message History,不授予 Send Messages -+3. 逐频道授权 Send Messages -+4. 验证:bot 只能在授权频道发送消息 -+ -+**Phase 1 -- 创建新 Discord bot** -+ -+1. 在 Discord Developer Portal 创建新 Application(每个 Agent 一个) -+2. 启用 Message Content Intent -+3. 生成 bot token,邀请 bot 进 server -+4. 为每个 bot 创建独立 role 并配置频道权限 -+ -+**Phase 2 -- 切换到多账户配置** -+ -+1. 修改 `channels.discord` 为 `accounts` 格式 -+2. 原 token 作为 `default` 账户 -+3. 逐个添加新账户 + binding -+4. 验证每个 Agent 在正确频道响应 -+ -+**Phase 3 -- 等待 Discussion 模式解锁** -+ -+- 跟踪 Issue #11199 修复状态(PR #11644, #22611, #35479) -+- 修复合并后,配置 `allowBots: true` + `requireMention: true` 在共享频道 -+- 测试跨 bot 可见性 -+ -+**回滚**:恢复配置 + 重启。移除新 bot 的频道权限即可。 -+ -+--- -+ -+## 9. 架构决策记录 Architecture Decision Records -+ -+### ADR-001: 两种模式共存而非替代 -+ -+**Context**: A2A v1 使用 `sessions_send` 两步触发。Slack 多 bot 支持一步 @mention 触发。需要决定是否用 v2 替代 v1。 -+ -+**Decision**: 两种模式共存。Delegation(v1)用于结构化任务委派,Discussion(v2)用于多方讨论。 -+ -+**Consequences**: -+- (+) 向后兼容:单 bot 用户不受影响,仍使用 Delegation -+- (+) 渐进增强:多 bot 用户解锁 Discussion 作为额外能力 -+- (+) 全平台覆盖:Feishu 只能用 Delegation,不被排除在外 -+- (-) 认知负担:Agent 需要理解两种模式的适用场景 -+- (-) 协议复杂度增加:SOUL.md/AGENTS.md 需要更多规则 -+ -+**Grounding**: Slack 研究确认 Discussion 模式仅需配置变更(no code changes)。Feishu 研究确认平台硬限制导致 Discussion 不可能。Discord 研究确认 Discussion 被 bug 阻断但修复后可用。三平台能力差异决定了不能用单一模式覆盖所有场景。 -+ -+### ADR-002: Orchestrator 控制讨论节奏而非自由讨论 -+ -+**Context**: Discussion 模式中,Agent 可以被动响应(等 @mention)或主动参与(看到相关消息就发言)。需要决定讨论模式。 -+ -+**Decision**: 采用 Orchestrator 控制模式。每次由人类或指定 Agent @mention 下一个发言者。Agent 只在被 @mention 时响应。 -+ -+**Consequences**: -+- (+) 防止 Agent 讨论失控循环 -+- (+) 人类可随时介入控制节奏 -+- (+) `requireMention: true` 在系统层面强制执行 -+- (+) 轮次计数可控(`maxDiscussionTurns`) -+- (-) 无法实现完全自主的 Agent 圆桌讨论 -+- (-) Orchestrator 成为瓶颈(每轮需等待 Orchestrator 决定下一步) -+ -+**Grounding**: Slack 研究指出 Agent-orchestrated turn management 需要验证 mention-parsing 可靠性(Open Question #2)。人类控制是最安全的起步模式(Phase 1)。`maxPingPongTurns` 已证明轮次限制对防止循环的价值。 -+ -+### ADR-003: 共享频道而非 DM 进行 Discussion -+ -+**Context**: 多 Agent 讨论可以在共享频道 thread(如 #collab)或通过 DM/私信进行。需要决定讨论场所。 -+ -+**Decision**: Discussion 必须在共享频道的 thread 中进行。不允许 Agent 间 DM 讨论。 -+ -+**Consequences**: -+- (+) 用户可见性:所有讨论对用户透明 -+- (+) 审计友好:thread 即审计日志 -+- (+) 与 SYSTEM_RULES 一致:"可见、可追踪、不串上下文" -+- (-) 需要创建额外的共享频道(如 #collab) -+- (-) 多 bot 都需要加入共享频道,增加配置工作量 -+ -+**Grounding**: SYSTEM_RULES.md 要求"通过结构化产物而非海量对话实现演化"。A2A_PROTOCOL v1 要求"用户必须能在频道里看到"。Discussion 在共享频道中天然满足这些要求。 -+ -+### ADR-004: `requireMention: true` 作为 Discussion 安全阀 -+ -+**Context**: 多 bot 在共享频道时,`allowBots: true` 意味着每个 bot 都能看到其他 bot 的消息。如果没有门控,所有 bot 可能同时响应同一条消息。 -+ -+**Decision**: 共享频道必须配置 `requireMention: true`。Agent 只响应 @mention 自己的消息。 -+ -+**Consequences**: -+- (+) 系统级循环防护(不依赖 Agent 自律) -+- (+) 用户/Orchestrator 精确控制哪个 Agent 参与 -+- (+) 即使 Agent SOUL.md 规则被忽略,系统仍安全 -+- (-) 无法实现"Agent 自主判断是否参与"的高级模式 -+- (-) Discord Issue #45300 报告 `requireMention` 在多账户下可能失效 -+ -+**Grounding**: Slack 研究确认 `allowBots: true` + `requireMention: true` 是官方推荐的安全组合。OpenClaw 文档明确推荐此组合用于多 Agent 场景。 -+ -+### ADR-005: Feishu 采用 `groupSessionScope: "group_topic"` 实现 Session 隔离 -+ -+**Context**: Feishu 群组默认共享单一 session(P0 问题)。需要决定隔离方案。 -+ -+**Decision**: 使用 `groupSessionScope: "group_topic"`,每个 topic thread 获得独立 session key。 -+ -+**Consequences**: -+- (+) 直接解决 P0 session 共享问题 -+- (+) 向后兼容:非 topic 消息仍使用群组级 session -+- (+) 与 Slack "thread = task = session" 模型对齐 -+- (+) 纯配置变更,不改代码 -+- (-) Session key 格式变化:`sessions_send` 时需要包含 topic 后缀 -+- (-) 需要 OpenClaw >= 2026.2(PR #29791) -+ -+**Grounding**: Feishu 研究确认 `buildFeishuConversationId` 函数在 `group_topic` 模式下生成 `chatId:topic:topicId` 格式的 session key。PR #29791 已合并,功能可用。 -+ -+### ADR-006: Discord 频道权限隔离作为必须配置 -+ -+**Context**: Issue #34 暴露了单 bot 模式下缺少频道权限隔离导致对话混淆的问题。 -+ -+**Decision**: Discord 部署必须配置频道级 Send Messages 权限隔离,无论单 bot 还是多 bot。 -+ -+**Consequences**: -+- (+) 根治 Issue #34 -+- (+) 多 bot 模式下每个 bot 天然隔离(只在自己频道有发送权限) -+- (+) 即使 OpenClaw binding 有 edge case,Discord 权限层提供兜底 -+- (-) 配置步骤增加 -+- (-) bot role 不能有 Administrator 权限(否则 override 失效) -+ -+**Grounding**: Discord 研究确认 Issue #34 root cause 是 "single-bot + missing channel permission overrides"。Reporter 自己确认了解决方案。 -+ -+--- -+ -+## Appendix A: Discussion 模式分阶段上线路线图 -+ -+``` -+Phase 1 -- Human Orchestrated (NOW, Slack only) -+├── 人类在 #collab thread 中 @mention Agent -+├── Agent 响应后,人类决定下一步 -+├── 最安全,零协议风险 -+└── 验证点:各 Agent 独立响应、thread history 正确加载 -+ -+Phase 2 -- Agent Orchestrated (NEAR, Slack only) -+├── CTO/CoS 在 SOUL.md 中被授权为 Orchestrator -+├── Orchestrator Agent 可以 @mention 其他 Agent -+├── maxDiscussionTurns = 8 作为安全阀 -+└── 验证点:Orchestrator mention 被目标 Agent 正确识别 -+ -+Phase 3 -- Cross-Platform (FUTURE, Slack + Discord) -+├── Discord Issue #11199 修复后启用 -+├── 统一 Discussion 协议在 Slack 和 Discord -+└── Feishu 保持 Delegation-only(平台限制) -+ -+Phase 4 -- Proactive Mode (EXPLORATION) -+├── Agent 不需要 @mention 即可判断相关性并参与 -+├── 需要 allowBots: "mentions" → allowBots: true 升级 -+├── 需要 Agent 端的 relevance filtering -+└── 参考:SlackAgents (EMNLP 2025) proactive mode -+``` -+ -+## Appendix B: 与 Anthropic Harness 模式对比 -+ -+| 维度 | Harness(文件协作) | OpenCrew Delegation(消息协作) | OpenCrew Discussion(讨论协作) | -+|------|--------------------|-----------------------------|-------------------------------| -+| 通信介质 | 磁盘文件 | `sessions_send` 内部消息 + thread 锚点 | Thread 消息(直接在聊天 UI) | -+| 持久性 | Git 可追踪 | Session 内存 + thread 日志 | Thread 日志 | -+| 结构化 | 高(sprint contract, spec 文件) | 高(任务包模板、closeout 模板) | 中(自然语言 + 格式约定) | -+| 延迟 | ~0(本地文件系统) | ~1-3s(内部 RPC + 平台 API) | ~1-3s(平台 API) | -+| 人类可见性 | 需要主动检查文件 | Thread 可见但需跟踪多频道 | **天然可见**(讨论就在 UI 中) | -+| 上下文窗口 | 完整文件内容 | Session history | Thread history(`initialHistoryLimit`) | -+| 轮次管理 | Harness 代码控制 | `maxPingPongTurns` | Orchestrator + `maxDiscussionTurns` | -+| 对抗式审查 | Generator vs Evaluator | 无内建(由上游人工审查) | **天然支持**(多 Agent 在同一 thread 辩论) | -+ -+**结论**:Delegation 适合执行型任务(Builder 写代码),Discussion 适合决策型任务(架构评审、方案对齐)。两者与 Harness 模式互补而非竞争 -- Harness 适合纯自动化 CI 管线,OpenCrew 适合需要人类参与和可见性的组织协作。 -+ -+## Appendix C: 配置 Quick Reference -+ -+### 最小配置(单 bot,仅 Delegation) -+ -+```json -+{ -+ "channels": { -+ "slack": { "botToken": "xoxb-...", "appToken": "xapp-..." } -+ } -+} -+``` -+ -+### 推荐配置(多 bot,Delegation + Discussion) -+ -+见第 7 节各平台配置片段。 -+ -+### 关键配置参数速查 -+ -+| 参数 | 值 | 用途 | 适用平台 | -+|------|-----|------|---------| -+| `allowBots` | `false` / `true` / `"mentions"` | 控制 bot 消息是否被处理 | Slack, Discord | -+| `requireMention` | `true` / `false` | 要求 @mention 才触发 Agent | Slack, Discord | -+| `thread.historyScope` | `"thread"` | Thread 级历史隔离 | Slack | -+| `thread.inheritParent` | `true` / `false` | Thread 是否继承 root message 上下文 | Slack | -+| `thread.initialHistoryLimit` | 数字 | Agent 加载的历史消息数 | Slack | -+| `groupSessionScope` | `"group"` / `"group_topic"` / `"group_sender"` / `"group_topic_sender"` | 群组 session 隔离粒度 | Feishu | -+| `maxPingPongTurns` | 数字 | Delegation A2A 最大往返轮数 | 全平台 | -+| `maxDiscussionTurns` | 5 (Level 1) / 8 (Level 2)(协议约定,非系统参数) | Discussion 最大 Agent 响应次数 | Slack (Discord future) | diff --git a/.harness/reports/qa_a2a_research_r1.md b/.harness/reports/qa_a2a_research_r1.md deleted file mode 100644 index f0d1dc8..0000000 --- a/.harness/reports/qa_a2a_research_r1.md +++ /dev/null @@ -1,237 +0,0 @@ -# QA Report: A2A Research Verification - -**QA Agent**: Claude Opus 4.6 (1M context) -**Date**: 2026-03-27 -**Scope**: Verify 6 critical claims from research reports against official docs and source evidence -**Method**: Web search, WebFetch of official docs, GitHub CLI issue/PR inspection - -## Overall: NEEDS-WORK - -Four of six claims verified or partially verified. Two claims have significant accuracy issues that could mislead implementation decisions. The Slack self-loop filter claim (Claim 1) has an important nuance the reports gloss over, and the Discord #11199 status (Claim 3) is stale -- the issue was auto-closed, not fixed. - ---- - -## Claim 1: Slack multi-account enables true cross-bot communication - -**Report says**: `allowBots: true` + `requireMention: true` + multi-account = Bot-A's messages visible to Bot-B. Self-loop filter is per-bot-user-ID, so multi-account naturally bypasses it. - -### Verified: PARTIALLY - -### Evidence - -**What IS confirmed (HIGH confidence)**: - -1. **Slack Events API delivers cross-bot messages**: Confirmed via [Slack message.channels event docs](https://docs.slack.dev/reference/events/message.channels/) and [Slack Events API docs](https://docs.slack.dev/apis/events-api/). All apps subscribed to `message.channels` receive events for all messages in channels they've joined, including messages from other bots. - -2. **OpenClaw multi-account Slack support exists**: Confirmed via [OpenClaw Slack docs](https://docs.openclaw.ai/channels/slack) and [community gist](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f). The `channels.slack.accounts` configuration with per-account `botToken`/`appToken` is documented. - -3. **`allowBots` exists as a per-channel config**: Confirmed in official Slack docs as a per-channel control under `channels.slack.channels.<id>`. - -4. **`requireMention: true` exists**: Confirmed. Official docs say "Channel messages are mention-gated by default." - -5. **`initialHistoryLimit` exists and defaults to 20**: Confirmed. Official docs: "controls how many existing thread messages are fetched when a new thread session starts (default 20; set 0 to disable)." - -6. **`thread.historyScope` defaults to `"thread"`**: Confirmed in official docs. - -7. **`thread.inheritParent` defaults to `false`**: Confirmed in official docs. - -**What is UNCERTAIN (MEDIUM confidence)**: - -8. **Self-loop filter is per-bot-user-ID on Slack**: The report states this with HIGH confidence, but the evidence is indirect. Issue [#15836](https://github.com/openclaw/openclaw/issues/15836) shows the Slack filter code as `if (message.user === botUserId)`, which IS per-bot-user-ID. However, this issue was about single-bot mode. In multi-account mode, each account has its own `botUserId` fetched via `client.fetchUser("@me")`, so the filter SHOULD be per-account. But there is NO confirmed end-to-end test of multi-account Slack bot-to-bot communication in the evidence. The gist author's "verification checklist" describes human-mediated `@mention` workflows, not autonomous bot-to-bot. A user in the gist comments reported duplicate reply issues that required "deleted all slack apps and reset configs" to resolve. - -9. **`allowBots: "mentions"` as a Slack value**: The report (Slack research, Section 1.2) lists three modes for `allowBots`: `false`, `true`, and `"mentions"`. The `"mentions"` value is confirmed for **Discord** via [OpenClaw configuration reference](https://github.com/openclaw/openclaw/blob/main/docs/gateway/configuration-reference.md): "use `allowBots: "mentions"` to only accept bot messages that mention the bot." However, the Slack docs do NOT document `allowBots: "mentions"` -- only `allowBots` appears as a listed per-channel control without value specification. The report itself rates this as MEDIUM confidence and notes it was found via "DeepWiki source-level documentation" -- but it may only be a Discord feature. - -**What is UNVERIFIED**: - -10. **End-to-end Slack multi-account cross-bot messaging**: No first-party evidence of a successful test where Bot-A posts a message, Bot-B's OpenClaw handler receives it via `allowBots: true`, and Bot-B responds. The community gist describes the setup pattern but does not demonstrate confirmed autonomous bot-to-bot message delivery. Issue #15836 (Slack agent-to-agent routing) was closed as NOT_PLANNED, and the two fix PRs (#15863, #15946) were CLOSED without merging. - -### Issues - -- **RISK**: The report presents Slack multi-account cross-bot communication as "technically feasible today" and "NOW" achievable, but no one has demonstrated it working end-to-end. The architecture report builds five collaboration patterns on this assumption. -- **INACCURACY**: `allowBots: "mentions"` may not exist for Slack. The report should use `allowBots: true` + `requireMention: true` as the recommended Slack config (which is what the config snippets actually show). -- **MISSING CAVEAT**: Issue #15836 was closed NOT_PLANNED, suggesting the OpenClaw maintainers may consider `sessions_send` the canonical A2A mechanism, with channel messages reserved for human-agent interaction only. - -### Implementation Recommendation - -Before implementing multi-account Discussion mode, the team MUST run a proof-of-concept test: -1. Create 2 Slack apps with separate tokens -2. Configure OpenClaw multi-account with `allowBots: true` + `requireMention: true` -3. Have Bot-A post a message @mentioning Bot-B in a shared channel -4. Verify Bot-B's OpenClaw session receives and responds to the message - ---- - -## Claim 2: Feishu bot message invisibility - -**Report says**: `im.message.receive_v1` only fires for user messages. Bot messages are invisible to other bots. This is a Feishu platform limitation. - -### Verified: YES - -### Evidence - -**Feishu official documentation** at [open.feishu.cn/document/server-docs/im-v1/message/events/receive](https://open.feishu.cn/document/server-docs/im-v1/message/events/receive) explicitly states: - -1. `sender_type` field: "目前只支持用户(user)发送的消息" -- **"Currently only supports messages sent by users"** - -2. Group chat behavior: "可接收与机器人所在群聊会话中用户发送的所有消息(不包含机器人发送的消息)" -- **"Can receive all messages sent by users in group chats where the bot participates, excluding messages sent by the bot"** - -This confirms the report's claim with direct official documentation. The limitation is at the Feishu platform level and cannot be worked around by any OpenClaw configuration. - -### Issues - -None. The report accurately characterizes this limitation and correctly concludes that `sessions_send` remains necessary for Feishu cross-agent triggering. - ---- - -## Claim 3: Discord OpenClaw Issue #11199 blocks cross-bot messaging - -**Report says**: OpenClaw's bot filter treats ALL configured bots as "self." Related fix PRs: #11644, #22611, #35479. - -### Verified: PARTIALLY -- Status is STALE, not actively blocked - -### Evidence - -**Issue #11199 confirmed**: The [issue](https://github.com/openclaw/openclaw/issues/11199) exists and accurately describes the problem. The bug report includes detailed reproduction steps and code analysis showing the mention detection failure. - -**However, the report's characterization is incomplete**: - -1. **Issue was auto-closed on 2026-03-08** due to inactivity (stale bot), NOT because it was fixed. The closure message: "Closing due to inactivity. If this is still an issue, please retry on the latest OpenClaw release." - -2. **All three fix PRs were CLOSED without merging**: - - PR #11644 ("fix: bypass bot filter and mention gate for sibling Discord bots") -- CLOSED, not merged - - PR #22611 ("fix(discord): allow messages from other instance bots in multi-account setups") -- CLOSED, not merged - - PR #35479 ("fix(discord): add allowBotIds config to selectively allow bot messages") -- CLOSED, not merged - -3. **A community workaround exists**: A [comment by @garibong-labs](https://github.com/openclaw/openclaw/issues/11199#issuecomment-3904716720) provides a working config using `allowBots: true` + `requireMention: false` + per-channel `users` whitelist. However, this requires disabling mention gating entirely. - -4. **Additional blocker**: Issue [#45300](https://github.com/openclaw/openclaw/issues/45300) -- `requireMention: true` is broken in multi-account Discord config (still OPEN). This means even if #11199 were fixed, mention-gated bot-to-bot communication still would not work. - -### Issues - -- **STALE DATA**: The report says "status of these PRs could not be confirmed." I can confirm: all three PRs are CLOSED without merge. The issue itself is closed-as-stale, not closed-as-fixed. -- **WORKAROUND NOT MENTIONED**: The `requireMention: false` + `users` whitelist workaround exists and is confirmed working by multiple users, but the report does not mention it. -- **DOUBLE BLOCKER**: Even if #11199 is reopened and fixed, #45300 (`requireMention` broken in multi-account) is an independent blocker for the recommended `allowBots: true` + `requireMention: true` pattern. - -### Implementation Recommendation - -Discord Discussion mode should be classified as **BLOCKED (two independent issues)** rather than "BLOCKED (one issue)." The architecture report should document the `requireMention: false` workaround as an interim option with appropriate warnings about loop risk. - ---- - -## Claim 4: groupSessionScope: "group_topic" creates per-topic sessions - -### Verified: YES (previously verified in qa_docs_official_r1.md) - -The previous QA round confirmed this against OpenClaw v2026.3.1 release notes. The version requirement was corrected from "2026.2" to "2026.3.1." - -No contradicting information found in this verification round. - ---- - -## Claim 5: A2A v2 Protocol design is backward compatible - -**Report says**: Delegation (v1) + Discussion (v2) coexist. Single-bot users are unaffected. The `allowBots` + `requireMention` config on shared channels does not conflict with existing single-bot config. - -### Verified: YES - -### Evidence - -1. **Current A2A_PROTOCOL.md** (`shared/A2A_PROTOCOL.md`) uses only Delegation mode: anchor message + `sessions_send`. It explicitly notes: "Slack 中所有 Agent 共用同一个 bot 身份" and "bot 自己发到别的频道的消息,默认不会触发对方 Agent 自动运行." - -2. **v2 protocol design** adds Discussion mode as a NEW capability alongside Delegation. Key design decisions that preserve compatibility: - - Agent-dedicated channels (e.g., #hq, #cto, #build) keep `allowBots: false` and `requireMention: false` -- identical to current behavior - - Only the NEW `#collab` channel uses `allowBots: true` + `requireMention: true` - - All Delegation workflows (two-step anchor + `sessions_send`) remain unchanged - - Multi-account is additive: the `default` account can map to the existing single bot - -3. **No config conflicts**: The v2 config snippets show dedicated channels retaining their current settings. The `accounts` block is an extension of the existing `channels.slack` structure, not a replacement. - -4. **Permission matrix unchanged**: v2 adds "Discussion 中的 @mention 也必须遵守权限矩阵" as a supplement, not a modification. - -### Issues - -- **Minor**: The v2 protocol references `maxDiscussionTurns` as an AGENTS.md-level instruction, not a system config key. This should be clearly documented as a convention, not a config parameter, to avoid confusion. -- **Minor**: The v2 `thread.inheritParent: true` recommendation for shared channels differs from the current default of `false`. Implementers should be warned this changes behavior for ALL threads in configured channels, not just Discussion threads. - ---- - -## Claim 6: Collaboration patterns are mechanically feasible - -**Report says**: 5 patterns (Architecture Review, Strategic Alignment, Code Review, Incident Response, Knowledge Synthesis) work via @mention -> thread history loading -> response. - -### Verified: PARTIALLY - -### What is mechanically sound - -1. **@mention triggers agent response**: With `requireMention: true`, a bot only processes messages where it is @mentioned. This is confirmed behavior in OpenClaw for both Slack and Discord. - -2. **Thread history loading via `initialHistoryLimit`**: Confirmed. When an agent starts a new session in a thread, it loads the last N messages (default 20, configurable). This is the mechanism by which Agent-B would "see" Agent-A's earlier messages. - -3. **`initialHistoryLimit` exists and is configurable**: Confirmed at `channels.slack.thread.initialHistoryLimit`. The architecture report's recommendations of 50-100 depending on pattern are reasonable. - -4. **Visual identity**: With multi-account, each bot app has its own profile (name, avatar). Issue [#27080](https://github.com/openclaw/openclaw/issues/27080) (Slack agent identity fix) is CLOSED (resolved on 2026-03-01). - -5. **Turn management via @mention**: The Level 1 (human-orchestrated) pattern is mechanically straightforward -- human @mentions agents in sequence, each agent loads thread history and responds. - -### What is uncertain - -6. **Agent generating @mentions**: Level 2 (agent-orchestrated) requires the orchestrator agent to produce `<@BOT_USER_ID>` in its messages. The report acknowledges this as an open question (#2 in Slack Open Questions): "When Agent-CTO posts '@Builder what do you think?', does OpenClaw's Slack plugin reliably detect this as a mention?" This is unverified. - -7. **Bot-to-bot message delivery** (the foundation of all patterns): As noted in Claim 1, there is no confirmed end-to-end test of multi-account Slack bot-to-bot message delivery via `allowBots: true`. All five patterns depend on this working. - -8. **Thread history completeness**: With `initialHistoryLimit: 50`, an agent joining a long discussion late will only see the last 50 messages. The architecture report addresses this with "context anchor" and "orchestrator summary" strategies, which is sound design but adds implementation complexity. - -### Issues - -- **CRITICAL DEPENDENCY**: All 5 patterns depend on Claim 1 (Slack multi-account cross-bot messaging) being true. If bot-to-bot message delivery fails in practice, all Discussion-mode patterns are blocked. -- **Level 2 feasibility uncertain**: Agent-generated @mentions have not been validated. If agents produce `@CTO` as plain text rather than `<@U123BOT>` in Slack's mention format, the mention will not be detected. -- **Pattern descriptions are thorough and well-designed**: The step-by-step mechanics, guard rails (maxDiscussionTurns, context anchors, escalation), and degradation strategies (Feishu/Discord fallback to Delegation chains) are architecturally sound regardless of implementation verification. - ---- - -## Implementation Recommendations - -### Must-Do Before Implementation - -1. **RUN A PROOF OF CONCEPT**: Set up 2 Slack apps, configure multi-account OpenClaw with `allowBots: true` + `requireMention: true`, and verify bot-to-bot message delivery end-to-end. This is the single most important validation step. Everything else depends on this. - -2. **Verify `allowBots: "mentions"` for Slack**: The config reference confirms this value exists for Discord. Confirm whether it also works for Slack, or if the Slack implementation only accepts `true`/`false`. If Slack does not support `"mentions"`, remove all references to it from Slack-specific documentation. - -3. **Test agent-generated @mentions**: Have an agent produce a message containing `<@BOT_USER_ID>` and verify the receiving bot's OpenClaw instance recognizes it as a mention. - -### Implementation Cautions - -4. **Discord has TWO blockers, not one**: Issue #11199 (closed-stale, unfixed) AND Issue #45300 (`requireMention` broken in multi-account). Both must be resolved for Discussion mode on Discord. - -5. **Feishu SecretRef crash (Issue #47436)**: Still OPEN. PR #47652 (fix) is OPEN but not merged. Use plaintext secrets for multi-account Feishu until this is resolved. - -6. **`thread.inheritParent: true`**: The v2 config changes this from the default `false`. This affects all threads in configured channels, not just Discussion threads. Test for regressions in existing Delegation workflows. - -7. **Issue #15836 closure**: The OpenClaw maintainers closed the Slack agent-to-agent routing issue as NOT_PLANNED, which may signal that channel-based A2A is not an officially supported pattern. The `sessions_send` approach should remain the primary A2A mechanism, with Discussion mode as an enhancement for Slack-capable deployments. - ---- - -## Blocking Issues - -1. **NO CONFIRMED END-TO-END TEST** of Slack multi-account bot-to-bot communication via `allowBots: true`. The entire Discussion mode architecture rests on this assumption. A proof-of-concept MUST succeed before any implementation work begins. - -2. **Discord Discussion mode has two independent blockers** (Issues #11199 and #45300), both unresolved. Implementation for Discord should be deferred. - ---- - -*Report generated: 2026-03-27* -*Verification method: WebSearch, WebFetch of official documentation (docs.openclaw.ai, open.feishu.cn, docs.slack.dev), GitHub CLI (gh issue view, gh pr view), OpenCrew repository inspection* - -Sources: -- [OpenClaw Slack Documentation](https://docs.openclaw.ai/channels/slack) -- [OpenClaw Multi-Agent Routing](https://docs.openclaw.ai/concepts/multi-agent) -- [OpenClaw Configuration Reference](https://github.com/openclaw/openclaw/blob/main/docs/gateway/configuration-reference.md) -- [Feishu Receive Message Event](https://open.feishu.cn/document/server-docs/im-v1/message/events/receive) -- [Slack Events API](https://docs.slack.dev/apis/events-api/) -- [Slack message.channels Event](https://docs.slack.dev/reference/events/message.channels/) -- [Issue #11199: Discord bot-to-bot filter](https://github.com/openclaw/openclaw/issues/11199) -- [Issue #15836: Slack agent-to-agent routing](https://github.com/openclaw/openclaw/issues/15836) -- [Issue #45300: requireMention broken in multi-account Discord](https://github.com/openclaw/openclaw/issues/45300) -- [Issue #27080: Slack agent identity fix](https://github.com/openclaw/openclaw/issues/27080) -- [Issue #47436: Feishu SecretRef crash](https://github.com/openclaw/openclaw/issues/47436) -- [Community Gist: Multi-Agent Slack Setup](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f) diff --git a/.harness/reports/qa_docs_official_r1.md b/.harness/reports/qa_docs_official_r1.md deleted file mode 100644 index 8dc2f54..0000000 --- a/.harness/reports/qa_docs_official_r1.md +++ /dev/null @@ -1,217 +0,0 @@ -# QA Report: Documentation Review Against Official Sources - -## Overall Verdict: NEEDS-WORK - -Three factual inaccuracies were found that require correction. The documentation is generally well-written and clear, but the version claims and one PR reference are verifiably wrong. - ---- - -## 1. Discord Permission Isolation - -### Accuracy vs official docs: PASS (with minor note) - -The permission setup flow described in Step 5b (Create role -> server-level deny Send Messages -> per-channel Allow) is **correct** and matches Discord's documented permission hierarchy. - -According to Discord's official developer documentation at [docs.discord.com/developers/topics/permissions](https://docs.discord.com/developers/topics/permissions): - -1. Server-level role permissions provide the base. -2. Channel-level overrides (green check / red X / gray slash) override the server-level defaults. -3. If a role lacks "Send Messages" at the server level (not granted), and a channel override sets it to Allow (green checkmark), the bot **will** be able to send in that specific channel. - -### Permission names correct: PASS - -All permission names used in the docs match Discord's official names: -- "Send Messages" -- correct (API: `SEND_MESSAGES`) -- "View Channels" -- correct (API: `VIEW_CHANNEL`, UI displays as "View Channels") -- "Read Message History" -- correct (API: `READ_MESSAGE_HISTORY`) -- "Send Messages in Threads" -- correct (API: `SEND_MESSAGES_IN_THREADS`) -- "Create Public Threads" -- correct (API: `CREATE_PUBLIC_THREADS`) -- "Create Private Threads" -- correct (API: `CREATE_PRIVATE_THREADS`) -- "Manage Threads" -- correct (API: `MANAGE_THREADS`) -- "Add Reactions" -- correct (API: `ADD_REACTIONS`) -- "Mention Everyone" -- correct (API: `MENTION_EVERYONE`) -- "Manage Channels" -- correct (API: `MANAGE_CHANNELS`) - -### Permission hierarchy correct: PASS - -The statement "Do NOT grant Administrator -- Administrator bypasses all channel overrides" is **factually correct**. Discord's official dev docs state: "ADMINISTRATOR overrides any potential permission overwrites, so there is nothing to do here." The Administrator permission short-circuits all channel override calculations. - -### Role-per-bot approach: PASS - -The multi-bot setup suggesting a separate role per bot with per-channel Send Messages overrides is a valid and recommended pattern. - -### Minor note on "View Channel" vs "View Channels" - -Line 128 of the EN Discord doc uses "View Channel" (singular) while line 60 and 143 use "View Channels" (plural). The API flag is `VIEW_CHANNEL` (singular), but the Discord UI shows "View Channels" (plural). Both are understood, but consistency within the doc would be better. - -### 50-bot limit: PASS - -The claim "Discord servers allow up to 50 bots" is correct. Discord imposes a 50-bot limit on servers. - -### 100-server vs 75-server threshold: MINOR INACCURACY - -The EN doc (line 45) states "Bots in fewer than 100 servers do not need review." The official threshold is actually 75 servers -- the application form for Privileged Intents appears once a bot reaches 75 servers. The actual enforcement kicks in at 100 servers. The CN doc says the same ("小于 100 个服务器的 bot 无需审核"). This is a simplification that could mislead users with bots approaching 75 servers. - -The Multi-Bot section (line 209 EN) correctly says "Bots in more than 75 servers require a separate Message Content Intent approval" -- this contradicts the earlier 100-server claim at line 45. - ---- - -## 2. Feishu groupSessionScope - -### Config key verified: PASS - -`groupSessionScope` is a valid configuration key. The OpenClaw v2026.3.1 release notes confirm: "add configurable group session scopes (`group`, `group_sender`, `group_topic`, `group_topic_sender`)". The value `"group_topic"` is confirmed as one of the four valid options. - -### Version requirement verified: FAIL -- INCORRECT VERSION - -**Both EN and CN docs claim `groupSessionScope` requires "OpenClaw >= 2026.2". This is wrong.** - -Evidence: -- OpenClaw v2026.3.1 release notes (https://github.com/openclaw/openclaw/releases/tag/v2026.3.1) explicitly list "Feishu/Group session routing: add configurable group session scopes" as a new feature. -- The feature is also referenced in Issue #29791 (opened Feb 28, 2026, closed via PR #29788, merged March 2, 2026) -- well after the 2026.2 release. -- The official Feishu channel docs at docs.openclaw.ai/channels/feishu do NOT mention `groupSessionScope` by name, suggesting it may be documented under a different naming convention or is very recent. - -**Correct version**: OpenClaw >= 2026.3.1 - -This error appears in 4 locations: -- `docs/en/FEISHU_SETUP.md` line 25: heading says "OpenClaw >= 2026.2" -- `docs/FEISHU_SETUP.md` line 25: heading says "OpenClaw >= 2026.2" -- `docs/en/KNOWN_ISSUES.md` line 74 and 82: says "OpenClaw >= 2026.2" -- `docs/KNOWN_ISSUES.md` line 67 and 75: says "OpenClaw >= 2026.2" - -### YAML config example syntax: PASS - -The YAML example is syntactically correct. The structure with `channels.feishu.groupSessionScope: "group_topic"` alongside `domain`, `connectionMode`, `appId`, `appSecret` is properly formatted. - -### SecretRef bug (Issue #47436): PASS - -- The issue exists and is OPEN: "[Bug] Feishu multi-account (accounts.*) appSecret SecretRef fails to resolve, crashes feishu plugin after ~3 minutes" -- It confirms that SecretRef in multi-account Feishu mode causes the plugin to crash after ~3 minutes, taking down the primary bot as well. -- The workaround described in the docs ("use plaintext secrets" or "restart the gateway twice") is a reasonable approximation, though the issue itself does not explicitly state "restart twice" as a workaround -- a fix PR (#47652) has been submitted with per-account error isolation. -- The docs' description of the bug is substantively accurate. - -### Issue #10242 reference: PARTIALLY INACCURATE - -The docs reference Issue #10242 as evidence of the group chat thread isolation limitation. However, Issue #10242 is actually titled "[Feature Request] Restore 'New Thread' capability for Feishu (Lark) Channel in **DMs**" -- it is about DM thread capability, not group chat session isolation. While the broader point about Feishu lacking thread isolation is valid, this specific issue is not the right citation. Issue #29791 ("[Feature]: Support thread-based replies in Feishu plugin") would be a more accurate reference for the group chat topic. - ---- - -## 3. Deployment Order - -### Logical correctness: PASS - -The 9-step deployment order is logically sound: -1. Create bot/app on platform -2. Configure permissions/intents -3. Invite bot to server/workspace/groups -4. Create agent channels/groups -5. Connect platform to OpenClaw (`openclaw channels add`) -6. Deploy OpenCrew files (shared protocols + workspaces) -7. Write OpenClaw config (agent bindings, channel IDs) -8. Restart gateway -9. Verify - -This order correctly places platform-side setup (steps 1-4) before OpenClaw-side configuration (steps 5-8), which is necessary because `openclaw channels add` requires bot tokens that only exist after step 1. - -### Matches OpenClaw workflow: PASS - -You cannot run `openclaw channels add` without a bot token (Discord) or app credentials (Feishu), so the platform setup must come first. The deployment order correctly captures this dependency. - -### Common Mistakes section: PASS - -All 5 common mistakes are realistic and would genuinely be encountered by new users: -1. Wrong deployment order -- logical error new users would make -2. Skipping channel permission isolation -- addresses Issue #34 -3. Forgetting to restart gateway -- a standard "gotcha" -4. Channel ID mismatch -- common copy-paste error -5. Bot not invited to channels -- frequently missed step - -### PR #3672 reference: FAIL -- PR WAS NOT MERGED - -Both EN and CN Discord docs (line 210/211) state: "OpenClaw multi-account support was introduced in [PR #3672](https://github.com/openclaw/openclaw/pull/3672) (merged January 2026)." - -**This is factually incorrect.** PR #3672 was **CLOSED without merging** on 2026-02-01 (`mergedAt: null`, `mergeCommit: null`, `state: CLOSED`). The PR itself references `moltbot/moltbot` in its "Fixes" line, suggesting it predates a repo rename. Multi-account Discord support does exist in current OpenClaw versions (as evidenced by multiple issues referencing it), but it was NOT delivered via PR #3672. The correct PR that shipped this feature needs to be identified, or the reference should be removed. - ---- - -## 4. Known Issues Update - -### Factual accuracy: PASS (with version caveat) - -The Feishu P1 entry accurately describes: -- The symptom: "all conversations within a single group are flat -- there is no thread-level session isolation" -- The resolution: `groupSessionScope: "group_topic"` enabling per-topic session isolation -- The distinction between built-in and community plugins - -However, the version requirement "OpenClaw >= 2026.2" is incorrect (should be >= 2026.3.1), as noted in Section 2. - -### Link integrity: PASS (with style note) - -- CN KNOWN_ISSUES.md links to `FEISHU_SETUP.md#更新groupsessionscopeopenclaw--20262` -- this anchor matches the CN heading and will resolve correctly on GitHub. -- EN KNOWN_ISSUES.md links to `../en/FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` -- this anchor matches the EN heading and will resolve correctly. However, since both files are in `docs/en/`, the `../en/` prefix is unnecessarily verbose. `FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` would be cleaner and less fragile. - ---- - -## 5. Cross-file Consistency - -### EN/CN match: PASS - -- Discord permission names are consistent across EN and CN. The EN version uses the English Discord permission names ("Send Messages", "View Channels", etc.), while the CN version uses the same English names in the tables (appropriate since Discord's UI is in English) with Chinese descriptions. -- Feishu config keys (`groupSessionScope`, `appId`, `appSecret`, `connectionMode`, `accounts`) are identical in both EN and CN versions. -- The YAML config examples are structurally identical between EN and CN, differing only in placeholder values for appSecret (EN: "your_app_secret", CN: "你的AppSecret"). - -### Org name consistency: PASS - -No references to the wrong org name `open-claw/open-claw` were found anywhere in the docs directory. All references use the correct `openclaw/openclaw` format. - -### Broken links: NO BROKEN LINKS DETECTED - -All internal markdown links have matching anchors in the target files. External links to GitHub issues (#10242, #47436, #3306) and PRs (#3672) are valid URLs (though #3672 is closed/not-merged and #10242 is not the ideal reference, the links themselves work). - ---- - -## 6. Beginner Readability - -### Clarity assessment: GOOD - -The documentation is well-structured and would be followable by a beginner: -- Step numbering is clear and sequential -- Each step includes specific UI navigation paths (e.g., "Server Settings -> Roles -> Create Role") -- Warning/important callouts are used appropriately to flag critical steps -- Time estimates are helpful for setting expectations -- The "What you will have when you are done" sections provide clear success criteria - -### Missing steps: MINOR GAP - -In the Discord Step 5b, the docs say "Do NOT grant Send Messages at the server level" but do not explicitly say what to do if the bot already has Send Messages from the invite (Step 3). Since Step 3's permission table includes "Send Messages", a beginner might be confused about why they granted it in Step 3 only to revoke it in Step 5b. A clarifying note such as "The permissions you selected in Step 3 set the OAuth2 invite scope, but the role-based permissions in Step 5b take precedence for channel-level control" would help. - -### Confusing sections: MINOR - -The Feishu docs mention both "OpenClaw >= 2026.2" and reference features that actually require 2026.3.1. A beginner running OpenClaw 2026.2.x would follow these instructions, find that `groupSessionScope` does not work, and be stuck without understanding why. - ---- - -## Must-Fix Issues - -1. **INCORRECT VERSION**: Change `groupSessionScope` version requirement from "OpenClaw >= 2026.2" to "OpenClaw >= 2026.3.1" in all 4 files (EN/CN FEISHU_SETUP.md and EN/CN KNOWN_ISSUES.md). The feature was introduced in the v2026.3.1 release, as confirmed by the official release notes. - -2. **PR #3672 NOT MERGED**: Remove or correct the claim that "PR #3672 (merged January 2026)" introduced multi-account Discord support. PR #3672 was closed without merging (`mergedAt: null`). Either find the correct PR that shipped this feature, or remove the PR reference and simply state that multi-account support is available in current OpenClaw versions. - -3. **ISSUE #10242 MISCHARACTERIZED**: Issue #10242 is about DM thread capability, not group chat session isolation. Consider replacing the reference with Issue #29791 ("[Feature]: Support thread-based replies in Feishu plugin") which more accurately describes the group chat thread isolation problem, or add both references with proper context. - -## Recommendations - -1. **Reconcile 100-server vs 75-server threshold**: Line 45 of both EN/CN Discord docs says "fewer than 100 servers" while the Multi-Bot section correctly says "more than 75 servers." Discord's actual threshold for Privileged Intent application is 75 servers. Recommend updating line 45 to say "fewer than 75 servers" for consistency. - -2. **Normalize "View Channel" vs "View Channels"**: Line 128 of EN DISCORD_SETUP.md uses "View Channel" (singular) while lines 60 and 143 use "View Channels" (plural). Pick one and be consistent (recommend "View Channels" to match the Discord UI). - -3. **Simplify EN KNOWN_ISSUES link path**: Change `../en/FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` to `FEISHU_SETUP.md#update-groupsessionscope-openclaw--20262` since both files are in the same `docs/en/` directory. - -4. **Add clarifying note between Step 3 and Step 5b in Discord docs**: Explain the relationship between OAuth2 invite permissions (Step 3) and role-based channel permissions (Step 5b) to avoid beginner confusion. - -5. **SecretRef workaround accuracy**: The "restart the gateway twice" workaround is not explicitly documented in Issue #47436. Consider softening to "restart the gateway after credential changes; a fix is tracked in Issue #47436" or referencing PR #47652 which implements per-account error isolation. - ---- - -*Report generated: 2026-03-27* -*Verification method: Web search, official Discord developer docs, OpenClaw GitHub releases/issues/PRs via gh CLI, DeepWiki* diff --git a/.harness/reports/research_autonomous_slack_r1.md b/.harness/reports/research_autonomous_slack_r1.md deleted file mode 100644 index bb722fc..0000000 --- a/.harness/reports/research_autonomous_slack_r1.md +++ /dev/null @@ -1,567 +0,0 @@ -# Research: Autonomous Multi-Bot Slack Collaboration - -> Researcher: Claude Opus 4.6 | Date: 2026-03-27 | Scope: Can multiple independent Slack bots autonomously drive a multi-round discussion without human intervention at every step? - ---- - -## Executive Summary - -**Yes, this is technically feasible -- but with significant caveats.** - -Multiple independent Slack Apps (each running as a separate bot with its own token, app token, and bot user ID), all present in the same Slack channel/thread, can autonomously drive a multi-round discussion without human intervention at every step. The architecture requires: - -1. **OpenClaw multi-account mode** (`channels.slack.accounts`) -- one Slack App per participating agent, all managed by a single OpenClaw gateway instance. -2. **`allowBots: true` + `requireMention: true`** on shared channels -- this allows bots to see each other's messages while preventing uncontrolled loops. -3. **An orchestrator agent (e.g., CTO)** whose AGENTS.md/SOUL.md instructs it to drive discussions by @mentioning other agents using Slack's `<@BOT_USER_ID>` format. -4. **A soft turn limit** enforced by agent instructions (not system-level enforcement for Discussion mode). - -**Critical finding**: OpenClaw's self-loop filter on Slack is **per-bot-user-ID**, not global. In multi-account mode, Bot-CTO's messages are NOT filtered by Bot-Builder's handler because they have different bot user IDs. This is the key enabler. However, this behavior was the subject of bug fixes (Issue #15836, fixed via PRs #15863/#15946 with session origin tracking), confirming that the current codebase intentionally supports inter-agent message routing. - -**Confidence level**: MEDIUM-HIGH for Architecture A (single gateway, multi-account). The primitives all exist and are documented. No end-to-end production validation of fully autonomous (zero human intervention) multi-round discussions has been publicly reported. - ---- - -## 1. Architecture Comparison - -### 1a. Single Gateway Multi-Account (Architecture A) -- RECOMMENDED - -**How it works**: One OpenClaw gateway instance manages multiple Slack Apps via `channels.slack.accounts`. Each agent is bound to its specific account via `bindings[].match.accountId`. - -```json -{ - "channels": { - "slack": { - "accounts": { - "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-..." }, - "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-..." }, - "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-..." } - }, - "channels": { - "<COLLAB_CHANNEL_ID>": { - "allow": true, - "requireMention": true, - "allowBots": true - } - }, - "thread": { - "historyScope": "thread", - "inheritParent": true, - "initialHistoryLimit": 50 - } - } - }, - "bindings": [ - { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, - { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, - { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } - ] -} -``` - -**When Bot-CTO posts in a shared channel, does Bot-Builder's agent receive it?** - -Yes, under these conditions: -- `allowBots: true` on the channel -- without this, all bot messages are ignored. -- The message must @mention Bot-Builder (`<@BUILDER_BOT_USER_ID>`) if `requireMention: true` is set. -- OpenClaw's self-loop filter only blocks messages from the **same** bot user ID. Since Bot-CTO and Bot-Builder have different user IDs, Bot-CTO's messages pass through Bot-Builder's filter. - -**What does `allowBots: true` actually do?** - -At the code level, `allowBots` controls whether the OpenClaw Slack plugin processes messages with the `bot_message` subtype (or messages where `message.bot_id` is set). Three behaviors: - -| Value | Behavior | -|-------|----------| -| `false` (default) | All bot-authored messages are dropped before reaching agent routing. This is the current OpenCrew default. | -| `true` | All bot-authored messages are accepted as inbound. Combined with `requireMention: true`, only messages that @mention the receiving bot are processed. | -| `"mentions"` | Bot messages are accepted **only** if they contain an @mention of the receiving bot. This is functionally equivalent to `true` + `requireMention: true` but applies specifically to bot messages. (Confidence: MEDIUM -- referenced in source-level analysis and community reports but not prominently featured in official docs.) | - -**Self-loop filter mechanics**: OpenClaw checks `message.user === account.botUserId`. With multi-account, each account has a distinct `botUserId`, so Bot-CTO's messages (`user = U_CTO`) are not filtered by Bot-Builder's handler (which only filters `user = U_BUILDER`). This was confirmed by Issue #15836's fix (PRs #15863/#15946), which added session origin tracking to refine the filtering -- the fix allows routing bot messages to bound sessions EXCEPT the originating session. - -**Advantages**: -- Single process, single config, centralized management -- All session keys, A2A tools, and routing work within one gateway -- Existing OpenCrew A2A protocol (sessions_send) and Discussion mode (@mention) coexist -- Official OpenClaw documentation and community guides describe this exact pattern - -**Disadvantages**: -- Requires creating 3-7 separate Slack Apps (one per participating agent) -- All agents share one process -- if the gateway crashes, all agents go down -- Socket Mode: 3-7 persistent WebSocket connections from one process (OpenClaw docs recommend HTTP mode with distinct `webhookPath` per account for multi-account setups) - -### 1b. Multiple Independent Instances (Architecture B) - -**How it works**: Each agent runs its own OpenClaw gateway instance (separate process, separate `openclaw.json`). Each connects to Slack with its own App. They share a Slack channel. - -**Is this possible?** Technically yes, but with major complications: - -- Each OpenClaw instance would need its own `openclaw.json` with one agent definition. -- The `allowBots: true` + `requireMention: true` pattern works the same way on each instance -- each instance's self-loop filter only blocks its own bot user ID. -- Slack delivers events to all subscribed apps regardless of which instance runs them. - -**Problems**: -- **No shared A2A infrastructure**: `sessions_send` cannot cross process boundaries. The orchestrator agent cannot use OpenClaw's built-in A2A tools to trigger other agents -- it can only @mention them in Slack and hope the other instance processes the mention. -- **No shared session management**: Each instance tracks sessions independently. There is no coordination, no shared `maxPingPongTurns`, no shared session keys. -- **Operational complexity**: 3-7 separate processes, configs, logs, restarts. -- **No unified routing**: Each instance independently processes all incoming messages from its channel. Binding isolation must be configured per-instance. - -**Verdict**: Architecture B is technically possible but provides no advantage over Architecture A while adding significant complexity. The only real advantage would be complete process isolation (one crash does not affect others), which is a minor benefit compared to the operational cost. - -### 1c. Recommendation - -**Architecture A (single gateway, multi-account) is clearly superior.** It leverages OpenClaw's built-in multi-account routing, keeps all A2A infrastructure unified, and is the pattern documented by OpenClaw's official docs and community guides. All subsequent sections assume Architecture A. - ---- - -## 2. Autonomous Orchestration Mechanics - -### Event Flow - -Here is the complete event flow for an autonomous multi-round discussion: - -``` -SETUP: Human starts a thread in #collab - "Let's discuss the architecture for feature X. @CTO please kick off." - -ROUND 1: CTO responds (triggered by human's @mention) - CTO reads thread history, proposes architecture. - CTO's response includes: "<@BUILDER_BOT_USER_ID> please review feasibility." - - Event flow: - 1. CTO agent produces response text containing <@U_BUILDER> - 2. OpenClaw's Slack plugin posts this as Bot-CTO in the thread - 3. Slack Events API delivers message event to ALL subscribed apps in the channel - 4. Bot-Builder's app receives the message event - 5. OpenClaw checks: is this from a bot? Yes. Is allowBots enabled? Yes. - 6. OpenClaw checks: is this from OUR bot user ID? No (U_CTO != U_BUILDER). Pass. - 7. OpenClaw checks: does this message @mention our bot? Yes (<@U_BUILDER>). Pass. - 8. OpenClaw routes to Builder agent (matched by accountId binding) - 9. Builder agent's session is created/resumed for this thread - -ROUND 2: Builder responds (triggered by CTO's @mention) - Builder reads thread history (sees human's prompt + CTO's proposal). - Builder posts feasibility analysis. - Builder does NOT @mention anyone (it's not an orchestrator). - -ROUND 3: CTO sees Builder's response (how?) - THIS IS THE CRITICAL QUESTION. - - Option A -- CTO is also listening via allowBots: - If CTO's channel config has allowBots: true + requireMention: true, - CTO only activates when explicitly @mentioned. - Builder's response does NOT @mention CTO, so CTO does NOT auto-activate. - --> PROBLEM: CTO cannot autonomously continue the discussion. - - Option B -- CTO uses requireMention: false on the shared channel: - CTO activates on ALL messages in the thread (including Builder's). - --> PROBLEM: Every bot activates on every message --> infinite loop. - - Option C -- CTO is the orchestrator with special config: - CTO's binding for #collab has allowBots: true + requireMention: false. - All OTHER agents have allowBots: true + requireMention: true. - CTO sees all messages. Others only respond when @mentioned. - --> THIS IS THE KEY ARCHITECTURE INSIGHT. - - Option D -- Builder @mentions CTO back: - Builder's AGENTS.md instructs: "After responding, @mention the - orchestrator: <@U_CTO> I've posted my analysis." - --> Works but creates a tight loop. Needs turn counting. -``` - -### The Orchestrator Pattern (Option C -- Recommended) - -The orchestrator agent (CTO) needs a **different configuration** from other agents on the shared channel: - -```json -{ - "channels": { - "slack": { - "channels": { - "<COLLAB_CHANNEL_ID>": { - "allow": true, - "requireMention": true, - "allowBots": true - } - } - } - } -} -``` - -**The problem**: OpenClaw's `channels.slack.channels` config is **global across all agents** in the same gateway instance. You cannot set `requireMention: false` for CTO and `requireMention: true` for Builder on the same channel in the same `openclaw.json`. - -**Workarounds**: - -1. **Option D (Explicit @mention-back)**: Builder's instructions say "always end your response with `<@U_CTO>`". This triggers CTO to read the thread and decide the next step. This is the simplest approach and works within existing config constraints. - -2. **Per-account channel overrides**: If OpenClaw supports per-account channel configuration (e.g., `accounts.cto.channels.<ID>.requireMention = false`), this would solve it cleanly. **Status: UNVERIFIED** -- the official docs mention that "named accounts inherit from global config but can override any setting," but whether per-account `channels` overrides are supported at the channel level is not confirmed. - -3. **Dedicated orchestrator channel**: The orchestrator monitors its own dedicated channel (#cto) where `requireMention: false`. Other agents post summaries to #cto after responding in #collab. This fragments the discussion across channels, which is undesirable. - -4. **Hybrid: @mention + sessions_send**: Builder @mentions CTO in the thread AND does a `sessions_send` to CTO's session. This provides both visibility and a reliable trigger. But Builder needs `agentToAgent.allow` permission, which current config restricts. - -### @mention Rendering - -**Can an agent produce `<@BOT_USER_ID>` in its Slack message, and does it render as a proper mention?** - -Yes. When an agent's response text contains `<@U0XXXXX>`, OpenClaw's Slack plugin posts this verbatim to Slack. Slack's rendering engine converts `<@U0XXXXX>` into a clickable @mention with the bot's display name. This is standard Slack message formatting -- there is nothing special about bot-authored messages vs human-authored messages in this regard. - -**Does the mentioned bot receive an event?** - -The `app_mention` event documentation does not explicitly confirm or deny whether bot-authored mentions trigger `app_mention` for the mentioned app. However, OpenClaw's Slack plugin primarily listens on `message.channels` events (not just `app_mention`), which delivers ALL messages in channels the bot has joined, regardless of sender. OpenClaw then applies its own mention-detection logic by parsing the message text for `<@botUserId>` patterns. Therefore: - -- Bot-CTO posts `<@U_BUILDER> review this` in #collab thread -- Bot-Builder's app receives the `message.channels` event (Slack delivers all channel messages to all member apps) -- OpenClaw checks `allowBots: true` -- pass -- OpenClaw checks `requireMention: true` -- scans message text for `<@U_BUILDER>` -- found -- pass -- Message is routed to Builder agent - -**Confidence: HIGH** that this works. The `message.channels` subscription is the primary event listener, and OpenClaw's mention detection is text-based parsing of `<@userId>`, not reliance on Slack's `app_mention` event type. - -### Binding/Routing with Multiple Accounts - -With multi-account, routing uses the binding specificity hierarchy: - -1. Peer match (exact channel ID) -2. Account ID match -3. Channel-level match -4. Fallback to default - -When a message arrives on a Slack account (e.g., the "cto" account receives an event), OpenClaw matches it against bindings. The binding `{ "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }` routes all messages received by the CTO's Slack app to the CTO agent. - -**Key insight**: In multi-account mode, each Slack app independently receives events. The OpenClaw gateway maintains separate Socket Mode connections for each account. When Bot-CTO posts a message, Bot-Builder's Slack app independently receives the event via its own connection. OpenClaw processes each account's events through its own binding chain. - ---- - -## 3. Loop Prevention - -### Available Mechanisms - -| Mechanism | Type | Scope | Enforcement | -|-----------|------|-------|-------------| -| `requireMention: true` | Config | Per-channel, global | System-enforced by OpenClaw | -| `allowBots: false` | Config | Per-channel, global | System-enforced -- blocks all bot messages | -| `allowBots: "mentions"` | Config | Per-channel, global | System-enforced -- bot messages only when bot is mentioned | -| `maxPingPongTurns` (0-5) | Config | Per A2A `sessions_send` | System-enforced by OpenClaw session manager | -| `maxTurns` (default: 80) | Config | Per agent session | System-enforced -- max model calls per session | -| `timeoutSeconds` (default: 172800) | Config | Per agent | System-enforced -- 48-hour abort timer | -| `maxDiscussionTurns` | Protocol | Per discussion thread | Agent-self-enforced via AGENTS.md instructions | -| Self-loop filter | Code | Per bot user ID | System-enforced -- ignores own messages | -| Permission matrix | Protocol | Per agent role | Agent-self-enforced via SOUL.md/AGENTS.md | -| Agent instructions (WAIT discipline) | Protocol | Per agent | Agent-self-enforced | - -### What Applies to Autonomous Discussion? - -**`maxPingPongTurns` does NOT directly apply** to @mention-driven discussions. This parameter governs the `sessions_send` reply-back loop specifically. In Discussion mode, there is no `sessions_send` -- agents respond to @mentions in Slack threads. The ping-pong counter is not incremented. - -**`maxTurns` provides a backstop** but is too coarse. At 80 turns per session, an agent could send many messages before hitting this limit. - -**`requireMention: true` is the primary loop breaker.** If all agents require mentions, and agents only @mention the next speaker (never broadcasting), then: -- Agent responds only when mentioned -- Agent mentions at most one other agent -- That other agent responds, mentions the orchestrator back -- Orchestrator decides next step - -This creates a **controlled chain**, not an unbounded loop. The chain only continues as long as the orchestrator keeps @mentioning agents. - -### Recommended Approach: Multi-Layer Defense - -**Layer 1 -- Config-enforced (reliable)**: -- `requireMention: true` on ALL agents in shared channels -- `allowBots: true` (or `"mentions"`) on shared channels only -- `maxTurns` per agent as an absolute backstop (e.g., 20 for discussion participants) -- Agent-specific `timeoutSeconds` (e.g., 600 for discussion sessions) - -**Layer 2 -- Protocol-enforced (agent instructions)**: -- Orchestrator AGENTS.md: "You may run at most `MAX_ROUNDS` discussion rounds. After `MAX_ROUNDS`, you MUST post `DISCUSSION_CLOSE` and stop @mentioning other agents." -- Participant AGENTS.md: "You ONLY respond when @mentioned. You NEVER @mention another agent unless explicitly instructed. After responding, you STOP." -- Exception: The `@mention-back` pattern where participants mention the orchestrator after responding. This is controlled because only the orchestrator decides whether to continue. - -**Layer 3 -- External monitoring**: -- A cron job or heartbeat that checks thread message count and kills sessions if a thread exceeds N messages. -- Ops agent periodic audit of thread lengths. - -### The "47 replies in 12 seconds" Cautionary Tale - -A production incident documented by the community: enabling `allowBots: true` without `requireMention: true` in a channel with another AI bot caused 47 replies in 12 seconds before manual process kill. This underscores that `requireMention: true` is **non-negotiable** when `allowBots` is enabled. - ---- - -## 4. Implementation Path - -### 4.1 Config Changes - -**Step 1: Create Slack Apps** (human-manual, one-time) - -Create one Slack App per participating agent. Minimum 3 (CoS, CTO, Builder). Each app needs: -- Socket Mode enabled -- App-Level Token (`xapp-`) with `connections:write` scope -- Bot Token (`xoxb-`) with scopes: `channels:history`, `channels:read`, `chat:write`, `users:read`, `reactions:read`, `reactions:write` -- Event Subscriptions: `message.channels`, `app_mention` -- Bot user configured with distinct name and icon - -**Step 2: Create shared discussion channel** (human-manual) - -Create `#collab` (or similar). Invite ALL agent bots to this channel. - -**Step 3: Update openclaw.json** (agent-executable) - -Add multi-account config: -```json -{ - "channels": { - "slack": { - "accounts": { - "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-...", "name": "CoS" }, - "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-...", "name": "CTO" }, - "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-...", "name": "Builder" } - }, - "channels": { - "<COLLAB_CHANNEL_ID>": { - "allow": true, - "requireMention": true, - "allowBots": true - } - }, - "thread": { - "historyScope": "thread", - "inheritParent": true, - "initialHistoryLimit": 50 - } - } - }, - "bindings": [ - { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, - { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, - { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } - ] -} -``` - -Keep existing per-agent channel bindings (each agent's dedicated channel stays `requireMention: false`, `allowBots: false`). - -**Step 4: Record bot user IDs** (agent-executable) - -For each Slack App, obtain the Bot User ID (e.g., `U0CTO1234`, `U0BLD5678`). These are needed in agent instructions for @mention formatting. - -### 4.2 Agent Instructions - -**CTO AGENTS.md -- Add Orchestrator Section**: - -```markdown -## Discussion Orchestration (Multi-Agent Threads in #collab) - -When driving a multi-agent discussion in #collab: - -### Setup -- You are the ORCHESTRATOR. You control who speaks next. -- Bot User IDs: CTO=<@U_CTO>, Builder=<@U_BUILDER>, CoS=<@U_COS> -- Maximum rounds: 5. After 5 rounds, you MUST close the discussion. - -### Each Round -1. Read the full thread history (all prior messages) -2. Analyze the latest response -3. Decide: (a) Ask another agent for input, (b) Ask the same agent to clarify, (c) Close discussion -4. If continuing: Post your analysis + @mention the next agent - Format: "[CTO] <your analysis>. <@U_NEXT_AGENT> <your question/request>" -5. If closing: Post DISCUSSION_CLOSE summary - -### Round Counting -- Maintain a round counter in your responses: "Round N/5" -- After Round 5, you MUST close regardless of convergence - -### DISCUSSION_CLOSE Format -``` -DISCUSSION_CLOSE -Topic: <topic> -Rounds: N/5 -Consensus: <achieved | not achieved> -Decision: <what was decided> -Actions: <next steps, including any A2A delegation tasks> -Participants: CTO, Builder, ... -``` - -### Safety Rules -- NEVER @mention more than one agent per message -- NEVER skip round counting -- If you receive a response that is clearly a loop (repeating prior content), immediately close -``` - -**Builder AGENTS.md -- Add Discussion Participant Section**: - -```markdown -## Discussion Participation (Multi-Agent Threads in #collab) - -When @mentioned in a #collab discussion thread: - -1. Read the full thread history -2. Respond from your domain perspective (feasibility, implementation, effort) -3. End your response with: "<@U_CTO> I've posted my analysis." - (This notifies the orchestrator to continue the discussion) -4. Do NOT @mention any agent other than the orchestrator (<@U_CTO>) -5. Do NOT continue working after posting -- WAIT for the next @mention -6. If you have nothing to add: respond "[Builder] PASS: <reason>" -``` - -**CoS AGENTS.md -- Similar participant pattern, mentioning CTO back after responding.** - -### 4.3 The @Mention-Back Problem and Solutions - -The fundamental challenge is: **how does the orchestrator know when a participant has responded?** - -**Solution 1 -- Participant @mentions orchestrator back** (Recommended): -- Participant ends response with `<@U_CTO>` -- CTO receives the `message.channels` event with its mention -- CTO reads thread, continues orchestration -- Pro: Simple, works within existing config -- Con: CTO activates on every participant response, even partial ones - -**Solution 2 -- Orchestrator polls thread** (Not recommended): -- CTO uses a timer/heartbeat to periodically check the thread -- Pro: No mention-back needed -- Con: OpenClaw agents don't have native polling/timer capabilities for thread monitoring - -**Solution 3 -- Hybrid: @mention-back + sessions_send**: -- Participant @mentions CTO AND does sessions_send to CTO's session -- Pro: Belt-and-suspenders reliability -- Con: Requires Builder to have sessions_send permission (currently restricted) - -**Solution 1 is recommended** as the simplest path that works within existing constraints. - -### 4.4 Failure Modes - -| Failure Mode | Cause | Mitigation | -|-------------|-------|------------| -| **Infinite loop** | Agent A mentions Agent B mentions Agent A... | `requireMention: true` + only orchestrator decides next speaker + round counter | -| **Silent failure** | Agent doesn't respond to @mention | Orchestrator waits N seconds, then posts "ping" + re-mentions. After 2 failures, closes discussion. | -| **Context overflow** | Thread gets too long for `initialHistoryLimit` | Set `initialHistoryLimit >= 50`. Instruct agents to keep responses under 500 words. | -| **Wrong agent responds** | Binding misconfiguration | Test with Round0 handshake before production discussions. | -| **All agents respond simultaneously** | `requireMention: false` accidentally set | Config audit: `requireMention: true` on ALL shared channels. | -| **Orchestrator never closes** | Round counter not maintained | `maxTurns` per session as absolute backstop. External monitoring. | -| **Bot mentions not parsed** | Agent outputs `@Builder` instead of `<@U_BUILDER>` | AGENTS.md must contain exact bot user IDs, not display names. | -| **Self-loop filter blocks legitimate messages** | OpenClaw bug / regression | Monitor Issue #15836 fix status. Test with Round0 handshake. | -| **Socket Mode connection limits** | 5-7 WebSocket connections from one process | OpenClaw docs recommend HTTP mode for multi-account. Test Socket Mode first; switch to HTTP if stability issues arise. | -| **Gateway crash kills all agents** | Single process architecture | Standard process management (systemd, pm2). Restart automatically. | - ---- - -## 5. Comparison with Claude Code Agent Teams - -| Dimension | Claude Code Agent Teams | OpenCrew Slack Discussion | -|-----------|------------------------|--------------------------| -| **Communication** | Mailbox system (in-memory message passing) | Slack thread messages | -| **Shared state** | Task list files (`~/.claude/tasks/`) | Thread history (via `initialHistoryLimit`) | -| **Orchestration** | Lead agent creates team, assigns tasks | CTO/CoS @mentions agents in thread | -| **Context** | Each teammate has own context window | Each agent has own session (thread-scoped) | -| **Human visibility** | Terminal output, requires split-pane/tmux | Slack UI -- real-time, mobile, searchable | -| **Turn control** | Task completion triggers, idle hooks | @mention triggers, round counter in instructions | -| **Loop prevention** | Task dependency system, lead controls | `requireMention` + round counter + `maxTurns` | -| **Persistence** | Session-scoped (lost on restart) | Slack thread history (persists, searchable) | -| **Cost model** | Token-based, per context window | Token-based + Slack API calls | -| **Inter-agent debate** | Teammates message each other directly, challenge findings | Agents @mention each other, challenge in shared thread | - -**Key parallel**: Both systems use an orchestrator that decides task decomposition and agent assignment. Both allow agents to challenge each other. Both have context isolation per agent. - -**Key difference**: Claude Code Agent Teams use file-based task lists for coordination and in-memory mailboxes for messages. OpenCrew uses Slack threads as both the communication channel and the shared context. The Slack approach provides superior human visibility but higher latency (~1-3s per message vs near-zero for file I/O). - -**Mapping to OpenCrew**: -- **Team lead** = CTO (or CoS for strategic discussions) -- **Teammates** = Builder, CIO, Ops (responding when called) -- **Task list** = The orchestrator's round-by-round plan (maintained in agent instructions, not a shared file) -- **Mailbox** = @mentions in the Slack thread -- **Shared resources** = Thread history (`initialHistoryLimit`) - ---- - -## 6. Confidence Assessment - -| Finding | Confidence | Evidence | -|---------|-----------|---------| -| Slack Events API delivers Bot-A's messages to Bot-B's app | **HIGH** | Slack API docs: apps receive `message.channels` for all messages in joined channels. No sender-type filtering at platform level. | -| OpenClaw multi-account supports separate bot tokens per agent | **HIGH** | Official docs (`channels.slack.accounts`), community gist, tutorial sites all confirm. | -| Self-loop filter is per-bot-user-ID in multi-account mode | **HIGH** | Issue #15836 confirms the filter checks `message.user === botUserId`. Fix (PRs #15863/#15946) added origin tracking to refine this. | -| `allowBots: true` enables processing of other bots' messages | **HIGH** | Official docs, community guide, production incident report (47 replies in 12s) all confirm. | -| `requireMention: true` prevents uncontrolled bot loops | **HIGH** | Official docs explicitly recommend this combination. Community confirms. | -| Agent can produce `<@BOT_USER_ID>` and it renders as a mention | **HIGH** | Standard Slack message formatting. No special handling needed for bot-authored messages. | -| OpenClaw parses `<@userId>` in message text for mention detection | **HIGH** | Official docs: "Mention sources: explicit app mention (`<@botId>`), mention regex patterns." | -| Autonomous multi-round discussion without ANY human intervention | **MEDIUM** | All primitives verified. The @mention-back pattern (participant mentions orchestrator) is the key mechanism. Not yet validated end-to-end in production. | -| `allowBots: "mentions"` as a third option beyond true/false | **MEDIUM** | Referenced in source-level analysis (DeepWiki) and prior research. Not prominently in official docs. | -| Per-account channel overrides (different `requireMention` per bot) | **LOW** | Docs say "named accounts can override any setting" but don't explicitly show per-account `channels.<ID>` overrides. Unverified. | -| `maxPingPongTurns` applies to @mention-driven discussions | **LOW (likely NO)** | This parameter governs `sessions_send` reply-back loops specifically, not @mention-driven thread interactions. | - ---- - -## 7. Open Questions - -1. **Per-account channel config**: Can `channels.slack.accounts.cto.channels.<COLLAB_ID>.requireMention` be set to `false` while keeping the global `channels.slack.channels.<COLLAB_ID>.requireMention = true`? This would allow the orchestrator to see all messages without requiring @mention-back. Needs empirical testing against OpenClaw source. - -2. **app_mention vs message.channels**: The Slack docs do not explicitly confirm whether `app_mention` events fire when a bot (not a human) mentions another bot. OpenClaw's primary listener is `message.channels` with text-based mention parsing, so this likely doesn't matter -- but confirmation would increase confidence. - -3. **Socket Mode scalability**: With 5-7 Slack Apps all using Socket Mode, what is the resource impact on the OpenClaw gateway? Are there Slack-side rate limits on concurrent WebSocket connections from the same server? OpenClaw docs recommend HTTP mode for multi-account -- is this a strong recommendation or just an option? - -4. **Thread history limits**: When `initialHistoryLimit = 50` and a discussion spans 30+ messages, does each agent see the FULL 30 messages or only the last 50? Does the limit count thread messages or include parent channel context? This determines whether agents can maintain discussion continuity. - -5. **Concurrent @mentions**: What happens if the orchestrator @mentions two agents simultaneously in one message (e.g., `<@U_BUILDER> and <@U_OPS> please review`)? Do both agents respond? In what order? Can this cause race conditions in the thread? - -6. **Discussion session lifecycle**: When does a thread-scoped session expire? If a discussion spans hours (with gaps between rounds), does the session survive? Does each @mention create a new session or resume the existing one? - -7. **Cost estimation**: Each discussion round involves: (a) agent reading full thread history, (b) agent producing a response, (c) OpenClaw posting to Slack. For a 5-round discussion with 3 agents, how many API tokens are consumed? Is this comparable to Claude Code Agent Teams or significantly more/less? - -8. **Empirical validation**: Nobody has publicly documented a fully autonomous (zero human intervention after initial prompt) multi-round OpenCrew discussion. The first implementation should be treated as an experiment with careful monitoring. - ---- - -## Appendix A: Step-by-Step Validation Plan - -Before deploying autonomous discussions, validate each component: - -**Test 1: Multi-account basic message delivery** -- Configure 2 accounts (CTO + Builder) on a shared channel -- Human @mentions CTO in a thread -- Verify CTO responds -- CTO's response includes `<@U_BUILDER>` -- Verify Builder receives the event and responds -- Expected: Both agents respond in the same thread - -**Test 2: @mention-back pattern** -- Builder's response includes `<@U_CTO>` -- Verify CTO receives Builder's message and can read thread history -- Expected: CTO sees all prior messages and can continue - -**Test 3: Round counter enforcement** -- Set max rounds to 3 in CTO's AGENTS.md -- Start a discussion -- Verify CTO closes discussion after round 3 with DISCUSSION_CLOSE -- Expected: Discussion terminates cleanly - -**Test 4: Loop prevention** -- Remove round counter from CTO's instructions (dangerous -- test in isolated channel) -- Start a discussion -- Verify `maxTurns` per session catches any runaway loop -- Expected: Session terminates at maxTurns limit - -**Test 5: Failure recovery** -- Start a discussion, then manually kill Builder's session mid-discussion -- Verify CTO detects non-response and closes or escalates -- Expected: CTO posts timeout message and either retries or closes - ---- - -## Appendix B: Key References - -- [OpenClaw Slack Plugin Docs](https://docs.openclaw.ai/channels/slack) -- [OpenClaw Multi-Agent Routing Docs](https://docs.openclaw.ai/concepts/multi-agent) -- [OpenClaw Session Tools Docs](https://docs.openclaw.ai/concepts/session-tool) -- [OpenClaw Agent Loop Docs](https://docs.openclaw.ai/concepts/agent-loop) -- [Running Multiple AI Agents as Slack Teammates (GitHub Gist)](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f) -- [OpenClaw Issue #15836: Agent-to-agent Slack routing (FIXED)](https://github.com/openclaw/openclaw/issues/15836) -- [OpenClaw Issue #11199: Discord multi-bot filtering (FIXED)](https://github.com/openclaw/openclaw/issues/11199) -- [OpenClaw Issue #45450: Matrix bot-to-bot visibility](https://github.com/openclaw/openclaw/issues/45450) -- [OpenClaw Issue #9912: maxTurns/maxToolCalls config](https://github.com/openclaw/openclaw/issues/9912) -- [OpenClaw Slack Setup Best Practices (Macaron)](https://macaron.im/blog/openclaw-slack-setup) -- [Claude Code Agent Teams Documentation](https://code.claude.com/docs/en/agent-teams) -- [Slack Events API Documentation](https://docs.slack.dev/apis/events-api/) -- [Slack Message Event Reference](https://docs.slack.dev/reference/events/message/) -- [Slack app_mention Event Reference](https://docs.slack.dev/reference/events/app_mention) -- [Prior OpenCrew Research: research_slack_r1.md](.harness/reports/research_slack_r1.md) -- [Prior OpenCrew Architecture: architecture_protocol_r1.md](.harness/reports/architecture_protocol_r1.md) -- [Prior OpenCrew Architecture: architecture_collab_r1.md](.harness/reports/architecture_collab_r1.md) diff --git a/.harness/reports/research_discord_r1.md b/.harness/reports/research_discord_r1.md deleted file mode 100644 index 4ed752e..0000000 --- a/.harness/reports/research_discord_r1.md +++ /dev/null @@ -1,382 +0,0 @@ -commit 7e825263db36aef68792a050c324daef598b4c56 -Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> -Date: Sat Mar 28 17:38:48 2026 +0800 - - feat: add A2A v2 research harness, architecture, and agent definitions - - Multi-agent harness for researching and designing A2A v2 protocol: - - Research reports (Phase 1): - - Slack: true multi-agent collaboration via multi-account + @mention - - Feishu: groupSessionScope + platform limitation analysis - - Discord: multi-bot routing + Issue #11199 blocker analysis - - Architecture designs (Phase 2): - - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode - - 5 collaboration patterns: Architecture Review, Strategic Alignment, - Code Review, Incident Response, Knowledge Synthesis - - 3-level orchestration: Human → Agent → Event-Driven - - Platform configs, migration guides, 6 ADRs - - Agent definitions for Claude Code Agent Teams: - - researcher.md, architect.md, doc-fixer.md, qa.md - - QA verification: all issues resolved, PASS verdict after fixes. - - Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> - -diff --git a/.harness/reports/research_discord_r1.md b/.harness/reports/research_discord_r1.md -new file mode 100644 -index 0000000..9467648 ---- /dev/null -+++ b/.harness/reports/research_discord_r1.md -@@ -0,0 +1,349 @@ -+# Research Report: Discord Multi-Bot Capabilities (U2, Round 1) -+ -+**Date**: 2026-03-27 -+**Researcher**: Claude (automated research agent) -+**Scope**: Multi-bot routing, channel isolation, thread support, and multi-agent collaboration on Discord for OpenCrew -+ -+--- -+ -+## Executive Summary -+ -+Discord fully supports multiple bots coexisting in a single server with distinct identities, and OpenClaw has shipped multi-account Discord support (PR #3672, merged ~Jan 2026). Each bot receives all MESSAGE_CREATE events for channels it can view, enabling cross-bot message visibility at the platform level. However, OpenClaw's internal bot-message filter currently treats all configured bot accounts as "self" and drops their messages (Issue #11199), which blocks visible bot-to-bot collaboration via Discord. The practical workaround is to use OpenClaw's internal `sessions_send` for agent-to-agent communication and restrict Discord to human-to-agent interaction per channel. -+ -+For Issue #34 (cos/ops conversation mixing), the root cause is a single-bot configuration where one bot identity serves all channels, combined with missing or incorrect channel-level permission overrides. The fix is either (a) proper Discord channel permission overwrites to restrict Send Messages per-channel on the single bot's role, or (b) migrating to multi-bot with each bot scoped to its designated channel via Discord permission overrides. -+ -+--- -+ -+## 1. Multi-Bot Routing Model -+ -+### 1.1 Multiple Bots in One Server -+ -+**Confidence: HIGH** (based on Discord API documentation and widespread community practice) -+ -+Discord servers support up to 50 bot users. Each bot is a separate Application in the Discord Developer Portal with its own token, avatar, display name, and online status. All bots in a server receive gateway events independently -- when a user posts in a channel, every bot with View Channel permission on that channel receives a `MESSAGE_CREATE` event via its own WebSocket connection. -+ -+Key facts: -+- Each bot requires its own Application, Bot Token, and server invite -+- Each bot must independently enable the **Message Content Intent** privileged gateway intent to read message body text -+- Bots appear as distinct members in the server member list with independent online/offline status -+- Each bot can register its own slash commands (no namespace collision if names differ) -+- Rate limits are per-bot, so multiple bots do not share rate limit buckets -+ -+### 1.2 Cross-Bot Message Visibility -+ -+**Confidence: HIGH** (Discord API behavior) / **MEDIUM** (OpenClaw handling) -+ -+At the Discord platform level, when Bot-A posts a message in #build, Bot-B **does** receive the `MESSAGE_CREATE` gateway event for that message, provided: -+1. Bot-B has View Channel permission on #build -+2. Bot-B has the Message Content Intent enabled -+3. Bot-B is connected to the Gateway using API v9 or above -+ -+The `message.author` object includes a `bot: true` flag, allowing the receiving bot to identify the source as another bot. -+ -+**OpenClaw complication**: OpenClaw's Discord plugin filters out messages authored by bots by default (`allowBots` defaults to `false`). More critically, even when `allowBots` is set to `true` or `"mentions"`, the current implementation (as of Issue [#11199](https://github.com/openclaw/openclaw/issues/11199)) checks the message author ID against **all** configured bot account IDs in the instance, not just the receiving account's own ID. This means Bot-A's message is treated as "own message" by Bot-B's handler and silently dropped. -+ -+**Workaround options**: -+1. Set `allowBots: true` with `requireMention: false` and whitelist sibling bot user IDs in per-channel `users` arrays. This works but disables mention gating. -+2. Use OpenClaw's internal `sessions_send` for all agent-to-agent communication (recommended by the A2A protocol). Discord messages then serve only as "visibility anchors" for human observers. -+ -+**Related PRs addressing #11199**: -+- PR #11644: "bypass bot filter and mention gate for sibling Discord bots" -+- PR #22611: "allow messages from other instance bots in multi-account setups" -+- PR #35479: "add allowBotIds config to selectively allow bot messages" -+ -+(Status of these PRs could not be confirmed in this research round.) -+ -+### 1.3 OpenClaw Multi-Account Support Status -+ -+**Confidence: HIGH** (confirmed via Issue #3306 comments and documentation) -+ -+OpenClaw's Discord plugin now supports multiple bot accounts in a single gateway instance. The feature was introduced via **PR #3672** ("feat: Introduce multi-account support for Discord, ensuring session keys and RPC IDs are account-aware"), which was merged around January 28, 2026. Issue #3306 was the original feature request; a commenter confirmed on February 9, 2026: "Multi-Agent works for current version." -+ -+Configuration structure: -+```json -+{ -+ "channels": { -+ "discord": { -+ "accounts": { -+ "default": { "token": "BOT_TOKEN_A" }, -+ "coding": { "token": "BOT_TOKEN_B" } -+ } -+ } -+ } -+} -+``` -+ -+Each account gets its own: -+- Bot token (the `default` account falls back to `DISCORD_BOT_TOKEN` env var) -+- Guild and channel allowlists -+- Session key namespace (session keys are account-aware: `agent:<agentId>:discord:<accountId>:channel:<channelId>`) -+ -+Bindings reference accounts via `accountId`: -+```json -+{ -+ "bindings": [ -+ { -+ "agentId": "cos", -+ "match": { -+ "channel": "discord", -+ "accountId": "default", -+ "guildId": "<GUILD_ID>", -+ "peer": { "kind": "channel", "id": "<CHANNEL_ID_HQ>" } -+ } -+ } -+ ] -+} -+``` -+ -+**Known issue**: `requireMention: true` is reportedly broken in multi-account configurations (Issue [#45300](https://github.com/openclaw/openclaw/issues/45300)) -- all guild messages are dropped at the preflight stage with reason "no-mention" even when the bot is explicitly @mentioned. -+ -+--- -+ -+## 2. Channel Permission Isolation -+ -+### 2.1 Root Cause of Issue #34 -+ -+**Confidence: HIGH** (confirmed by reporter FRED-DL's own comment) -+ -+Issue [#34](https://github.com/AlexAnys/opencrew/issues/34) ("routing bug, cos and ops conversations mixed together") was caused by the **single-bot configuration** where all agents share one Discord bot identity. -+ -+The reporter's comment translates to: "The Discord plugin configuration requires each bot to only send in specific channels, so the documentation should not describe them as public channels but rather as manually restricted bot send permissions." -+ -+The root cause chain: -+1. A single bot receives `MESSAGE_CREATE` events for **all** channels it has View Channel permission on -+2. OpenClaw's binding system routes messages by channel ID to the correct agent (e.g., #hq -> CoS, #ops -> Ops) -+3. However, when any agent responds, the **same bot identity** sends the message. If the bot has Send Messages permission in channels it should not be active in, or if bindings are misconfigured, responses can leak across channels -+4. Without explicit Discord permission overrides, the single bot can read and write in every channel, creating a surface for context mixing if the OpenClaw routing layer has any edge-case failures -+ -+The fix documented in the issue: manually restrict the bot's Send Messages permission so it can only send in its assigned channel(s). With multi-bot, this becomes natural -- each bot only needs permissions in its own channel. -+ -+### 2.2 Discord Permission Override Configuration -+ -+**Confidence: HIGH** (based on Discord API documentation) -+ -+Discord uses a layered permission system: -+ -+1. **Server-level role permissions** (base) -+2. **Category-level permission overwrites** (inherited by child channels unless overridden) -+3. **Channel-level permission overwrites** (most specific, wins) -+4. **Member-specific overwrites** (highest priority, per-user/bot) -+ -+Permission overwrites use an `allow`/`deny` bitfield model: -+- `allow` explicitly grants a permission at the channel level -+- `deny` explicitly revokes a permission at the channel level -+- Unset bits inherit from the parent level -+ -+Key permission bits for bot isolation: -+ -+| Permission | Bit | Hex Value | -+|---|---|---| -+| VIEW_CHANNEL | `1 << 10` | `0x0000000000000400` | -+| SEND_MESSAGES | `1 << 11` | `0x0000000000000800` | -+| SEND_MESSAGES_IN_THREADS | `1 << 38` | `0x0000004000000000` | -+| READ_MESSAGE_HISTORY | `1 << 16` | `0x0000000000010000` | -+ -+**Critical note**: If a bot's role has the **Administrator** permission, all channel-level overrides are bypassed. Ensure bot roles do NOT have Administrator. -+ -+### 2.3 Step-by-Step Isolation Setup -+ -+**Confidence: HIGH** (testable in any Discord server) -+ -+#### For single-bot setup (restrict one bot to specific channels): -+ -+1. **Create a bot-specific role** (e.g., "OpenCrew Bot") -- do NOT use Administrator permission -+2. **At the server level**, grant the role: View Channels, Read Message History -+3. **At the server level**, do NOT grant: Send Messages, Send Messages in Threads -+4. **For each agent channel** (e.g., #hq, #cto, #build): -+ - Right-click the channel -> Edit Channel -> Permissions -+ - Click "+" next to Roles/Members, add the "OpenCrew Bot" role -+ - Set "Send Messages" to **Allow** (green checkmark) -+ - Set "Send Messages in Threads" to **Allow** (green checkmark) -+5. **Verify**: The bot can now only send messages in channels where you explicitly allowed it -+ -+#### For multi-bot setup (each bot restricted to its own channel): -+ -+1. **Create a role per bot** (e.g., "CoS Bot", "CTO Bot", "Builder Bot") -+2. **At the server level**, grant each role: View Channels, Read Message History -- do NOT grant Send Messages -+3. **For each bot**, add a channel-level overwrite on its designated channel: -+ - #hq: "CoS Bot" role -> Allow Send Messages, Allow Send Messages in Threads -+ - #cto: "CTO Bot" role -> Allow Send Messages, Allow Send Messages in Threads -+ - #build: "Builder Bot" role -> Allow Send Messages, Allow Send Messages in Threads -+4. **Optional hardening**: On channels a bot should NOT access at all, add a channel-level overwrite denying View Channel for that bot's role -+ -+#### Programmatic approach (via Discord API): -+ -+``` -+PUT /channels/{channel_id}/permissions/{role_or_user_id} -+{ -+ "allow": "2048", // SEND_MESSAGES (1 << 11) -+ "deny": "0", -+ "type": 0 // 0 = role overwrite -+} -+``` -+ -+To deny Send Messages on a channel: -+``` -+PUT /channels/{channel_id}/permissions/{role_or_user_id} -+{ -+ "allow": "0", -+ "deny": "2048", // SEND_MESSAGES denied -+ "type": 0 -+} -+``` -+ -+--- -+ -+## 3. Thread Support -+ -+### 3.1 Discord Thread Model -+ -+**Confidence: HIGH** (based on Discord API documentation) -+ -+Discord threads are lightweight sub-channels that live under a parent text channel. Key properties: -+ -+- **Types**: Public threads (visible to anyone who can view the parent channel), Private threads (invite-only, or visible to those with Manage Threads permission) -+- **Auto-archive**: Threads automatically archive after a configurable period of inactivity: 1 hour, 24 hours, 3 days, or 7 days (higher values require server boost level). "Activity" means sending a message, unarchiving, or changing the auto-archive duration -+- **Archived threads**: Can still be viewed and searched, but no new messages can be added until unarchived. Locked threads can only be unarchived by users with Manage Threads permission -+- **Member limit**: Threads support up to 1,000 members -+- **Thread metadata**: Includes `archived`, `archive_timestamp`, `auto_archive_duration`, `locked`, `owner_id`, `parent_id` -+ -+### 3.2 Bot Access to Threads -+ -+**Confidence: HIGH** -+ -+- Bots **must** use API v9 or above to receive thread-related gateway events (MESSAGE_CREATE, THREAD_CREATE, etc.) -+- Threads **inherit** all parent channel permissions. The relevant permission for posting in threads is `SEND_MESSAGES_IN_THREADS` (not `SEND_MESSAGES`) -+- Bots with View Channel on the parent automatically see public threads; private threads require membership or Manage Threads permission -+- The `THREAD_LIST_SYNC` event synchronizes active threads when a bot gains access to a channel -+ -+**OpenClaw thread handling**: Discord threads are routed as channel sessions. Thread configuration inherits parent channel config unless a thread-specific entry exists. OpenClaw supports binding threads to specific agents or sessions via `/focus` and `/unfocus` commands. The OpenCrew config document confirms: "Discord threads automatically inherit the configuration of their parent channel (agent binding, requireMention, etc.) unless you configure a specific thread ID separately." -+ -+### 3.3 Comparison with Slack Thread Behavior -+ -+**Confidence: MEDIUM** (based on documented behavior, not direct testing) -+ -+| Aspect | Slack | Discord | -+|---|---|---| -+| **Thread creation** | Any message can become a thread parent by replying to it | Threads are created explicitly from a message or via API; Forum channels auto-create threads | -+| **Persistence** | Threads persist indefinitely (searchable, no expiry) | Threads auto-archive after inactivity (1h to 7d) | -+| **Visibility** | Thread replies can optionally be broadcast to the channel | Thread messages stay in the thread only | -+| **Session key** | `agent:<agentId>:slack:channel:<channelId>:thread:<root_ts>` | `agent:<agentId>:discord:channel:<channelId>` (thread inherits parent; thread-specific session may append thread ID) | -+| **Bot trigger** | Bot-authored messages in other channels don't auto-trigger agents (same single-bot limitation) | Same behavior -- OpenClaw ignores bot-authored inbound by default | -+| **A2A pattern** | Two-step: Slack root message (anchor) + `sessions_send` (trigger) | Same two-step pattern applies to Discord | -+| **OpenClaw session isolation** | `historyScope = "thread"` + `inheritParent=false` isolates thread context | Thread config inherits parent channel unless overridden; `/focus` and `/unfocus` provide explicit binding | -+ -+**Key difference**: Discord's auto-archive is the most significant operational difference. Slack threads never expire, so long-running tasks can span days without concern. Discord threads will auto-archive after inactivity, requiring either: -+- The bot to have Manage Threads permission to unarchive -+- A periodic "keep-alive" message (not recommended; adds noise) -+- Accepting that completed task threads will archive naturally (acceptable for most workflows) -+ -+--- -+ -+## 4. Multi-Agent Collaboration Potential -+ -+### 4.1 Multiple Bots in Same Thread -+ -+**Confidence: HIGH** (Discord platform) / **MEDIUM** (OpenClaw implementation) -+ -+At the Discord level, multiple bots can absolutely participate in the same thread with distinct identities -- each bot appears with its own name, avatar, and online status. Any bot with `SEND_MESSAGES_IN_THREADS` permission on the parent channel can post in threads under that channel. Each bot's messages are clearly attributed to its own identity. -+ -+**OpenClaw limitation**: The bot-to-bot filtering issue (Issue #11199) means that even if Bot-A and Bot-B are both in the same thread, Bot-B's OpenClaw instance will drop Bot-A's messages as "own-bot" messages. This prevents a pattern where Bot-A mentions Bot-B in a thread to trigger a response. -+ -+**Practical pattern today**: Agent-to-agent collaboration in a shared thread must use `sessions_send` internally. The Discord thread serves as a shared audit log where both agents post their outputs for human visibility, but the actual trigger mechanism is internal to OpenClaw. -+ -+### 4.2 Discussion/Review/Brainstorm Patterns -+ -+**Confidence: MEDIUM** (conceptual; not tested in production) -+ -+With the current OpenClaw architecture, several collaboration patterns are achievable: -+ -+**Pattern 1: Delegated Execution (works today)** -+- CTO posts a task root message in #build -+- CTO uses `sessions_send` to trigger Builder in that thread -+- Builder executes and posts results in the thread -+- CTO monitors the thread and posts checkpoint summaries in #cto -+- Both agents' messages appear in the #build thread with distinct identities (if multi-bot) -+ -+**Pattern 2: Sequential Review (works today with orchestration)** -+- CoS creates a review request thread -+- CoS triggers CTO via `sessions_send` with the review brief -+- CTO posts analysis in the thread, then triggers Builder if implementation needed -+- Each agent's contribution is visible in the thread, attributed to its bot identity -+ -+**Pattern 3: Multi-Agent Discussion (partially blocked)** -+- Requires multiple agents to read each other's thread messages and respond -+- Currently blocked by Issue #11199 (bot-to-bot filtering) -+- Workaround: An orchestrator agent uses `sessions_send` to relay context between agents, posting summaries in a shared thread -+- True "round-table" discussion where agents directly read and respond to each other's Discord messages is not yet supported -+ -+**Pattern 4: Human-in-the-Loop Brainstorm (works today)** -+- Human posts a question in a channel -+- Bound agent responds -+- Human can @mention a different bot to bring another agent into the conversation (if multi-bot and `requireMention` is configured per-bot) -+- Each agent responds with its own identity -+ -+### 4.3 Orchestration Options -+ -+**Confidence: MEDIUM** -+ -+Three orchestration approaches exist for multi-agent Discord collaboration: -+ -+1. **OpenClaw A2A Protocol (recommended)**: Uses `sessions_send` for agent-to-agent triggering. Discord messages are "visibility anchors." This is the documented approach in OpenCrew's `A2A_PROTOCOL.md` and works with both single-bot and multi-bot configurations. It does not depend on Discord's message delivery for inter-agent communication. -+ -+2. **Discord-native orchestration (blocked)**: Would rely on bots reading each other's Discord messages and responding. Currently blocked by Issue #11199. If/when fixed, this would enable more natural multi-agent threads where agents directly react to each other's messages. Requires `allowBots` configuration and careful loop prevention. -+ -+3. **Hybrid orchestration**: Uses `sessions_send` for triggering but has agents post structured outputs in shared Discord threads. A "coordinator" agent (e.g., CTO) reads thread history via OpenClaw's message history and synthesizes responses. This works today and provides the best human visibility. -+ -+--- -+ -+## 5. Confidence Assessment -+ -+| Finding | Confidence | Basis | -+|---|---|---| -+| Multiple bots coexist in one Discord server | HIGH | Discord API docs, widespread practice | -+| Cross-bot MESSAGE_CREATE visibility | HIGH | Discord API v9+ documented behavior | -+| OpenClaw multi-account support shipped (PR #3672) | HIGH | Issue #3306 confirmation, official docs | -+| Bot-to-bot filtering bug (Issue #11199) | HIGH | Issue report with reproduction steps | -+| Channel permission override isolation method | HIGH | Discord API docs, testable | -+| Issue #34 root cause (single-bot + missing permission overrides) | HIGH | Reporter's own comment confirms | -+| Thread auto-archive behavior | HIGH | Discord API docs | -+| Thread permission inheritance from parent | HIGH | Discord API docs | -+| `requireMention` broken in multi-account (Issue #45300) | MEDIUM | Single issue report, not independently verified | -+| Multi-agent discussion pattern feasibility | MEDIUM | Conceptual; depends on #11199 resolution | -+| Session key format for Discord threads | MEDIUM | Partially documented; thread-specific key format not fully confirmed | -+ -+--- -+ -+## 6. Open Questions -+ -+1. **Issue #11199 fix status**: What is the merge status of PRs #11644, #22611, and #35479? If any are merged, bot-to-bot Discord messaging would be unblocked, enabling richer collaboration patterns. -+ -+2. **`requireMention` in multi-account**: Issue #45300 reports this is broken. Is there a workaround or fix? This is important for noise reduction in multi-bot setups. -+ -+3. **Thread session key format**: The exact session key format for Discord threads (as opposed to channels) needs confirmation. Does OpenClaw append a thread ID to the channel session key, or does it use the parent channel key? -+ -+4. **Auto-archive impact on long tasks**: If an OpenCrew task thread auto-archives mid-execution (e.g., agent is processing a long task), does the agent's next message automatically unarchive the thread, or does it fail silently? -+ -+5. **Rate limits with many bots**: With 7 agents each having their own bot, are there aggregate rate limit concerns at the guild level? Discord's per-guild rate limits may be stricter than per-bot limits for certain operations. -+ -+6. **Webhook relay vs. multi-bot trade-offs**: The DISCORD_SETUP.md mentions webhook relay as a middle-ground option. Has anyone in the OpenCrew community tested this approach? Webhooks can send with custom names/avatars but cannot receive messages, which limits their utility for agent routing. -+ -+7. **PR #3672 compatibility with current OpenClaw version**: The OpenCrew docs reference PR #3672 as "still in development," but Issue #3306 comments suggest it works. Which OpenClaw version is required for multi-account Discord support? -+ -+--- -+ -+## Sources -+ -+- [Discord Threads API Documentation](https://docs.discord.com/developers/topics/threads) -+- [Discord Permissions API Documentation](https://docs.discord.com/developers/topics/permissions) -+- [OpenClaw Discord Channel Documentation](https://docs.openclaw.ai/channels/discord) -+- [OpenClaw Multi-Agent Routing Documentation](https://docs.openclaw.ai/concepts/multi-agent) -+- [OpenClaw Issue #3306: Support multiple Discord accounts](https://github.com/openclaw/openclaw/issues/3306) -+- [OpenClaw Issue #11199: Multiple agent bots filtered when talking to each other](https://github.com/openclaw/openclaw/issues/11199) -+- [OpenClaw Issue #28479: Support Multiple Discord Bot Accounts](https://github.com/openclaw/openclaw/issues/28479) -+- [OpenClaw Issue #45300: requireMention broken in multi-account Discord config](https://github.com/openclaw/openclaw/issues/45300) -+- [OpenCrew Issue #34: Routing bug, cos and ops conversations mixed](https://github.com/AlexAnys/opencrew/issues/34) -+- [OpenCrew DISCORD_SETUP.md](../../docs/en/DISCORD_SETUP.md) -+- [OpenCrew CONFIG_SNIPPET_DISCORD.md](../../docs/en/CONFIG_SNIPPET_DISCORD.md) -+- [OpenCrew A2A_PROTOCOL.md](../../shared/A2A_PROTOCOL.md) -+- [OpenCrew KNOWN_ISSUES.md](../../docs/KNOWN_ISSUES.md) diff --git a/.harness/reports/research_feishu_r1.md b/.harness/reports/research_feishu_r1.md deleted file mode 100644 index e8e5e93..0000000 --- a/.harness/reports/research_feishu_r1.md +++ /dev/null @@ -1,433 +0,0 @@ -commit 7e825263db36aef68792a050c324daef598b4c56 -Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> -Date: Sat Mar 28 17:38:48 2026 +0800 - - feat: add A2A v2 research harness, architecture, and agent definitions - - Multi-agent harness for researching and designing A2A v2 protocol: - - Research reports (Phase 1): - - Slack: true multi-agent collaboration via multi-account + @mention - - Feishu: groupSessionScope + platform limitation analysis - - Discord: multi-bot routing + Issue #11199 blocker analysis - - Architecture designs (Phase 2): - - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode - - 5 collaboration patterns: Architecture Review, Strategic Alignment, - Code Review, Incident Response, Knowledge Synthesis - - 3-level orchestration: Human → Agent → Event-Driven - - Platform configs, migration guides, 6 ADRs - - Agent definitions for Claude Code Agent Teams: - - researcher.md, architect.md, doc-fixer.md, qa.md - - QA verification: all issues resolved, PASS verdict after fixes. - - Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> - -diff --git a/.harness/reports/research_feishu_r1.md b/.harness/reports/research_feishu_r1.md -new file mode 100644 -index 0000000..548ab4c ---- /dev/null -+++ b/.harness/reports/research_feishu_r1.md -@@ -0,0 +1,400 @@ -+# Research Report: Feishu Multi-Agent Capabilities (U1, Round 1) -+ -+## Executive Summary -+ -+The Feishu integration for OpenCrew is poised for a significant upgrade. Two independent developments converge to address the project's top limitations: -+ -+1. **Per-topic session isolation** is now available via the built-in OpenClaw Feishu plugin's `groupSessionScope: "group_topic"` config (the official replacement for the legacy `topicSessionMode` / the `threadSession` shorthand referenced in Issue #31). This directly solves the P0 session-sharing problem -- each Feishu topic thread gets its own session key, enabling parallel tasks without context intermingling. -+ -+2. **Multi-account (multi-bot) routing** is supported by both the built-in plugin and the community plugin, allowing each Agent to run as a distinct Feishu app with its own identity, API quota, and permissions. Combined with `accountId`-based bindings, messages route deterministically to the correct Agent. -+ -+However, the A2A two-step trigger **cannot be reduced to one step** via cross-bot messaging alone. Feishu's `im.message.receive_v1` event only fires for user-sent messages; bot-sent messages are invisible to other bots' event subscriptions. The `sessions_send` internal routing mechanism remains necessary for cross-agent triggering. -+ -+--- -+ -+## 1. threadSession Analysis -+ -+### 1.1 What It Does -+ -+**Confidence: HIGH** (backed by OpenClaw source code analysis via DeepWiki and PR #29791) -+ -+The term `threadSession` as used in Issue #31 (`openclaw config set channels.feishu.threadSession true`) refers to enabling per-topic session isolation in Feishu group chats. At the code level, this maps to the `groupSessionScope` configuration in the built-in OpenClaw Feishu extension (`extensions/feishu/`). -+ -+The built-in plugin supports four session scope modes, defined in `extensions/feishu/src/config-schema.ts`: -+ -+| `groupSessionScope` value | Session key format | Behavior | -+|---|---|---| -+| `"group"` (default) | `chatId` | One session per group chat | -+| `"group_sender"` | `chatId:sender:senderOpenId` | One session per (group + sender) | -+| `"group_topic"` | `chatId:topic:topicId` | One session per topic thread; falls back to `chatId` if no topic | -+| `"group_topic_sender"` | `chatId:topic:topicId:sender:senderOpenId` | One session per (topic + sender); cascading fallback | -+ -+The session key is constructed by the `buildFeishuConversationId` function (in `extensions/feishu/src/bot.ts` or related module): -+ -+```typescript -+function buildFeishuConversationId(params: { -+ chatId: string; -+ scope: FeishuGroupSessionScope; -+ senderOpenId?: string; -+ topicId?: string; -+}): string { -+ switch (params.scope) { -+ case "group_topic": -+ return topicId ? `${chatId}:topic:${topicId}` : chatId; -+ case "group_topic_sender": -+ if (topicId && senderOpenId) -+ return `${chatId}:topic:${topicId}:sender:${senderOpenId}`; -+ if (topicId) return `${chatId}:topic:${topicId}`; -+ return senderOpenId ? `${chatId}:sender:${senderOpenId}` : chatId; -+ // ... -+ } -+} -+``` -+ -+The `topicId` is derived from the Feishu message event's `root_id` (preferred) or `thread_id` (fallback). This was implemented in PR #29791 (merged March 2, 2026), which resolved the long-standing feature request for thread-based replies in Feishu. -+ -+**Historical note**: The deprecated `topicSessionMode: "enabled"` config is a legacy predecessor that maps internally to `groupSessionScope: "group_topic"`. The `threadSession = true` shorthand referenced in Issue #31 is likely another alias or community documentation shorthand for the same underlying mechanism. The canonical config key in current OpenClaw versions (2026.2+) is `groupSessionScope`. -+ -+### 1.2 How It Solves Session Sharing -+ -+**Confidence: HIGH** -+ -+This directly addresses OpenCrew's P0 issue ("Slack channel root messages share one session -- context pollution"). With `groupSessionScope: "group_topic"`: -+ -+- Each Feishu topic thread within a group gets a distinct session key (e.g., `oc_xxx:topic:om_root_123`) -+- Non-topic messages in the group mainline fall back to the group-level session (`oc_xxx`) -+- Different tasks running in different topics within the same group will have **fully isolated conversation context** -+- This mirrors the Slack model where "thread = task = session" -+ -+**Practical impact for OpenCrew**: In the CTO group, multiple A2A tasks can now run in parallel as separate topics, each with its own session. No more context intermingling. -+ -+### 1.3 Interaction with OpenCrew's Group Chat Model -+ -+**Confidence: MEDIUM** (theoretical analysis, not tested) -+ -+OpenCrew's model is "group chat = role" (each group is bound to one Agent). Adding topic-level session isolation is additive and non-breaking: -+ -+- **Routing**: The Agent binding still matches on `chatId` (group level). The `groupSessionScope` only affects the session key, not routing. Messages in any topic within the CTO group still route to the CTO Agent. -+- **A2A visibility**: Task root messages posted as Feishu topic starters become natural "anchors" (equivalent to Slack root messages). All follow-up conversation stays within the topic. -+- **Session key for A2A**: When using `sessions_send`, the session key format changes from `agent:cto:feishu:group:oc_xxx` to `agent:cto:feishu:group:oc_xxx:topic:om_root_yyy`. Existing A2A protocol session key construction logic will need to account for the topic suffix. -+ -+**Config to enable**: -+ -+```json -+{ -+ "channels": { -+ "feishu": { -+ "groupSessionScope": "group_topic" -+ } -+ } -+} -+``` -+ -+Or per-group override: -+ -+```json -+{ -+ "channels": { -+ "feishu": { -+ "groups": { -+ "oc_xxx": { -+ "groupSessionScope": "group_topic" -+ } -+ } -+ } -+ } -+} -+``` -+ -+--- -+ -+## 2. Multi-Account A2A Impact -+ -+### 2.1 Cross-App Message Routing -+ -+**Confidence: HIGH** (backed by OpenClaw source code and Feishu platform documentation) -+ -+In multi-account mode, each Feishu app (bot) runs as a separate account under `channels.feishu.accounts`. Each account establishes its own WebSocket connection to Feishu Cloud using its own `appId`/`appSecret`. The `startFeishuProvider` function in the built-in plugin creates a separate provider per enabled account. -+ -+When a user sends a message in a group where multiple bots are present, each bot receives an independent `im.message.receive_v1` event. OpenClaw handles this through cross-account broadcast deduplication: -+ -+1. The first account to claim the `messageId` in a shared "broadcast" namespace processes the message -+2. Subsequent accounts skip dispatch for that message -+3. The `tryRecordMessagePersistent` function enforces first-claim-wins semantics -+ -+With `accountId`-based bindings, the routing priority is: -+1. Exact `peer` match (specific group/DM ID) -+2. `parentPeer` match (thread inheritance) -+3. `accountId` match -+4. Channel-level fallback (`accountId: "*"`) -+5. Default agent fallback -+ -+**Recommended setup**: In "one bot per group" mode, add only the corresponding bot to each group. This avoids deduplication contention entirely since only one bot receives events per group. -+ -+### 2.2 Self-Loop Filter Bypass -+ -+**Confidence: HIGH** (backed by Feishu platform documentation) -+ -+**Critical finding**: The Feishu `im.message.receive_v1` event **only fires for user-sent messages**. The official Feishu documentation states: -+ -+> "Currently only supports messages sent by users" (`sender_type: "user"`) -+> "In group scenarios, you receive all messages sent by users (not including messages sent by the bot)" -+ -+This means: -+- When Bot-CTO posts a message in the Builder group, **Bot-Builder does NOT receive an `im.message.receive_v1` event** -+- This is a Feishu platform constraint, not an OpenClaw filter -+- The "self-loop filter bypass" question is moot -- there is nothing to bypass because bot messages simply do not generate inbound events for other bots -+ -+**Implication**: Cross-bot messaging via Feishu API cannot trigger another Agent. The only way to trigger Agent-B from Agent-A remains `sessions_send` (OpenClaw's internal A2A mechanism). -+ -+### 2.3 Implications for Two-Step Trigger -+ -+**Confidence: HIGH** -+ -+The current A2A two-step trigger cannot be simplified to one step via cross-bot messaging: -+ -+| Step | Current (single-bot) | Multi-bot mode | Change? | -+|------|---------------------|----------------|---------| -+| Step 1: Post visible root message in target channel | Bot posts in target group | Bot-A posts in Bot-B's group | **Same** (visibility anchor) | -+| Step 2: `sessions_send` to trigger target agent | Required (bot self-messages are ignored) | **Still required** (Feishu does not deliver bot messages to other bots) | **No change** | -+ -+However, multi-bot mode does provide these improvements: -+- **Visual clarity**: Each Agent's messages appear under a distinct bot name/avatar, making A2A exchanges easier to follow -+- **API quota independence**: Each bot has its own rate limits, preventing a chatty Agent from starving others -+- **Permission isolation**: Different Agents can have different Feishu permission scopes -+ -+The `sessions_send` mechanism works via OpenClaw's `INTERNAL_MESSAGE_CHANNEL` with `deliver: false`, meaning it routes entirely within the OpenClaw runtime without touching Feishu APIs. This is efficient and reliable regardless of bot configuration. -+ -+### 2.4 Config Examples -+ -+Multi-account Feishu config with topic session isolation: -+ -+```json -+{ -+ "channels": { -+ "feishu": { -+ "domain": "feishu", -+ "connectionMode": "websocket", -+ "groupSessionScope": "group_topic", -+ "accounts": { -+ "cos-bot": { -+ "name": "CoS Chief of Staff", -+ "appId": "cli_cos_xxxxx", -+ "appSecret": "your-cos-secret", -+ "enabled": true -+ }, -+ "cto-bot": { -+ "name": "CTO Tech Partner", -+ "appId": "cli_cto_xxxxx", -+ "appSecret": "your-cto-secret", -+ "enabled": true -+ }, -+ "builder-bot": { -+ "name": "Builder Executor", -+ "appId": "cli_build_xxxxx", -+ "appSecret": "your-builder-secret", -+ "enabled": true -+ } -+ } -+ } -+ }, -+ "bindings": [ -+ { -+ "agentId": "cos", -+ "match": { -+ "channel": "feishu", -+ "accountId": "cos-bot", -+ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_HQ>" } -+ } -+ }, -+ { -+ "agentId": "cto", -+ "match": { -+ "channel": "feishu", -+ "accountId": "cto-bot", -+ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_CTO>" } -+ } -+ }, -+ { -+ "agentId": "builder", -+ "match": { -+ "channel": "feishu", -+ "accountId": "builder-bot", -+ "peer": { "kind": "group", "id": "<FEISHU_GROUP_ID_BUILD>" } -+ } -+ } -+ ] -+} -+``` -+ -+**Important caveat**: A known bug (Issue #47436) in OpenClaw 2026.3.13 causes the Feishu plugin to crash when a second account uses `SecretRef` for `appSecret`. A fix has been submitted in PR #47652 (wraps per-account errors in try-catch). Until this is merged, use plaintext secrets or wait for the patch. -+ -+--- -+ -+## 3. Multi-Agent Collaboration Potential -+ -+### 3.1 Multiple Bots in Same Group/Topic -+ -+**Confidence: MEDIUM** -+ -+Multiple Feishu bots CAN coexist in the same group. The behaviors: -+ -+- **User messages**: All bots receive the event. OpenClaw's cross-account dedup ensures only one processes it (first-claim-wins). With `requireMention: true`, bots only respond when explicitly @mentioned, which is the cleanest pattern. -+- **Bot messages**: No bot receives events from other bots (Feishu platform limitation). Cross-bot conversation within a group is therefore not possible via Feishu events alone. -+- **Recommended pattern**: Each Agent's group should contain only its own bot. If multi-Agent collaboration is needed within a single group, use `sessions_send` for triggering and Feishu API messages for visibility only. -+ -+### 3.2 Discussion Patterns in Feishu -+ -+**Confidence: MEDIUM** -+ -+With `groupSessionScope: "group_topic"`, Feishu topics enable a workflow analogous to Slack threads: -+ -+1. **Task initiation**: Human or Agent creates a new topic in the Agent's group (this becomes the session root) -+2. **Execution**: Agent works within the topic, maintaining isolated context -+3. **A2A delegation**: Agent-A posts a topic in Agent-B's group (visibility anchor), then uses `sessions_send` to trigger Agent-B in that topic's session -+4. **Parallel tasks**: Multiple topics in the same group run independently -+ -+The key difference from Slack: Feishu topics in standard groups are less prominent in the UI than Slack threads. Feishu "topic groups" (话题群) are a special group type where all messages must belong to a topic -- this would be the ideal group type for OpenCrew's use case, as it enforces topic-based organization. -+ -+### 3.3 Comparison with Slack Capabilities -+ -+| Capability | Slack | Feishu (with groupSessionScope) | -+|---|---|---| -+| Thread/topic isolation | Native (thread = session) | Now available via `group_topic` scope | -+| Bot self-loop filter | Bot ignores own messages (configurable) | Platform-level: bot events only for user messages | -+| Cross-bot triggering | Not possible (single bot identity) | Not possible (bot messages invisible to other bots) | -+| A2A trigger mechanism | `sessions_send` (required) | `sessions_send` (required) | -+| Visual identity | Single bot, shared name | Multi-bot, distinct names/avatars | -+| Thread UI prominence | High (native threading) | Medium (topic groups better than standard groups) | -+ -+--- -+ -+## 4. Migration Path -+ -+### 4.1 Single-App to Multi-App -+ -+**Confidence: MEDIUM** (logical analysis, not tested end-to-end) -+ -+Migration steps: -+ -+1. **Create new Feishu apps** for each Agent (follow existing FEISHU_SETUP.md Steps 1-3 for each) -+2. **Update openclaw.json** to use `accounts` format instead of top-level `appId`/`appSecret`: -+ -+ Before (single-app): -+ ```json -+ { -+ "channels": { -+ "feishu": { -+ "appId": "cli_original_xxx", -+ "appSecret": "original-secret" -+ } -+ } -+ } -+ ``` -+ -+ After (multi-app): -+ ```json -+ { -+ "channels": { -+ "feishu": { -+ "accounts": { -+ "legacy": { -+ "appId": "cli_original_xxx", -+ "appSecret": "original-secret", -+ "enabled": true -+ }, -+ "cto-bot": { -+ "appId": "cli_cto_xxx", -+ "appSecret": "cto-secret", -+ "enabled": true -+ } -+ } -+ } -+ } -+ } -+ ``` -+ -+3. **Add `accountId` to bindings** incrementally -- start with one Agent, verify, then expand -+4. **Add each new bot to its group** in Feishu settings -+5. **Enable topic sessions** with `groupSessionScope: "group_topic"` (can be done independently of multi-app migration) -+ -+### 4.2 Session Key Compatibility -+ -+**Confidence: LOW** (requires testing) -+ -+When migrating from single-app to multi-app, session keys may change format: -+ -+- **Old format**: `agent:cto:feishu:group:oc_xxx` (no accountId component) -+- **New format**: Potentially `agent:cto:feishu:cto-bot:group:oc_xxx` (with accountId) -+ -+If the session key changes, existing conversation history associated with old session keys becomes orphaned. The Agent starts with a fresh session in the new key. -+ -+**Mitigation strategies**: -+- Keep the original app as the `legacy` account and migrate agents one at a time -+- Use `session.resetByType` to explicitly reset group sessions during migration (treat it as a clean-slate moment) -+- Back up `~/.openclaw/sessions/` before migration -+ -+When adding `groupSessionScope: "group_topic"`: -+- Messages in the group mainline (no topic) continue using the base `chatId` key -- unchanged -+- Only messages within topics get the new `chatId:topic:topicId` key -+- This is backward-compatible: existing mainline sessions are unaffected -+ -+### 4.3 Rollback Strategy -+ -+1. **Config rollback**: Restore from backup (`openclaw.json.bak.<timestamp>`) -+2. **Bot rollback**: Remove new bots from groups; the original bot remains functional -+3. **Session rollback**: Session data for the old key format is preserved -- reverting config restores old routing -+4. **Gateway restart**: `openclaw gateway restart` applies all changes -+ -+The migration is designed to be incremental and reversible at each step. -+ -+--- -+ -+## 5. Confidence Assessment -+ -+| Finding | Confidence | Source | -+|---|---|---| -+| `groupSessionScope: "group_topic"` creates per-topic sessions | **HIGH** | OpenClaw source code (`buildFeishuConversationId`), PR #29791, DeepWiki analysis | -+| `threadSession` is a shorthand/alias for topic session isolation | **MEDIUM** | Issue #31 comment + correlation with `topicSessionMode` legacy config; exact alias mechanism not found in source | -+| Feishu `im.message.receive_v1` only fires for user messages | **HIGH** | Official Feishu Open Platform documentation | -+| Bot-to-bot messages do NOT trigger other bots | **HIGH** | Feishu platform documentation: "sender_type currently only supports user" | -+| A2A two-step trigger cannot become one-step | **HIGH** | Combination of Feishu platform constraint + OpenClaw `sessions_send` architecture | -+| Multi-account config format with `accounts` block | **HIGH** | OpenClaw source code, config schema, DeepWiki analysis | -+| Cross-account dedup via broadcast namespace | **HIGH** | OpenClaw source code, test cases documented in DeepWiki | -+| Multi-account SecretRef crash bug (Issue #47436) | **HIGH** | GitHub issue with reproduction steps and submitted fix | -+| Session key format change during migration | **LOW** | Theoretical analysis; needs empirical testing | -+| `groupSessionScope` can be set per-group | **MEDIUM** | Config schema supports it; not tested in practice | -+ -+--- -+ -+## 6. Open Questions -+ -+1. **Exact `threadSession` config path**: The Issue #31 comment references `openclaw config set channels.feishu.threadSession true`, but the canonical config key found in OpenClaw source is `groupSessionScope`. Is `threadSession` a CLI shorthand that resolves to `groupSessionScope: "group_topic"`? Or is it specific to the community plugin (`AlexAnys/feishu-openclaw`)? This needs verification against the actual CLI behavior. -+ -+2. **Session key migration**: When adding `accountId` to bindings, does the session key incorporate the accountId? If so, what happens to existing sessions? This needs empirical testing. -+ -+3. **Topic group type**: Feishu distinguishes between standard groups (topics optional) and "topic groups" (话题群, topics mandatory). Which type works better with OpenCrew? Does `groupSessionScope` work identically in both? -+ -+4. **Announce step behavior**: When Agent-A uses `sessions_send` to trigger Agent-B, the "announce step" posts a summary to the target channel. With multi-bot mode, which bot identity is used for the announce post -- Agent-B's bot (correct) or a shared bot? This affects visual clarity of A2A exchanges. -+ -+5. **Rate limiting with many accounts**: Each Feishu app has independent API quotas. However, the health check ping interval (noted in `docs/api-quota-fix.md`) consumes API calls per account. With 7 agents = 7 apps, health check overhead may be significant. What is the optimal ping interval? -+ -+6. **Community plugin vs built-in**: The `AlexAnys/feishu-openclaw` community plugin does NOT support `groupSessionScope` (it uses a simpler `chatId`-only session key with `threading.resolveReplyToMode: "off"`). OpenCrew's current setup guide references this community plugin. Is the project already using the built-in plugin (OpenClaw >= 2026.2), or does it need to migrate? -+ -+7. **Cross-account broadcast in shared groups**: If a "shared collaboration group" with multiple bots is desired (e.g., a "war room"), how should the broadcast dedup be configured? Should one bot be designated as the "listener" with others set to `requireMention: true`? -+ -+--- -+ -+## Sources -+ -+- [OpenClaw Feishu documentation](https://docs.openclaw.ai/channels/feishu) -+- [OpenClaw GitHub - Feishu docs](https://github.com/openclaw/openclaw/blob/main/docs/channels/feishu.md) -+- [AlexAnys/openclaw-feishu (community plugin)](https://github.com/AlexAnys/openclaw-feishu) -+- [AlexAnys/feishu-openclaw (bridge)](https://github.com/AlexAnys/feishu-openclaw) -+- [Issue #29791: Thread-based replies in Feishu](https://github.com/openclaw/openclaw/issues/29791) -- closed, resolved via PR #29788 -+- [Issue #8692: Multi-bot routing issues](https://github.com/openclaw/openclaw/issues/8692) -+- [Issue #47436: Multi-account SecretRef crash](https://github.com/openclaw/openclaw/issues/47436) -+- [Feishu Open Platform - Receive message event](https://open.feishu.cn/document/server-docs/im-v1/message/events/receive) -+- [DeepWiki - OpenClaw Session Management](https://deepwiki.com/openclaw/openclaw/2.4-session-management) -+- [DeepWiki - AlexAnys/openclaw-feishu](https://deepwiki.com/AlexAnys/openclaw-feishu) -+- [OpenCrew Issue #31: Feishu multi-agent bot routing](https://github.com/AlexAnys/opencrew/issues/31) diff --git a/.harness/reports/research_platform_limitations_r1.md b/.harness/reports/research_platform_limitations_r1.md deleted file mode 100644 index fee64d8..0000000 --- a/.harness/reports/research_platform_limitations_r1.md +++ /dev/null @@ -1,109 +0,0 @@ -# Why Discord and Feishu Cannot Match Slack's Cross-Bot Collaboration - -**Date**: 2026-03-27 -**Purpose**: Definitive answer to why Slack supports cross-bot collaboration but Discord and Feishu do not. - ---- - -## How Slack Works (the baseline) - -Slack's architecture enables cross-bot collaboration through five properties working together: - -1. **Independent bot identities**: Each agent gets its own Slack App with a unique bot user ID. -2. **Per-bot self-loop filter**: OpenClaw checks `message.user === ctx.botUserId` per-account. Bot-CTO (user ID `U_CTO`) is NOT filtered by Bot-Builder's handler (which only filters `U_BUILDER`). -3. **`allowBots: true`**: Enables processing messages authored by other bots. -4. **Per-account channel config**: Each account can have different `requireMention` settings, preventing uncontrolled loops while allowing targeted @mention-based triggering. -5. **Platform-level event delivery**: Slack's Events API delivers all channel messages to all subscribed apps, including messages from other bots. - -The result: Bot-CTO can @mention Bot-Builder in a thread, Bot-Builder's OpenClaw handler receives it as a legitimate inbound message, processes it, and responds -- all visible to humans as a natural conversation between distinct identities. - ---- - -## Discord: Blocked by TWO OpenClaw Code Bugs - -### The blocker is NOT a Discord platform limitation. - -Discord's platform fully supports cross-bot messaging. When Bot-A posts in a channel, Bot-B receives the `MESSAGE_CREATE` gateway event (provided Bot-B has View Channel permission and Message Content Intent enabled). The `message.author.bot` flag identifies it as a bot message. This is identical to Slack's behavior. - -### Bug 1: Issue #11199 -- Bot filter treats all configured bots as "self" - -**What happens at the code level**: OpenClaw's Discord plugin checks the message author ID against ALL configured bot account IDs in the instance, not just the receiving account's own ID. When Bot-A posts a message, Bot-B's handler sees Bot-A's user ID in its "known bot IDs" list and drops the message as if it were a self-loop. - -**Contrast with Slack**: The Slack plugin checks `message.user === ctx.botUserId` where `ctx.botUserId` is the specific bot user ID of THAT account. Different accounts have different `botUserId` values, so cross-bot messages pass through. The Discord plugin lacks this per-account scoping. - -**Fix status**: Three PRs were submitted to fix this (#11644, #22611, #35479). All three were CLOSED without merging. The issue itself was auto-closed on 2026-03-08 by a stale bot due to inactivity -- it was NOT fixed. - -**Workaround**: A community workaround exists: set `allowBots: true` + `requireMention: false` + per-channel `users` whitelist. This works but requires disabling mention gating entirely, which removes the primary loop-prevention mechanism. - -### Bug 2: Issue #45300 -- `requireMention` broken in multi-account Discord - -**What happens**: When multiple Discord bot accounts are configured, the `requireMention: true` check drops ALL guild messages at the preflight stage with reason "no-mention" -- even when the bot IS explicitly @mentioned. The mention detection logic fails to correctly resolve mentions against the receiving bot's user ID in multi-account configurations. - -**Why this matters**: Even if #11199 were fixed, the recommended safe pattern (`allowBots: true` + `requireMention: true`) would still not work. Without `requireMention`, every bot message in the channel triggers every other bot, creating infinite loops. - -**Status**: Issue is still OPEN. No fix PR identified. - -### What would need to change - -1. Fix the self-loop filter to be per-account (check author ID against only the receiving account's bot user ID, not all configured bot IDs). -2. Fix mention detection in multi-account mode to correctly identify @mentions of the receiving bot. -3. Both fixes are straightforward code changes -- they align Discord's behavior with Slack's existing implementation. - -### Timeline - -**Uncertain.** All three fix PRs for #11199 were closed without merge, and the issue was auto-closed as stale. The OpenClaw maintainers have not signaled intent to prioritize this. Given that `sessions_send` (internal A2A routing) is the officially recommended pattern, channel-based cross-bot communication may not be considered a priority. - ---- - -## Feishu: Blocked by a Platform-Level API Limitation - -### The blocker IS a Feishu platform limitation. It cannot be fixed by OpenClaw. - -### The technical constraint - -Feishu's `im.message.receive_v1` event -- the only event type for receiving chat messages -- explicitly delivers ONLY user-sent messages. The official documentation states: - -> "目前只支持用户(user)发送的消息" -> ("Currently only supports messages sent by users") - -> "可接收与机器人所在群聊会话中用户发送的所有消息(不包含机器人发送的消息)" -> ("Can receive all messages sent by users in group chats where the bot participates, excluding messages sent by the bot") - -When Bot-CTO posts a message in a group, Bot-Builder does NOT receive any event. The message is simply invisible to other bots at the API level. There is no `allowBots` flag or configuration that can change this -- the events are never generated by Feishu's servers. - -### Are there alternative APIs? - -No viable alternative exists: - -- **`im.message.receive_v1`** is the only message reception event. There is no `im.message.receive_bot_v1` or equivalent. -- **Message list API** (`GET /im/v1/messages`): Could theoretically poll for messages, but this is a REST endpoint, not a real-time event. Polling introduces latency, complexity, and API quota consumption. It also cannot distinguish which messages have already been processed. -- **Feishu's event system** has no event type for "bot message posted in group." The platform was designed with a user-to-bot interaction model, not a bot-to-bot model. -- Searching for alternative approaches (e.g., "飞书 机器人消息 其他机器人接收") confirms this is a well-known and accepted limitation of the Feishu platform with no documented workaround. - -### What would need to change - -Feishu (ByteDance/Lark) would need to add a new event type or extend `im.message.receive_v1` to include bot-sent messages with an opt-in flag. There is no public indication this is planned. - ---- - -## Summary Table - -| Dimension | Slack | Discord | Feishu | -|-----------|-------|---------|--------| -| Platform delivers cross-bot messages? | YES | YES | **NO** | -| OpenClaw processes cross-bot messages? | YES (per-account self-loop filter) | **NO** (global bot filter bug #11199) | N/A (no events to process) | -| Mention gating works in multi-account? | YES | **NO** (broken, #45300) | N/A | -| Blocker type | None | **Code bugs** (fixable) | **Platform limitation** (unfixable by us) | -| Fix complexity | N/A | Low (align with Slack's implementation) | Requires Feishu platform change | -| Fix timeline | N/A | Uncertain (PRs closed, issue stale) | No indication from Feishu | -| Current workaround | N/A | `allowBots: true` + `requireMention: false` + `users` whitelist (loop risk) | `sessions_send` only | - -### The bottom line - -- **Discord** could work exactly like Slack if two code bugs in OpenClaw were fixed. The Discord platform itself is fully capable. The fixes are straightforward but have not been prioritized by OpenClaw maintainers. -- **Feishu** cannot work like Slack regardless of any code changes. The limitation is baked into Feishu's event delivery architecture. Only `sessions_send` (OpenClaw's internal routing) can achieve cross-agent triggering on Feishu. -- **Both platforms** fully support the Delegation pattern (anchor message + `sessions_send`). Only the Discussion pattern (autonomous cross-bot conversation visible in chat) is blocked. - ---- - -*Sources: OpenClaw Issues #11199, #45300, #15836; PRs #11644, #22611, #35479; Feishu Open Platform docs (open.feishu.cn); OpenClaw source code verification (verify_source_code_r1.md); QA verification (qa_a2a_research_r1.md)* diff --git a/.harness/reports/research_selective_agents_r1.md b/.harness/reports/research_selective_agents_r1.md deleted file mode 100644 index 0739e21..0000000 --- a/.harness/reports/research_selective_agents_r1.md +++ /dev/null @@ -1,501 +0,0 @@ -# Research: Selective Independent Agents Architecture - -> Researcher: Claude Opus 4.6 | Date: 2026-03-27 | Scope: Three architectural questions -- orchestrator role, hybrid bot architecture, instance vs workspace independence - ---- - -## Executive Summary - -**Question 1 (Orchestrator)**: CoS is the correct orchestrator, not CTO. In Anthropic's Harness Design, the orchestrator is an **external script** (not a participant agent). In OpenCrew's Slack-native context, CoS maps most naturally to this role: it represents the user's intent, drives strategy forward, and does not do execution work. CTO should be a Generator/participant, not the orchestrator. - -**Question 2 (Hybrid Architecture)**: The proposed hybrid model -- single bot for execution agents + independent CoS bot + independent QA bot -- is technically feasible within a single OpenClaw gateway using multi-account mode. The key insight: one account CAN serve multiple agents via peer binding (channel-to-agent), while other accounts each serve one agent via account binding. Two bots CAN coexist in the same channel when `allowBots: true` + `requireMention: true` is set. The @mention-back flow works without changing existing channel configs, provided `allowBots: true` is added to channels where cross-bot interaction is needed. - -**Question 3 (Instance vs Workspace)**: A single OpenClaw gateway with multi-account is strongly recommended over separate instances. Multi-account means multiple Slack Apps managed by one gateway process. This preserves `sessions_send` interoperability, shared config, and single-process management. Separate instances would break A2A tools across process boundaries. - ---- - -## 1. Orchestration: CoS vs CTO vs External Harness - -### Anthropic Harness Design Analysis - -The Anthropic Harness Design methodology (as implemented in Claude Code Agent Teams and documented in the harness-design skill) follows a clear separation: - -| Component | Role | Is an Agent? | -|-----------|------|--------------| -| **Harness** (script) | Orchestrator -- decides what runs next, parses outputs, manages flow | NO -- it is external code | -| **Planner** | Analyzes requirements, produces spec/plan | YES -- spawned by harness | -| **Builder/Generator** | Executes the plan, produces artifacts | YES -- spawned by harness | -| **QA/Evaluator** | Reviews outputs, challenges quality, catches issues | YES -- spawned by harness | - -Key architectural principle: **No single agent is both a participant AND the orchestrator.** The harness is not an LLM agent -- it is a deterministic script that reads outputs and decides the next step. This prevents: -- Orchestrator bias (an agent-orchestrator favors its own perspective) -- Context pollution (orchestration logic competes with domain reasoning) -- Role confusion (is the agent thinking about the problem or about who to call next?) - -### Mapping to OpenCrew Roles - -The current A2A_PROTOCOL.md and architecture_collab_r1.md designate CTO as the discussion orchestrator for technical discussions. But this creates a problem identified in the Harness Design: - -**CTO as orchestrator = CTO is both participant and controller.** When CTO drives an architecture review, it is simultaneously: -1. Proposing the technical approach (Generator role) -2. Deciding who speaks next (Orchestrator role) -3. Evaluating Builder's response (Evaluator role) - -This triple-hat violates the Harness Design's core separation. CTO's technical opinions will bias which agents it calls and how it frames questions. - -**CoS maps naturally to the Harness's orchestrator role:** - -| Harness Concept | CoS Mapping | Evidence | -|----------------|-------------|---------| -| External to generation | CoS does NOT do technical implementation | SOUL.md: "you are not a gateway, not a doer" | -| Represents user intent | CoS's core role is "deep intent alignment" | SOUL.md: "strategic partner who drives things forward when user is away" | -| Decides what runs next | CoS determines priorities and delegation | AGENTS.md: "strategic tradeoff + pacing + coordination" | -| Reads outputs, routes decisions | CoS synthesizes and routes to CTO/CIO | ARCHITECTURE.md: "CoS evaluates/delegates to CTO/CIO" | -| Does not participate in generation | CoS does not write code, do research, or build | SOUL.md: "push main thread, lower cognitive load" | - -The user's insight is correct: CoS's SOUL.md description -- "strategic partner who drives things forward when the user is away" -- is almost word-for-word the harness's role description. - -**But CoS is not a pure external script -- it is an LLM agent.** This is a key difference from the Harness Design. In OpenCrew's Slack-native architecture, a non-agent orchestrator would be a Slack bot with hardcoded routing logic, which loses the strategic judgment that makes CoS valuable. The pragmatic solution: CoS acts as orchestrator but with strict role discipline: - -- CoS **decides** who to engage and what to ask (orchestrator hat) -- CoS **does not** propose technical solutions or challenge technical details (no generator/evaluator hat) -- CoS **synthesizes** outcomes and aligns with user intent (unique CoS value-add) - -### Recommendation - -**CoS should be the orchestrator. CTO should be a participant/generator.** - -This means: -1. **Discussion mode**: CoS @mentions CTO, Builder, CIO as needed. CTO responds with technical input but does not decide who speaks next. -2. **Delegation mode**: CoS delegates to CTO via A2A. CTO then orchestrates within its execution scope (CTO-to-Builder), which is fine -- this is scoped orchestration of subordinates, not strategic orchestration. -3. **QA as independent evaluator**: QA reviews outputs without being called by the generator (CTO/Builder). This mirrors the Harness's Evaluator independence. - -The existing Permission Matrix (CoS -> CTO only, CTO -> Builder/Research/KO/Ops) already supports this. The change is conceptual: CTO stops being the Discussion orchestrator and becomes a discussion participant. - ---- - -## 2. Hybrid Architecture: Single Bot + Selective Independent Agents - -### 2.1 Config Feasibility - -**Proposed model:** -- `accounts.default` (single Slack App) -- serves CTO, CIO, Builder, KO, Ops, Research via peer binding per channel -- `accounts.cos` (independent Slack App) -- serves CoS only via account binding -- `accounts.qa` (independent Slack App) -- serves QA only via account binding - -**Does this work?** Yes. OpenClaw's binding system supports mixing peer-match and account-match bindings in the same config. The binding resolution order is: - -1. **Peer match** (most specific): `match.peer.kind = "channel", match.peer.id = "<CHANNEL_ID>"` -- routes messages from a specific Slack channel to a specific agent -2. **Account match**: `match.accountId = "cos"` -- routes ALL messages received by the CoS Slack App to the CoS agent -3. **Fallback**: unmatched messages go to the default agent - -The current OpenCrew config (CONFIG_SNIPPET_2026.2.9.md) uses peer binding exclusively -- each agent is bound to its channel via `match.peer`. This binding method is **account-agnostic** -- it works on whichever Slack App receives the event. In the current single-bot setup, all events come through one bot, and peer binding routes them to the correct agent by channel. - -**The hybrid config would look like:** - -```jsonc -{ - "channels": { - "slack": { - "accounts": { - // Single bot for execution agents (existing App) - "default": { - "botToken": "${SLACK_BOT_TOKEN_DEFAULT}", - "appToken": "${SLACK_APP_TOKEN_DEFAULT}" - }, - // Independent CoS bot (new App) - "cos": { - "botToken": "${SLACK_BOT_TOKEN_COS}", - "appToken": "${SLACK_APP_TOKEN_COS}" - }, - // Independent QA bot (new App) - "qa": { - "botToken": "${SLACK_BOT_TOKEN_QA}", - "appToken": "${SLACK_APP_TOKEN_QA}" - } - }, - "channels": { - // Existing agent channels -- unchanged - "<HQ_CHANNEL_ID>": { "allow": true, "requireMention": false }, - "<CTO_CHANNEL_ID>": { "allow": true, "requireMention": false, "allowBots": true }, - "<BUILD_CHANNEL_ID>": { "allow": true, "requireMention": false, "allowBots": true }, - "<INVEST_CHANNEL_ID>": { "allow": true, "requireMention": false }, - "<KNOW_CHANNEL_ID>": { "allow": true, "requireMention": true }, - "<OPS_CHANNEL_ID>": { "allow": true, "requireMention": true }, - "<RESEARCH_CHANNEL_ID>": { "allow": true, "requireMention": false } - } - } - }, - "bindings": [ - // CoS: account-level binding (all events from CoS App -> CoS agent) - { "agentId": "cos", "match": { "channel": "slack", "accountId": "cos" } }, - - // QA: account-level binding (all events from QA App -> QA agent) - { "agentId": "qa", "match": { "channel": "slack", "accountId": "qa" } }, - - // Execution agents: peer binding on default account (unchanged from current) - { "agentId": "cto", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<CTO_CHANNEL_ID>" } } }, - { "agentId": "builder", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<BUILD_CHANNEL_ID>" } } }, - { "agentId": "cio", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<INVEST_CHANNEL_ID>" } } }, - { "agentId": "ko", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<KNOW_CHANNEL_ID>" } } }, - { "agentId": "ops", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<OPS_CHANNEL_ID>" } } }, - { "agentId": "research", "match": { "channel": "slack", "peer": { "kind": "channel", "id": "<RESEARCH_CHANNEL_ID>" } } } - ] -} -``` - -**Key insight**: The peer bindings for CTO/Builder/etc. are processed against the default account's events. The account bindings for CoS/QA are processed against their respective accounts' events. These don't conflict -- OpenClaw processes each account's event stream independently through its own binding chain. - -### 2.2 Channel Coexistence (Two Bots in Same Channel) - -**Scenario**: CoS-Bot and Default-Bot (bound to CTO) are both in #cto. Someone posts in #cto. - -**What happens:** - -1. Slack delivers the `message.channels` event to ALL apps that have joined #cto -2. Default-Bot's app receives the event -> OpenClaw checks bindings -> peer match on `<CTO_CHANNEL_ID>` -> routes to CTO agent -3. CoS-Bot's app receives the event -> OpenClaw checks bindings -> account match on "cos" -> routes to CoS agent - -**Without any guards, BOTH agents would respond.** This is the core coexistence question. - -**Solution: `requireMention: true` on the CoS and QA accounts' channel configs.** - -There is a subtlety here. OpenClaw's channel config (`channels.slack.channels`) is **global across all accounts** in the same gateway. You cannot set `requireMention: false` for CTO on #cto and `requireMention: true` for CoS on #cto in the same channel config block. - -**However**, the binding model handles this naturally: - -- **Default-Bot in #cto**: The peer binding for CTO matches on channel ID, not on mention. The channel config for #cto has `requireMention: false`, so CTO responds to all messages. This is the existing behavior. -- **CoS-Bot in #cto**: CoS-Bot is NOT the "native" bot for #cto. CoS's binding is account-level (`accountId: cos`), not peer-level on #cto. When CoS-Bot receives a message from #cto, the `requireMention` check applies. If `requireMention: true` is set on #cto's channel config, CoS only activates when @mentioned. - -**The problem**: Setting `requireMention: true` on #cto globally would also require CTO to be @mentioned -- breaking its current behavior. - -**Resolution approaches:** - -1. **Option A -- Per-account channel overrides**: If OpenClaw supports `accounts.cos.channels.<CTO_CHANNEL_ID>.requireMention = true` while global remains `false`, this solves it cleanly. **Status: UNVERIFIED** in prior research. The docs say named accounts "can override any setting" but this has not been confirmed at the per-channel level. - -2. **Option B -- CoS-Bot uses `requireMention: true` natively**: CoS-Bot joins #cto but ONLY responds when @mentioned (`<@U_COS>`). The global `requireMention: false` on #cto applies to the default bot (CTO), while CoS-Bot's agent instructions enforce mention-only behavior. **Problem**: This relies on agent-level self-discipline, not config-level enforcement. If `requireMention: false` is the channel setting, OpenClaw WILL trigger CoS's session for non-mention messages. - -3. **Option C -- CoS does NOT join agent channels by default**: CoS-Bot only joins #hq (its home channel). When CoS needs to interact with #cto, it uses `sessions_send` (A2A delegation) rather than direct @mention. CoS only joins other channels for active Discussion mode sessions. **This is the cleanest solution that preserves existing channel behavior.** - -4. **Option D -- Separate "discussion" threads**: CoS-Bot joins #cto only for specific discussion threads (not the channel at large). The bot is invited to the channel but only activates in threads where it is @mentioned. With `requireMention: true` set globally and CTO's channel using `requireMention: false`, CTO auto-responds to channel messages while CoS only responds in threads where it is mentioned. **Problem**: Same global config conflict as Option B. - -**Recommended approach: Option C with selective @mention for discussions.** - -CoS-Bot stays in #hq as its home. For orchestration: -- **Delegation**: CoS uses `sessions_send` to trigger CTO (existing A2A v1 flow, proven) -- **Discussion**: CoS @mentions CTO in a dedicated #collab or #hq thread where `allowBots: true` + `requireMention: true` are both set. CTO joins this discussion via @mention trigger. -- **Progress checking**: CoS @mentions CTO in #cto threads (CTO's channel has `allowBots: true`, CTO responds because it is @mentioned) - -This avoids the global `requireMention` conflict entirely. - -### 2.3 @Mention Interaction Patterns - -**Pattern A: CoS checks CTO's progress in #cto** - -``` -Precondition: - - #cto has allowBots: true (MUST ADD -- currently false) - - #cto has requireMention: false (existing) - - CoS-Bot has been invited to #cto - -Flow: - 1. CTO (default bot) is working in #cto thread on a task - 2. CoS-Bot joins #cto and posts in the thread: "@CTO what's the status on X?" - 3. Default-Bot's app receives CoS-Bot's message in #cto - 4. OpenClaw checks: is this from a bot? Yes. Is allowBots: true? Yes. - 5. OpenClaw checks: is this from our bot user ID? No (U_COS != U_DEFAULT). Pass. - 6. OpenClaw checks requireMention: false on #cto -> pass (no mention needed) - 7. Peer binding matches <CTO_CHANNEL_ID> -> routes to CTO agent - 8. CTO reads thread, sees CoS's question, responds - -Problem: Step 6 means CTO responds to ALL of CoS-Bot's messages, not just @mentions. -This is actually DESIRABLE for this pattern -- CoS posting in CTO's thread IS an interaction. - -Counter-problem: CoS-Bot also receives CTO's response (via its own app's event stream). - - CoS's account binding routes ALL events from CoS-Bot to CoS agent - - CoS receives CTO's response in the thread -> CoS might auto-respond - - This creates potential ping-pong - -Mitigation: CoS's AGENTS.md must include explicit WAIT discipline: - "After posting a progress check in another agent's channel, WAIT for response. - Do not auto-respond to the reply unless you have a specific follow-up." -``` - -**Pattern B: QA reviews Builder's output in #build** - -``` -Precondition: - - #build has allowBots: true (MUST ADD -- currently false) - - QA-Bot has been invited to #build - -Flow: - 1. Builder (default bot) posts closeout in #build thread - 2. QA-Bot reads the thread and posts review: "@Builder three issues found..." - 3. Default-Bot's app receives QA-Bot's message - 4. allowBots: true -> pass - 5. Self-loop filter: U_QA != U_DEFAULT -> pass - 6. Peer binding matches <BUILD_CHANNEL_ID> -> routes to Builder agent - 7. Builder reads QA's review and responds with fixes/clarifications - -This pattern works. The key config change: allowBots: true on #build. -``` - -**Pattern C: CoS orchestrates discussion in #hq thread** - -``` -Precondition: - - #hq has allowBots: true + requireMention: true - - CTO-equivalent needs to join #hq -- but CTO uses the default bot - - Default-Bot is already in #hq (it's the shared bot for all execution agents) - -Flow: - 1. CoS creates thread in #hq: "Strategic discussion: should we add QA agent?" - 2. CoS @mentions CTO: "<@U_DEFAULT_BOT> CTO, what's the technical feasibility?" - - PROBLEM: The default bot has ONE user ID shared across CTO, Builder, KO, etc. - When CoS @mentions the default bot, the peer binding for #hq routes to CoS - (because #hq is CoS's channel). The CTO peer binding only matches on - <CTO_CHANNEL_ID>, not on #hq. - - This means: the default bot receiving @mention in #hq routes to CoS, not CTO. - CTO never sees it. -``` - -**This is a critical discovery.** In the hybrid model where the default bot serves multiple agents via peer binding, you CANNOT use @mention to reach a specific agent through the default bot in a channel that is not that agent's home channel. The peer binding is by channel, so the bot always routes to whichever agent owns that channel. - -**The fix**: For Discussion mode, CoS uses `sessions_send` to trigger CTO on a specific thread, not @mention. Or, discussions happen in a dedicated #collab channel where binding can be configured differently. - -**Alternative fix**: CTO gets its own Slack App (promoting it from default-bot to independent). This would mean the "default" account serves fewer agents (Builder, CIO, KO, Ops, Research) while CoS and CTO each get independent apps. This is a graduated approach -- start with CoS independent, add QA, then consider CTO. - -### 2.4 Recommended Config - -Given the analysis, the cleanest hybrid architecture is: - -**Phase 1 (Minimal -- CoS independent only):** -- `accounts.default` -- CTO, Builder, CIO, KO, Ops, Research (peer binding, existing model) -- `accounts.cos` -- CoS only (account binding, independent Slack App) -- CoS uses `sessions_send` for delegation (existing A2A v1 mechanism) -- CoS uses #hq as its home channel (existing) -- Add `allowBots: true` to #cto and #build so CoS can post progress checks -- No Discussion mode yet -- defer to Phase 2 - -**Phase 2 (Add QA):** -- `accounts.qa` -- QA agent (account binding, independent Slack App) -- QA-Bot joins #build, #cto, #know with `allowBots: true` -- QA auto-reviews closeouts and @mentions the producing agent for feedback -- QA's AGENTS.md defines review triggers and quality criteria - -**Phase 3 (Full Discussion mode):** -- Create #collab channel with `allowBots: true` + `requireMention: true` -- CoS orchestrates discussions by @mentioning agents in #collab threads -- If default-bot's peer binding is ambiguous in #collab (which agent?), consider promoting CTO to independent account - ---- - -## 3. Instance vs Workspace Independence - -### 3.1 Same Gateway with Multi-Account - -**What multi-account means**: One OpenClaw gateway process manages multiple Slack Apps. Each App has its own `botToken`, `appToken`, and maintains its own Socket Mode WebSocket connection (or HTTP webhook endpoint). The gateway runs a single event loop that dispatches events from each account through the binding chain. - -``` - OpenClaw Gateway (single process) - / | \ - Socket Mode Socket Mode Socket Mode - | | | - [CoS App] [Default App] [QA App] - (Slack) (Slack) (Slack) -``` - -**Is workspace independence sufficient?** Yes, combined with account binding: -- Each agent already has its own workspace (`~/.openclaw/workspace-cos/`, etc.) with SOUL.md, AGENTS.md, MEMORY.md, etc. -- Each agent already has isolated sessions (thread-level session keys) -- Adding an independent Slack App via `accounts.cos` gives CoS its own bot identity (distinct name, avatar, bot user ID) -- The account-level binding (`accountId: cos`) ensures all events from CoS's App route exclusively to the CoS agent - -This provides **logical independence** (own identity, own workspace, own session space) within **shared infrastructure** (one gateway, shared A2A tools, shared config). - -### 3.2 Separate Gateway Instances - -**What this means**: Each independent agent runs its own OpenClaw process with its own `openclaw.json`. - -``` - [Gateway 1: CoS] [Gateway 2: Default] [Gateway 3: QA] - openclaw-cos.json openclaw.json openclaw-qa.json - | | | - [CoS App] [Default App] [QA App] -``` - -**Problems:** - -| Issue | Impact | -|-------|--------| -| `sessions_send` cannot cross processes | CoS cannot delegate to CTO via A2A -- must rely entirely on @mention in Slack, losing the reliable two-step trigger | -| No shared session management | `maxPingPongTurns`, session timeouts, and other safety constraints are per-instance. No unified loop prevention | -| No shared agent registry | Gateway 1 does not know Gateway 2's agents exist. `sessions_list`, `sessions_send`, agent ID resolution all fail cross-process | -| Triple operational burden | Three processes to monitor, restart, log-manage, and configure | -| Config duplication | Channel configs, thread settings, tool permissions must be maintained in three separate files | -| Heartbeat isolation | CoS's heartbeat cannot check on CTO's status via internal APIs | - -**The ONLY advantage**: Complete process isolation. If Gateway 2 crashes, CoS (Gateway 1) and QA (Gateway 3) continue operating. But this is a minor benefit -- a single gateway restart takes seconds, and process managers (pm2, systemd) handle automatic restarts. - -### 3.3 Recommendation - -**Same gateway with multi-account is the clear winner.** - -| Dimension | Same Gateway | Separate Gateways | -|-----------|-------------|-------------------| -| A2A delegation (`sessions_send`) | Works natively | BROKEN | -| Session management | Unified | Fragmented | -| Config management | Single file | Three files | -| Process management | One process | Three processes | -| Process isolation | No (single failure point) | Yes | -| Operational complexity | Low | High | -| Bot identity independence | Yes (multi-account) | Yes | -| Workspace independence | Yes (existing) | Yes | - -The only scenario where separate gateways make sense is if you need to run agents on different physical machines (e.g., CoS on a cloud server for uptime, Builder on a local machine with code access). This is not the current requirement. - ---- - -## 4. Proposed Architecture Diagram - -``` - +-----------------------+ - | Slack Workspace | - +-----------------------+ - | | - +----------+ +----------+ +----------+ | - | CoS App | |Default | | QA App | | - | (Bot-CoS)| |App (Bot) | | (Bot-QA) | | - +----+-----+ +----+-----+ +----+-----+ | - | | | | - | +-----------+-----------+ | | - | | Channels: | | | - | | #hq #cto #build | | | - | | #invest #know #ops | | | - | | #research | | | - | +-----------------------+ | | - +-----------------------+------+----------+ - | - +-----------------+------------------+ - | OpenClaw Gateway (single) | - +-------------------------------------+ - | | - | accounts: | - | cos: CoS App tokens | - | default: Default App tokens | - | qa: QA App tokens | - | | - | bindings: | - | cos <- accountId: cos | - | qa <- accountId: qa | - | cto <- peer: #cto channel | - | builder <- peer: #build channel | - | cio <- peer: #invest channel | - | ko <- peer: #know channel | - | ops <- peer: #ops channel | - | research<- peer: #research channel| - | | - | A2A tools: sessions_send, | - | sessions_list (shared) | - +-------------------------------------+ - - Interaction patterns: - - CoS (independent) ---sessions_send---> CTO (default bot) - CoS (independent) ---@mention in #cto thread---> CTO (needs allowBots:true on #cto) - QA (independent) ---@mention in #build thread--> Builder (needs allowBots:true on #build) - CTO (default bot) ---sessions_send---> Builder (existing, unchanged) - - Harness mapping: - CoS = Orchestrator/Planner (drives what to do) - QA = Evaluator (challenges what was done) - CTO/Builder/CIO = Generator (does the work) - KO/Ops = System maintenance (unchanged) -``` - ---- - -## 5. Implementation Path - -### What Changes in OpenCrew - -**Config changes (openclaw.json):** - -1. Add `accounts` block with `cos` and `qa` account entries (new Slack App tokens) -2. Keep `default` account (existing single bot -- rename from implicit default) -3. Add account-level bindings for CoS and QA -4. Keep peer-level bindings for CTO, Builder, CIO, KO, Ops, Research (unchanged) -5. Add `allowBots: true` to #cto and #build channel configs (enables cross-bot interaction) -6. Optionally add `requireMention: true` to #cto and #build (if you want CoS/QA to only respond when @mentioned in those channels -- recommended) - -**Agent workspace changes:** - -7. Create QA workspace (`~/.openclaw/workspace-qa/`) with SOUL.md, AGENTS.md -8. Define QA's role: review closeouts, challenge quality, check DoD compliance -9. Update CoS AGENTS.md: add orchestration responsibilities, Discussion mode instructions -10. Update CTO AGENTS.md: clarify CTO is a participant in discussions (not orchestrator), add `allowBots` interaction patterns - -**Protocol changes (shared/):** - -11. Update A2A_PROTOCOL.md: clarify that CoS is the Discussion orchestrator for strategic discussions; CTO orchestrates only within its execution scope (CTO->Builder) -12. Add QA agent to Permission Matrix: QA can review any agent's closeout, QA cannot delegate execution tasks - -**Slack setup (human-manual):** - -13. Create CoS Slack App (bot token, app token, Socket Mode) -14. Create QA Slack App (bot token, app token, Socket Mode) -15. Invite CoS-Bot to #hq, #cto, #build (and any other channels CoS should monitor) -16. Invite QA-Bot to #build, #cto, #know (channels where QA should review) -17. Record bot user IDs for @mention formatting - -### What Stays the Same - -- All existing agent workspaces (CTO, Builder, CIO, KO, Ops, Research) -- unchanged -- All existing peer bindings -- unchanged -- The default Slack App (single bot for execution agents) -- unchanged -- A2A Delegation mode (sessions_send) -- unchanged, still the primary mechanism -- Task types (Q/A/P/S), Closeout protocol, Autonomy Ladder -- unchanged -- Channel structure (#hq, #cto, #build, etc.) -- unchanged -- Thread-level session isolation -- unchanged - ---- - -## 6. Confidence Assessment - -| Finding | Confidence | Basis | -|---------|-----------|-------| -| One account can serve multiple agents via peer binding | **HIGH** | This is OpenCrew's current production model (CONFIG_SNIPPET_2026.2.9.md) | -| Multi-account supports mixing peer and account bindings | **HIGH** | OpenClaw docs confirm binding specificity hierarchy: peer > accountId > channel > fallback | -| Two bots in same channel both receive events | **HIGH** | Slack Events API delivers to ALL subscribed apps. Confirmed in research_autonomous_slack_r1.md | -| `allowBots: true` enables cross-bot message processing | **HIGH** | Confirmed by docs, community reports, and the "47 replies in 12 seconds" incident | -| Self-loop filter is per-bot-user-ID | **HIGH** | Confirmed by Issue #15836 fix analysis | -| CoS as orchestrator maps to Harness Design | **HIGH** | Role analysis against SOUL.md and Harness methodology | -| Default-bot @mention in non-home channel routes correctly | **LOW** | Default bot's peer binding routes by channel, not by @mention target. @mentioning default bot in #hq routes to CoS (not CTO). This is a limitation. | -| Per-account channel config overrides | **LOW** | Docs say "named accounts can override" but per-channel overrides within accounts are unverified | -| QA as independent evaluator adds net value | **MEDIUM** | Conceptually sound (Harness Design validates the pattern), but no empirical data on QA agent quality in Slack-based review | -| Socket Mode with 3 apps is stable | **MEDIUM** | Within normal range per OpenClaw docs. 5+ apps recommended to switch to HTTP mode | - ---- - -## 7. Open Questions - -1. **Per-account channel overrides**: Can `accounts.cos.channels.<CTO_CH>.requireMention` override the global channel setting? If yes, this solves the coexistence problem cleanly. Needs empirical testing. - -2. **Default bot @mention routing in foreign channels**: When CoS @mentions the default bot in #hq (CoS's channel), does the peer binding route to CoS or CTO? Initial analysis says CoS (because #hq's peer binding maps to CoS). This means Discussion mode via @mention in #hq cannot reach CTO through the default bot. Needs testing. - -3. **QA agent scope definition**: What exactly should QA review? Options: - - All closeouts (comprehensive but noisy) - - Only S-type and P-type closeouts (high-signal) - - Only when explicitly triggered by CoS or user - -4. **Bot display names in thread history**: When CTO (default bot) and CoS (CoS-Bot) both post in a thread, do agents loading thread history see distinct sender identities? Or do they see generic "bot" labels? This affects discussion context quality. - -5. **CoS heartbeat as orchestration trigger**: CoS already has a 12-hour heartbeat. Could this heartbeat serve as the "harness loop" -- checking agent status, driving pending tasks forward, synthesizing overnight progress? This would make CoS a proactive orchestrator without requiring external cron jobs. - -6. **Graduated independence**: The analysis assumes CoS and QA both need independence. Should CTO also eventually become independent (own Slack App)? This would solve the @mention routing problem (Question 2) but adds a fourth Slack App. What is the right graduation path? - -7. **Cost impact**: Three Slack Apps means three Socket Mode connections and potentially 3x the event processing for shared channels. What is the token cost of CoS and QA processing events from channels they monitor but do not act on (filtered by `requireMention`)? - -8. **A2A protocol for QA**: The current A2A_PROTOCOL.md does not define a QA role. What is QA's permission in the matrix? Can QA send tasks back to Builder? Can QA escalate to CTO? Does QA participate in the closeout flow or sit outside it? diff --git a/.harness/reports/research_slack_r1.md b/.harness/reports/research_slack_r1.md deleted file mode 100644 index bf24687..0000000 --- a/.harness/reports/research_slack_r1.md +++ /dev/null @@ -1,382 +0,0 @@ -commit 7e825263db36aef68792a050c324daef598b4c56 -Author: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> -Date: Sat Mar 28 17:38:48 2026 +0800 - - feat: add A2A v2 research harness, architecture, and agent definitions - - Multi-agent harness for researching and designing A2A v2 protocol: - - Research reports (Phase 1): - - Slack: true multi-agent collaboration via multi-account + @mention - - Feishu: groupSessionScope + platform limitation analysis - - Discord: multi-bot routing + Issue #11199 blocker analysis - - Architecture designs (Phase 2): - - A2A v2 Protocol: Delegation (v1) + Discussion (v2) dual-mode - - 5 collaboration patterns: Architecture Review, Strategic Alignment, - Code Review, Incident Response, Knowledge Synthesis - - 3-level orchestration: Human → Agent → Event-Driven - - Platform configs, migration guides, 6 ADRs - - Agent definitions for Claude Code Agent Teams: - - researcher.md, architect.md, doc-fixer.md, qa.md - - QA verification: all issues resolved, PASS verdict after fixes. - - Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> - -diff --git a/.harness/reports/research_slack_r1.md b/.harness/reports/research_slack_r1.md -new file mode 100644 -index 0000000..16022c5 ---- /dev/null -+++ b/.harness/reports/research_slack_r1.md -@@ -0,0 +1,349 @@ -+# Research Report: Slack True Multi-Agent Collaboration (U0, Round 1) -+ -+> Researcher: Claude Opus 4.6 | Date: 2026-03-27 | Contract: `.harness/contracts/research-slack.md` -+ -+--- -+ -+## Executive Summary -+ -+True multi-agent collaboration on Slack -- where multiple agents participate in the same thread as distinct identities, each bringing independent judgment -- is **technically feasible today** using OpenClaw's multi-account Slack support. The key enabler is `channels.slack.accounts`: each agent gets its own Slack app/bot, its own `xoxb-` token, and a binding to a specific OpenClaw agent. With `allowBots: true` + `requireMention: true` on shared channels, Bot-B can see Bot-A's messages as context and respond when explicitly @mentioned. This eliminates the self-loop problem that forces today's two-step `sessions_send` workaround. However, the current OpenCrew deployment uses single-bot mode, so migrating requires creating 6-7 Slack apps and reconfiguring bindings -- a significant but well-documented path. -+ -+--- -+ -+## 1. Platform Capability Assessment -+ -+### 1.1 Slack Multi-Bot in Threads -+ -+**Confidence: HIGH** (verified against Slack API docs and practical testing reports) -+ -+Slack's Events API delivers message events to all apps that are members of a channel, regardless of who posted the message. Specifically: -+ -+- **Bot-A posts in a thread; Bot-B receives the event**: Yes. Each Slack app subscribed to `message.channels` (or equivalent) receives events for all messages in channels it has joined, including messages posted by other bots. The `bot_message` subtype identifies these. ([Slack Events API docs](https://docs.slack.dev/apis/events-api/), [Slack message event reference](https://docs.slack.dev/reference/events/message/)) -+ -+- **Self-loop prevention is app-side, not platform-side**: Slack itself does NOT filter out a bot's own messages from event delivery. Frameworks like Bolt implement `ignoring_self` as an application-level guard. This means each app must decide whether to ignore its own messages. ([Slack bot interactions docs](https://api.slack.com/bot-users)) -+ -+- **Thread participation**: Any bot that is a member of a channel can post to any thread in that channel using the `thread_ts` parameter. No special permissions needed beyond `chat:write`. ([Slack threading blog](https://medium.com/slack-developer-blog/bringing-your-bot-into-threaded-messages-cd272a42924f)) -+ -+- **Visual identity**: Each Slack app has its own name, icon, and bot user ID. Messages from different bots are visually distinct in threads -- this is the key advantage over single-bot mode where all agents look like the same entity. -+ -+**Key implication**: The Slack platform fully supports multiple bots having a real-time conversation in a thread. There is no platform-level barrier. -+ -+### 1.2 OpenClaw Slack Plugin Current State -+ -+**Confidence: HIGH** (verified against OpenClaw official docs, GitHub gist, and DeepWiki source analysis) -+ -+OpenClaw's Slack plugin already supports multi-account mode: -+ -+**Multi-account configuration** (`channels.slack.accounts`): -+```json -+{ -+ "channels": { -+ "slack": { -+ "accounts": { -+ "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-..." }, -+ "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-...", "name": "CTO" }, -+ "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-...", "name": "Builder" } -+ } -+ } -+ } -+} -+``` -+ -+**Binding per account**: -+```json -+{ -+ "bindings": [ -+ { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, -+ { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, -+ { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } -+ ] -+} -+``` -+ -+**Bot message handling** -- three modes for `allowBots`: -+- `false` (default): All bot messages ignored. Current OpenCrew behavior. -+- `true`: All bot messages accepted as inbound. Requires loop prevention. -+- `"mentions"`: Bot messages accepted only if they @mention this bot. Safest for multi-agent. -+ -+**Self-loop prevention**: OpenClaw ignores messages from the same bot user ID (`message.user === botUserId`). With multi-account, each account has a different `botUserId`, so Bot-CTO's messages are NOT filtered by Bot-Builder's agent -- they are treated as real inbound. ([OpenClaw Slack docs](https://docs.openclaw.ai/channels/slack), [GitHub gist](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f)) -+ -+**Agent identity** (visual differentiation): OpenClaw supports `chat:write.customize` scope for per-agent name/icon override. Issue #27080 (identity not applied on inbound-triggered replies) was fixed in PR #27134. With multi-account, each bot has its native identity, so `chat:write.customize` is not even needed -- each app's profile serves as the identity. ([GitHub issue #27080](https://github.com/openclaw/openclaw/issues/27080)) -+ -+**Thread session isolation**: `thread.historyScope = "thread"` + `inheritParent = false` ensures each thread is an independent session. `initialHistoryLimit` controls how many prior messages load when a new session starts in an existing thread. -+ -+### 1.3 Gap Analysis -+ -+| Capability | Slack Platform | OpenClaw Plugin | OpenCrew Config | Gap | -+|-----------|---------------|----------------|----------------|-----| -+| Multiple bots in one channel | YES | YES (multi-account) | NO (single bot) | **OpenCrew config change** | -+| Bot-B sees Bot-A's messages | YES (Events API) | YES (`allowBots`) | NO (`allowBots: false`) | **OpenCrew config change** | -+| Visual identity per agent | YES (separate apps) | YES (multi-account or `chat:write.customize`) | NO (shared identity) | **OpenCrew config change** | -+| Thread-level session isolation | YES (native threads) | YES (`historyScope: "thread"`) | YES (already configured) | None | -+| Loop prevention | N/A (app-side) | YES (`requireMention`, `allowBots: "mentions"`) | N/A | **OpenCrew config change** | -+| Orchestrated turn-taking | N/A | Partial (`sessions_send` for explicit trigger) | YES (two-step A2A) | **New orchestration logic needed** | -+| Agent sees full thread history | YES | YES (`initialHistoryLimit`) | Partial | **Config tuning** | -+ -+**Bottom line**: The platform (Slack) and middleware (OpenClaw) already support everything needed. The gap is entirely at the OpenCrew configuration and protocol layer. No upstream code changes are required. -+ -+--- -+ -+## 2. Collaboration Patterns Catalog -+ -+### 2.1 Discussion Pattern -+ -+**Description**: Multiple agents participate in a single Slack thread, each contributing from their domain perspective. Example: CTO proposes architecture, Builder critiques feasibility, QA identifies risks -- they iterate to convergence. -+ -+**Mechanics**: -+1. Human (or CoS) posts a topic in a shared channel (e.g., `#cto`) or a dedicated `#collab` channel. -+2. CTO is bound to that channel and responds first with an architecture proposal. -+3. Human (or orchestrator agent) @mentions `@Builder` in the thread: "What do you think about feasibility?" -+4. Builder's bot receives the thread message (because `allowBots: true` + it was @mentioned), loads thread history via `initialHistoryLimit`, and responds with feasibility analysis. -+5. Human @mentions `@CTO` again: "How do you respond to Builder's concerns?" -+6. CTO sees Builder's messages in thread history and refines the proposal. -+7. Repeat until convergence. -+ -+**Requirements**: -+- Multi-account Slack setup (one app per participating agent) -+- `allowBots: true` or `"mentions"` on the shared channel -+- `requireMention: true` on shared channels (loop prevention) -+- `thread.historyScope: "thread"` + `initialHistoryLimit >= 50` (so each agent sees the full discussion) -+- `inheritParent: true` on shared channels (so thread participants inherit the root message context) -+ -+**Feasibility**: **NOW** -- achievable with config changes only. No code changes to OpenClaw or OpenCrew needed. -+ -+### 2.2 Review Pattern -+ -+**Description**: One agent produces work, multiple agents review it in the same thread. Example: Builder submits a design doc, CTO reviews architecture soundness, QA reviews correctness, KO checks knowledge consistency. -+ -+**Mechanics**: -+1. Builder posts deliverable in `#build` thread (the existing A2A closeout thread). -+2. CTO is @mentioned in the thread: "@CTO please review architecture." -+3. CTO's bot receives the message, loads thread history (seeing Builder's full output), and posts architecture review. -+4. QA is @mentioned: "@QA please review correctness." -+5. QA reads both Builder's output and CTO's review, posts correctness assessment. -+6. Builder is @mentioned with consolidated feedback: "@Builder please address these items." -+ -+**Requirements**: -+- Same multi-account setup as Discussion Pattern. -+- Reviewers' bots must be invited to the channel where the review thread lives. -+- `initialHistoryLimit` must be high enough to capture the full deliverable (possibly 80-100 for large outputs). -+- Each reviewer agent needs workspace instructions (SOUL.md/AGENTS.md) that define their review perspective. -+ -+**Feasibility**: **NOW** -- identical infrastructure to Discussion Pattern. The only addition is role-specific review instructions in each agent's workspace. -+ -+### 2.3 Brainstorm Pattern -+ -+**Description**: Agents take turns building on each other's ideas in a free-form exploration. Example: CoS states a goal, CTO proposes technical approaches, CIO adds domain constraints, Builder estimates effort -- they converge on a plan. -+ -+**Mechanics**: -+1. Human posts a brainstorm prompt in a shared channel: "How should we approach X?" -+2. Multiple agents are @mentioned (or an orchestrator agent manages turn order). -+3. Each agent reads the full thread history before contributing. -+4. **Turn management option A -- Human orchestrated**: Human @mentions the next agent after each response. -+5. **Turn management option B -- Agent orchestrated**: A designated orchestrator agent (CoS or CTO) reads each response and decides who to call next, posting "@Builder what's your take?" or "@CIO any domain constraints?" -+6. **Turn management option C -- Round-robin**: A lightweight script or orchestrator sends @mentions in a fixed order with a configurable delay. -+ -+**Requirements**: -+- All participating agents need bots in the shared channel. -+- Higher `initialHistoryLimit` (brainstorms can get long). -+- Clear termination criteria (who decides the brainstorm is "done"?). -+- For Option B (agent-orchestrated): The orchestrator agent needs `allowBots: true` to see other agents' messages AND the ability to @mention other bots in its responses. -+ -+**Feasibility**: **NEAR** -- The infrastructure is the same as Discussion Pattern (NOW). However, agent-orchestrated turn management (Option B) requires that agents can reliably @mention other bots in their messages and that the mentioned bot's Slack app correctly recognizes the mention. This needs validation. Human-orchestrated (Option A) works NOW. -+ -+--- -+ -+## 3. Comparison with Harness Design -+ -+### 3.1 File-based Blackboard vs Chat-based Collaboration -+ -+Anthropic's harness design for long-running applications uses a **file-based Blackboard pattern**: agents write files, other agents read them. The Planner writes a spec file, the Generator reads it and writes code, the Evaluator reads the code and writes a review file. Communication is asynchronous, persistent, and structured. ([Anthropic engineering blog](https://www.anthropic.com/engineering/harness-design-long-running-apps)) -+ -+| Dimension | Harness (File-based) | OpenCrew Slack (Chat-based) | -+|-----------|---------------------|---------------------------| -+| **Communication medium** | Files on disk | Slack thread messages | -+| **Persistence** | Git-trackable files | Slack thread history (ephemeral on free plan) | -+| **Structure** | Highly structured (sprint contracts, spec files) | Semi-structured (thread messages with conventions) | -+| **Latency** | Near-zero (local filesystem) | ~1-3s per message (Slack API round-trip) | -+| **Human visibility** | Requires explicit file inspection | Built-in (Slack UI) | -+| **Context window** | Full file contents per agent | Thread history limited by `initialHistoryLimit` | -+| **Turn management** | Explicit (harness orchestrator) | @mention-based or human-driven | -+| **Adversarial review** | Generator vs Evaluator (GAN-inspired) | Any agent vs any agent (same mechanism) | -+ -+The harness pattern's key strength is **deterministic orchestration**: the harness code decides exactly when each agent runs and what context it receives. The Slack pattern's strength is **human-in-the-loop visibility and intervention**: any human can read the thread, jump in, redirect, or override at any point. -+ -+### 3.2 Unique Value of Real-time Multi-Agent Chat -+ -+**Confidence: MEDIUM** (inference based on architecture comparison, not empirical measurement) -+ -+The Slack-based approach offers several advantages the file-based harness cannot: -+ -+1. **Real-time human oversight**: The user watches the discussion unfold in real-time and can intervene ("Actually, ignore that constraint -- we changed requirements"). File-based harnesses require the user to inspect files after-the-fact. -+ -+2. **Natural escalation**: If agents get stuck or disagree, the human is already in the thread and can break the tie. In a harness, you need explicit escalation mechanisms. -+ -+3. **Organizational memory**: Slack threads persist as searchable organizational history. Future agents (or humans) can search for "that architecture discussion we had about X" and find the full multi-agent deliberation. -+ -+4. **Progressive trust building**: Users can start with human-orchestrated discussions (manually @mentioning agents) and gradually move to agent-orchestrated as trust builds. The harness pattern is all-or-nothing autonomous. -+ -+5. **Cross-domain collaboration**: A Slack thread can include agents from different "layers" (CoS + CTO + Builder) that wouldn't interact in a harness's rigid pipeline. -+ -+**Trade-off**: Slack-based collaboration is slower (network latency, message rendering) and less structured than file-based. For pure software generation tasks, the harness pattern is likely more efficient. For strategic decisions, design reviews, and cross-functional alignment, Slack-based collaboration is superior. -+ -+--- -+ -+## 4. Recommended Architecture -+ -+### 4.1 Multi-Bot Configuration -+ -+**Recommended approach**: Create separate Slack apps for each "conversational" agent. Not all 7 agents need their own app -- only those that participate in multi-agent discussions. -+ -+**Tier 1 -- Own Slack App** (agents that discuss): -+- CoS, CTO, Builder (core discussion participants) -+ -+**Tier 2 -- Own Slack App if needed** (agents that may review): -+- CIO (domain specialist, participates in strategic discussions) -+- KO (participates in knowledge reviews) -+ -+**Tier 3 -- Shared or no Slack App** (agents that don't discuss): -+- Ops (audit only, doesn't need to participate in discussions) -+- Research (ephemeral worker, spawned via `sessions_spawn`) -+ -+**Configuration skeleton**: -+```json -+{ -+ "channels": { -+ "slack": { -+ "accounts": { -+ "default": { "botToken": "xoxb-cos-...", "appToken": "xapp-cos-..." }, -+ "cto": { "botToken": "xoxb-cto-...", "appToken": "xapp-cto-..." }, -+ "builder": { "botToken": "xoxb-bld-...", "appToken": "xapp-bld-..." } -+ }, -+ "channels": { -+ "<COLLAB_CHANNEL_ID>": { -+ "allow": true, -+ "requireMention": true, -+ "allowBots": true -+ } -+ }, -+ "thread": { -+ "historyScope": "thread", -+ "inheritParent": true, -+ "initialHistoryLimit": 50 -+ } -+ } -+ }, -+ "bindings": [ -+ { "agentId": "cos", "match": { "channel": "slack", "accountId": "default" } }, -+ { "agentId": "cto", "match": { "channel": "slack", "accountId": "cto" } }, -+ { "agentId": "builder", "match": { "channel": "slack", "accountId": "builder" } } -+ ] -+} -+``` -+ -+**Each Slack app requires**: Bot Token Scopes: `channels:history`, `channels:read`, `chat:write`, `chat:write.customize`, `users:read`. Event Subscriptions: `message.channels`, `app_mention`. Socket Mode enabled with `connections:write` scope on the app-level token. -+ -+### 4.2 Orchestration Model -+ -+**Recommended: Hybrid human + agent orchestration** (phased rollout) -+ -+**Phase 1 -- Human Orchestrated (NOW)**: -+- User @mentions agents in threads to drive discussion. -+- All agents have `requireMention: true` + `allowBots: true`. -+- User controls pace, topic, and turn order. -+- This is the safest starting point and requires zero protocol changes. -+ -+**Phase 2 -- Agent Orchestrated (NEAR)**: -+- Designate CTO (or CoS) as the orchestrator for technical discussions. -+- Orchestrator agent's AGENTS.md includes instructions: "After receiving input, decide which specialist to consult next and @mention them in the thread." -+- Add guardrails: max 3 agent-to-agent turns per discussion before requiring human input. -+- `maxPingPongTurns` from A2A protocol can be repurposed as a discussion round limit. -+ -+**Phase 3 -- Event-Driven with Guardrails (FUTURE)**: -+- Agents proactively respond when they detect messages relevant to their domain (similar to SlackAgents' proactive mode from [EMNLP 2025](https://aclanthology.org/2025.emnlp-demos.76.pdf)). -+- Requires sophisticated relevance filtering to avoid noise. -+- Use `allowBots: "mentions"` as the safety valve. -+ -+### 4.3 Integration with Existing A2A Protocol -+ -+The multi-bot architecture does NOT replace the existing A2A protocol -- it extends it with a new mode: -+ -+**A2A v1 (existing -- single-bot delegation)**: -+- Two-step trigger: visible anchor + `sessions_send` -+- Use case: Structured task delegation (CTO assigns Builder a specific task) -+- Keep for: All existing delegation workflows, task tracking, closeout flows -+ -+**A2A v2 (new -- multi-bot discussion)**: -+- One-step trigger: @mention in a shared thread -+- Use case: Multi-party discussion, review, brainstorm -+- Session: Each agent's session is the thread itself (`thread:<threadTs>`) -+- Context: Thread history serves as the shared context (Blackboard equivalent) -+ -+**Coexistence**: Both modes can coexist. A Discussion (v2) in `#collab` can result in a Delegation (v1) where CTO creates a task thread in `#build` and `sessions_send`s Builder. The discussion thread serves as the "why" record; the task thread serves as the "what" execution record. -+ -+**Migration path**: No breaking changes. Add multi-bot accounts alongside existing single-bot. Existing bindings continue to work for channels that don't need multi-agent discussion. New `#collab` or shared channels use multi-bot + `allowBots` + `requireMention`. -+ -+--- -+ -+## 5. Confidence Assessment -+ -+| Finding | Confidence | Evidence | -+|---------|-----------|---------| -+| Slack Events API delivers Bot-A's messages to Bot-B | **HIGH** | Slack API docs confirm apps receive all message events in channels they've joined. Community testing confirms. | -+| OpenClaw `channels.slack.accounts` supports multi-bot | **HIGH** | Official docs, GitHub gist with working config, DeepWiki source analysis all confirm. | -+| `allowBots: true` + `requireMention: true` prevents loops | **HIGH** | Official OpenClaw docs explicitly recommend this combination. Community reports confirm. | -+| Self-loop filter is per-bot-user-ID (not global) | **HIGH** | OpenClaw Slack docs: "ignores messages from the same bot user ID." Multi-account = different user IDs. Discord issue #11199 confirms the same logic was fixed for Discord with sibling bot registry. | -+| Agent identity fix (PR #27134) enables visual differentiation | **HIGH** | GitHub issue #27080 closed with fix. With multi-account, native app profiles provide identity without needing `chat:write.customize`. | -+| Discussion/Review patterns work NOW with config changes | **MEDIUM** | All required primitives exist (multi-account, allowBots, thread history). Not yet tested end-to-end in an OpenCrew deployment. | -+| Agent-orchestrated turn management works | **MEDIUM** | Requires agents to @mention other bots in their messages. OpenClaw's message tool can include @mentions, but reliable mention-parsing by receiving bot needs validation. | -+| `allowBots: "mentions"` mode exists | **MEDIUM** | DeepWiki analysis mentions this as a supported value. Not found in the main official docs page, but referenced in source-level documentation. | -+| Brainstorm pattern with automatic turn-taking | **LOW** | Conceptually sound but no existing implementation. Requires custom orchestration logic not yet built. | -+| Thread history is sufficient as Blackboard replacement | **LOW** | Depends on thread length, `initialHistoryLimit` setting, and whether agents can parse unstructured thread content as effectively as structured files. | -+ -+--- -+ -+## 6. Open Questions -+ -+1. **Slack free plan thread history**: Does the free Slack plan's message history limit (90 days) affect thread-based collaboration? For long-running projects, will old discussion threads become inaccessible? -+ -+2. **Mention parsing reliability**: When Agent-CTO posts "@Builder what do you think?", does OpenClaw's Slack plugin reliably detect this as a mention of the Builder bot and route it to the Builder agent? Or does it require Slack's native mention format (`<@BOT_USER_ID>`)? This needs empirical testing. -+ -+3. **Socket Mode connection limits**: With 5-7 separate Slack apps all using Socket Mode, does this create issues with Slack's connection limits or rate limits? The OpenClaw docs recommend HTTP mode for multi-account: "Give each account a distinct `webhookPath` so registrations do not collide." -+ -+4. **Session isolation in shared channels**: When CTO and Builder both participate in a thread in `#collab`, they each have their own session (`agent:cto:slack:channel:COLLAB:thread:TS` and `agent:builder:slack:channel:COLLAB:thread:TS`). Do these sessions conflict? Can both write to the same thread without routing issues? -+ -+5. **`maxPingPongTurns` applicability**: The existing A2A protocol uses `maxPingPongTurns = 4` for `sessions_send` loops. In the new discussion pattern, is there an equivalent limit for @mention-driven discussions? Without one, an agent-orchestrated discussion could theoretically run indefinitely. -+ -+6. **Cost implications**: Each Slack app consumes one Socket Mode connection. With 5-7 apps, the OpenClaw gateway maintains 5-7 persistent WebSocket connections. What is the resource impact? Is HTTP Events API mode more appropriate for this scale? -+ -+7. **Discord/Feishu parity**: The Discord plugin had a similar global bot-filter bug (#11199) that was fixed with a sibling bot registry (PRs #11644, #22611, #35479). Has the Slack plugin received an equivalent fix, or does it still use the simpler per-bot-user-ID check? The Slack issue #15836 (agent-to-agent routing) was closed as NOT_PLANNED -- does multi-account mode make that issue moot? -+ -+8. **SlackAgents (EMNLP 2025) proactive mode**: The research paper describes agents that listen to threads without being mentioned and proactively contribute. Could this "proactive mode" be adapted for OpenCrew? What relevance filtering would prevent noise? -+ -+--- -+ -+## Appendix A: Key References -+ -+- [OpenClaw Slack Plugin Docs](https://docs.openclaw.ai/channels/slack) -+- [Running Multiple AI Agents as Slack Teammates (GitHub Gist)](https://gist.github.com/rafaelquintanilha/9ca5ae6173cd0682026754cfefe26d3f) -+- [OpenClaw Multi-Agent Routing Docs](https://docs.openclaw.ai/concepts/multi-agent) -+- [OpenClaw Issue #15836: Agent-to-agent Slack routing](https://github.com/openclaw/openclaw/issues/15836) -+- [OpenClaw Issue #27080: Slack agent identity fix](https://github.com/openclaw/openclaw/issues/27080) -+- [OpenClaw Issue #11199: Discord multi-bot filtering](https://github.com/openclaw/openclaw/issues/11199) -+- [Anthropic Harness Design Blog Post](https://www.anthropic.com/engineering/harness-design-long-running-apps) -+- [SlackAgents: EMNLP 2025 Demo Paper](https://aclanthology.org/2025.emnlp-demos.76.pdf) -+- [Slack Events API Documentation](https://docs.slack.dev/apis/events-api/) -+- [Slack Message Event Reference](https://docs.slack.dev/reference/events/message/) -+- [OpenClaw Slack Setup Best Practices (Macaron)](https://macaron.im/blog/openclaw-slack-setup) -+- [OpenClaw Multi-Agent Setup Tutorial (LumaDock)](https://lumadock.com/tutorials/openclaw-multi-agent-setup) -+ -+## Appendix B: Glossary -+ -+- **A2A**: Agent-to-Agent protocol used in OpenCrew for task delegation -+- **allowBots**: OpenClaw config setting controlling whether bot-authored messages are processed -+- **Blackboard pattern**: Communication pattern where agents read/write shared files (used in Anthropic's harness) -+- **Multi-account**: OpenClaw feature allowing multiple Slack apps (each with own bot token) in one gateway -+- **requireMention**: OpenClaw config setting requiring explicit @mention for agent activation -+- **sessions_send**: OpenClaw tool for sending messages to another agent's session (the "real trigger" in A2A v1) -+- **Socket Mode**: Slack connection mode using persistent WebSocket (default for OpenClaw) -+- **Two-step trigger**: Current OpenCrew A2A mechanism: visible anchor message + sessions_send diff --git a/.harness/reports/verify_source_code_r1.md b/.harness/reports/verify_source_code_r1.md deleted file mode 100644 index 7625a91..0000000 --- a/.harness/reports/verify_source_code_r1.md +++ /dev/null @@ -1,389 +0,0 @@ -# Source Code Verification: Cross-Bot Message Routing - -**Repo**: `openclaw/openclaw` (GitHub, accessed 2026-03-27) -**Method**: Direct source code reads via `gh api` against the `openclaw/openclaw` repository. - ---- - -## 1. allowBots Filter - -### Source Files -- **Config type definition**: `src/config/types.slack.ts` (lines 41, 115) -- **Channel-level config resolution**: `extensions/slack/src/monitor/channel-config.ts` -- **Actual filter logic**: `extensions/slack/src/monitor/message-handler/prepare.ts` - -### Code Snippet (from `prepare.ts`, `resolveSlackConversationContext`) - -```typescript -const allowBots = - channelConfig?.allowBots ?? - account.config?.allowBots ?? - cfg.channels?.slack?.allowBots ?? - false; -``` - -### Behavior - -The `allowBots` flag is resolved with a **three-tier fallback**: - -1. **Per-channel config** (`channels.slack.channels.<channelId>.allowBots`) -- highest priority -2. **Per-account config** (`channels.slack.accounts.<accountId>.allowBots`) -- middle priority -3. **Global Slack config** (`channels.slack.allowBots`) -- lowest priority -4. **Default**: `false` - -When `allowBots` is `false` (the default), all messages with a `bot_id` field are silently dropped. The check occurs in `authorizeSlackInboundMessage`: - -```typescript -if (isBotMessage) { - if (message.user && ctx.botUserId && message.user === ctx.botUserId) { - return null; // self-loop filter (always blocks own messages) - } - if (!allowBots) { - logVerbose(`slack: drop bot message ${message.bot_id ?? "unknown"} (allowBots=false)`); - return null; - } -} -``` - ---- - -## 2. Self-Loop Filter - -### Source File -- `extensions/slack/src/monitor/message-handler/prepare.ts`, function `authorizeSlackInboundMessage` - -### Code Snippet - -```typescript -if (isBotMessage) { - if (message.user && ctx.botUserId && message.user === ctx.botUserId) { - return null; - } - if (!allowBots) { - logVerbose(`slack: drop bot message ${message.bot_id ?? "unknown"} (allowBots=false)`); - return null; - } -} -``` - -Where `isBotMessage` is defined as: - -```typescript -isBotMessage: Boolean(message.bot_id), -``` - -And `ctx.botUserId` is set per-account from `auth.test`: - -```typescript -const auth = await app.client.auth.test({ token: botToken }); -botUserId = auth.user_id ?? ""; -``` - -### Per-Account or Global? - -**Per-account.** Each Slack account (`monitorSlackProvider`) creates its own `SlackMonitorContext` with its own `botUserId`. The self-loop check compares `message.user === ctx.botUserId`, which is the bot user ID **of that specific Slack App/account**. - -This means: -- **Default-Bot** (account `default`, botUserId = `U_DEFAULT`) will only drop messages where `message.user === "U_DEFAULT"` -- **CoS-Bot** (account `cos`, botUserId = `U_COS`) will only drop messages where `message.user === "U_COS"` - -When CoS-Bot posts a message, Default-Bot's self-loop check (`message.user === U_COS`) does NOT match `U_DEFAULT`, so it passes through (assuming `allowBots=true`). - ---- - -## 3. Multi-Account Event Dispatch - -### Source Files -- **Gateway orchestration**: `src/gateway/server-channels.ts` -- **Per-account provider boot**: `extensions/slack/src/monitor/provider.ts` -- **Channel plugin gateway hook**: `extensions/slack/src/channel.ts` (`gateway.startAccount`) -- **Event registration**: `extensions/slack/src/monitor/events/messages.ts` - -### How Events Are Routed - -Each Slack account gets **its own independent Bolt `App` instance** with its own socket connection or HTTP receiver: - -From `provider.ts`: -```typescript -const app = new App( - slackMode === "socket" - ? { token: botToken, appToken, socketMode: true, clientOptions } - : { token: botToken, receiver: receiver ?? undefined, clientOptions }, -); -``` - -From `channel.ts` (`gateway.startAccount`): -```typescript -startAccount: async (ctx) => { - const account = ctx.account; - const botToken = account.botToken?.trim(); - const appToken = account.appToken?.trim(); - ctx.log?.info(`[${account.accountId}] starting provider`); - return getSlackRuntime().channel.slack.monitorSlackProvider({ - botToken: botToken ?? "", - appToken: appToken ?? "", - accountId: account.accountId, - config: ctx.cfg, - // ... - }); -}, -``` - -From `server-channels.ts` (`startChannelInternal`): -```typescript -const accountIds = accountId ? [accountId] : plugin.config.listAccountIds(cfg); -// ... -await Promise.all( - accountIds.map(async (id) => { - // each account gets its own startAccount call - }), -); -``` - -**Each account runs a completely separate Slack Bolt App** with its own WebSocket connection to Slack. Events from Slack are delivered directly to the Bolt App that owns that bot token. - -### Critical Architectural Point: No Cross-Account Event Delivery - -Unlike Feishu (where all bots in a group receive every message from every member, including other bots), **Slack delivers events only to the App that owns the relevant subscription**. Each Slack App gets its own events stream. - -### Event Isolation via `shouldDropMismatchedSlackEvent` - -Each account's context also includes `shouldDropMismatchedSlackEvent`, which checks `api_app_id` and `team_id`: - -```typescript -const shouldDropMismatchedSlackEvent = (body: unknown) => { - // ... - if (params.apiAppId && incomingApiAppId && incomingApiAppId !== params.apiAppId) { - logVerbose(`slack: drop event with api_app_id=${incomingApiAppId} (expected ${params.apiAppId})`); - return true; - } - if (params.teamId && incomingTeamId && incomingTeamId !== params.teamId) { - logVerbose(`slack: drop event with team_id=${incomingTeamId} (expected ${params.teamId})`); - return true; - } - return false; -}; -``` - -This is a safety net for HTTP mode where requests might be shared, but in socket mode each App has its own connection so this rarely fires. - -### Dedup Behavior - -**Slack has NO cross-account broadcast deduplication** like Feishu does. The reason is architectural: - -- **Feishu**: All bots in a group receive the same event via a shared webhook. Feishu explicitly deduplicates with `tryRecordMessagePersistent(ctx.messageId, "broadcast")` using a shared namespace. -- **Slack**: Each bot App has its own event subscription stream. There is no shared event delivery mechanism, so no cross-account dedup is needed. - -Within a single account, there is a `markMessageSeen` dedup to handle the `message` vs `app_mention` race: - -```typescript -const seenMessages = createDedupeCache({ ttlMs: 60_000, maxSize: 500 }); -const markMessageSeen = (channelId: string | undefined, ts?: string) => { - if (!channelId || !ts) { return false; } - return seenMessages.check(`${channelId}:${ts}`); -}; -``` - -This dedup is **per-account** (each `SlackMonitorContext` has its own `seenMessages` cache). It exists to prevent the same message from being processed twice within ONE account (e.g., when both `message` and `app_mention` events fire for the same message). - -There is also a global `inbound-dedupe.ts` that runs at the agent dispatch layer (`shouldSkipDuplicateInbound`), but its key includes `AccountId`: - -```typescript -return [provider, accountId, sessionScope, peerId, threadId, messageId].filter(Boolean).join("|"); -``` - -Since `accountId` is part of the key, the same physical message delivered to two different accounts generates **different dedup keys** and is processed independently by both. - ---- - -## 4. Binding Resolution with Two Bots in Same Channel - -### Source File -- `src/routing/resolve-route.ts` - -### How It Works - -When a message arrives at a specific account, `resolveAgentRoute` is called with: -```typescript -const route = resolveAgentRoute({ - cfg: ctx.cfg, - channel: "slack", - accountId: account.accountId, // <-- per-account - teamId: ctx.teamId || undefined, - peer: { - kind: isDirectMessage ? "direct" : isRoom ? "channel" : "group", - id: isDirectMessage ? (message.user ?? "unknown") : message.channel, - }, -}); -``` - -The binding resolution uses a tiered priority system: -1. `binding.peer` -- exact channel/peer match -2. `binding.peer.parent` -- parent thread match -3. `binding.guild+roles` -- Discord-specific -4. `binding.guild` -- Discord-specific -5. `binding.team` -- team-level binding -6. `binding.account` -- account-level binding -7. `binding.channel` -- channel-level (wildcard account) binding -8. `default` -- uses `resolveDefaultAgentId(cfg)` - -Bindings are filtered by both `channel` AND `accountId`: - -```typescript -function getEvaluatedBindingsForChannelAccount( - cfg: OpenClawConfig, - channel: string, - accountId: string, -): EvaluatedBinding[] { -``` - -### Which Binding Wins? - -Each account resolves its **own** route independently. There is no conflict because: - -- Default-Bot (account `default`) + binding `{match: {channel: "slack", accountId: "default", peer: {kind: "channel", id: "C_CTO"}}, agentId: "cto"}` resolves to agent `cto` -- CoS-Bot (account `cos`) + binding `{match: {channel: "slack", accountId: "cos"}, agentId: "cos"}` resolves to agent `cos` - -The two accounts process events in parallel; each resolves its own agentId based on its own bindings. - ---- - -## 5. requireMention Scope - -### Source Files -- `extensions/slack/src/monitor/channel-config.ts` -- `extensions/slack/src/monitor/message-handler/prepare.ts` - -### Code Snippet - -```typescript -const shouldRequireMention = isRoom - ? (channelConfig?.requireMention ?? ctx.defaultRequireMention) - : false; -``` - -Where `channelConfig` is resolved per-channel: -```typescript -const channelConfig = isRoom - ? resolveSlackChannelConfig({ - channelId: message.channel, - channelName, - channels: ctx.channelsConfig, - channelKeys: ctx.channelsConfigKeys, - defaultRequireMention: ctx.defaultRequireMention, - allowNameMatching: ctx.allowNameMatching, - }) - : null; -``` - -And `ctx.channelsConfig` comes from the **per-account merged config**: - -```typescript -channelsConfig: slackCfg.channels, -``` - -Where `slackCfg = account.config` (merged from account-specific + global config). - -### Per-Account or Global? - -**Per-account.** Each account's `SlackMonitorContext` has its own `channelsConfig` and `defaultRequireMention` derived from its merged config. However, the channel config entries themselves (`channels.slack.channels.*`) are typically shared globally in the config file unless explicitly overridden in `channels.slack.accounts.<id>.channels.*`. - -In practice: -- If `channels.slack.channels.C_CTO.requireMention: true` is set globally, **both** Default-Bot and CoS-Bot accounts will see the same `requireMention: true` for channel `C_CTO` -- To give CoS-Bot different mention requirements, you would need `channels.slack.accounts.cos.channels.C_CTO.requireMention: false` (if supported by the merge logic in `resolveMergedAccountConfig`) - -The `mergeSlackAccountConfig` function in `accounts.ts` does merge account-level config over global: - -```typescript -return resolveMergedAccountConfig<SlackAccountConfig>({ - channelConfig: cfg.channels?.slack as SlackAccountConfig | undefined, - accounts: cfg.channels?.slack?.accounts as Record<string, Partial<SlackAccountConfig>> | undefined, - accountId, -}); -``` - -So **per-account channel config overrides are supported** via the `accounts.<id>.channels` namespace. - ---- - -## 6. Verdict - -**Can CoS-Bot and CTO (via Default-Bot) have a conversation in #cto?** - -### PARTIALLY -- with important caveats - -### Evidence For (YES, it can work): - -1. **Self-loop filter is per-account**: Each account only filters out messages from its own `botUserId`. CoS-Bot's messages won't be filtered by Default-Bot's self-loop check, and vice versa. - -2. **allowBots enables cross-bot reception**: Setting `allowBots: true` on the `#cto` channel config (or per-account) will allow each bot to receive messages from the other bot. - -3. **Bindings route correctly**: Account-scoped bindings ensure Default-Bot's events route to the CTO agent and CoS-Bot's events route to the CoS agent. - -4. **No cross-account dedup**: Unlike Feishu, Slack accounts have fully independent event streams. Both accounts will process events independently without interfering with each other. - -5. **Independent Bolt Apps**: Each account runs its own Slack Bolt App with its own WebSocket connection, so there's no event contention. - -### Evidence For Caveats (PARTIALLY): - -1. **Both bots see ALL messages in #cto**: When a human user posts in #cto, BOTH Default-Bot and CoS-Bot receive the event independently (from their own Slack event streams). This means both CTO agent AND CoS agent will process the message and potentially respond, unless properly gated. The inbound dedup at `inbound-dedupe.ts` includes `accountId` in its key, so the same physical message will NOT be deduped across accounts. - -2. **requireMention is the primary gating mechanism**: To prevent both agents from responding to every message, `requireMention` must be configured. But if CoS-Bot specifically @mentions Default-Bot's agent name, that mention is resolved by Default-Bot's context using its own `botUserId` and `mentionRegexes`. The cross-bot mention resolution works correctly because `explicitlyMentioned` checks `message.text?.includes(<@${ctx.botUserId}>)` using each account's own bot user ID. - -3. **Infinite loop risk**: If `allowBots: true` is set and `requireMention` is not configured, the CTO agent responding will trigger CoS agent to respond (because CoS-Bot sees the CTO agent's reply as a bot message in #cto), and vice versa. The code has NO built-in loop-breaker beyond requireMention and behavioral instructions. The docs explicitly warn: "if you allow replying to other bots (allowBots=true), use requireMention, users allowlist, and/or explicit guardrails in AGENTS.md and SOUL.md to prevent bot reply loops." - -4. **Thread behavior**: When CoS-Bot posts in #cto, and CTO agent replies (via Default-Bot), the reply goes to the channel or thread. If CTO agent responds in-thread, CoS-Bot will receive the thread reply event, and the thread participation check (`hasSlackThreadParticipation`) means CoS agent may get an **implicit mention** for subsequent thread messages, bypassing `requireMention`. - -5. **Session isolation**: Each agent's conversation is tracked in separate sessions (because `accountId` and `agentId` differ). This means neither agent sees the other's conversation history natively -- they only see the raw Slack messages. Cross-agent context must be inferred from the Slack message text. - -### Required Configuration for the Claimed Scenario: - -```yaml -channels: - slack: - accounts: - default: - botToken: xoxb-DEFAULT-BOT-TOKEN - appToken: xapp-DEFAULT-APP-TOKEN - channels: - C_CTO: - allowBots: true # Required: accept messages from CoS-Bot - requireMention: true # Recommended: prevent responding to everything - cos: - botToken: xoxb-COS-BOT-TOKEN - appToken: xapp-COS-APP-TOKEN - channels: - C_CTO: - allowBots: true # Required: accept messages from Default-Bot - requireMention: true # Recommended: prevent responding to everything - -bindings: - - match: { channel: slack, accountId: default, peer: { kind: channel, id: C_CTO } } - agentId: cto - - match: { channel: slack, accountId: cos } - agentId: cos -``` - -### Summary - -The original claim is **technically accurate** -- the code supports it -- but the claim omits critical operational details: - -| Claim | Verified? | Notes | -|-------|-----------|-------| -| "CoS-Bot posts in #cto, CTO agent receives it" | YES | Requires `allowBots: true` on Default-Bot's channel config | -| "CTO responds, CoS agent sees the response" | YES | Requires `allowBots: true` on CoS-Bot's channel config | -| "They can have a back-and-forth conversation" | PARTIALLY | Works but requires careful loop prevention; no built-in cross-bot conversation termination | - ---- - -## 7. What I Could NOT Find - -1. **Explicit cross-account dedup for Slack**: I confirmed Feishu has `tryRecordMessagePersistent(ctx.messageId, "broadcast")` for cross-account dedup. Slack has NO equivalent. I could not find any code that deduplicates the same physical human message across two Slack accounts. This means when a human posts in #cto, both CTO agent and CoS agent WILL process it independently. - -2. **Loop detection/circuit-breaker**: I found no automatic bot-to-bot loop detection beyond the self-loop filter. The protection is entirely behavioral (requireMention + AGENTS.md guardrails). - -3. **Thread participation cross-account behavior**: I found `hasSlackThreadParticipation` which tracks sent messages per `accountId + channel + threadTs`, but I could not find whether this tracking affects the OTHER account's implicit mention resolution. Each account has its own `sent-thread-cache`, so if Default-Bot posts in a thread, CoS-Bot's `hasSlackThreadParticipation` for that thread would be `false` (unless CoS-Bot has also posted there). This means **implicit mention via thread participation does NOT cross account boundaries**, which is good for loop prevention but means CoS agent won't automatically follow thread conversations started by CTO agent unless explicitly @mentioned. - -4. **`resolveMergedAccountConfig` deep merge behavior**: I read the function signature but did not retrieve its full implementation to confirm whether nested `channels.*` config in `accounts.<id>` truly deep-merges or shallow-replaces. The merge behavior affects whether per-account channel config overrides work as expected. - -5. **Rate limiting or throttling between bot accounts**: I found no evidence of cross-account rate limiting that would prevent rapid back-and-forth between two bots. diff --git a/CLAUDE.md b/CLAUDE.md deleted file mode 100644 index ed74c07..0000000 --- a/CLAUDE.md +++ /dev/null @@ -1,98 +0,0 @@ -# OpenCrew — Project Context - -> This file is local development context for Claude Code. Do NOT commit to the repo. - -## Core Philosophy - -OpenCrew is a multi-agent operating system where **agents are the primary operators, humans are strategic decision-makers**. Everything flows from three principles: - -### 1. Minimize Human Intervention - -Users should be able to quickly identify the absolute minimum manual steps, then delegate everything else. - -**What MUST be human:** -- Creating platform apps/bots (Slack/Discord/Feishu) — requires browser auth, OAuth consent -- Granting privileged intents / permissions — platform-level human verification -- Providing credentials (tokens, secrets) — security boundary -- L3 decisions (irreversible: deploy, delete, trade) — Autonomy Ladder enforcement - -**What agents handle:** -- Reading DEPLOY.md and executing deployment steps -- Copying workspace files, creating symlinks, directory structure -- Fetching channel/group IDs via API -- Merging config snippets into openclaw.json (incremental, bounded) -- Restarting gateway and verifying connectivity -- All L1/L2 operations (reversible work, impactful-but-rollbackable changes) - -### 2. Agent-First Configuration - -Remaining setup is done by Tool Agents — but with strict boundaries: - -- **Incremental patches, not rewrites**: CONFIG_SNIPPET files define minimal additions to merge, never full config replacements. Agent must not touch auth/models/gateway sections. -- **Config enforces what docs can't**: A2A allowlist, maxPingPongTurns, subagent deny — all in openclaw.json, not just documentation. If a rule can be a config constraint, it must be. -- **Bounded tool use**: Agents modify specific config sections, not broad file rewrites. `rsync --ignore-existing` for workspace files, targeted JSON merges for config. - -### 3. Reliable Instruction Following Across Models - -Protocols are structured so different LLMs can follow them step-by-step: - -- **SOUL.md first**: Identity anchors all decisions. Read before workflow. -- **Fixed read order**: SOUL → AGENTS → USER → MEMORY → shared protocols -- **Explicit state machines**: AGENTS.md specifies exact flowcharts (receive input → classify QAPS → branch), not fuzzy guidelines -- **Fixed-format templates**: Closeout, Checkpoint, Subagent Packet — all have mandatory fields, not suggestions -- **Numerical signals**: Signal score 0-3 in closeouts removes subjective judgment from KO filtering -- **Config > docs**: Hard constraints in openclaw.json trump soft constraints in .md files - -## Document Audience Map - -| Audience | Documents | -|----------|-----------| -| **Human only** | README, CONCEPTS, ARCHITECTURE, FAQ, KNOWN_ISSUES, JOURNEY, CUSTOMIZATION | -| **Agent only** | SOUL.md, AGENTS.md, USER.md, MEMORY.md, shared/*.md, AGENT_ONBOARDING | -| **Both (bridge)** | DEPLOY.md (human: prerequisites + credentials; agent: execution steps), GETTING_STARTED (human guide referencing agent-executable setup), CONFIG_SNIPPET_*.md (agent reads during deploy, human reviews) | -| **Human → Agent handoff** | Platform setup docs (SLACK_SETUP, DISCORD_SETUP, FEISHU_SETUP): human does platform-side steps, then hands credentials to agent for OpenClaw-side config | - -## Architecture Quick Reference - -``` -Layer 1: Intent Alignment — YOU + CoS (strategic partner, not gateway) -Layer 2: Execution — CTO → Builder, CIO (swappable domain), Research (spawn-only) -Layer 3: Maintenance — KO (knowledge distillation), Ops (audit, governance) -``` - -- Channel = Role, Thread = Task -- Autonomy Ladder: L0 suggest → L1 reversible → L2 impactful → L3 irreversible -- Task types: Q (query) → A (artifact) → P (project) → S (system change) -- Knowledge pipeline: Raw chat → Closeout (25x compression) → KO extraction - -## Working With This Repo - -### What belongs in the repo -- Agent-facing protocols (shared/*.md, workspaces/*/SOUL.md etc.) -- Human-facing documentation (docs/*, README, DEPLOY) -- Platform setup guides and config snippets - -### What does NOT belong in the repo -- `.claude/` (local Claude Code config and agents) -- `.harness/` (local development harness artifacts) -- `CLAUDE.md` (this file — local project context) -- Research reports, QA reports, architecture drafts (local harness outputs) - -### When editing docs -- Platform setup instructions must be verified against official docs (Discord API, Feishu Open Platform, Slack API) -- Config keys must be verified against actual OpenClaw releases — never fabricate -- Update both Chinese and English versions -- Remember: setup docs are a human-agent bridge. Mark clearly which steps are human-manual vs agent-automated -- YAML/JSON examples in code blocks must use ASCII quotes, never typographic quotes - -### When proposing architecture changes -- Distinguish config-layer constraints (reliable, system-enforced) from doc-layer constraints (soft, agent-voluntary) -- Favor pushing rules into openclaw.json over adding them to .md files -- Changes to shared/*.md affect ALL agents — review impact across roles -- v2-lite direction: fewer files, more config constraints, simpler protocols - -## Active Context - -- **Issues**: #31 (Feishu multi-agent), #33 (doc ordering), #34 (Discord routing) — addressed in PR #36 -- **A2A Evolution**: Research completed on Slack/Feishu/Discord multi-bot capabilities. Architecture reports in local `.harness/reports/`. Key finding: Slack supports true multi-agent discussion NOW via multi-account + @mention; Feishu limited by platform (bot messages invisible to other bots); Discord blocked by OpenClaw #11199. -- **PR #36**: Documentation fixes only. Local harness/agent files stay local. From 400cd2c0c08bb3fa32b7c591e63c26ba42cae892 Mon Sep 17 00:00:00 2001 From: Alex's Mac <alexmac@AlexsdeMac-mini-2.local> Date: Fri, 3 Apr 2026 07:54:43 +0800 Subject: [PATCH 4/4] docs: update README for A2A v2 + fix protocol status inconsistencies MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit README: - Replace March status update with April A2A v2 announcement - Add A2A v2 architecture diagram reference and three-step setup summary - Explain "selective independence" and Harness Design connection - Update core concepts: A2A two-step → A2A two modes (table) - Add Discussion mode explanation in A2A section - Update "stable vs exploring": Discussion moves to stable Protocol: - Fix two leftover [待 POC 验证] → [已验证] in terminology and platform matrix (§2b and §7 were already updated) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> --- README.md | 44 ++++++++++++++++++++++++++++-------------- shared/A2A_PROTOCOL.md | 4 ++-- 2 files changed, 32 insertions(+), 16 deletions(-) diff --git a/README.md b/README.md index 8f60b35..6969e59 100644 --- a/README.md +++ b/README.md @@ -14,19 +14,23 @@ --- -## 📢 近况更新(2026 年 3 月 19 日) +## 📢 近况更新(2026 年 4 月) -感谢越来越多的同学关注 OpenCrew。项目有一段时间没更新了,但这也是我自己一直在用的架构——**项目没有停,方向没有变。** 后续会在合适的时机持续更新,也欢迎大家多提 Issue 反馈问题和需求。 +### A2A v2:Agent 之间能真正讨论了 -OpenCrew 还在早期,很多实现方式还不够高效,但核心目标始终清晰:**让每个人都能管好一支多 Agent 团队——有机协同、稳定迭代。** +我们解决了 OpenCrew 自诞生以来最大的架构限制:**Agent 之间现在可以像真人一样在同一个 Slack 频道里讨论问题**,而不只是单向派任务。 -目前在做的事:看板与聊天界面的项目管理融合、Agent 智能 Onboarding(从海量开源 Agent 和 Skills 中精炼选型方法论,让系统能自动引入新 Agent)、架构简化(当前 A2A 依赖补丁方案,正跟踪 OpenClaw 上游的系统级支持)、以及探索更适合多 Agent 架构的记忆管理体系。各方向都在测试新技术,但我不想推未经验证的临时方案——开源生态在快速革新,我会在时机成熟时做一次有质量的更新。 +**之前**:所有 Agent 共享一个 Slack bot → bot 不能触发自己 → Agent 之间只能靠 `sessions_send` 单向委派。 -就我个人体感来说,**Slack 目前仍是多 Agent 管理的最优解。** 我在一个 Workspace 管理两台设备上的 17 个 Agent,体验已经很流畅。Slack 新上线的 Activity 页面(类似邮件收件箱)非常适合批量处理 Agent 通知。 +**现在**:给少数关键 Agent(如你的幕僚长 / 编排者)创建一个独立 Slack App → 把它拉进任意 Agent 的频道 → 两个 Agent 直接对话,你可以实时旁观。 -我相信关注这个仓库的同学,大多已经有了深度的 Agent 和 AI 协作经验,并且希望利用多 Agent 在工作和生活中创造更大的价值。在这个基础上,可能不少人也在思考一个更进一步的问题:**如何真正发挥 Coding Agent 的能力去打造生产级应用,而不只是用 Lovable 之类的工具做一个 demo。** 我最近在做的另一个项目 **[Agent-First Development](https://github.com/AlexAnys/agent-first-dev)** 就围绕这个方向——面向非技术或全栈背景的构建者,内容框架参考 Stanford (已获授权) 和 Chicago Booth 的相关课程,同样在持续开发中,欢迎关注。 +![A2A v2 架构](docs/OpenCrew-A2A-V2-架构图.svg) -**感谢每一位关注者的耐心。下一个更新不远了。** +这个架构借鉴了 [Anthropic Harness Design](https://www.anthropic.com/engineering/harness-design-long-running-apps) 的核心洞察:**生成者(执行)和评估者(QC)必须分离**——因为同一个 AI 既做事又自检时,它倾向于对自己宽容。在 OpenCrew 中,编排者负责规划和质量把关,执行层 Agent 负责干活,两者通过 @mention 在 Slack 频道里自然协作。 + +**设置只需三步**:创建一个独立 Slack App → 配置多账号 → 把 bot 拉进目标频道。详见 → [Discussion Mode 配置指南](docs/A2A_SETUP_GUIDE.md) + +> 更多技术细节和实战踩坑记录 → [A2A 协议 v2](shared/A2A_PROTOCOL.md) · [核心概念](docs/CONCEPTS.md#a2a-的两种模式delegation-与-discussion) --- @@ -226,9 +230,14 @@ OpenCrew 的运转靠几个关键机制。下面是 30 秒速览,详细说明 | P | 项目(多步骤、跨天) | 需要 + Checkpoint | | S | 系统变更 | 需要 + Ops 审计 | -**A2A 两步触发** — Agent 之间怎么协作 +**A2A 两种模式** — Agent 之间怎么协作 + +| 模式 | 适用场景 | 机制 | 平台 | +|------|---------|------|------| +| **Delegation(委派)** | 派具体任务 | `sessions_send` 两步触发 | Slack / Discord / 飞书 | +| **Discussion(讨论)** | 多方讨论、评审、协商 | 独立 Bot @mention 对话 | Slack | -因为所有 Agent 共用一个 Slack bot,bot 自己发的消息不会触发自己。所以跨 Agent 协作需要两步:先在目标频道发一条可见消息(锚点),再用 `sessions_send` 真正触发对方。细节见 → [A2A 协议](shared/A2A_PROTOCOL.md) +Delegation 是基础——CTO 给 Builder 派活。Discussion 是增强——编排者走进 CTO 的频道,两个 Agent 直接讨论方案,你在旁边看着。细节见 → [A2A 协议 v2](shared/A2A_PROTOCOL.md) **三层知识沉淀** — 经验怎么从聊天记录变成组织资产 @@ -243,12 +252,18 @@ Layer 2: KO 提炼的抽象知识(原则 / 模式 / 踩坑记录) ## 跑通 A2A 闭环 > 部署完成后,每个 Agent 各自能回复消息 ≠ Agent 之间能协作。 -> A2A(Agent-to-Agent)闭环需要额外配置和验证。 +> A2A(Agent-to-Agent)需要额外配置。OpenCrew 支持两种模式: -### 什么是 A2A 闭环? +### Delegation(委派)— 派任务 你在 `#cto` 给 CTO 一个开发任务 → CTO 自动在 `#build` 给 Builder 派单 → Builder 在 thread 里分轮执行 → 每轮进展在 Slack 可见 → CTO 回到 `#cto` 汇报结果。**全程你只需要看 Slack。** +### Discussion(讨论)— 多 Agent 协作 `NEW` + +选一个 Agent 作为你的编排者(如 CoS),给它创建独立 Slack App → 把它拉进 CTO 和 Builder 的频道 → 编排者在频道里 @mention 执行 Agent 发起讨论 → 执行 Agent 回复 → 编排者评估、追问或结束 → **你看到的是两个 Agent 在 Slack 里像同事一样讨论问题。** + +> 为什么要分离?借鉴 Anthropic Harness Design:同一个 AI 既执行又自检时倾向于宽容自己。编排者专注规划和 QC,执行 Agent 专注干活——这是让 AI 协作真正有效的关键分工。 + ### 让你的 Agent 自动完成 A2A 设置 > ⚠️ **首次设置提醒**:A2A 闭环流程中,Agent 会检查并补全 `openclaw.json` 的 A2A 配置(如 `agentToAgent.allow`、`maxPingPongTurns`)。配置变更会**自动触发 OpenClaw gateway 重启**,导致所有 Agent 的当前会话短暂中断。这是正常的一次性设置过程——重启完成后 Agent 会自动恢复,你只需要重新发起验证步骤即可。 @@ -288,7 +303,7 @@ Layer 2: KO 提炼的抽象知识(原则 / 模式 / 踩坑记录) | **[完整上手指南](docs/GETTING_STARTED.md)** | 从零到跑通的详细步骤 + 常见问题 | 第一次部署 | | **[核心概念详解](docs/CONCEPTS.md)** | 自主等级、QAPS、A2A、知识沉淀的完整说明 | 想深度理解系统 | | **[架构设计](docs/ARCHITECTURE.md)** | 三层架构、设计取舍、为什么这么做 | 想理解设计思路 | -| **[A2A 跑通指南](docs/A2A_SETUP_GUIDE.md)** | A2A 配置、workspace 补丁、验证步骤 | 让 Agent 间能协作 | +| **[A2A 跑通指南](docs/A2A_SETUP_GUIDE.md)** | Delegation + Discussion 配置、多账号设置、验证步骤 | 让 Agent 间能协作 | | **[自定义指南](docs/CUSTOMIZATION.md)** | 增删改 Agent、替换领域专家 | 想调整团队配置 | | **[已知问题](docs/KNOWN_ISSUES.md)** | 系统的真实边界和当前最佳实践 | 遇到奇怪行为时 | | **[开发历程](docs/JOURNEY.md)** | 从一个人的痛点到一支虚拟团队 | 想了解来龙去脉 | @@ -312,7 +327,8 @@ Layer 2: KO 提炼的抽象知识(原则 / 模式 / 踩坑记录) ### ✅ 已稳定运行 - 多 Agent 领域分工 + 频道绑定(Slack / 飞书 / Discord) -- A2A 两步触发(Slack 可见锚点 + sessions_send) +- A2A Delegation(两步触发:Slack 可见锚点 + `sessions_send`) +- A2A Discussion(独立 Bot @mention 协作,Slack 已验证)`NEW` - A2A 闭环(多轮 WAIT 纪律 + 双通道留痕 + 闭环 DoD) - Closeout / Checkpoint 强制结构化产物 - Autonomy Ladder(L0-L3) @@ -323,7 +339,7 @@ Layer 2: KO 提炼的抽象知识(原则 / 模式 / 踩坑记录) - 更好的知识系统(跨 session 语义检索) - 更轻量的架构(v2-lite:7 Agent → 5,9 个 shared 文件 → 3) -- Slack root message 独立 session 的更稳定方案 +- Discord Discussion 模式(受 OpenClaw 代码层 bug 阻塞,非平台限制) --- diff --git a/shared/A2A_PROTOCOL.md b/shared/A2A_PROTOCOL.md index 95510e1..538ac4e 100644 --- a/shared/A2A_PROTOCOL.md +++ b/shared/A2A_PROTOCOL.md @@ -17,7 +17,7 @@ - **A2A**:Agent-to-Agent 协作流程总称,包含 Delegation 和 Discussion 两种模式。 - **Task Thread**:在目标 Agent 频道里创建的任务线程;该线程即该任务的独立 Session。 - **Delegation(委派)**:由 `sessions_send` 触发的结构化任务委派,全平台可用。 -- **Discussion(讨论)**:由 @mention 触发的多 Agent 实时讨论,仅 Slack 多 Bot [待 POC 验证]。 +- **Discussion(讨论)**:由 @mention 触发的多 Agent 实时讨论,仅 Slack 多 Bot [已验证]。 - **Multi-Account(多账户)**:每个 Agent 使用独立 Slack App(独立 bot token / app token / bot user ID)。 - **Orchestrator(编排者)**:控制讨论节奏的角色。默认是 CoS(代表用户推进),也可以是人类。 @@ -229,7 +229,7 @@ Rounds Used: N/M | 能力 | Slack | Discord | Feishu | |------|-------|---------|--------| | Delegation | YES | YES | YES | -| Discussion | 待 POC 验证 | NO(OpenClaw 代码层阻塞) | NO(飞书平台限制) | +| Discussion | ✅ 已验证 | NO(OpenClaw 代码层阻塞) | NO(飞书平台限制) | | Multi-Account | YES | YES | YES(注意 #47436) | | Thread/Topic 隔离 | YES (native) | YES (auto-archive) | YES (groupSessionScope >= 2026.3.1) |