Skip to content

v0.5.4 - Fix workflow issues and enhance task complexity handling#4

Open
namos2502 wants to merge 9 commits intomainfrom
dev
Open

v0.5.4 - Fix workflow issues and enhance task complexity handling#4
namos2502 wants to merge 9 commits intomainfrom
dev

Conversation

@namos2502
Copy link
Owner

@namos2502 namos2502 commented Mar 21, 2026

This pull request introduces a new "task complexity" system to the orchestration skill, refines delegation logic, and clarifies reporting and prompt instructions for both the orchestration and agent skills. The changes aim to improve task routing, ensure appropriate spec detail, and optimize agent interactions—especially for complex or high-risk tasks.

Task Complexity and Delegation Logic:

  • Introduced a task complexity scale (Simple / Standard / Complex) in skills/orchestration/SKILL.md, with clear signals and delegation styles for each level. This classification determines the level of detail in prompts and whether a Q&A phase is triggered before execution.
  • Updated the orchestration decision tree: now checks task complexity before type, explicitly prefers native subagents for simple tasks, and clarifies when cross-CLI delegation is justified.
  • Added an effort threshold: simple, single-operation, or read-only tasks are handled inline or with native subagents to avoid unnecessary overhead.

Prompt and Reporting Protocols:

  • For complex tasks, added a mandatory Q&A phase: the control center appends a question block to the prompt, the agent surfaces ambiguities, and execution only proceeds after clarification.
  • Standardized report format instructions—now require plain text labels (no bold or headers) and more explicit handling of status semantics for analysis tasks.
  • For copilot-cli, delegation prompts must now always include a critical instruction to return only the structured report, with no extraneous reasoning or output.

Control Center and Synthesis Improvements:

  • Added guidance to avoid duplicate fetches: control center should fetch shared context (e.g., PR diffs) once and pass it to agents, rather than letting each agent fetch independently.
  • Introduced a synthesis step: after collecting agent reports, the control center now produces a consolidated summary for the user, never surfacing raw agent output directly.

Documentation and Release:

  • Updated CHANGELOG.md for version 0.5.4, summarizing the new features and protocol changes.
  • Bumped plugin version to 0.5.4 in .claude-plugin/plugin.json.

Summary by CodeRabbit

  • New Features

    • Task complexity tiers (Simple/Standard/Complex) for smarter routing
    • Q&A step for Complex tasks and a final Synthesize step to produce a single user-facing verdict
    • Cross-agent rule to avoid duplicate data fetches
  • Documentation

    • Added status semantics for analysis results (Verified/Partial/Failed)
    • Tightened delegation/report formatting and an explicit constraint requiring structured-report-only output
    • Noted known cleanup/session behavior
  • Chores

    • Version bumped to 0.5.4

namos2502 and others added 2 commits March 21, 2026 11:10
- Suppress intermediate reasoning: add no-reasoning Critical note to copilot-cli delegation prompt
- Avoid duplicate fetches: control center fetches shared data once, passes as context
- Enforce plain text report format: ban bold/headers in delegation template
- Fix ⚠️ semantics in claude-cli: static analysis complete = ✅, not ⚠️
- Add Synthesis Step template for consolidated control center output

Issues 3 (model routing) and 7 (background delegation) parked to ~/Projects/xflow-plans/

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Add Simple/Standard/Complex scale to orchestration skill
- Simple tasks stay inline — cross-CLI overhead not justified
- Updated decision tree checks complexity before task type
- Native subagent tier made explicit in routing
- Q&A phase for Complex tasks: 1-turn clarification before execution
- Bump version to 0.5.4

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@namos2502 namos2502 self-assigned this Mar 21, 2026
@coderabbitai
Copy link

coderabbitai bot commented Mar 21, 2026

📝 Walkthrough

Walkthrough

Bumps plugin manifest version to 0.5.4 and updates documentation: adds task complexity classification for orchestration routing, a Q&A phase for Complex tasks, status semantics for analysis tasks, an enforced structured-report constraint for Copilot CLI delegation, duplicate-fetch avoidance, and a final synthesis step.

Changes

Cohort / File(s) Summary
Version & Changelog
.claude-plugin/plugin.json, CHANGELOG.md
Manifest version incremented from 0.5.30.5.4. Changelog entry added describing orchestration additions: task complexity scale, effort thresholds, Complex Q&A phase, and updated decision ordering.
Claude CLI skill docs
skills/agents/claude-cli/SKILL.md
Adds "Status semantics for analysis tasks" with three explicit status levels (✅ Verified, ⚠️ Partial, ❌ Failed) and a constraint forbidding using ⚠️ solely due to inability to execute code.
Copilot CLI skill docs
skills/agents/copilot-cli/SKILL.md
Adds mandatory constraint: delegated Copilot CLI must emit an exact "⛔ Critical: … Return ONLY the structured report…" line and output only the structured report (no reasoning/preamble).
Orchestration skill docs
skills/orchestration/SKILL.md
Introduces Task Complexity (Simple/Standard/Complex) driving routing and spec requirements; Simple tasks kept inline/native, Standard/Complex may delegate cross-CLI; Complex requires a pre-execution Q&A turn via [Before executing]; adds [Report format] plain-text labels only, "Avoid duplicate fetches" rule, and a final Synthesize step consolidating agent outputs into one verdict.
Misc docs
commands/cleanup.md
Adds note about xFlow uninstall quirk: commands may persist until session restart and /reload-plugins won't remove already-loaded plugins.

Sequence Diagram(s)

sequenceDiagram
  participant User as User
  participant CC as Control Center
  participant Agent as Delegated Agent
  participant Data as Shared Data Source

  User->>CC: Submit task
  CC->>CC: Classify task (Simple/Standard/Complex)
  alt Simple
    CC->>CC: Handle inline or use native subagent
    CC-->>User: Synthesize & respond
  else Standard
    CC->>Data: Fetch shared data (once) 
    CC->>Agent: Delegate with full spec + [Context]
    Agent-->>CC: Structured report (plain-text labels)
    CC->>CC: Synthesize results & verdict
    CC-->>User: Respond
  else Complex
    CC->>Data: Fetch shared data (once)
    CC->>Agent: Delegate with full spec + [Before executing] Q&A block
    Agent-->>CC: Answer Q&A
    CC->>Agent: Refine delegation (if needed)
    Agent-->>CC: Structured report
    CC->>CC: Synthesize results & verdict (resolve ❌ subtasks first)
    CC-->>User: Respond
  end
Loading

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~10 minutes

Possibly related PRs

Suggested labels

bug

Poem

🐰 A hop, a bump, from .3 to .4,

Tasks sorted neat, complexity at the core,
Simple zips along, Complex asks and learns,
Agents report cleanly; the control center turns,
A synthesis carrot—bright verdict in store. 🥕

🚥 Pre-merge checks | ✅ 3
✅ Passed checks (3 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title directly aligns with the main changeset: v0.5.4 version bump, workflow improvements (synthesis, duplicate fetch avoidance), and the new task complexity system (Simple/Standard/Complex) with enhanced routing logic.
Docstring Coverage ✅ Passed No functions found in the changed files to evaluate docstring coverage. Skipping docstring coverage check.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch dev

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

Tip

You can make CodeRabbit's review stricter and more nitpicky using the `assertive` profile, if that's what you prefer.

Change the reviews.profile setting to assertive to make CodeRabbit's nitpick more issues in your PRs.

@namos2502 namos2502 changed the title Fix workflow issues and enhance task complexity handling v0.5.4 - Fix workflow issues and enhance task complexity handling Mar 21, 2026
Complexity-gates first (Simple stays inline), then routes by reason:
platform-specific, context isolation, different model.
Complexity level (Standard/Complex) now governs spec detail,
not the routing decision itself.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧹 Nitpick comments (1)
skills/orchestration/SKILL.md (1)

64-64: Consider adding language specifiers to fenced code blocks.

Three fenced code blocks are missing language specifiers (lines 64, 101, 175), triggering markdownlint MD040 warnings. While these are templates rather than executable code, adding text or markdown identifiers would improve linter compliance and rendering consistency.

📝 Proposed fix for linter compliance

Line 64 (decision tree):

-```
+```text
 New task →

Line 101 (Q&A phase template):

-```
+```text
 [Before executing]: This task is complex. In your first response, list any questions

Line 175 (synthesis step template):

-```
+```text
 **[Task / PR title]**

Also applies to: 101-101, 175-175

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/orchestration/SKILL.md` at line 64, Add language specifiers to the
three fenced code blocks in SKILL.md that are currently untyped: the
decision-tree block (the "New task →" template), the Q&A phase template block
(the block starting with "[Before executing]: This task is complex..."), and the
synthesis step template block (the block starting with "**[Task / PR title]**");
update each opening fence to include a language token such as text or markdown
(e.g., ```text) so the linter (MD040) stops flagging them and rendering is
consistent.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Nitpick comments:
In `@skills/orchestration/SKILL.md`:
- Line 64: Add language specifiers to the three fenced code blocks in SKILL.md
that are currently untyped: the decision-tree block (the "New task →" template),
the Q&A phase template block (the block starting with "[Before executing]: This
task is complex..."), and the synthesis step template block (the block starting
with "**[Task / PR title]**"); update each opening fence to include a language
token such as text or markdown (e.g., ```text) so the linter (MD040) stops
flagging them and rendering is consistent.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: 5cdcd561-f37c-40a9-848c-3d3f953b23e1

📥 Commits

Reviewing files that changed from the base of the PR and between bd2af7d and d8b352c.

📒 Files selected for processing (5)
  • .claude-plugin/plugin.json
  • CHANGELOG.md
  • skills/agents/claude-cli/SKILL.md
  • skills/agents/copilot-cli/SKILL.md
  • skills/orchestration/SKILL.md

namos2502 and others added 6 commits March 21, 2026 11:56
The table template forced a per-agent breakdown even for single-agent
flows. Prose guidance lets the control center lead with what matters
(issues list) and only add agent attribution when it's useful.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Removes standalone Synthesis Step section — guidance now lives as
step 6 in the protocol where it belongs in the workflow sequence.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
- Clarify 'rewrite in your own words' vs surface raw output
- Define severity tiers inline with what they mean
- Add per-agent attribution guidance (only when useful)
- Add partial failure handling: surface blocker, hold verdict

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
/reload-plugins does not remove already-loaded plugins — user must
restart the session for the uninstall to take full effect.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy link

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
skills/orchestration/SKILL.md (1)

54-62: Clarify precedence between “Standard/Complex” and native-subagent routing.

Line 54 can be read as “always cross-CLI for Standard/Complex,” while Line 67 says to prefer host-native subagents first. Please make precedence explicit in prose to match the decision tree.

Suggested wording tweak
-4. Task is **Standard or Complex** — the work inside the agent justifies the delegation overhead
+4. Task is **Standard or Complex** — *after* checking for a suitable host-native subagent, the work may justify cross-CLI delegation

Also applies to: 63-76

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/orchestration/SKILL.md` around lines 54 - 62, The guidance is
ambiguous between the rule "Task is **Standard or Complex** — the work inside
the agent justifies the delegation overhead" and the rule "The host has a native
subagent that can handle it"; update the prose in the SKILL.md section
containing these rules so it states explicit precedence: always prefer the
host-native subagent when available and capable, and only choose cross-CLI agent
delegation for Standard/Complex work if no appropriate native subagent exists or
if the delegation provides clear additional capabilities; revise the sentences
around the "Task is **Standard or Complex**" and the "Do NOT delegate when: ...
The host has a native subagent that can handle it" lines (and the adjacent 63-76
rules) to make that precedence sentence explicit and unambiguous.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.

Inline comments:
In `@skills/orchestration/SKILL.md`:
- Around line 64-76: The three fenced code blocks in SKILL.md (the block
starting with "New task →", the block starting with "[Task]: <one sentence —
what needs to be done>", and the block starting with "[Before executing]: This
task is complex...") are missing language identifiers; update each opening
triple-backtick to include a language tag (e.g., ```text) so markdownlint MD040
is satisfied, leaving the block contents unchanged and keeping the closing ```
as-is.
- Line 174: Update the Agent SKILL docs to explicitly prohibit re-fetching when
source data is provided in [Context]: for each SKILL (e.g., copilot-cli SKILL
and claude-cli SKILL) add a clear sentence such as "If the required source data
is present in [Context], do not re-fetch it; use the [Context] payload as the
single source of truth," and include a short example delegation prompt showing
passing [Context] and avoiding additional fetch calls; ensure the guidance
mentions both not issuing HTTP/file fetches and not embedding raw re-fetched
output in the delegation prompt.

---

Nitpick comments:
In `@skills/orchestration/SKILL.md`:
- Around line 54-62: The guidance is ambiguous between the rule "Task is
**Standard or Complex** — the work inside the agent justifies the delegation
overhead" and the rule "The host has a native subagent that can handle it";
update the prose in the SKILL.md section containing these rules so it states
explicit precedence: always prefer the host-native subagent when available and
capable, and only choose cross-CLI agent delegation for Standard/Complex work if
no appropriate native subagent exists or if the delegation provides clear
additional capabilities; revise the sentences around the "Task is **Standard or
Complex**" and the "Do NOT delegate when: ... The host has a native subagent
that can handle it" lines (and the adjacent 63-76 rules) to make that precedence
sentence explicit and unambiguous.

ℹ️ Review info
⚙️ Run configuration

Configuration used: defaults

Review profile: CHILL

Plan: Pro

Run ID: b166f721-f206-47ac-a1e4-14fd8dd55ee5

📥 Commits

Reviewing files that changed from the base of the PR and between d8b352c and 326f55c.

📒 Files selected for processing (2)
  • commands/cleanup.md
  • skills/orchestration/SKILL.md
✅ Files skipped from review due to trivial changes (1)
  • commands/cleanup.md

Comment on lines 64 to 76
```
New task →
Platform-specific? YES → cross-CLI
Context isolation / different model? YES → cross-CLI
Host native subagent available? YES → use it (faster, no auth needed)
Default → handle in current context
Simple? → handle inline (no cross-CLI)
Host native subagent available? → use it (faster, no auth needed)
Platform-specific (GitHub, Anthropic API, etc.)? → cross-CLI
Context isolation needed (verbose output, long subtask)? → cross-CLI
Different model needed? → cross-CLI
Default → handle inline

When going cross-CLI:
Standard → full spec (problem + acceptance criteria)
Complex → full spec + Q&A turn before execution
```
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟡 Minor

Add language identifiers to fenced code blocks (markdownlint MD040).

The fences starting at Line 64, Line 89, and Line 106 are missing language tags.

Lint-compliant patch
-```
+```text
 New task →
   Simple? → handle inline (no cross-CLI)
   Host native subagent available? → use it (faster, no auth needed)
@@
-```
+```

-```
+```text
 [Task]: <one sentence — what needs to be done>
 [Context]: <only what the agent strictly needs>
@@
-```
+```

-```
+```text
 [Before executing]: This task is complex. In your first response, list any questions
 or ambiguities — 1 turn only. Do not perform any actions until I confirm.
-```
+```

Also applies to: 89-100, 106-109

🧰 Tools
🪛 markdownlint-cli2 (0.21.0)

[warning] 64-64: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/orchestration/SKILL.md` around lines 64 - 76, The three fenced code
blocks in SKILL.md (the block starting with "New task →", the block starting
with "[Task]: <one sentence — what needs to be done>", and the block starting
with "[Before executing]: This task is complex...") are missing language
identifiers; update each opening triple-backtick to include a language tag
(e.g., ```text) so markdownlint MD040 is satisfied, leaving the block contents
unchanged and keeping the closing ``` as-is.


Fan out to multiple Agents in parallel when tasks are independent. Pass a report excerpt (SUMMARY + STEPS) as context into the next delegation prompt — never raw output. Agents do not chain to each other; all coordination happens at the control center.

**Avoid duplicate fetches:** When multiple agents need the same source data (e.g. a PR diff, a file's contents), fetch it once in the control center and pass it as `[Context]` in each delegation prompt. Do not let each agent re-fetch the same data independently.
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

🧩 Analysis chain

🏁 Script executed:

#!/bin/bash
# Verify whether agent SKILL docs explicitly enforce context-first / no-refetch behavior.
rg -n -C2 -g 'skills/agents/**/SKILL.md' \
  -e 'Context' \
  -e 're-fetch|refetch|fetch.*once|avoid duplicate|do not fetch'

Repository: namos2502/xFlow

Length of output: 1032


Agent SKILL docs need explicit no-refetch guidance when source data is in [Context].

Agent SKILL docs (e.g., skills/agents/copilot-cli/SKILL.md:113 and skills/agents/claude-cli/SKILL.md:144) instruct passing [Context] in the next delegation prompt and not passing raw output, but they do not explicitly state that agents should not refetch source data when it is already available in [Context]. This gap creates ambiguity—agents may still redundantly fetch the same data without clear directive to avoid it.

🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.

In `@skills/orchestration/SKILL.md` at line 174, Update the Agent SKILL docs to
explicitly prohibit re-fetching when source data is provided in [Context]: for
each SKILL (e.g., copilot-cli SKILL and claude-cli SKILL) add a clear sentence
such as "If the required source data is present in [Context], do not re-fetch
it; use the [Context] payload as the single source of truth," and include a
short example delegation prompt showing passing [Context] and avoiding
additional fetch calls; ensure the guidance mentions both not issuing HTTP/file
fetches and not embedding raw re-fetched output in the delegation prompt.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant