docs(skill): list_recent_decisions hardening [skip-runtime-e2e] by saurabhjain1592 · Pull Request #61 · getaxonflow/axonflow-codex-plugin

saurabhjain1592 · 2026-05-08T13:28:56Z

Adds prescriptive language to the V1.1 list_recent_decisions surface that prevents the agent-runtime hallucinations observed in the 2026-05-08 Cursor IDE GUI evidence (axonflow-cursor-plugin#62).

What changed

Two prescriptive additions to the SKILL.md / tool description:

"Invoke the tool directly — do not pre-flight-check." During the Cursor IDE GUI evidence run, the LLM agent invented an "Authentication required (-32001)" error path before the MCP tool ever returned. A direct curl reproduction with the same headers Cursor sends returns the correct V1.1 upgrade envelope. The MCP server is authoritative; agents shouldn captures speculate about auth or descriptor-file state before making the call.
"Reporting integrity." The Cursor agent wrote a SMOKE_RESULT marker for an upgrade envelope when the tool actually returned a decisions array, and vice versa. If the user prompt asks for a structured output marker, derive it from the actual tool response — never substitute one shape for the other.

Three-rule DOD

Rule 0 — Runtime proof: ride-along on the existing list_recent_decisions runtime-e2e — no behavior change in the platform or the plugin runtime, only agent-side framing.
Rule 1 — Self-review: small textual change, both sentences read cleanly. No code path affected.
Rule 2 — No deferred bugs: none in path.

Reference

axonflow-cursor-plugin/runtime-e2e/list-recent-decisions/EVIDENCE.md — captured Cursor IDE LLM-narration gap that motivated this hardening.

Skip-runtime-e2e justification

This PR only modifies the agent-side prose / TS-string description that the LLM agent reads when deciding how to interact with the list_recent_decisions MCP tool. No platform code, no SDK code, no plugin runtime path is changed — the existing runtime-e2e/list-recent-decisions/test.sh continues to pass against the live stack (verified yesterday during the V1.1 DoD sweep — see axonflow-enterprise/runtime-e2e/v1_1_full_dod_sweep/plugin_host_runtime/).

The hardening is to prevent agent-runtime hallucinations (Cursor agent inventing an "auth required" path before invoking the tool) — that is an agent-prompt concern, not a runtime-wire concern.

Two prescriptive additions to prevent agent-runtime hallucinations observed in 2026-05-08 Cursor IDE GUI evidence (#1982): 1. "Invoke the tool directly — do not pre-flight-check." The Cursor agent invented an "auth required" error path before the MCP tool ever returned. The MCP server's response is authoritative; agents shouldn't speculate about auth or descriptor-file state before making the call. 2. "Reporting integrity." If the user prompt asks for a SMOKE_RESULT marker, derive it from the actual tool response, never from a guess at the response shape. The Cursor agent in the captured run wrote a SMOKE_RESULT for an upgrade envelope when the tool actually returned decisions, and vice versa — substituting one shape for the other corrupts both UX integrity and any release-prep test harness reading the marker. Signed-off-by: Saurabh Jain <saurabhjain1592@gmail.com>

…runtime-e2e] Trims the v1.4.0 telemetry section to the project's standard 2-bullet / 5-line compact format at end of release entry. Same wire-shape contract documented; just shorter prose. Sibling to axonflow-openclaw-plugin#120 + axonflow-claude-plugin#74. Signed-off-by: Saurabh Jain <saurabhjain1592@gmail.com>

Signed-off-by: Saurabh Jain <saurabhjain1592@gmail.com>

saurabhjain1592 added 2 commits May 8, 2026 15:26

saurabhjain1592 changed the title ~~docs(skill): tighten list_recent_decisions agent guidance~~ docs(skill): list_recent_decisions hardening [skip-runtime-e2e] May 8, 2026

chore: re-trigger CI after PR title/body update [skip-runtime-e2e]

12c4132

Signed-off-by: Saurabh Jain <saurabhjain1592@gmail.com>

saurabhjain1592 merged commit 3c990f6 into main May 8, 2026
8 checks passed

saurabhjain1592 deleted the docs/skill-list-decisions-llm-hardening branch May 8, 2026 14:21

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docs(skill): list_recent_decisions hardening [skip-runtime-e2e]#61

docs(skill): list_recent_decisions hardening [skip-runtime-e2e]#61
saurabhjain1592 merged 3 commits into
mainfrom
docs/skill-list-decisions-llm-hardening

saurabhjain1592 commented May 8, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

saurabhjain1592 commented May 8, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What changed

Three-rule DOD

Reference

Skip-runtime-e2e justification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

saurabhjain1592 commented May 8, 2026 •

edited

Loading