Skip to content

Add Cursor agent provider#424

Open
dichioniccolo wants to merge 5 commits intomattpocock:mainfrom
dichioniccolo:main
Open

Add Cursor agent provider#424
dichioniccolo wants to merge 5 commits intomattpocock:mainfrom
dichioniccolo:main

Conversation

@dichioniccolo
Copy link
Copy Markdown

@dichioniccolo dichioniccolo commented Apr 22, 2026

Hey Matt,

This PR brings Cursor in as a first-class provider which makes Sandcastle a lot more usable for teams that want to use Cursor instead of the current agents setup.


Summary

This PR adds a built-in cursor() agent provider with full scaffold and runtime support.

import * as sandcastle from "@ai-hero/sandcastle";

await sandcastle.run({
  agent: sandcastle.cursor(),
  ...
});

Out of the box, this gives you:

  • Cursor as a selectable provider during setup
  • Structured, streamable output (JSON events)
  • Support for both print and interactive modes
  • Env-based auth (no secrets in code)
  • A clean public API you can import directly

What’s in here

Provider

  • A cursor() factory that follows the existing AgentProvider interface
  • Print mode outputs structured JSON events (so parsing is reliable)

Scaffold

  • Cursor is available during project initialization
  • Generated projects come preconfigured—no manual bootstrapping needed
  • .env.example includes clear auth setup for Cursor
  • A provider-specific Dockerfile installs everything needed
  • Default metadata (model, import, etc.) lives in the agent registry

Exports

  • cursor() is exported from the package
  • Cursor-specific option types are also exported

Registry

  • Cursor is added to the agent catalog for consistent discovery
  • Defaults are centralized so updates are low-risk

Testing

Nothing fancy here—just sticking to the patterns already in the repo and focusing on behavior over implementation details.


Happy to tweak anything if something feels off.

…nteractive args, env support, and exports), adds Cursor to the init agent registry/scaffolding with a dedicated Dockerfile and CURSOR_API_KEY env template, and expands tests to cover provider behavior plus scaffold output
@vercel
Copy link
Copy Markdown

vercel Bot commented Apr 22, 2026

@dichioniccolo is attempting to deploy a commit to the Matt Pocock's projects Team on Vercel.

A member of the Team first needs to authorize it.

@dichioniccolo dichioniccolo marked this pull request as draft April 22, 2026 17:23
… based on permissions in buildPrintCommand and buildInteractiveArgs. Update tests to reflect changes in command construction and default model. Improve robustness of planner output parsing in templates.
@dichioniccolo dichioniccolo marked this pull request as ready for review April 22, 2026 17:50
@oferRounds
Copy link
Copy Markdown

oferRounds commented May 2, 2026

@dichioniccolo, same as you, I’d also love to have Cursor support!

I asked Cursor to review your PR (using /grill-with-docs, and while verifying with the latest Cursor docs), and this was the result. Could you take a look? Happy to help further to make it happen

cc: @mattpocock

Review Document for PR #424 — Add Cursor Agent Provider

Constructive review. Critical + Major findings are merge-blocking ("request changes"). Minor findings are non-blocking suggestions.

Findings verified against Cursor CLI Parameters and Output Format docs.


Blocking findings (request changes)

1. CRITICAL -- Stream parser assumes Anthropic event shapes, but Cursor's format differs

File: src/AgentProvider.ts (parseStreamJsonLine, shared with claudeCode)

Problem: The Cursor provider reuses parseStreamJsonLine, which was written for Claude Code's stream format. After checking Cursor's stream-json docs, the formats are similar but not identical:

Concern What parseStreamJsonLine expects What Cursor CLI emits (per docs)
Text events type: "assistant" with message.content[].type === "text" Same shape -- likely compatible
Tool calls tool_use blocks inside assistant.message.content[] with name: "Bash" and input.command Top-level type: "tool_call" events with subtype: "started" / "completed" and Cursor-specific nesting (readToolCall, writeToolCall, etc.) -- not tool_use blocks
Result type: "result" with result: string type: "result" with subtype: "success" and result field -- likely compatible
System/init type: "system", subtype: "init", session_id Same shape -- likely compatible

The tool call path is the biggest concern. The test at AgentProvider.test.ts ("parseStreamLine extracts tool_use block (Bash -> command arg)") uses Anthropic-style tool_use inside message.content -- if Cursor emits tool calls as top-level type: "tool_call" events instead, tool invocations will be silently dropped (parser returns [] for unrecognized shapes).

Ask: Could you verify by running a small agent --print --output-format stream-json job and sharing the raw NDJSON output? If tool calls use Cursor's top-level format, the provider needs either a dedicated parseCursorStreamLine function, or an adapter layer that normalizes Cursor events into the shape parseStreamJsonLine expects.


2. MAJOR -- -p flag is used incorrectly in the command string

File: src/AgentProvider.ts (cursor buildPrintCommand)

Problem: The constructed command is:

agent --print --output-format stream-json ${trustFlag} --model ${model} -p ${shellEscape(prompt)}

Per Cursor CLI docs, -p is the short form of --print (non-interactive mode flag). It does not accept a prompt argument. The prompt is a positional argument:

Arguments: prompt — Initial prompt for the agent

So the command is effectively passing --print twice and relying on the CLI to treat the trailing string as a positional argument by accident. The correct form should be:

agent --print --output-format stream-json ${trustFlag} --model ${model} ${shellEscape(prompt)}

This may work today if the CLI is lenient, but it's fragile and could break if flag parsing is tightened.

Ask: Please change -p ${shellEscape(prompt)} to just ${shellEscape(prompt)} (positional argument). Update the related test assertions accordingly.


3. MAJOR -- PR description and test name contradict the implementation

Files: PR body, src/AgentProvider.test.ts (line ~2192)

Problem: Three user-facing inconsistencies stem from docs not being updated after the implementation changed:

  1. PR body says "Prompts are passed via stdin in print mode" -- code passes the prompt on argv.
  2. Test name is "buildPrintCommand delivers prompt via stdin, not argv" but asserts stdin === undefined (the exact opposite of what the name says).
  3. PR body shows import { cursor } from "@ai-hero/sandcastle/agents/cursor" but package.json has no ./agents/cursor export entry -- cursor is exported from the root (src/index.ts).

Ask: Please align all three with the actual implementation:

  • Fix the test name to match the behavior (e.g. "buildPrintCommand passes prompt as positional argument").
  • Update the PR description to remove the stdin claim.
  • For the import path: either add a "./agents/cursor" subpath export to package.json, or update the example to import { cursor } from "@ai-hero/sandcastle".

4. MAJOR -- Print-mode prompt on argv without a size guard

File: src/AgentProvider.ts (cursor buildPrintCommand)

Problem: The prompt is passed on the command line. Every other provider (claudeCode, codex) uses stdin, specifically because the PrintCommand type's JSDoc warns about the Linux ~128 KiB per-argument limit. The PromptPreprocessor expands shell expressions inline (e.g. file contents), so prompts can realistically exceed that limit.

Nuance: The Cursor CLI docs document the prompt as a positional argument only -- stdin delivery is not documented. So argv is likely the only option, which is a CLI limitation.

Ask: Since stdin doesn't appear to be supported, please add:

  1. A size guard that throws a clear error when the prompt exceeds a safe threshold (e.g. 120 KB), so users get a meaningful message instead of a cryptic E2BIG from the OS.
  2. A code comment explaining why this provider uses argv instead of stdin (Cursor CLI limitation), so future maintainers don't file it as a bug.

Non-blocking suggestions

5. MINOR -- Greedy regex fallback in extractPlanIssues

Files: src/templates/parallel-planner/main.mts, src/templates/parallel-planner-with-review/main.mts

The last-resort regex /\{[\s\S]*\}/ is greedy -- it captures from the first { to the last } in stdout. The Array.isArray(parsed.issues) validation reduces false-positive risk significantly, but a one-character fix makes it strictly correct:

const firstObjectMatch = stdout.match(/\{[\s\S]*?\}/); // non-greedy

Also: this function is duplicated identically across both templates. Consider extracting it to a shared utility to prevent drift.

6. MINOR -- Unpinned Cursor CLI install in Dockerfile

File: src/InitService.ts (CURSOR_DOCKERFILE)

curl https://cursor.com/install | bash installs whatever version is latest at build time. Pinning to a specific version would improve reproducibility. Even if other providers in the repo use a similar pattern, a comment noting the intentional "latest" policy would help.

7. MINOR -- --trust vs --yolo test coverage story

The tests cover dangerouslySkipPermissions: false (producing --trust), but Orchestrator.invokeAgent hardcodes true for all print-mode calls. A brief code comment in the test noting that the false / --trust branch is only exercised via interactive mode would prevent confusion for future readers.


Overall impression

Nice work bringing Cursor in as a first-class provider. The AgentProvider interface integration is clean, permission mapping is coherent (--yolo / --trust), interactive mode uses safe argv delivery, and the env/API-key handling avoids leakage. The scaffold, registry, and test coverage are solid additions.

The main risk is the stream parser compatibility -- the shared parseStreamJsonLine was designed for Claude Code's event shapes, and Cursor's tool call events appear to use a different structure. Verifying this with real CLI output (and potentially adding a Cursor-specific parser) would make this a confident merge.

The -p flag issue and the docs contradictions are straightforward fixes. The argv size guard is a defensive addition worth having given the PromptPreprocessor's ability to expand prompts beyond argv limits.

@mattpocock
Copy link
Copy Markdown
Owner

@oferRounds Really freaking useful, thank you!

…orce, and update tests to reflect changes in prompt handling and argument limits. Introduce error handling for oversized prompts in print mode and enhance stream line parsing for tool_call events.
@dichioniccolo
Copy link
Copy Markdown
Author

Thanks for the thorough review, it was really useful.
I’ve made the suggested changes, feel free to let me know if there’s anything else popping up!

@dichioniccolo, same as you, I’d also love to have Cursor support!

I asked Cursor to review your PR (using /grill-with-docs, and while verifying with the latest Cursor docs), and this was the result. Could you take a look? Happy to help further to make it happen

cc: @mattpocock

Review Document for PR #424 — Add Cursor Agent Provider

Constructive review. Critical + Major findings are merge-blocking ("request changes"). Minor findings are non-blocking suggestions.

Findings verified against Cursor CLI Parameters and Output Format docs.

Blocking findings (request changes)

1. CRITICAL -- Stream parser assumes Anthropic event shapes, but Cursor's format differs

File: src/AgentProvider.ts (parseStreamJsonLine, shared with claudeCode)

Problem: The Cursor provider reuses parseStreamJsonLine, which was written for Claude Code's stream format. After checking Cursor's stream-json docs, the formats are similar but not identical:

Concern What parseStreamJsonLine expects What Cursor CLI emits (per docs)
Text events type: "assistant" with message.content[].type === "text" Same shape -- likely compatible
Tool calls tool_use blocks inside assistant.message.content[] with name: "Bash" and input.command Top-level type: "tool_call" events with subtype: "started" / "completed" and Cursor-specific nesting (readToolCall, writeToolCall, etc.) -- not tool_use blocks
Result type: "result" with result: string type: "result" with subtype: "success" and result field -- likely compatible
System/init type: "system", subtype: "init", session_id Same shape -- likely compatible
The tool call path is the biggest concern. The test at AgentProvider.test.ts ("parseStreamLine extracts tool_use block (Bash -> command arg)") uses Anthropic-style tool_use inside message.content -- if Cursor emits tool calls as top-level type: "tool_call" events instead, tool invocations will be silently dropped (parser returns [] for unrecognized shapes).

Ask: Could you verify by running a small agent --print --output-format stream-json job and sharing the raw NDJSON output? If tool calls use Cursor's top-level format, the provider needs either a dedicated parseCursorStreamLine function, or an adapter layer that normalizes Cursor events into the shape parseStreamJsonLine expects.

2. MAJOR -- -p flag is used incorrectly in the command string

File: src/AgentProvider.ts (cursor buildPrintCommand)

Problem: The constructed command is:

agent --print --output-format stream-json ${trustFlag} --model ${model} -p ${shellEscape(prompt)}

Per Cursor CLI docs, -p is the short form of --print (non-interactive mode flag). It does not accept a prompt argument. The prompt is a positional argument:

Arguments: prompt — Initial prompt for the agent

So the command is effectively passing --print twice and relying on the CLI to treat the trailing string as a positional argument by accident. The correct form should be:

agent --print --output-format stream-json ${trustFlag} --model ${model} ${shellEscape(prompt)}

This may work today if the CLI is lenient, but it's fragile and could break if flag parsing is tightened.

Ask: Please change -p ${shellEscape(prompt)} to just ${shellEscape(prompt)} (positional argument). Update the related test assertions accordingly.

3. MAJOR -- PR description and test name contradict the implementation

Files: PR body, src/AgentProvider.test.ts (line ~2192)

Problem: Three user-facing inconsistencies stem from docs not being updated after the implementation changed:

  1. PR body says "Prompts are passed via stdin in print mode" -- code passes the prompt on argv.
  2. Test name is "buildPrintCommand delivers prompt via stdin, not argv" but asserts stdin === undefined (the exact opposite of what the name says).
  3. PR body shows import { cursor } from "@ai-hero/sandcastle/agents/cursor" but package.json has no ./agents/cursor export entry -- cursor is exported from the root (src/index.ts).

Ask: Please align all three with the actual implementation:

  • Fix the test name to match the behavior (e.g. "buildPrintCommand passes prompt as positional argument").
  • Update the PR description to remove the stdin claim.
  • For the import path: either add a "./agents/cursor" subpath export to package.json, or update the example to import { cursor } from "@ai-hero/sandcastle".

4. MAJOR -- Print-mode prompt on argv without a size guard

File: src/AgentProvider.ts (cursor buildPrintCommand)

Problem: The prompt is passed on the command line. Every other provider (claudeCode, codex) uses stdin, specifically because the PrintCommand type's JSDoc warns about the Linux ~128 KiB per-argument limit. The PromptPreprocessor expands shell expressions inline (e.g. file contents), so prompts can realistically exceed that limit.

Nuance: The Cursor CLI docs document the prompt as a positional argument only -- stdin delivery is not documented. So argv is likely the only option, which is a CLI limitation.

Ask: Since stdin doesn't appear to be supported, please add:

  1. A size guard that throws a clear error when the prompt exceeds a safe threshold (e.g. 120 KB), so users get a meaningful message instead of a cryptic E2BIG from the OS.
  2. A code comment explaining why this provider uses argv instead of stdin (Cursor CLI limitation), so future maintainers don't file it as a bug.

Non-blocking suggestions

5. MINOR -- Greedy regex fallback in extractPlanIssues

Files: src/templates/parallel-planner/main.mts, src/templates/parallel-planner-with-review/main.mts

The last-resort regex /\{[\s\S]*\}/ is greedy -- it captures from the first { to the last } in stdout. The Array.isArray(parsed.issues) validation reduces false-positive risk significantly, but a one-character fix makes it strictly correct:

const firstObjectMatch = stdout.match(/\{[\s\S]*?\}/); // non-greedy

Also: this function is duplicated identically across both templates. Consider extracting it to a shared utility to prevent drift.

6. MINOR -- Unpinned Cursor CLI install in Dockerfile

File: src/InitService.ts (CURSOR_DOCKERFILE)

curl https://cursor.com/install | bash installs whatever version is latest at build time. Pinning to a specific version would improve reproducibility. Even if other providers in the repo use a similar pattern, a comment noting the intentional "latest" policy would help.

7. MINOR -- --trust vs --yolo test coverage story

The tests cover dangerouslySkipPermissions: false (producing --trust), but Orchestrator.invokeAgent hardcodes true for all print-mode calls. A brief code comment in the test noting that the false / --trust branch is only exercised via interactive mode would prevent confusion for future readers.

Overall impression

Nice work bringing Cursor in as a first-class provider. The AgentProvider interface integration is clean, permission mapping is coherent (--yolo / --trust), interactive mode uses safe argv delivery, and the env/API-key handling avoids leakage. The scaffold, registry, and test coverage are solid additions.

The main risk is the stream parser compatibility -- the shared parseStreamJsonLine was designed for Claude Code's event shapes, and Cursor's tool call events appear to use a different structure. Verifying this with real CLI output (and potentially adding a Cursor-specific parser) would make this a confident merge.

The -p flag issue and the docs contradictions are straightforward fixes. The argv size guard is a defensive addition worth having given the PromptPreprocessor's ability to expand prompts beyond argv limits.

@oferRounds
Copy link
Copy Markdown

Amazing @dichioniccolo!
please see bellow what my agent ( :) ) said:


All four blocking findings from the initial review have been addressed. Here's the breakdown:


Previously blocking findings -- all resolved

1. Stream parser format mismatch (was CRITICAL) -- Resolved

Added a dedicated parseCursorStreamLine that intercepts top-level type: "tool_call" events and routes them to parseCursorToolCallStarted, then falls back to the shared parseStreamJsonLine for text/result/system events. Handles readToolCall, writeToolCall, and generic function-style tool calls. Only processes subtype: "started" (ignores "completed" -- tested). Clean composition.

2. -p flag misuse (was MAJOR) -- Resolved

Prompt is now a positional argument at the end of the command string. Test asserts command.endsWith(...) and not.toContain(" -p ").

3. PR description + test name contradictions (was MAJOR) -- Resolved

  • PR description updated: stdin claim removed, import example now uses sandcastle.cursor() from root.
  • Test renamed to "buildPrintCommand passes prompt as a positional shell-escaped argument".
  • --yolo replaced with --force throughout (matches Cursor CLI docs).

4. Prompt size guard (was MAJOR) -- Resolved

Added assertCursorPrintPromptFitsArgv with a 120 KB limit, clear error message referencing the Cursor CLI limitation, and a well-documented JSDoc. New test covers the rejection path.


Previously minor findings

  • Greedy regex -- Fixed to non-greedy \{[\s\S]*?\} in both templates.
  • --trust / --yolo test story -- --yolo replaced with --force, --trust removed from print mode. Test for the false branch now has a comment explaining the Orchestrator context.
  • Unpinned Dockerfile -- Unchanged (acceptable; consistent with other providers).

Remaining non-blocking suggestions

1. Add a writeToolCall test

There's a test for readToolCall parsing and for completed event filtering, but no test for writeToolCall. Given it's a near-identical code path the risk is low, but it would round out coverage:

it("parseStreamLine extracts top-level tool_call writeToolCall (Cursor stream-json)", () => {
  const provider = cursor("claude-sonnet-4-6");
  const line = JSON.stringify({
    type: "tool_call",
    subtype: "started",
    call_id: "toolu_vrtx_02",
    tool_call: {
      writeToolCall: { args: { path: "src/index.ts" } },
    },
    session_id: "c6b62c6f-7ead-4fd6-9922-e952131177ff",
  });
  expect(provider.parseStreamLine(line)).toEqual([
    { type: "tool_call", name: "Write", args: "src/index.ts" },
  ]);
});

2. Extract shared extractPlanIssues

The function is still duplicated across parallel-planner/main.mts and parallel-planner-with-review/main.mts. Consider extracting to a shared template utility to prevent drift. Fine as a follow-up.


Overall

Clean, well-structured follow-up. The Cursor-specific stream parser is nicely composed (intercept tool calls, delegate the rest), the size guard has a helpful error message, and the flag usage now matches the CLI docs. LGTM -- approving.

…l writeToolCall events and updating the buildPrintCommand test to ensure it includes the --resume flag when resumeSession is set.
@dichioniccolo
Copy link
Copy Markdown
Author

@oferRounds I’ve implemented point 1.

Regarding point 2, I’d like to clarify the preferred direction before proceeding. My initial thought was to extract the shared utility into the main package and expose it, something along the lines of:

import { extractPlanIssues } from "@ai-hero/sandcastle/utils";

However, I’m not entirely sure if promoting it to a public/shared utility is the right approach here, or if you’d prefer keeping it internal to the templates (or organizing it differently).

What do you think is the best place for it?

@mattpocock
Copy link
Copy Markdown
Owner

I've written an ADR and pushed it to main describing this. Essentially, all templates should be self-contained with no shared utilities.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants