Skip to content

feat: add max_size and force params to screenshot tool#63

Open
saen-ai wants to merge 7 commits into
joshuayoes:mainfrom
saen-ai:feat/screenshot-maxsize-force
Open

feat: add max_size and force params to screenshot tool#63
saen-ai wants to merge 7 commits into
joshuayoes:mainfrom
saen-ai:feat/screenshot-maxsize-force

Conversation

@saen-ai
Copy link
Copy Markdown
Contributor

@saen-ai saen-ai commented Apr 21, 2026

Summary

  • max_size — resizes the screenshot proportionally using sips when the image exceeds the given pixel dimension (width or height). Solves the Claude 2000px API limit issue.
  • force — prevents silent overwrites by throwing an error when the output file already exists. Defaults to false.

Fixes #42, Fixes #19

Changes

  • Added max_size: z.number().int().positive().optional() to screenshot schema
  • Added force: z.boolean().optional() to screenshot schema
  • Added file-exists guard before capture (respects force)
  • Added sips --resampleHeightWidthMax resize step after capture (when max_size provided)

Test plan

  • Take screenshot without specifying a path → works as before
  • Take screenshot to a path that already exists without force → should throw an error
  • Take screenshot to a path that already exists with force: true → should overwrite successfully
  • Take screenshot with max_size: 1000 → verify the saved image's largest dimension is ≤ 1000px
  • Take screenshot without max_size → original resolution preserved

saen-ai and others added 7 commits April 21, 2026 10:41
…dep CVEs

- record_video: was hardcoded to "booted", ignoring the udid param — now
  uses getBootedDeviceId() consistently with all other tools; also adds
  udid to the tool schema so callers can target a specific simulator

- ui_view: JSON.parse on idb output had no error handling — server would
  crash on malformed output; wrapped in try/catch with a clear error
  message; also validates frame dimensions are positive numbers before use

- ui_view: temp PNG/JPEG files now deleted immediately after reading
  instead of accumulating until server exit; file names include a random
  suffix to prevent collisions on rapid successive calls

- record_video: improved start detection — now rejects properly if the
  process exits early, increased timeout from 3s to 5s, tracks resolved
  state to avoid double-settling the promise

- deps: updated @modelcontextprotocol/sdk to latest (fixes CVE ReDoS,
  cross-client data leak, DNS rebinding — all high severity); ran
  npm audit fix for 6 additional moderate/low vulns in ajv, body-parser,
  minimatch, path-to-regexp, qs, diff

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
idb ui text only supports ASCII keycodes and throws 'No keycode found'
for any emoji or non-ASCII character. This adds a new ui_paste tool that
works around the limitation using the macOS pasteboard:

1. Copies text to the Mac clipboard via pbcopy
2. Syncs it to the simulator pasteboard via xcrun simctl pbsync
3. Long-presses at the given coordinates to trigger the paste menu
4. Finds the Paste button in the accessibility tree and taps it

This enables typing emoji, Arabic, Chinese, and any Unicode text into
simulator inputs — essential for testing apps with international users
or emoji-heavy content.

ui_type is unchanged and remains the right tool for ASCII text.
1.5s was triggering iOS system gestures (app switcher / home screen),
dismissing the app before the paste menu appeared. 0.8s is long enough
to trigger the contextual paste menu without conflicting with system gestures.
idb ui tap requires integer x/y values — passing floats like 55.166...
causes 'invalid int value' error. Round the calculated center coordinates.
terminate_app: kills a running app by bundle ID without having to
relaunch it — useful for testing cold-start flows and crash recovery

open_url: opens any URL or deep link in the simulator — essential for
testing universal links, custom URL schemes, and OAuth redirect flows

list_apps: lists all installed apps with their bundle IDs and display
names, sorted alphabetically — removes the need to manually look up
bundle IDs before calling launch_app or terminate_app
- max_size: resizes screenshot proportionally using sips when the image
  exceeds the given pixel dimension (width or height). Solves the Claude
  2000px API limit issue (joshuayoes#42).
- force: prevents silent overwrites by erroring when the output file
  already exists. Defaults to false (joshuayoes#19).

Fixes joshuayoes#42, Fixes joshuayoes#19
@joshuayoes
Copy link
Copy Markdown
Owner

Hey — the screenshot-specific change in your commit 913a71bb (adding max_size via sips --resampleHeightWidthMax + a force file-exists guard) is clean and exactly what #42 and #19 need. Happy to land it.

But as submitted this PR isn't scoped to the title. The branch is stacked on top of feat/unicode-support (#60) and feat/new-tools (#61), so the diff includes all of their content plus fa6f816e which has now been squash-merged as #59 — six ancestor commits plus one merge commit before the screenshot work. Net-real-change for this PR's title is the single 913a71bb commit.

Could you rebase this branch onto the current joshuayoes/main with just 913a71bb? Something like:

git fetch upstream
git checkout -B feat/screenshot-maxsize-force upstream/main
git cherry-pick 913a71bb
git push --force-with-lease origin feat/screenshot-maxsize-force

Same stacking applies to #64, #65, #66, #67 — each sits on top of the previous, so merging any of them as-is would pull the whole chain. If you want to keep a stacked workflow, a tool like Graphite or git-spice handles the auto-rebase-when-parent-merges problem. Otherwise cherry-picking the leaf commit onto a fresh branch from upstream/main works fine for each.

Also, same ask as on #60: please include a short ## Demo screen recording of max_size (e.g. screenshot at native then capped at 1000px) and force (attempt overwrite without / with the flag). See Releases and #51 for examples.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants