Skip to content

feat: unicode support, recording fixes, screenshot improvements, device rotation & docs#68

Closed
saen-ai wants to merge 12 commits into
joshuayoes:mainfrom
saen-ai:main
Closed

feat: unicode support, recording fixes, screenshot improvements, device rotation & docs#68
saen-ai wants to merge 12 commits into
joshuayoes:mainfrom
saen-ai:main

Conversation

@saen-ai
Copy link
Copy Markdown
Contributor

@saen-ai saen-ai commented Apr 21, 2026

Overview

A batch of fixes and features from real-world usage. Each change addresses an open issue.


Changes

fix: security & stability improvements

  • record_video now routes through getBootedDeviceId() instead of hardcoded "booted" — fixes multi-simulator setups
  • ui_view JSON.parse wrapped in try/catch with dimension validation
  • Temp files cleaned immediately after use with random suffixes to prevent collisions
  • record_video improved with exit listener + resolved state tracking to catch silent failures

feat: ui_paste — Unicode and emoji text input

idb ui text can't handle emoji or non-ASCII characters (no keycodes exist). The new ui_paste tool injects text via the system clipboard (pbcopypbsync → long-press → Paste), enabling full Unicode support including emoji, CJK, Arabic, and RTL text.

feat: terminate_app, open_url, list_apps

Standard simctl wrappers that were missing from the toolset.

fix: AZERTY keyboard layout (ui_type)

idb ui text sends QWERTY keycodes that get remapped by the simulator's keyboard layout, breaking input on AZERTY/QWERTZ systems. Rewrote ui_type to use the same clipboard injection as ui_paste, with auto-detection of the focused AXTextField/AXTextArea.

feat: screenshotmax_size and force params (fixes #42, fixes #19)

  • max_size: resizes proportionally via sips when the image exceeds the given pixel dimension — solves the Claude 2000px API limit
  • force: prevents silent overwrites by erroring when the output file already exists

fix: recording PID tracking (fixes #20)

stop_recording previously used pkill -f simctl.*recordVideo which killed every recording process on the machine. Now tracks each ChildProcess in a Map<udid, ChildProcess> and kills by specific PID. Falls back to pkill if no tracked process is found.

feat: record_videotimeout param (fixes #5)

Optional timeout (1–3600s) auto-stops recording after the given duration using the same targeted PID kill. No need for a separate stop_recording call.

feat: rotate_device (contributes to #49)

Rotates the simulator left/right via osascript keyboard shortcuts (Cmd+Left / Cmd+Right). Supports optional times (1–3) for multi-increment rotations in one call.

docs: rewrite use cases section (fixes #40)

Replaced the generic prompt list with five real developer workflows: bug repro recording, post-implementation validation, React Native Redbox debugging, Unicode text testing, and landscape/portrait layout checks. Also added missing tool entries for all new tools.


Issues addressed

Fixes #5, Fixes #19, Fixes #20, Fixes #40, Fixes #42, Contributes to #49

saen-ai and others added 12 commits April 21, 2026 10:41
…dep CVEs

- record_video: was hardcoded to "booted", ignoring the udid param — now
  uses getBootedDeviceId() consistently with all other tools; also adds
  udid to the tool schema so callers can target a specific simulator

- ui_view: JSON.parse on idb output had no error handling — server would
  crash on malformed output; wrapped in try/catch with a clear error
  message; also validates frame dimensions are positive numbers before use

- ui_view: temp PNG/JPEG files now deleted immediately after reading
  instead of accumulating until server exit; file names include a random
  suffix to prevent collisions on rapid successive calls

- record_video: improved start detection — now rejects properly if the
  process exits early, increased timeout from 3s to 5s, tracks resolved
  state to avoid double-settling the promise

- deps: updated @modelcontextprotocol/sdk to latest (fixes CVE ReDoS,
  cross-client data leak, DNS rebinding — all high severity); ran
  npm audit fix for 6 additional moderate/low vulns in ajv, body-parser,
  minimatch, path-to-regexp, qs, diff

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
idb ui text only supports ASCII keycodes and throws 'No keycode found'
for any emoji or non-ASCII character. This adds a new ui_paste tool that
works around the limitation using the macOS pasteboard:

1. Copies text to the Mac clipboard via pbcopy
2. Syncs it to the simulator pasteboard via xcrun simctl pbsync
3. Long-presses at the given coordinates to trigger the paste menu
4. Finds the Paste button in the accessibility tree and taps it

This enables typing emoji, Arabic, Chinese, and any Unicode text into
simulator inputs — essential for testing apps with international users
or emoji-heavy content.

ui_type is unchanged and remains the right tool for ASCII text.
1.5s was triggering iOS system gestures (app switcher / home screen),
dismissing the app before the paste menu appeared. 0.8s is long enough
to trigger the contextual paste menu without conflicting with system gestures.
idb ui tap requires integer x/y values — passing floats like 55.166...
causes 'invalid int value' error. Round the calculated center coordinates.
terminate_app: kills a running app by bundle ID without having to
relaunch it — useful for testing cold-start flows and crash recovery

open_url: opens any URL or deep link in the simulator — essential for
testing universal links, custom URL schemes, and OAuth redirect flows

list_apps: lists all installed apps with their bundle IDs and display
names, sorted alphabetically — removes the need to manually look up
bundle IDs before calling launch_app or terminate_app
- max_size: resizes screenshot proportionally using sips when the image
  exceeds the given pixel dimension (width or height). Solves the Claude
  2000px API limit issue (joshuayoes#42).
- force: prevents silent overwrites by erroring when the output file
  already exists. Defaults to false (joshuayoes#19).

Fixes joshuayoes#42, Fixes joshuayoes#19
Instead of pkill-ing all simctl recordVideo processes, we now store
each recording's ChildProcess in a Map keyed by UDID. stop_recording
sends SIGINT to that specific PID, leaving any other simulators or
concurrent idb operations untouched.

Falls back to pkill if no tracked process is found (e.g. server
restarted mid-recording) so behaviour is never worse than before.

Also adds an optional udid param to stop_recording for multi-simulator
setups.

Fixes joshuayoes#20
Adds an optional timeout (1–3600 seconds) to record_video. When set,
a setTimeout fires after the given duration and sends SIGINT to the
tracked recording process — same targeted kill used by stop_recording.

If omitted, behaviour is unchanged: recording runs until stop_recording
is called.

Fixes joshuayoes#5
Adds a rotate_device tool that rotates the iOS Simulator left
(counter-clockwise) or right (clockwise) using the Simulator app's
built-in keyboard shortcuts via osascript.

Supports an optional `times` param (1–3) for multi-increment rotations
without multiple tool calls. A 500ms delay between increments lets the
simulator animate each step.

Contributes to joshuayoes#49
Replaces the generic prompt list with five concrete end-to-end flows
that reflect how the tool is actually used: bug reproduction recording,
post-implementation feature validation, React Native Redbox debugging,
Unicode/emoji text input testing, and rotation-based layout checks.

Also adds missing tool entries to the Tools section for ui_paste,
rotate_device, terminate_app, open_url, and list_apps.

Fixes joshuayoes#40
push_notification: sends a simulated APNs push to any app via
xcrun simctl push. Accepts title, body, badge, and optional custom
data payload. Writes payload to a temp file (cleaned up after use)
and validates the 4096-byte limit before sending.

set_permission: grants, revokes, or resets any iOS privacy permission
via xcrun simctl privacy. Supports all 13 services (camera, photos,
location, microphone, etc). Validates that bundle_id is provided for
grant/revoke actions.

get_clipboard: reads the simulator clipboard via xcrun simctl pbpaste.
Useful for verifying copy behaviour in apps like snippet managers
where clipboard correctness is a core feature.
@saen-ai
Copy link
Copy Markdown
Contributor Author

saen-ai commented Apr 21, 2026

Update: Three more tools have been added to this branch since the original description:

  • push_notification — send simulated APNs push notifications via xcrun simctl push
  • set_permission — grant/revoke/reset any iOS privacy permission via xcrun simctl privacy (covers notifications, photos, microphone, location, etc.)
  • get_clipboard — read the simulator clipboard via xcrun simctl pbpaste (useful for testing copy behaviour in apps)

These are covered individually in PR #69 if you prefer to review them separately.

@joshuayoes
Copy link
Copy Markdown
Owner

Closing this in favor of the individual PRs I'm already reviewing (#60, #61, #63, #64, #65, #66, #67). Details below.

Why the bundled PR won't merge as-is

This branch is saen-ai:mainjoshuayoes:main — your fork's main is being used as a feature branch. Three problems with that:

  1. Review surface. 1117 additions covering 9 distinct changes (unicode input, three simctl wrappers, screenshot params, recording fixes, recording timeout, rotation, docs, plus three more tools in commit 8109dd2f) can't be reviewed as one unit against the "keep it simple / real use cases only" principle in CLAUDE.md. The individual PRs you've already opened are the right shape — I've left actionable feedback on each.

  2. Conflict with main. Merging would overwrite joshuayoes:main with your fork's divergent history, including commits that have already squash-merged (e.g. fa6f816e is now on main as fix: record_video ignores udid, ui_view JSON crash, temp file leaks, dep CVEs #59's merge commit). This is marked CONFLICTING by GitHub for exactly that reason.

  3. Future hygiene. Using your fork's main as a branch makes subsequent contributions awkward — every new PR will drag stale history along. Recommend resetting your fork's main to match upstream:

    git fetch upstream
    git checkout main
    git reset --hard upstream/main
    git push --force-with-lease origin main
    

    Then branch fresh per PR: git checkout -b feat/some-thing upstream/main.

Next actions

Thanks for the work — closing here, but actively working through the individual PRs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

2 participants