Minimal computer use implementation for macOS (local!) using the latest GPT-5.4 native computer use support. To my knowledge, no other such implementation yet exists.
Here's an example command:
uv run python -m cua_mac run "Open Firefox and navigate to Hacker News"Demo (sending an email)
cua_demo_fast.mp4
The following is an AI-generated README:
- macOS
uv- Python
3.14available throughuv - An OpenAI API key
- Screen Recording permission for the app launching
uv - Accessibility permission for the app launching
uv
Install dependencies:
uv syncCreate a .env file in the repo root:
OPENAI_API_KEY=your_api_key_hereGrant permissions in macOS:
- Grant
Screen Recordingto the terminal app you use to launchuv. - Grant
Accessibilityto that same app.
If you run through Ghostty, grant permissions to Ghostty.
If you run through Terminal, grant permissions to Terminal.
Capture a manual screenshot:
uv run python -m cua_mac screenshotRun a task:
uv run python -m cua_mac run "Open Calculator and compute 17 * 24"The runner prints progress lines like:
Responses turn 1: resp_...
Turn 1.1 actions: click :: [{"button":"left","type":"click","x":1854,"y":2118}]
When the model finishes, it prints the final assistant message.
Open Calculator and do a simple computation:
uv run python -m cua_mac run "Open Calculator and compute 17 * 24"Open Notes and create a short note:
uv run python -m cua_mac run "Open Notes and create a note titled test with the body hello from cua_mac"Open Firefox and go to a page:
uv run python -m cua_mac run "Open Firefox and navigate to https://news.ycombinator.com"Draft an email:
uv run python -m cua_mac run "Open Mail and draft an email to example@example.com with subject test and body hello"Choose a model:
uv run python -m cua_mac run \
--model gpt-5.4 \
"Open Calculator and compute 17 * 24"Slow down local UI actions:
uv run python -m cua_mac run \
--action-delay-ms 250 \
"Open Notes and create a note titled slower run"Set an explicit turn limit:
uv run python -m cua_mac run \
--max-turns 50 \
"Open Firefox and search for OpenAI"By default there is no implicit turn cap. If you omit --max-turns, the loop runs until the model returns a final response or an error occurs.
Write artifacts to a specific directory:
uv run python -m cua_mac \
--artifact-dir artifacts/manual-test \
run "Open Calculator and compute 17 * 24"Each run writes screenshots under:
artifacts/<timestamp>/
Typical files:
turn-000-initial.pngturn-001-call-01.pngturn-002-call-01.png
These are the exact screenshots the model is acting on after each loop step.
- Capture the current desktop.
- Send the prompt plus screenshot to the Responses API with
tools=[{"type": "computer"}]. - Receive a
computer_callwithactions[]. - Execute those actions locally on macOS.
- Capture a new screenshot.
- Send it back as
computer_call_output. - Repeat until the model returns a final message.
This is still an MVP. Current behavior and limits:
- Supports exactly one connected display
- Uses screenshot-pixel coordinates for mouse actions
- Supports
click,double_click,move,drag,scroll,type,keypress,wait, andscreenshot - Uses real key events for normal typing
- Falls back to Unicode event injection only for unsupported characters
- Prints action logs to stdout
- Stores screenshots locally for inspection
If you see a screenshot error, confirm Screen Recording is granted to the terminal app launching uv.
If the runner says Accessibility is not granted, enable Accessibility for the terminal app launching uv, then fully relaunch that app.
That usually means the intended field is not actually focused.
Try:
- making the prompt tell the model to click the input field before typing
- increasing
--action-delay-ms - checking the latest artifact screenshots to confirm focus
Inspect the logged action payloads and the saved screenshots together. The runner now interprets click coordinates in screenshot-pixel space, which should match the model's view of the screen.
Top-level CLI help:
uv run python -m cua_mac --helprun command help:
uv run python -m cua_mac run --help