A FastMCP server that runs on the client machine and exposes screenshot tools to a host MCP. It supports both direct screenshot capture and session-based chunked transfers so an LLM can consume images reliably.
Official FastMCP documentation: gofastmcp.com/getting-started/welcome
list_monitors: returns detected monitors (index and dimensions)capture_screenshot: captures a screen image with hybrid mode (base64for non-vision, native MCPimagefor vision)capture_timeline: captures a timed screen sequence (ordered frames with timestamps)start_timeline_capture: starts a timeline session and returns atimeline_idget_timeline_manifest: returns chunked timeline metadataget_timeline_chunk: retrieves a timeline JSON chunkrelease_timeline_capture: explicitly releases a timeline sessionstart_screenshot_capture: starts a screenshot session and returns acapture_idget_screenshot_manifest: returns metadata plus ASCII preview for non-vision LLMsget_screenshot_chunk: returns a chunk of base64 image datarelease_screenshot_capture: releases the screenshot session and frees memory
- Need available monitor info:
list_monitors - Need a fast single screenshot with moderate payload:
capture_screenshot - Need a more robust single screenshot with chunking:
start_screenshot_capture->get_screenshot_manifest->get_screenshot_chunk(0..N-1) ->release_screenshot_capture - Need a short timeline in one call:
capture_timeline - Need a robust timeline for large payloads:
start_timeline_capture->get_timeline_manifest->get_timeline_chunk(0..N-1) ->release_timeline_capture
Best practices:
- Always concatenate chunks in ascending
chunk_indexorder. - Always call
release_*after reading session data to free memory. - For non-vision models, consume
preview_textfrom the manifest before loading full payload.
- Linux with an active graphical session (X11/Wayland capture support)
DISPLAYenvironment variable available to the server process (mssrequires it on Linux)- Python 3.10+
uv syncOr via Taskfile:
task setuptask serverThis task starts the server using mcpm run screen-mcp through uvx.
It also registers or updates the local MCP server automatically when needed.
Display-related environment variables are propagated during registration: DISPLAY, WAYLAND_DISPLAY, XAUTHORITY, XDG_RUNTIME_DIR.
task clientThe smoke-test script is located in scripts/smoke_client.py and exercises:
list_monitorsstart_screenshot_captureget_screenshot_manifestget_screenshot_chunkrelease_screenshot_capture
It writes a verification image to artifacts/smoke_capture.jpg.
You can also run a specific action via --action:
uv run python scripts/smoke_client.py --action list-monitors
uv run python scripts/smoke_client.py --action capture-screenshot --monitor-index 0 --output artifacts/capture.jpg
uv run python scripts/smoke_client.py --action capture-timeline --duration-seconds 6 --output artifacts/timeline.json
uv run python scripts/smoke_client.py --action capture-timeline-session --duration-seconds 6 --chunk-size 120000 --output artifacts/timeline_session.jsontask inspectorThis launches the MCP Inspector against the mcpm run screen-mcp server.
- Open this project folder in VS Code.
- Add a
serversconfiguration. - Create a
.vscode/mcp.jsonfile and add one of the examples below.
Recommended local example for a cloned repo (unpublished package):
{
"servers": {
"screen-mcp": {
"type": "stdio",
"command": "uv",
"args": ["run", "--project", "/absolute/path/to/screen-mcp", "screen-mcp"]
}
}
}Example for running directly from a Git repo without global installation:
{
"servers": {
"screen-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["--from", "git+https://github.com/<owner>/screen-mcp.git", "screen-mcp"]
}
}
}Alternative via MCPM:
{
"servers": {
"screen-mcp": {
"type": "stdio",
"command": "uvx",
"args": ["mcpm", "run", "screen-mcp"]
}
}
}list_monitors()capture_screenshot(monitor_index=0, image_format="jpeg", max_width=1600, quality=80)capture_screenshot(monitor_index=0, image_format="jpeg", max_width=1600, quality=80, response_mode="image")capture_timeline(duration_seconds=10, monitor_index=0, image_format="jpeg", max_width=900, quality=70)start_timeline_capture(duration_seconds=10, monitor_index=0, image_format="jpeg", max_width=900, quality=70, chunk_size=120000)get_timeline_manifest(timeline_id)get_timeline_chunk(timeline_id, chunk_index)release_timeline_capture(timeline_id)
Timeline behavior in capture_timeline:
- fixed cadence:
TIMELINE_FPS(default 2 images/s, configurable in source) - maximum duration:
TIMELINE_MAX_DURATION_SECONDS(default 30s, configurable in source) - each frame includes:
frame_index,t_offset_ms,captured_at,preview_text,image_sha256,image_size_bytes temporal_hintmakes chronological order explicit for an LLM
Robust flow recommendation:
start_screenshot_capture(...)-> obtaincapture_idget_screenshot_manifest(capture_id)-> metadata +preview_textget_screenshot_chunk(capture_id, chunk_index)-> reassemble chunksrelease_screenshot_capture(capture_id)
- For multi-client MCP, base64 is the most interoperable format: simple, JSON-friendly, compatible with vision and non-vision clients.
- Tradeoff: larger payload (~33%) and risk of single-block truncation.
- This project uses session-based chunked base64 transfer (
capture_id) to make large exchanges reliable. - For non-vision LLMs, prefer
get_screenshot_manifest(metadata + ASCII preview) before downloading the full image.
Hybrid mode in capture_screenshot:
response_mode="base64"(default): legacy behavior, JSON output withimage_base64.response_mode="image": native MCP image output for vision models, with metadata instructured_content.response_mode="auto": readsSCREEN_MCP_CAPTURE_RESPONSE_MODE(base64orimage) and chooses automatically based on the client/host.
Screen captures may contain sensitive data. Add an explicit client-side policy for production use (consent, masking, window whitelisting, etc.).