Sandboxed Python execution for AI agents. Scripts run in ephemeral, isolated environments with inline dependencies (PEP 723) -- zero host pollution, zero leftover venvs, zero package conflicts.
Every coding agent can already run Python on your host. The problem is what happens next: packages accumulate, venvs sprawl, and a rogue pip install breaks your system. mcp-python-exec-sandbox eliminates this:
- Scripts execute in a sandbox (bubblewrap on Linux, Docker on macOS/other platforms)
- Dependencies are declared inline and resolved ephemerally via
uv - Nothing touches your host's Python, site-packages, or virtualenvs
- Each execution is isolated and disposable
- Sandboxed execution -- platform-specific isolation prevents host filesystem access
- PEP 723 inline metadata -- declare dependencies directly in scripts with
# /// scriptblocks - Multi-version Python -- run scripts on Python 3.13, 3.14, or 3.15 (uv downloads the right version automatically)
- Ephemeral environments -- dependencies are resolved per-execution, never persisted
- Package caching -- uv's global cache makes repeat installs near-instant
- Timeout enforcement -- configurable per-execution timeouts
- Output truncation -- prevents runaway output from overwhelming the agent
All setups require:
- Python 3.13+ -- to run the MCP server process
- uv -- manages script execution, dependency resolution, and Python version downloads. Also provides
uvxfor running the server without installing it globally.
Additional requirements depend on your chosen sandbox backend:
| Setup | Additional requirements | Install |
|---|---|---|
| Native sandbox (Linux) | bubblewrap | sudo apt install bubblewrap |
| Docker sandbox (macOS, any) | Docker Engine | See Docker docs |
| No sandbox | None | -- |
Host Python vs. execution Python: These are independent. Python 3.13+ is needed to run the server process itself. The
--python-versionflag controls which Python version your scripts execute on -- uv downloads the target version automatically. You do not need to install Python 3.14 or 3.15 on your host to run scripts on those versions.
claude mcp add python-sandbox -- uvx mcp-python-exec-sandboxclaude mcp add python-sandbox -- uvx mcp-python-exec-sandboxThe Docker sandbox image is pulled automatically from GHCR on first use. No manual build required.
claude mcp add python-sandbox -- uvx mcp-python-exec-sandbox --sandbox-backend noneAdd to .cursor/mcp.json (project-level) or ~/.cursor/mcp.json (global):
{
"mcpServers": {
"python-sandbox": {
"command": "uvx",
"args": ["mcp-python-exec-sandbox"]
}
}
}codex mcp add python-sandbox -- uvx mcp-python-exec-sandboxOr add to .codex/config.toml:
[mcp_servers.python-sandbox]
command = "uvx"
args = ["mcp-python-exec-sandbox"]Any client that supports the MCP stdio transport can use this server:
{
"mcpServers": {
"python-sandbox": {
"command": "uvx",
"args": ["mcp-python-exec-sandbox"]
}
}
}Use --python-version to target a specific Python version. uv downloads it automatically -- no manual install needed.
# Python 3.13 (default)
uvx mcp-python-exec-sandbox --python-version 3.13
# Python 3.14
uvx mcp-python-exec-sandbox --python-version 3.14
# Python 3.15
uvx mcp-python-exec-sandbox --python-version 3.15This works across all sandbox backends. The Docker sandbox uses uv inside the container to manage Python versions, so the same --python-version flag applies.
Execute a Python script with automatic dependency management.
| Parameter | Type | Default | Description |
|---|---|---|---|
script |
str | required | Python source code, may include PEP 723 inline metadata |
dependencies |
list[str] | [] |
Extra PEP 508 dependency specifiers to merge |
timeout_seconds |
int | 30 | Maximum execution time (1--300) |
# Simple script
execute_python(script="print('hello world')")
# Script with dependencies
execute_python(
script="import requests; print(requests.get('https://httpbin.org/get').status_code)",
dependencies=["requests"]
)
# Script with inline PEP 723 metadata
execute_python(script="""
# /// script
# dependencies = ["pandas", "matplotlib"]
# ///
import pandas as pd
print(pd.DataFrame({'a': [1,2,3]}).describe())
""")Returns information about the execution environment: Python version, uv version, platform, sandbox status, and configuration.
Validates a script's PEP 723 metadata and dependencies without executing it.
| Parameter | Type | Default | Description |
|---|---|---|---|
script |
str | required | Python source code to validate |
dependencies |
list[str] | [] |
Extra dependency specifiers to validate |
| Backend | Platform | Tool | Notes |
|---|---|---|---|
native |
Linux | bubblewrap | Namespace isolation, network allowed |
docker |
Any | Docker | Container isolation, resource limits |
none |
Any | -- | No sandboxing (not recommended) |
The default backend is native (bubblewrap) on Linux and docker on macOS/other platforms. Specifying --sandbox-backend native on macOS automatically redirects to Docker. If the sandbox tool is unavailable, the server falls back to none with a warning.
The Docker sandbox image is published to GHCR and pulled automatically when the server starts. No manual setup is needed.
To build locally for development:
docker build -t ghcr.io/lu-zhengda/mcp-python-exec-sandbox profiles/mcp-python-exec-sandbox [OPTIONS]
Options:
--python-version TEXT Python version for execution (default: 3.13)
--sandbox-backend TEXT native | docker | none (default: native on Linux, docker on macOS)
--max-timeout INT Maximum allowed timeout in seconds (default: 300)
--default-timeout INT Default timeout in seconds (default: 30)
--max-output-bytes INT Maximum output size in bytes (default: 102400)
--no-warm-cache Skip cache warming on startup
--uv-path TEXT Path to uv binary (default: uv)
git clone https://github.com/lu-zhengda/mcp-python-exec-sandbox.git
cd mcp-python-exec-sandbox
uv sync --devsrc/mcp_python_exec_sandbox/ # Package source
server.py # FastMCP server + tool definitions
executor.py # uv subprocess orchestration
script.py # PEP 723 metadata parsing/merging
sandbox.py # Sandbox ABC + factory
sandbox_linux.py # bubblewrap sandbox (Linux)
sandbox_docker.py # Docker sandbox (macOS/any)
config.py, cache.py, output.py, errors.py
tests/ # Unit + integration tests (mocked or local uv)
e2e_tests/ # End-to-end tests (require uv + network)
profiles/ # Dockerfile, warmup packages
.devcontainer/ # Devcontainer for Linux sandbox testing from macOS
Unit and integration tests -- fast, run everywhere:
uv run pytest tests/ -vE2E tests -- require uv and network access. These exercise real script execution, package installation, MCP protocol flow, and sandbox enforcement:
uv run pytest e2e_tests/ -vThe Docker E2E tests (e2e_tests/test_docker_sandbox.py) verify execution, dependency installation, read-only filesystem enforcement, host isolation, and timeout handling through the Docker backend.
Prerequisites:
- Docker must be installed and running
- Build the sandbox image:
docker build -t ghcr.io/lu-zhengda/mcp-python-exec-sandbox profiles/Then run:
uv run pytest e2e_tests/test_docker_sandbox.py -vThese tests are automatically skipped if Docker is unavailable or the image hasn't been built.
The Linux sandbox tests (e2e_tests/test_sandbox_enforcement.py::test_linux_sandbox_blocks_etc_shadow) use bubblewrap (bwrap) for namespace isolation. They are skipped on macOS because bwrap is Linux-only.
To run them from macOS, use the included devcontainer which provides Ubuntu 24.04 with bwrap pre-installed:
VS Code:
- Install the Dev Containers extension
- Open the project and select Reopen in Container
- In the integrated terminal:
uv run pytest e2e_tests/test_sandbox_enforcement.py -vCLI:
# Install the devcontainer CLI (once)
npm install -g @devcontainers/cli
# Build and start the container
devcontainer up --workspace-folder .
# Run the Linux sandbox tests inside the container
devcontainer exec --workspace-folder . uv run pytest e2e_tests/test_sandbox_enforcement.py -v| Test suite | Command | Requirements |
|---|---|---|
| Unit tests | uv run pytest tests/ -v |
uv |
| Integration tests | uv run pytest tests/test_integration.py -v |
uv |
| E2E (general) | uv run pytest e2e_tests/ -v |
uv, network |
| E2E (Docker sandbox) | uv run pytest e2e_tests/test_docker_sandbox.py -v |
uv, Docker, sandbox image |
| E2E (Linux/bwrap sandbox) | uv run pytest e2e_tests/test_sandbox_enforcement.py -v |
uv, Linux with bwrap (or devcontainer) |
- One logical change per commit. Descriptive commit message (imperative mood).
- Run
uv run pytest tests/ -vbefore committing -- all tests must pass. - Add tests for new functionality: unit tests in
tests/, E2E ine2e_tests/if it needs real execution. - Keep dependencies minimal. Do not add runtime deps without strong justification.
- Tool docstrings in
server.pyare user-facing MCP tool descriptions. Write them for an LLM audience. - Sandbox backends must degrade gracefully: if the required tool (bwrap, docker) is missing, fall back to
NoopSandboxwith a warning.
MIT