[Bug] Synchronous blocking in _wait_for_sandbox_ready crashes single-worker uvicorn server

## Describe the bug

When creating a sandbox via `POST /v1/sandboxes`, the server synchronously blocks the asyncio event loop in `_wait_for_sandbox_ready`, causing liveness probe failures and pod restarts.

## Root Cause

1. **`time.sleep()` instead of `await asyncio.sleep()`** in `kubernetes_service.py` line 185 — when the workload is not yet visible in the K8s API, the code hits `time.sleep(poll_interval_seconds)` which blocks the entire event loop. Line 212 in the same method correctly uses `await asyncio.sleep()`.

2. **Synchronous K8s client calls** — `get_workload()`, `get_status()`, `create_workload()` all use the synchronous `kubernetes` Python client. Each API call blocks the event loop for the duration of the network round-trip.

3. **Single uvicorn worker** — `cli.py` calls `uvicorn.run()` without a `workers` parameter, defaulting to 1 process with 1 event loop.

Combined, a single `POST /v1/sandboxes` request can block the event loop for up to 60 seconds (`sandbox_create_timeout_seconds`). During this time, all other requests — including `/health` liveness probes — are unserviceable. Kubernetes kills the pod after enough missed probes.

## To Reproduce

1. Deploy OpenSandbox server on Kubernetes with default Helm values (single replica, default liveness probe)
2. Create a sandbox with an image that hasn't been pulled yet on the target node
3. Observe server logs: sandbox stays `Pending` for the full 60s timeout
4. Observe pod restarts due to liveness probe failure

## Suggested Fix

- **Immediate**: Replace `time.sleep()` on line 185 with `await asyncio.sleep()`.
- **Proper**: Wrap synchronous K8s client calls in `asyncio.loop.run_in_executor()`, or switch to an async K8s client.

## Environment

- OpenSandbox Server: v0.1.4 (Helm chart 0.1.0)
- Kubernetes: v1.28.15
- Runtime: containerd 1.6.36

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Bug] Synchronous blocking in _wait_for_sandbox_ready crashes single-worker uvicorn server #620

Describe the bug

Root Cause

To Reproduce

Suggested Fix

Environment

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[Bug] Synchronous blocking in _wait_for_sandbox_ready crashes single-worker uvicorn server #620

Description

Describe the bug

Root Cause

To Reproduce

Suggested Fix

Environment

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions