-
Notifications
You must be signed in to change notification settings - Fork 763
[Feature Request] Auto-create backing storage (PVC / Docker volume) when volume is requested #660
Description
Problem
OpenSandbox accepts volume mounts in Sandbox.create(), but treats them as references to
pre-existing storage. If the backing storage doesn't exist — a PVC in Kubernetes, or a
named volume in Docker — the sandbox fails at scheduling time.
This means every caller must independently manage storage lifecycle outside of OpenSandbox,
which defeats the purpose of having a unified sandbox API. The caller has to know the
runtime backend (K8s vs Docker), have the right client/credentials, and implement
ensure-or-create logic themselves.
Current behavior
The volume_helper adds volume references to the pod/container spec, but no provider
actually creates the backing storage. The sandbox simply fails if it doesn't already exist.
- Kubernetes:
persistentvolumeclaim "xxx" not found→ pod stuck in Pending - Docker: equivalent failure when referencing a non-existent named volume
The server-side PVC model (api/schema.py) already describes itself as a
"runtime-neutral abstraction" covering both K8s PVCs and Docker named volumes, but only
contains claim_name — no way to express provisioning intent.
Expected behavior
When a sandbox requests a volume, OpenSandbox should ensure the backing storage exists
before creating the workload — create if missing, skip if already present. This is
analogous to docker run -v mydata:/data which auto-creates the mydata volume.
Why this matters
Volumes are essential for sandbox use cases like:
- Persistent user workspaces that survive sandbox restarts
- Shared data across multiple sandbox sessions
- Pre-populated datasets mounted into sandboxes
Without auto-ensure, callers must couple to the runtime backend and duplicate storage
management logic — exactly the abstraction leak that OpenSandbox is designed to eliminate.
Proposed changes
1. Extend PVC model with optional provisioning hints
Currently PVC only has claim_name. Add optional fields so the caller can declaratively
describe what they need. These are only used during auto-creation and ignored if the volume
already exists:
# server: api/schema.py
class PVC(BaseModel):
claim_name: str = Field(..., alias="claimName")
# Provisioning hints — used when auto-creating, ignored if volume already exists
storage_class: str | None = Field(None, alias="storageClass") # None = platform
default
storage: str | None = Field(None) # e.g. "1Gi", "10Gi"
access_modes: list[str] | None = Field(None, alias="accessModes") # e.g.
["ReadWriteOnce"] SDK models (models/sandboxes.py, api/lifecycle/models/pvc.py) and other language SDKs
should be updated accordingly.
- Each runtime provider gains an ensure_volumes() step before workload creation
- Kubernetes: for each PVC volume, check if the PVC exists via CoreV1Api; if not, create it
using the provisioning hints - Docker: check/create named volume via Docker API (provisioning hints like storage may not
apply)
- Server-side config — only a safety switch, not defaults
[storage]
volume_auto_create = true # safety switch to disable auto-creation entirely
All sizing/class decisions come from the caller's PVC declaration, not from server config.
The server only controls whether auto-creation is allowed.