Skip to content

[Feature Request] Auto-create backing storage (PVC / Docker volume) when volume is requested #660

@xfgong

Description

@xfgong

Problem

OpenSandbox accepts volume mounts in Sandbox.create(), but treats them as references to
pre-existing storage
. If the backing storage doesn't exist — a PVC in Kubernetes, or a
named volume in Docker — the sandbox fails at scheduling time.

This means every caller must independently manage storage lifecycle outside of OpenSandbox,
which defeats the purpose of having a unified sandbox API. The caller has to know the
runtime backend (K8s vs Docker), have the right client/credentials, and implement
ensure-or-create logic themselves.

Current behavior

The volume_helper adds volume references to the pod/container spec, but no provider
actually creates the backing storage. The sandbox simply fails if it doesn't already exist.

  • Kubernetes: persistentvolumeclaim "xxx" not found → pod stuck in Pending
  • Docker: equivalent failure when referencing a non-existent named volume

The server-side PVC model (api/schema.py) already describes itself as a
"runtime-neutral abstraction" covering both K8s PVCs and Docker named volumes, but only
contains claim_name — no way to express provisioning intent.

Expected behavior

When a sandbox requests a volume, OpenSandbox should ensure the backing storage exists
before creating the workload — create if missing, skip if already present. This is
analogous to docker run -v mydata:/data which auto-creates the mydata volume.

Why this matters

Volumes are essential for sandbox use cases like:

  • Persistent user workspaces that survive sandbox restarts
  • Shared data across multiple sandbox sessions
  • Pre-populated datasets mounted into sandboxes

Without auto-ensure, callers must couple to the runtime backend and duplicate storage
management logic — exactly the abstraction leak that OpenSandbox is designed to eliminate.

Proposed changes

1. Extend PVC model with optional provisioning hints

Currently PVC only has claim_name. Add optional fields so the caller can declaratively
describe what they need. These are only used during auto-creation and ignored if the volume
already exists:

# server: api/schema.py                                                                   
class PVC(BaseModel):
    claim_name: str = Field(..., alias="claimName")
    # Provisioning hints — used when auto-creating, ignored if volume already exists
    storage_class: str | None = Field(None, alias="storageClass")  # None = platform       
default                                                                                    
    storage: str | None = Field(None)  # e.g. "1Gi", "10Gi"                                
    access_modes: list[str] | None = Field(None, alias="accessModes")  # e.g.              
["ReadWriteOnce"]                                                                          

SDK models (models/sandboxes.py, api/lifecycle/models/pvc.py) and other language SDKs
should be updated accordingly.

  1. Each runtime provider gains an ensure_volumes() step before workload creation
  • Kubernetes: for each PVC volume, check if the PVC exists via CoreV1Api; if not, create it
    using the provisioning hints
  • Docker: check/create named volume via Docker API (provisioning hints like storage may not
    apply)
  1. Server-side config — only a safety switch, not defaults
[storage]                                                                                  
volume_auto_create = true   # safety switch to disable auto-creation entirely             

All sizing/class decisions come from the caller's PVC declaration, not from server config.
The server only controls whether auto-creation is allowed.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions