feat(lambda): add Lambda Cloud provider#337
Conversation
Add the built-in Lambda provider registration, non-secret config defaults and env overrides, config-show output, API client/types, and read-only doctor readiness checks. Keep billable lifecycle, provider-side SSH key orchestration, docs, generated provider matrices, and live smoke scripts deferred to later Lambda plans.
Implement Lambda direct SSH lease acquisition, provider-side SSH key reconciliation, launch polling, release, cleanup, recovery claims, and local Tailscale/touch metadata handling on top of the provider foundation. Add fake-client coverage for launch request shape, ownership filtering, key reuse/deletion, cleanup safety, ambiguous mutation recovery, and non-mutating doctor behavior.
Document the direct Lambda SSH lease provider, add it to the generated provider matrix metadata, and provide a guarded live smoke script with deterministic blocker classifications. The smoke script remains opt-in, redacts Lambda secrets, distinguishes external account blockers from validation failures, and verifies cleanup across instances, provider SSH keys, and local testbox keys.
Teach the guarded Lambda live smoke cleanup path to recognize the backend's actual instance-not-found message so ambiguous-create recovery can prove a clean account state instead of reporting a false cleanup failure.
Keep launch-time expiry metadata on Lambda provider tags so provider-only cleanup can eventually reclaim billable orphan instances, while local claims continue to carry fresh touch state. Decode Lambda instance type responses from both array and map-keyed API shapes so doctor can validate capacity against the current API response.
Remove unsupported launch name/tag fields from the Lambda create payload and rely on Crabbox local lease claims for ownership, list, resolve, stop, and cleanup when provider tags are not persisted. Also allow ambiguous SSH-key create recovery to clean up by owned key name, while retaining support for complete provider-tagged Lambda instances as a compatibility cleanup path.
Use the per-lease Lambda SSH key name to match an untagged instance back to an ambiguous launch recovery claim when the launch response was lost. This gives stop and cleanup a concrete billable-resource recovery path even without provider-side launch tags.
Invoke the acquire callback before local Lambda claim side effects so controller acknowledgement failures roll back billable resources. Harden Lambda API error redaction for multi-line user_data and let the image-family env override clear lower-precedence exact image config.
|
Codex review: needs real behavior proof before merge. Reviewed June 13, 2026, 11:26 PM ET / 03:26 UTC. Summary Reproducibility: yes. for the PR defects: comparing the proposed request/response structs with Lambda's live OpenAPI schema shows mismatched JSON shapes. I did not run a live Lambda mutation because the PR itself reports that live gates were not enabled. Review metrics: 2 noteworthy metrics.
Merge readiness Overall follows the weaker of proof and patch quality, so missing proof can cap an otherwise strong patch. Rank-up moves:
Proof guidance:
Risk before merge
Maintainer options:
Next step before merge
Security Review findings
Review detailsBest possible solution: Fix the Lambda request/response schemas against the live OpenAPI contract, keep credentials env-only, then merge only after redacted live terminal output proves doctor, launch, run/list, stop, and cleanup behavior or maintainers explicitly accept an external-account blocker. Do we have a high-confidence way to reproduce the issue? Yes for the PR defects: comparing the proposed request/response structs with Lambda's live OpenAPI schema shows mismatched JSON shapes. I did not run a live Lambda mutation because the PR itself reports that live gates were not enabled. Is this the best way to solve the issue? No: the provider direction fits Crabbox's direct SSH-lease model, but this implementation needs schema fixes and real Lambda behavior proof before it is the maintainable merge path. Full review comments:
Overall correctness: patch is incorrect AGENTS.md: found and applied where relevant. Codex review notes: model internal, reasoning high; reviewed against 7763eecdf759. Label changesLabel changes:
Label justifications:
Evidence reviewedWhat I checked:
Likely related people:
What the crustacean ranks mean
Shiny media proof means a screenshot, video, or linked artifact directly shows the changed behavior. Runtime, network, CSP, and security claims still need visible diagnostics. How this review workflow works
|
Closes #336
Summary
Adds a built-in
lambdaprovider for Lambda Cloud On-Demand GPU instances.This implements a direct Linux SSH lease provider with Lambda API auth, provider defaults/config/env overrides, non-mutating doctor checks, instance launch/resolve/list/stop/cleanup behavior, local claim-backed recovery, Tailscale cloud-init support, provider docs, generated provider metadata, and a guarded live smoke script.
Notes
CRABBOX_LIVE=1,CRABBOX_LIVE_PROVIDERS=lambda, andLAMBDA_API_KEY.Verification
go test -race ./...go vet ./...go build -trimpath -o bin/crabbox ./cmd/crabboxnpm test --prefix workernpm run format:check --prefix workernpm run lint --prefix workernpm run check --prefix workernode scripts/check-docs-links.mjsbash scripts/check-docs.shbash -n scripts/live-lambda-smoke.shnode --test scripts/live-lambda-smoke.test.jsscripts/live-lambda-smoke.sh->classification=environment_blocked reason=CRABBOX_LIVE_not_enabled~/.agents/skills/autoreview/scripts/autoreview --mode branch --base origin/main-> clean