ops: PowerShell operator scripts for AWS EC2 dev-loop#13
Merged
Conversation
Adds scripts/ops/ with four parameterized PowerShell helpers for running
the gateway on a single EC2 GPU host without paying for idle time:
- setup-ssh.ps1 one-time-per-laptop key bootstrap (ed25519 +
transient SG :22 inbound /32 + EC2 Instance
Connect for first-connect key push)
- fix-and-start.ps1 start instance, disable idle alarm/cron,
sed-fix systemd unit for compose v2 compat,
wait for "Application startup complete.",
smoke /health /ready /v1/chat/completions
- restore-idle-protection re-enable alarm + cron; -StopNow to lock in
savings
- teardown-ssh.ps1 revoke the SG :22 rule when done
Tag-based discovery (tag:application=vllm-serving +
tag:environment=<env>) means no hardcoded instance IDs/EIPs - script
params override when needed. checkip.amazonaws.com response is
IPv4-validated before authoring SG rules.
First .ps1 files in the repo; CI is Python-only so no new lint surface.
Used in production by the convilyn dev-loop, externalized here so any
operator running llm-gateway on EC2 can pick them up.
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Drive-by fix to unblock CI on this PR. The previous commit on main (2c2c48d "fix(schemas): cap messages, tools, and per-message content length") landed pre-formatted lines that black 26.3.1 wants collapsed into single lines under the configured line-length=100. Pure formatting, no semantic change. Verified: poetry run black --check llm_gateway/ tests/ now reports "52 files would be left unchanged." Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Adds
scripts/ops/with four parameterized PowerShell helpers that turn an EC2-hosted llm-gateway box into a low-friction, near-zero-fixed-cost dev environment. All scripts are idempotent, use tag-based instance discovery, and respect the existingllm-gateway-bootstrap/ systemd / idle-cron design.setup-ssh.ps1— one-time-per-laptop ed25519 key bootstrap. Opens a transient SG :22 inbound rule scoped to the operator's/32(validates the IP viacheckip.amazonaws.comagainst an IPv4 regex before authoring the rule), pushes the public key via EC2 Instance Connect (60s TTL), persists into~ubuntu/.ssh/authorized_keysfor ongoing use.fix-and-start.ps1— start instance, disable idle alarm/cron,sed-patch the systemd unit (docker compose --no-color→--ansi neverfor Compose v2.x compat; safe no-op if already correct), wait forApplication startup complete., then smoke/health/ready/v1/chat/completionsfrom inside the box.restore-idle-protection.ps1— re-enable the CW alarm action + idle cron; optional-StopNowto stop the instance immediately and lock in savings.teardown-ssh.ps1— revoke the transient SG :22 rule when done with the dev box for a while.scripts/ops/README.mddocuments prereqs, tag-based discovery (tag:application=vllm-serving+tag:environment=<env>), the daily flow, and the operator's required IAM permission set.Top-level
README.mdgets a short "Ops scripts" section linking toscripts/ops/README.md(right before "License").Why
These helpers were originally living in a downstream consumer (convilyn). They have zero coupling to any consumer's package layout / domain logic — they only know how to start/stop the gateway box, fix a known systemd-unit issue, and manage the cost guardrails. Externalizing them here so any operator running llm-gateway on EC2 can adopt them.
Notes
.ps1files in the repo. CI is Python-only (ruff/black/pyright/pytest), so the donation does not gate CI.deploy/scripts/*.sh), strict mode ($ErrorActionPreference = 'Stop').-Environment dev(default) drives tag discovery.-InstanceId/-Eip/-Regionoverride for explicit pinning.-AlarmNameContains VLLMIdleBackstopis overridable.scripts/ops/README.md.Test plan
Get-ChildItem scripts/ops/*.ps1 | ForEach-Object { Test-Path $_.FullName }lists all 4 scripts.\scripts\ops\setup-ssh.ps1 -Environment devsucceeds against a realvllm-servinginstance and pushes an SSH key.\scripts\ops\fix-and-start.ps1 -Environment devbrings the gateway from stopped → ready and the smoke test returnsok.\scripts\ops\restore-idle-protection.ps1 -Environment dev -StopNowre-arms the alarm + cron and stops the instance.\scripts\ops\teardown-ssh.ps1 -Environment devrevokes the SG :22 rule🤖 Generated with Claude Code