On-demand GitHub Actions self-hosted runners using AWS Lambda (Go) + EC2 spot instances
- Documentation: docs/ - Configuration, deployment, and development guides.
- Releases: github.com/devopsfactory-io/jit-runners/releases
- Infrastructure (OpenTofu/Terraform): infra/terraform/ - HCL modules for all AWS resources.
- Infrastructure (CloudFormation): infra/cloudformation/ - CloudFormation template (
template.yaml). - Infrastructure (Packer): infra/packer/ - Packer template for pre-baked runner AMI.
- Getting started (Terraform): docs/getting-started-terraform.md
- Getting started (CloudFormation): docs/getting-started-cloudformation.md
- GitHub App setup: docs/github-app-setup.md - Create and configure the GitHub App that sends
workflow_jobwebhooks. - Troubleshooting: docs/troubleshooting.md - Common operational issues, diagnosis commands, and resolutions.
- Contributing: CLAUDE.md for AI and contributor guidance.
jit-runners provisions on-demand GitHub Actions self-hosted runners that launch EC2 spot instances as ephemeral JIT (Just-In-Time) runners. Three AWS Lambda functions written in Go handle webhook reception, instance provisioning, and cleanup. There are no long-running servers — the entire control plane runs on serverless infrastructure.
graph LR
A[GitHub webhook<br>workflow_job] --> B[API Gateway]
B --> C[Webhook Lambda]
C --> D[SQS Queue]
D --> E[Scale-Up Lambda]
E --> F[EC2 Spot<br>JIT Runner]
G[EventBridge<br>every 5min] --> H[Scale-Down Lambda]
H -->|cleanup| F
The three Lambda functions share code via lambda/internal/:
- webhook - Validates the GitHub webhook signature, parses the
workflow_jobevent, and enqueues a message to SQS. - scaleup - Processes SQS messages, generates a JIT runner token via the GitHub API, and launches an EC2 spot instance with a user-data script that registers and runs the ephemeral runner.
- scaledown - Runs on a schedule to clean up stale or orphaned instances and deregisters any runners that have not self-terminated.
- A GitHub App sends
workflow_jobwebhooks to an API Gateway endpoint when a workflow job is queued. - The Webhook Lambda validates the HMAC signature, parses the event, and enqueues a message to SQS with a 30-second delivery delay — this provides a deduplication window to absorb duplicate webhook deliveries before an instance is launched.
- The Scale-Up Lambda processes the SQS message, calls the GitHub API to generate a JIT runner registration token, and launches an EC2 spot instance (falling back to on-demand automatically if spot capacity is unavailable). The instance user-data script configures the runner agent (installing it on stock AMIs, or reusing the pre-baked binary on pre-baked AMIs), registers it using the JIT config, and immediately starts accepting jobs.
- After the job completes, the runner agent self-deregisters from GitHub and the instance self-terminates — no manual cleanup needed.
- The Scale-Down Lambda fires every 5 minutes via an EventBridge Scheduler. It queries DynamoDB for runner state and terminates any instances that are stale, orphaned, or whose runners have already deregistered.
- Up to 90% cost savings - EC2 spot instances cost a fraction of GitHub-hosted runners for equivalent compute.
- No idle infrastructure - Runners launch on demand and terminate after use; you pay only for the seconds a job is running.
- Private network access - Runners launch inside your VPC and can reach private resources (RDS, EKS API, internal registries) that GitHub-hosted runners cannot.
- Custom hardware - Configure instance types and sizes per workflow label (e.g.
runs-on: [self-hosted, c6i.4xlarge]). The default instance type when no label matches ist3.large. - Single-use ephemeral runners - Each job gets a clean environment with no shared state, no credential leakage, and no leftover artifacts from previous runs.
- Serverless control plane - No servers to maintain or patch. The entire orchestration layer is Lambda, SQS, DynamoDB, and EventBridge.
jit-runners ships a pre-baked Amazon Linux 2023 AMI with an ubuntu-latest-like toolchain pre-installed. Using the pre-baked AMI eliminates the per-job dependency installation step, reducing cold-start time.
The AMI is built with Packer from infra/packer/. It:
- Installs system libraries, build tools (
gcc,g++,cmake), and common utilities. - Installs Docker CE, Docker Compose v2, and Docker Buildx.
- Installs Python 3 + pip, Node.js 22 LTS + npm, and Go 1.23.x.
- Installs cloud tools: AWS CLI v2, kubectl, and Helm 3.
- Installs CLI tools:
gh,jq,yq,git-lfs,yamllint, and more. - Creates a dedicated
runnerOS user and downloads the GitHub Actions runner agent to/home/runner/actions-runner/. - Writes a version marker at
/opt/jit-runner-prebakedand a tool manifest at/opt/jit-runner-manifest.txt.
At instance launch, the user-data script checks for /opt/jit-runner-prebaked. If the file exists and the version matches the requested runner version, dependency installation and user creation are skipped. If the version differs, only the runner binary is re-downloaded. Stock AMIs (no marker file) still work as before.
The AMI is published publicly to the AWS Community AMI catalog (ami_groups = ["all"]) with name pattern jit-runner-{jit_runners_version}-runner{runner_version}-{timestamp} (example: jit-runner-v0.3.0-runner2.332.0-1773472793). It can be distributed to multiple regions: us-east-1, us-west-1, us-west-2, eu-west-1, eu-west-2, eu-west-3, eu-central-1, eu-north-1, sa-east-1.
# Validate Packer template
make ami.validate
# Build public AMI in us-east-2 only (version auto-detected from git)
make ami.build
# Build private test AMI (not published to Community AMI catalog)
make ami.build-test
# Build and copy to all distribution regions (US, EU, SA)
make ami.build-distribute
# Copy an existing AMI to all distribution regions
make ami.copy AMI_ID=ami-xxxxxxxxYou can also trigger an AMI build from GitHub Actions via the ami-build.yml workflow. Inputs: runner_version, go_version, node_major_version, jit_runners_version (auto-detected from git tags if empty), extra_script, distribute. The workflow uses OIDC (AMI_BUILD_ROLE_ARN secret), auto-triggers on pushes to infra/packer/**, and also runs on pull requests targeting infra/packer/** — PR builds create private, single-region AMIs that are automatically cleaned up after the build. The workflow runs on GitHub-hosted runners (ubuntu-latest): the self-hosted runner security group blocks SSH egress (port 22), which Packer requires to reach the build instance, and using self-hosted runners would create a circular dependency on the infrastructure being built.
Choose the deployment path that matches your tooling:
- OpenTofu / Terraform: Follow docs/getting-started-terraform.md to deploy the full AWS stack with HCL modules in infra/terraform/.
- CloudFormation: Follow docs/getting-started-cloudformation.md to deploy using the SAM/CloudFormation template in infra/cloudformation/template.yaml.
Both guides assume a GitHub App is already configured. If you have not set one up yet, start with docs/github-app-setup.md.
