diff --git a/CHANGELOG.md b/CHANGELOG.md index 7dd9d44..c8a7370 100644 --- a/CHANGELOG.md +++ b/CHANGELOG.md @@ -17,7 +17,25 @@ pins that tag, so this file is the human-readable answer to "what's in v0.2.0?". ## [Unreleased] -_Nothing yet._ +### Changed +- **`compute-engine` now manages many VMs, and no longer hard-codes environment identity.** + - **Multi-VM:** the single hard-coded `google_compute_instance` is replaced by an `instances` + map fanned out with `for_each` (mirrors `github`'s `repositories` pattern) — add a map key to + add a VM. VM name is `-` (the map key is the name, env-prefixed — + e.g. `postiz` → `dev-postiz`). Per-VM spec (`machine_type`, `boot_image`, `boot_disk_size_gb`, `zone`, + `assign_public_ip`, `startup_script`, `network_tags`) moved from top-level vars into each map + entry, all `optional(...)` with cost-safe defaults. IAM grants fan out **member × VM** via + `setproduct` on stable keys (no reindex churn). Outputs collapse to a single `instances` map + keyed by VM key (the map key, not the full `-` VM name) — each value carries `name`, + `instance_id`, `internal_ip`, `zone`, `ssh_command`. + - **Env identity out of the module:** `project_id` lost its dev-project default and is now + **required** (a forgotten value fails loudly instead of silently provisioning into the wrong + project); `access_members` now defaults to `[]` (no SSH) instead of two named engineers. A + reusable module should know *how* to build a VM, not *where* or *who* — that belongs at the + call site. Cost-safe `how` defaults (`e2-micro`, `debian-12`, 20 GB) are unchanged. + - ⚠️ **Breaking:** existing state re-keys `google_compute_instance.this` → `this[""]` (IAM + members too) — consumers must `terragrunt state mv` or recreate. Consumers must now set + `project_id` explicitly and list `access_members` (the empty default grants no access). ## [0.4.0] - 2026-06-15 diff --git a/README.md b/README.md index f6e3112..4444164 100644 --- a/README.md +++ b/README.md @@ -28,7 +28,7 @@ and **[CHANGELOG.md](./CHANGELOG.md)** for what changed in each tagged version. | Component | Cloud | Purpose | Key outputs | | ---------------- | ------ | ------------------------------------------------ | ----------------------------------------------- | | `network` | GCP | Network foundation — wraps CFT network + cloud-router modules | `network_self_link`, `subnetwork_self_link`, `ssh_tag` | -| `compute-engine` | GCP | VM (bootstrap-agnostic); OS Login + IAP access, no public IP | `instance_name`, `internal_ip`, `ssh_command` | +| `compute-engine` | GCP | One or more VMs (`instances` map, bootstrap-agnostic); OS Login + IAP access, no public IP | `instances` (map keyed by VM key) | | `github` | GitHub | GitHub repositories as code (repo factory) | `repository_names`, `repository_urls` | `network` and `compute-engine` form a dependency chain: diff --git a/compute-engine/README.md b/compute-engine/README.md index 23e79bd..4badb57 100644 --- a/compute-engine/README.md +++ b/compute-engine/README.md @@ -1,21 +1,28 @@ # compute-engine -A **GCP Compute Engine VM with Docker** installed on first boot. The VM has **no external IP** — -access follows the same model as AWS SSM Session Manager: **OS Login + IAP TCP forwarding**, where -you connect through Google's infrastructure with your own identity and short-lived, -IAM-governed keys (no public SSH port, no key files to manage). +**One or more GCP Compute Engine VMs**, defined as a map (`instances`) and fanned out with +`for_each` — add a key to add a VM. Each VM has **no external IP** by default; access follows the +same model as AWS SSM Session Manager: **OS Login + IAP TCP forwarding**, where you connect through +Google's infrastructure with your own identity and short-lived, IAM-governed keys (no public SSH +port, no key files to manage). The module is **bootstrap-agnostic** — it runs whatever +`startup_script` each VM passes (e.g. a Docker install), but owns no userdata itself. ## What it creates +For each entry in `instances` (keyed by a short name — that key also keys the `instances` **output**; the VM's GCP name is `-`, so a `postiz` key → output `instances["postiz"]` whose `.name` is `dev-postiz`): + - `google_compute_instance` — the VM. No external IP by default; `enable-oslogin = TRUE` so SSH - access is governed by IAM, not project metadata keys. Runs the caller-supplied `startup_script` - on first boot (empty string = no bootstrap). **This module is bootstrap-agnostic** — the actual - first-boot script (e.g. installing Docker) is **userdata owned by the consuming environment**, - not baked into the module. See "Bootstrap / userdata" below. -- `google_compute_instance_iam_member` (per member) — grants **`roles/compute.osLogin`** to each - `access_members` principal, letting them log in over SSH. -- `google_iap_tunnel_instance_iam_member` (per member) — grants **`roles/iap.tunnelResourceAccessor`** - to each `access_members` principal, letting them open an IAP tunnel to the VM. + access is governed by IAM, not project metadata keys. Runs that entry's `startup_script` on first + boot (empty string = no bootstrap). **This module is bootstrap-agnostic** — the actual first-boot + script (e.g. installing Docker) is **userdata owned by the consuming environment**, not baked into + the module. See "Bootstrap / userdata" below. +- `google_compute_instance_iam_member` — grants **`roles/compute.osLogin`** to each `access_members` + principal on every VM, letting them log in over SSH. +- `google_iap_tunnel_instance_iam_member` — grants **`roles/iap.tunnelResourceAccessor`** to each + `access_members` principal on every VM, letting them open an IAP tunnel to it. + +IAM grants fan out as **member × VM** (via `setproduct`) on stable keys, so adding a VM or a member +never reindexes existing grants. Inbound SSH from the IAP range is allowed by the **`network` component's** `allow-iap-ssh` firewall rule, and outbound internet (for the Docker install) comes from the `network`'s Cloud NAT — so this module @@ -37,7 +44,7 @@ gcloud compute ssh \ --zone --tunnel-through-iap ``` -The `ssh_command` output prints this for you. To grant someone access, add their principal to +Each `instances[].ssh_command` output prints this for you. To grant someone access, add their principal to `access_members` (e.g. `user:alice@officialdad.com`, `serviceAccount:ci@.iam.gserviceaccount.com`) and re-apply — no keys to distribute, and access is revoked the moment IAM is removed. @@ -48,21 +55,28 @@ caller passes, on first boot, as the instance's `metadata_startup_script`. Pass for a plain VM. The **consuming environment owns the bootstrap**. In `infra-environments-dev` the Docker install -lives as a script file (e.g. `compute-engine/userdata/docker-bootstrap.sh`) and a switch in the -unit's `terragrunt.hcl` decides whether to pass it: +lives as a script file (`compute-engine/userdata/docker-bootstrap.sh`), read with `file()` and +passed as that VM's `startup_script` inside the `instances` map: ```hcl -locals { - install_docker = true -} inputs = { - startup_script = local.install_docker ? file("${get_terragrunt_dir()}/userdata/docker-bootstrap.sh") : "" + network = dependency.network.outputs.network_self_link + subnetwork = dependency.network.outputs.subnetwork_self_link + + instances = { + dev = { + machine_type = "e2-micro" + network_tags = [dependency.network.outputs.ssh_tag] + startup_script = file("${get_terragrunt_dir()}/userdata/docker-bootstrap.sh") + } + } } ``` -This keeps the cookbook module generic (any VM, any bootstrap) and puts the "what runs on boot" -decision where environment-specific choices belong. A Docker bootstrap needs outbound internet -(apt + docker.com), which the `network`'s Cloud NAT provides to this otherwise-private VM. +Leave `startup_script` unset (or `""`) for a plain VM. This keeps the cookbook module generic (any +VM, any bootstrap) and puts the "what runs on boot" decision where environment-specific choices +belong. A Docker bootstrap needs outbound internet (apt + docker.com), which the `network`'s Cloud +NAT provides to these otherwise-private VMs. ## Auth @@ -73,29 +87,39 @@ must be enabled on the project. ## Inputs -| Name | Type | Default | Description | -| ------------------- | ------------ | ------------------------ | ------------------------------------------------------------------------ | -| `global` | object | — | Env-wide context (`environment_name`, `deploy_region`, `tags`). | -| `project_id` | string | — | GCP project the VM is created in. | -| `network` | string | — | Network self link / name (from `network.network_self_link`). | -| `subnetwork` | string | — | Subnetwork self link / name (from `network.subnetwork_self_link`). | -| `zone` | string | `""` | Zone for the VM. Empty → `"-a"`. | -| `machine_type` | string | `e2-micro` | Machine type. | -| `boot_image` | string | `debian-cloud/debian-12` | Boot image (`project/family` or full self link). | -| `boot_disk_size_gb` | number | `20` | Boot disk size in GB. | -| `startup_script` | string | `""` | First-boot script (userdata). Empty = no bootstrap. Supplied by the env. | -| `assign_public_ip` | bool | `false` | Attach an ephemeral external IP. Leave `false` for the IAP-only model. | -| `access_members` | list(string) | `[]` | IAM principals granted OS Login + IAP tunnel access (see access model). | +| Name | Type | Default | Description | +| ---------------- | ------------ | ------- | ------------------------------------------------------------------------------------------------ | +| `global` | object | — | Env-wide context (`environment_name`, `deploy_region`, `tags`). | +| `project_id` | string | — | **Required.** GCP project the VMs are created in. Set per environment (see note below). | +| `network` | string | — | Network self link / name (from `network.network_self_link`). Shared by all VMs. | +| `subnetwork` | string | — | Subnetwork self link / name (from `network.subnetwork_self_link`). Shared by all VMs. | +| `access_members` | list(string) | `[]` | IAM principals granted OS Login + IAP access on **every** VM. Empty = no SSH access (see note). | +| `instances` | map(object) | `{}` | VMs to create, keyed by short name. Per-VM fields below; each entry overrides only what it needs. | + +Per-VM fields inside each `instances` entry (all optional): + +| Field | Default | Description | +| ------------------- | ------------------------ | ----------------------------------------------------------------------- | +| `machine_type` | `e2-micro` | Machine type. | +| `boot_image` | `debian-cloud/debian-12` | Boot image (`project/family` or full self link). | +| `boot_disk_size_gb` | `20` | Boot disk size in GB. | +| `zone` | `""` | Zone. Empty → `"-a"`. | +| `assign_public_ip` | `false` | Attach an ephemeral external IP. Leave `false` for the IAP-only model. | +| `startup_script` | `""` | First-boot script (userdata). Empty = no bootstrap. Supplied by the env. | +| `network_tags` | `[]` | Firewall tags (e.g. `[network.ssh_tag]`). Empty = no tag-scoped inbound. | + +> **Why `project_id` is required and `access_members` defaults to `[]`:** a reusable module should +> know *how* to build a VM, never *where* or *who* — that's environment identity, owned by the call +> site. `project_id` has no safe default (defaulting to one env's project risks another env silently +> provisioning into it), so it's required and fails loudly when unset. `access_members` defaults to +> "no access" (safe when forgotten) rather than a hard-coded person. The cost-safe *how* knobs +> (`e2-micro`, `debian-12`, 20 GB) keep defaults, since a forgotten value there is harmless. ## Outputs -| Name | Description | -| --------------- | ---------------------------------------------------------------- | -| `instance_name` | The VM name (`-compute-engine`). | -| `instance_id` | The instance ID. | -| `internal_ip` | The VM's internal IP. | -| `zone` | The zone the VM runs in. | -| `ssh_command` | Ready-to-run `gcloud compute ssh ... --tunnel-through-iap` line. | +| Name | Type | Description | +| ----------- | ----------- | ------------------------------------------------------------------------------------------------------------------------------------------------------------------------ | +| `instances` | map(object) | Per-VM details keyed by instance key. Each value has `name`, `instance_id`, `internal_ip`, `zone`, and a ready-to-run `ssh_command` (`gcloud compute ssh … --tunnel-through-iap`). | ## Dependencies diff --git a/compute-engine/terraform/main.tf b/compute-engine/terraform/main.tf index 435896b..f39894b 100644 --- a/compute-engine/terraform/main.tf +++ b/compute-engine/terraform/main.tf @@ -4,24 +4,28 @@ provider "google" { } locals { - name_prefix = "${var.global.environment_name}-compute-engine" - - # Default to zone "a" in the deploy region unless the caller pins one. - zone = var.zone != "" ? var.zone : "${var.global.deploy_region}-a" - # GCP labels must be lowercase; only this subset of the tags convention maps cleanly. labels = { environment = lower(var.global.environment_name) managed_by = "terraform" } + + # member x instance -> one stable key per pair, so for_each never reindexes + # when a VM or a member is added/removed. + vm_access = { + for pair in setproduct(keys(var.instances), var.access_members) : + "${pair[0]}:${pair[1]}" => { instance = pair[0], member = pair[1] } + } } resource "google_compute_instance" "this" { - name = local.name_prefix - machine_type = var.machine_type - zone = local.zone + for_each = var.instances + + name = "${var.global.environment_name}-${each.key}" + machine_type = each.value.machine_type + zone = each.value.zone != "" ? each.value.zone : "${var.global.deploy_region}-a" labels = local.labels - tags = var.network_tags + tags = each.value.network_tags # Destroy-friendly: no accidental lock, and changing machine_type stops the VM # instead of forcing a full recreate. @@ -31,8 +35,8 @@ resource "google_compute_instance" "this" { boot_disk { auto_delete = true # disk is deleted with the VM -> no orphaned disk cost initialize_params { - image = var.boot_image - size = var.boot_disk_size_gb + image = each.value.boot_image + size = each.value.boot_disk_size_gb } } @@ -42,7 +46,7 @@ resource "google_compute_instance" "this" { # Emitting access_config = an external IP. Omit it (default) = no public IP. dynamic "access_config" { - for_each = var.assign_public_ip ? [1] : [] + for_each = each.value.assign_public_ip ? [1] : [] content {} } } @@ -52,25 +56,25 @@ resource "google_compute_instance" "this" { } # Caller-supplied userdata; null (unset) when empty. - metadata_startup_script = var.startup_script != "" ? var.startup_script : null + metadata_startup_script = each.value.startup_script != "" ? each.value.startup_script : null } -# OS Login: who may SSH in (identity-based). +# OS Login: who may SSH in (identity-based). Every member on every VM. resource "google_compute_instance_iam_member" "os_login" { - for_each = toset(var.access_members) + for_each = local.vm_access project = var.project_id - zone = local.zone - instance_name = google_compute_instance.this.name + zone = google_compute_instance.this[each.value.instance].zone + instance_name = google_compute_instance.this[each.value.instance].name role = "roles/compute.osLogin" - member = each.value + member = each.value.member } -# IAP: who may open the tunnel that reaches the (private) VM's SSH port. +# IAP: who may open the tunnel that reaches each (private) VM's SSH port. resource "google_iap_tunnel_instance_iam_member" "tunnel" { - for_each = toset(var.access_members) + for_each = local.vm_access project = var.project_id - zone = local.zone - instance = google_compute_instance.this.name + zone = google_compute_instance.this[each.value.instance].zone + instance = google_compute_instance.this[each.value.instance].name role = "roles/iap.tunnelResourceAccessor" - member = each.value + member = each.value.member } diff --git a/compute-engine/terraform/outputs.tf b/compute-engine/terraform/outputs.tf index accd214..5603b2e 100644 --- a/compute-engine/terraform/outputs.tf +++ b/compute-engine/terraform/outputs.tf @@ -1,24 +1,12 @@ -output "instance_name" { - value = google_compute_instance.this.name - description = "The VM name." -} - -output "instance_id" { - value = google_compute_instance.this.instance_id - description = "The instance ID." -} - -output "internal_ip" { - value = google_compute_instance.this.network_interface[0].network_ip - description = "The VM's internal IP." -} - -output "zone" { - value = google_compute_instance.this.zone - description = "The zone the VM runs in." -} - -output "ssh_command" { - value = "gcloud compute ssh ${google_compute_instance.this.name} --zone ${google_compute_instance.this.zone} --project ${var.project_id} --tunnel-through-iap" - description = "Ready-to-run IAP SSH command." +output "instances" { + value = { + for k, vm in google_compute_instance.this : k => { + name = vm.name + instance_id = vm.instance_id + internal_ip = vm.network_interface[0].network_ip + zone = vm.zone + ssh_command = "gcloud compute ssh ${vm.name} --zone ${vm.zone} --project ${var.project_id} --tunnel-through-iap" + } + } + description = "Per-instance details keyed by instance key: name, instance_id, internal_ip, zone, and a ready-to-run IAP ssh_command." } diff --git a/compute-engine/terraform/variables.tf b/compute-engine/terraform/variables.tf index 070873d..edae0be 100644 --- a/compute-engine/terraform/variables.tf +++ b/compute-engine/terraform/variables.tf @@ -9,63 +9,40 @@ variable "global" { variable "project_id" { type = string - description = "GCP project the VM is created in." + description = "GCP project the VMs are created in. Required — set per environment so a forgotten value fails loudly instead of silently landing resources in the wrong project (e.g. prod into dev)." } variable "network" { type = string - description = "Network self link or name (from network.network_self_link)." + description = "Network self link or name (from network.network_self_link). Shared by all instances." } variable "subnetwork" { type = string - description = "Subnetwork self link or name (from network.subnetwork_self_link)." -} - -variable "zone" { - type = string - description = "Zone for the VM. Empty string -> \"-a\"." - default = "" -} - -variable "machine_type" { - type = string - description = "Machine type." - default = "e2-micro" -} - -variable "boot_image" { - type = string - description = "Boot image as project/family or a full self link." - default = "debian-cloud/debian-12" -} - -variable "boot_disk_size_gb" { - type = number - description = "Boot disk size in GB." - default = 20 -} - -variable "startup_script" { - type = string - description = "First-boot script (userdata) run via metadata_startup_script. \"\" = no bootstrap." - default = "" -} - -variable "assign_public_ip" { - type = bool - description = "Attach an ephemeral external IP. Leave false for the IAP-only model." - default = false + description = "Subnetwork self link or name (from network.subnetwork_self_link). Shared by all instances." } variable "access_members" { type = list(string) - description = "IAM principals granted OS Login + IAP tunnel access (e.g. user:me@x.com)." + description = "IAM principals (user:/group:/serviceAccount:) granted OS Login + IAP tunnel access on EVERY VM. Empty = no SSH access; each environment opts in its own people." default = [] } -variable "network_tags" { - type = list(string) - description = "Network tags applied to the VM. Each tag opts the VM into the VPC firewall rules that target it (e.g. [module.network.ssh_tag] to allow IAP SSH). Empty = no tag-scoped inbound." - default = [] +variable "instances" { + type = map(object({ + machine_type = optional(string, "e2-micro") + boot_image = optional(string, "debian-cloud/debian-12") + boot_disk_size_gb = optional(number, 20) + zone = optional(string, "") + assign_public_ip = optional(bool, false) + startup_script = optional(string, "") + network_tags = optional(list(string), []) + })) + description = "VMs to create, keyed by short name. Each entry overrides only the fields it needs; the rest take module defaults. VM name = \"-\"." + default = {} + + validation { + condition = alltrue([for k in keys(var.instances) : can(regex("^[a-z][a-z0-9-]{0,61}[a-z0-9]$|^[a-z]$", k))]) + error_message = "Each instances key must be RFC1035: lowercase letter first, then lowercase/digits/hyphens, no trailing hyphen, ≤63 chars." + } }