Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 2 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -99,7 +99,7 @@ See [Performance](docs/performance.md) for detailed timings and cost breakdowns.
- **EventBridge scope**: Captures all EC2 state changes in the account; Lambda filters by VPC ID.
- **Startup delay**: First workload in an idle AZ waits ~10 seconds for internet. Design scripts to retry outbound connections.
- **Dual ENI**: Persistent public + private ENIs survive stop/start cycles.
- **DLQ**: Failed Lambda invocations go to an SQS dead letter queue.
- **Retries**: Failed Lambda invocations are retried up to 2 times by EventBridge.
- **Clean destroy**: A cleanup action terminates NAT instances before `terraform destroy` removes ENIs.
- **Config versioning**: Changing AMI or instance type auto-replaces NAT instances on next workload event.
- **EC2 events only**: Currently nat-zero responds only to EC2 instance state changes. If you have a use case for other event sources (ECS tasks, Lambda, etc.), PRs are welcome.
Expand Down Expand Up @@ -163,6 +163,7 @@ No modules.
| <a name="input_custom_ami_name_pattern"></a> [custom\_ami\_name\_pattern](#input\_custom\_ami\_name\_pattern) | AMI name pattern when use\_fck\_nat\_ami is false | `string` | `null` | no |
| <a name="input_custom_ami_owner"></a> [custom\_ami\_owner](#input\_custom\_ami\_owner) | AMI owner account ID when use\_fck\_nat\_ami is false | `string` | `null` | no |
| <a name="input_enable_logging"></a> [enable\_logging](#input\_enable\_logging) | Create a CloudWatch log group for the Lambda function | `bool` | `true` | no |
| <a name="input_encrypt_root_volume"></a> [encrypt\_root\_volume](#input\_encrypt\_root\_volume) | Encrypt the root EBS volume. | `bool` | `true` | no |
| <a name="input_ignore_tag_key"></a> [ignore\_tag\_key](#input\_ignore\_tag\_key) | Tag key used to mark instances the Lambda should ignore | `string` | `"nat-zero:ignore"` | no |
| <a name="input_ignore_tag_value"></a> [ignore\_tag\_value](#input\_ignore\_tag\_value) | Tag value used to mark instances the Lambda should ignore | `string` | `"true"` | no |
| <a name="input_instance_type"></a> [instance\_type](#input\_instance\_type) | Instance type for the NAT instance | `string` | `"t4g.nano"` | no |
Expand Down
2 changes: 1 addition & 1 deletion docs/architecture.md
Original file line number Diff line number Diff line change
Expand Up @@ -156,7 +156,7 @@ Each NAT instance uses two ENIs to separate public and private traffic:

## Config Versioning

The Lambda tags each NAT instance with a `ConfigVersion` hash derived from AMI, instance type, market type, and volume size.
The Lambda tags each NAT instance with a `ConfigVersion` hash derived from AMI, instance type, market type, volume size, and encryption setting.

When the reconciler detects an outdated NAT, replacement takes two events (following the "one action per invocation" pattern):

Expand Down
14 changes: 14 additions & 0 deletions docs/examples.md
Original file line number Diff line number Diff line change
Expand Up @@ -118,6 +118,20 @@ module "nat_zero" {
}
```

## Disable Root Volume Encryption

The root EBS volume is encrypted by default. To disable encryption (e.g., for environments without compliance requirements):

```hcl
module "nat_zero" {
source = "github.com/MachineDotDev/nat-zero"

# ... required variables ...

encrypt_root_volume = false
}
```

## Building Lambda Locally

For development or if you want to build from source:
Expand Down
1 change: 1 addition & 0 deletions docs/reference.md
Original file line number Diff line number Diff line change
Expand Up @@ -56,6 +56,7 @@ No modules.
| <a name="input_custom_ami_name_pattern"></a> [custom\_ami\_name\_pattern](#input\_custom\_ami\_name\_pattern) | AMI name pattern when use\_fck\_nat\_ami is false | `string` | `null` | no |
| <a name="input_custom_ami_owner"></a> [custom\_ami\_owner](#input\_custom\_ami\_owner) | AMI owner account ID when use\_fck\_nat\_ami is false | `string` | `null` | no |
| <a name="input_enable_logging"></a> [enable\_logging](#input\_enable\_logging) | Create a CloudWatch log group for the Lambda function | `bool` | `true` | no |
| <a name="input_encrypt_root_volume"></a> [encrypt\_root\_volume](#input\_encrypt\_root\_volume) | Encrypt the root EBS volume. | `bool` | `true` | no |
| <a name="input_ignore_tag_key"></a> [ignore\_tag\_key](#input\_ignore\_tag\_key) | Tag key used to mark instances the Lambda should ignore | `string` | `"nat-zero:ignore"` | no |
| <a name="input_ignore_tag_value"></a> [ignore\_tag\_value](#input\_ignore\_tag\_value) | Tag value used to mark instances the Lambda should ignore | `string` | `"true"` | no |
| <a name="input_instance_type"></a> [instance\_type](#input\_instance\_type) | Instance type for the NAT instance | `string` | `"t4g.nano"` | no |
Expand Down
10 changes: 5 additions & 5 deletions docs/testing.md
Original file line number Diff line number Diff line change
Expand Up @@ -22,22 +22,22 @@ The test uses [Terratest](https://terratest.gruntwork.io/) with a single `terraf

1. Deploy fixture (private subnet + nat-zero module in default VPC)
2. Launch workload instance in private subnet
3. Invoke Lambda → creates NAT instance
3. EventBridge fires workload state change → Lambda creates NAT instance
4. Wait for NAT running with EIP attached
5. Verify workload's egress IP matches NAT's EIP

### Phase 2: Scale-Down

1. Terminate workload
2. Invoke Lambda → stops NAT
2. EventBridge fires workload terminated → Lambda stops NAT
3. Wait for NAT stopped
4. Invoke Lambda → releases EIP
4. EventBridge fires NAT stopped → Lambda releases EIP
5. Verify no EIPs remain

### Phase 3: Restart

1. Launch new workload
2. Invoke Lambda → restarts stopped NAT
2. EventBridge fires workload state change → Lambda restarts stopped NAT
3. Wait for NAT running with new EIP
4. Verify connectivity

Expand All @@ -64,4 +64,4 @@ Integration tests run in GitHub Actions when the `integration-test` label is add

## Config Version Replacement

The Lambda tags NAT instances with a `ConfigVersion` hash (AMI + instance type + market type + volume size). When the config changes and a workload triggers reconciliation, the Lambda terminates the outdated NAT and creates a replacement. The integration test doesn't exercise this path directly, but it's covered by unit tests.
The Lambda tags NAT instances with a `ConfigVersion` hash (AMI + instance type + market type + volume size + encryption). When the config changes and a workload triggers reconciliation, the Lambda terminates the outdated NAT and creates a replacement. The integration test doesn't exercise this path directly, but it's covered by unit tests.
1 change: 1 addition & 0 deletions lambda.tf
Original file line number Diff line number Diff line change
Expand Up @@ -80,6 +80,7 @@ resource "aws_lambda_function" "nat_zero" {
var.instance_type,
var.market_type,
tostring(var.block_device_size),
tostring(var.encrypt_root_volume),
]))
}
}
Expand Down
2 changes: 1 addition & 1 deletion launch_template.tf
Original file line number Diff line number Diff line change
Expand Up @@ -25,7 +25,7 @@ resource "aws_launch_template" "nat_launch_template" {
volume_type = "gp3"
iops = 3000
throughput = 250
encrypted = true
encrypted = var.encrypt_root_volume
}
}

Expand Down
14 changes: 12 additions & 2 deletions tests/integration/fixture/main.tf
Original file line number Diff line number Diff line change
Expand Up @@ -66,6 +66,11 @@ variable "nat_instance_type" {
default = "t4g.nano"
}

variable "encrypt_root_volume" {
type = bool
default = true
}

module "nat_zero" {
source = "../../../"

Expand All @@ -78,8 +83,9 @@ module "nat_zero" {
private_route_table_ids = [aws_route_table.private.id]
private_subnets_cidr_blocks = [aws_subnet.private.cidr_block]

instance_type = var.nat_instance_type
market_type = "on-demand"
instance_type = var.nat_instance_type
market_type = "on-demand"
encrypt_root_volume = var.encrypt_root_volume
}

output "vpc_id" {
Expand All @@ -97,3 +103,7 @@ output "lambda_function_name" {
output "nat_security_group_ids" {
value = module.nat_zero.nat_security_group_ids
}

output "encrypt_root_volume" {
value = var.encrypt_root_volume
}
8 changes: 8 additions & 0 deletions tests/integration/nat_zero_test.go
Original file line number Diff line number Diff line change
Expand Up @@ -71,9 +71,16 @@ func TestNatZero(t *testing.T) {
phases = append(phases, phase{name, d})
t.Logf("[TIMER] %-45s %s", name, d.Round(time.Millisecond))
}
var encryptRootVolume string
defer func() {
t.Log("")
t.Log("=== TIMING SUMMARY ===")
encryptLabel := "enabled"
if encryptRootVolume == "false" {
encryptLabel = "disabled"
}
t.Logf(" EBS Encryption: %s", encryptLabel)
t.Log("")
t.Logf(" %-45s %s", "PHASE", "DURATION")
t.Log(" " + strings.Repeat("-", 60))
var total time.Duration
Expand Down Expand Up @@ -121,6 +128,7 @@ func TestNatZero(t *testing.T) {
vpcID := terraform.Output(t, opts, "vpc_id")
privateSubnet := terraform.Output(t, opts, "private_subnet_id")
lambdaName := terraform.Output(t, opts, "lambda_function_name")
encryptRootVolume = terraform.Output(t, opts, "encrypt_root_volume")
t.Logf("VPC: %s, private subnet: %s, Lambda: %s", vpcID, privateSubnet, lambdaName)

// Terminate test workload instances before terraform destroy.
Expand Down
6 changes: 6 additions & 0 deletions variables.tf
Original file line number Diff line number Diff line change
Expand Up @@ -62,6 +62,12 @@ variable "block_device_size" {
description = "Size in GB of the root EBS volume"
}

variable "encrypt_root_volume" {
type = bool
default = true
description = "Encrypt the root EBS volume."
}

# AMI configuration
variable "use_fck_nat_ami" {
type = bool
Expand Down
Loading