Security

The Netlix Platform implements defense-in-depth security across infrastructure, secrets, network, and workload layers. This page documents all security controls and their rationale.

Credential Management

Zero static credentials in code. All authentication uses OIDC workload identity, dynamic secrets, or encrypted variable sets.

Credential	Source	Delivery	Rotation
AWS access	OIDC identity token	`assume_role_with_web_identity`	Per-run (short-lived JWT)
HCP service principal	Variable set `netlix-hcp`	Ephemeral store reference	Manual
Vault authentication	OIDC identity token	`auth_login_jwt` (mount: `jwt-tfc`)	Per-run
Database credentials	Vault database engine	Dynamic secrets via VSO	Automatic (67% TTL)
TLS certificates	Vault PKI engine	VaultPKISecret CRD via VSO	Automatic (before expiry)
GitHub PAT	Deployment input	Sensitive variable	Manual
Datadog API key	Kubernetes secret	Manual `kubectl create secret`	Manual

OIDC Workload Identity Flow

HCP Terraform Run
    │
    ├── identity_token "aws"   ─── JWT (aud: aws.workload.identity)
    │       │
    │       └── AWS STS AssumeRoleWithWebIdentity
    │               │
    │               └── Temporary AWS credentials (scoped IAM role)
    │
    └── identity_token "vault" ─── JWT (aud: vault.workload.identity)
            │
            └── Vault JWT Auth (mount: jwt-tfc, role: tfc-stacks)
                    │
                    └── Vault token (TTL: 20min, max: 1hr)

IAM Security

Bootstrap IAM Roles

The bootstrap creates per-environment IAM roles with scoped inline policies (not AdministratorAccess).

Allowed services:

SID	Actions	Purpose
Networking	`ec2:`, `elasticloadbalancing:`	VPC, subnets, NAT, security groups, ALBs
EKS	`eks:*`	Cluster, node groups, addons, OIDC
RDS	`rds:*`	Database instances, subnet groups
DNS	`route53:`, `acm:`	Hosted zones, records, certificates
IAM	`iam:*`	Roles, policies, OIDC providers (IRSA)
KMS	`kms:*`	Encryption keys for EKS/RDS
Observability	`logs:`, `cloudwatch:`, `sns:*`	Log groups, alarms, notifications
STS	`sts:GetCallerIdentity`, `sts:AssumeRole`	Identity verification

Not allowed: Lambda, DynamoDB, SQS, Redshift, S3 (except via EKS addons), and all other AWS services.

Role Trust Policy

Each role is locked to a specific TFC organization, project, and stack:

{
  "Condition": {
    "StringEquals": { "app.terraform.io:aud": "aws.workload.identity" },
    "StringLike": { "app.terraform.io:sub": "organization:tim-krebs-org:project:netlix-platform:stack:netlix-platform-dev:*" }
  }
}

IRSA (IAM Roles for Service Accounts)

Kubernetes workloads that need AWS API access use IRSA:

Service Account	IAM Role	Permissions
ALB Controller	`netlix-{env}-lb-controller`	ELB, EC2 (target groups, listeners)
ExternalDNS	`netlix-{env}-external-dns`	Route53 (record sets)
EBS CSI Driver	`netlix-{env}-ebs-csi`	EC2 (volumes, snapshots)

Vault Security

Namespace Isolation

HCP Vault uses hierarchical namespaces to isolate environments:

admin/                          # Root admin namespace
├── dev/                        # Dev environment
│   ├── kubernetes auth         # K8s auth for dev cluster
│   ├── pki + pki_int          # PKI CAs for dev
│   ├── database               # Dynamic DB creds for dev
│   └── secret                 # KV secrets for dev
├── staging/                    # Staging environment
│   ├── kubernetes auth
│   ├── pki + pki_int
│   ├── database
│   └── secret
├── jwt-tfc auth               # Shared: TFC OIDC auth
└── userpass auth               # Shared: bootstrap admin

Vault Policies

Policy	Scope	Capabilities	Purpose
`vso-policy`	Per-environment	Read/list on `secret/`, `database/`, `pki_int/*`	VSO secret injection
`app-policy`	Per-environment	Read on `secret/data/netlix/*`, `database/creds/netlix-app`	Application pods
`tfc-policy`	Shared (admin)	Full access (`path "*"`)	TFC Stacks provisioning
`admin-policy`	Shared (admin)	Full access (`path "*"`)	Bootstrap admin (dev only)

Note on TFC policy: HCP Vault requires path "*" at the parent namespace for cross-namespace operations. Scoped paths (sys/mounts/*, auth/*, etc.) do not propagate to child namespaces (admin/dev, admin/staging). This is a known HCP Vault limitation, not a misconfiguration. See ADR-005.

Kubernetes Auth

HCP Vault authenticates Kubernetes service accounts via the TokenReview API:

A dedicated vault-token-reviewer ServiceAccount in kube-system has system:auth-delegator cluster role
A long-lived service account token is created (required for external Vault)
Vault uses this token to call the K8s TokenReview API and validate pod identities

Auth roles:

Role	Bound SAs	Bound Namespaces	Policy
`netlix-vso`	`vault-secrets-operator`, `vault-secrets-operator-controller-manager`	`vault-secrets-operator-system`, `consul`	`vso-policy`
`netlix-app`	`netlix-app`	`netlix`	`app-policy`

Dynamic Database Credentials

Vault's database secrets engine generates short-lived PostgreSQL credentials:

App Pod → VSO → Vault (database/creds/netlix-app) → Temporary PG credentials

Creation statements create a role with specific grants
Default TTL: 1 hour
Max TTL: 24 hours
VSO renews at 67% TTL automatically

Network Security

VPC Flow Logs

All VPC traffic is logged to CloudWatch:

Log format: REJECT + ACCEPT actions
Retention: configurable
CloudWatch metric filter on REJECT actions
Alarm: > 100 rejected connections in 5 minutes triggers SNS notification

Security Groups

Resource	Inbound	Outbound
EKS Control Plane	Worker nodes (443)	Worker nodes (all)
EKS Worker Nodes	Control plane (all), self (all)	All
RDS	EKS SG (5432), HVN CIDR (5432)	None
ALB	Internet (443, 80)	Worker nodes (target ports)

Kubernetes NetworkPolicies

The consul namespace has a default-deny-all policy with explicit allow rules:

Default Deny

spec:
  podSelector: {}         # Applies to ALL pods
  policyTypes:
    - Ingress
    - Egress

Web Pod Policy

Direction	Source/Destination	Ports	Purpose
Ingress	Any (ALB)	TCP/9090	HTTP traffic from ALB
Ingress	Any namespace	TCP/20200	Prometheus metrics scraping
Egress	`app=api` pods	TCP/8080	Upstream API calls
Egress	`kube-system` namespace	UDP+TCP/53	DNS resolution
Egress	`consul` namespace	TCP/8301,8502,20000	Consul control plane
Egress	`vault-secrets-operator-system`	TCP/8200	Vault access
Egress	`datadog` namespace	TCP/8126, UDP/8125	APM traces + DogStatsD
Egress	Same namespace pods	All	Envoy mesh traffic

API Pod Policy

Direction	Source/Destination	Ports	Purpose
Ingress	`app=web` pods	TCP/8080	Traffic from web
Ingress	Any namespace	TCP/20200	Prometheus metrics
Egress	`kube-system`	UDP+TCP/53	DNS resolution
Egress	`consul` namespace	TCP/8301,8502,20000	Consul control plane
Egress	`vault-secrets-operator-system`	TCP/8200	Vault access
Egress	`datadog` namespace	TCP/8126, UDP/8125	APM + DogStatsD
Egress	Same namespace pods	All	Envoy mesh traffic

Pod Security Standards

The consul namespace applies Kubernetes Pod Security Standards:

labels:
  pod-security.kubernetes.io/enforce: baseline     # Blocks privileged containers
  pod-security.kubernetes.io/warn: restricted       # Warns on restricted violations
  pod-security.kubernetes.io/audit: restricted      # Audits restricted violations

Why baseline (not restricted) for enforce? Consul Connect sidecars may require capabilities that violate the restricted profile (e.g., NET_BIND_SERVICE). The warn + audit labels for restricted allow monitoring violations without blocking deployments. Once validated, enforce can be upgraded to restricted.

Container Security Contexts

All application containers enforce:

runAsNonRoot: true
runAsUser: 65532 (nonroot user)
allowPrivilegeEscalation: false
readOnlyRootFilesystem: true
capabilities.drop: ["ALL"]
seccompProfile.type: RuntimeDefault

Container Images

Base image: gcr.io/distroless/static-debian12:nonroot
No shell, no package manager, no writable filesystem
Trivy scans in CI and CD pipelines (CRITICAL/HIGH severity fails the build)

EKS Security

Encryption

Feature	Details
Secrets encryption	KMS envelope encryption for etcd secrets
EBS encryption	KMS-encrypted persistent volumes via EBS CSI driver
In-transit	All K8s API communication over TLS

Endpoint Access

Environment	Public Access	CIDR Restriction
Dev	Enabled	`0.0.0.0/0` (open for debugging)
Staging	Disabled	Private only (VPC internal)

Addons

Managed EKS addons with automatic updates:

CoreDNS
kube-proxy
VPC CNI (for pod networking)
EBS CSI driver (for persistent volumes)

RDS Security

Feature	Dev	Staging
Encryption at rest	KMS	KMS
Multi-AZ	No	Yes
Deletion protection	No	Yes
Final snapshot on delete	Skip	Required
Performance insights	Enabled (KMS encrypted)	Enabled (KMS encrypted)
Network access	EKS SG + HVN CIDR only	EKS SG + HVN CIDR only
Admin password	Random (Terraform-generated)	Random (Terraform-generated)
App credentials	Vault dynamic secrets	Vault dynamic secrets

Supply Chain Security

Control	Implementation
Container scanning	Trivy (CRITICAL/HIGH) in CI and CD
Dependency scanning	Go `vet` in CI
Image provenance	GHCR with SHA-pinned tags
Pre-commit hooks	`detect-private-key`, `no-commit-to-branch main`
Code ownership	`CODEOWNERS` file with team rules
Branch protection	Required CI checks before merge
Secrets detection	Pre-commit hook + `.gitignore` patterns

Security

Security

Credential Management

OIDC Workload Identity Flow

IAM Security

Bootstrap IAM Roles

Role Trust Policy

IRSA (IAM Roles for Service Accounts)

Vault Security

Namespace Isolation

Vault Policies

Kubernetes Auth

Dynamic Database Credentials

Network Security

VPC Flow Logs

Security Groups

Kubernetes NetworkPolicies

Default Deny

Web Pod Policy

API Pod Policy

Pod Security Standards

Container Security Contexts

Container Images

EKS Security

Encryption

Endpoint Access

Addons

RDS Security

Supply Chain Security

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally