Deploy WEKA distributed storage on Amazon EKS using the WEKA Operator.
A WEKA storage cluster is created with dedicated backend instances. EKS worker nodes run WEKA client containers that connect to the backend over the network. Applications access WEKA storage via the CSI plugin and PersistentVolumeClaims.
WEKA backend and client processes run together on the same EKS nodes. Each node contributes local NVMe storage to the distributed filesystem while also running application workloads.
Similar to weka-dedicated, with a standalone WEKA storage cluster and an EKS cluster for worker nodes and application pods. However, client instances are provisioned and managed by SageMaker HyperPod, and then added to the EKS cluster as worker nodes.
Similar to weka-axon, but SageMaker HyperPod provisions the underlying EC2 instances. Those instances are added to an EKS cluster, where they're used for deploying both the WEKA cluster and worker pods.
Each deployment model is self-contained; see its README for step-by-step instructions. Shared Terraform modules (EKS, weka-backend) live in modules/ and are referenced by each deployment model.
- AWS CLI configured with appropriate permissions
- Existing VPC with subnets (private subnets recommended)
- Terraform >= 1.5
- kubectl, Helm 3.x
- WEKA download token from get.weka.io
- Quay.io credentials for WEKA container images (available at get.weka.io)
WEKA integrates with Kubernetes using the standard CSI (Container Storage Interface) pattern:
┌──────────────────────────────────────────────────────────────────────┐
│ Kubernetes Cluster │
│ │
│ 1. WEKA Operator 2. CSI Plugin 3. Your Pods │
│ ┌─────────────────┐ ┌─────────────────┐ ┌──────────────┐ │
│ │ Deploys WEKA │ │ Provisions PVs │ │ Mount WEKA │ │
│ │ client containers│ ──▶ │ from WEKA │ ──▶ │ via PVC │ │
│ │ on selected nodes│ │ filesystem │ │ │ │
│ └─────────────────┘ └─────────────────┘ └──────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────┐ │
│ │ WekaClient CRD │ Runs WEKA client process on nodes with │
│ │ │ label: weka.io/supports-clients=true │
│ └─────────────────┘ │
└──────────────────────────────────────────────────────────────────────┘
The general flow for a deployment is:
-
Deploy Infrastructure
- A
terraform.tfvars.exampleis provided as a starting point - Terraform builds the WEKA backend (dedicated) and/or the EKS cluster, depending on the deployment model
- Assumes existing infrastructure (e.g. VPC, subnets)
- A
-
Deploy the WEKA Operator
- A Helm chart installs the operator
-
Deploy WEKA resources
WekaCluster(axon) andWekaClientCRs- Core manifests are provided
-
Install the CSI plugin
- Bundled with the operator in axon, installed separately in dedicated
-
Test with a PVC and pod
- Examples are provided for creating a
StorageClassandPVCthat application pods can use
- Examples are provided for creating a
