Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
20 commits
Select commit Hold shift + click to select a range
1cf96f0
feat(crd): add AzureFlexNodeClass v1alpha1 CRD
chokevin Apr 22, 2026
6c4786d
chore(plugin): bump flexNodeVersion to v0.0.18
chokevin Apr 22, 2026
6bdb3a2
feat(plugin): add azure/flexvm agent pool service
chokevin Apr 22, 2026
170a427
feat(karpenter): add azure cross-region cloudprovider
chokevin Apr 22, 2026
43f1b52
feat(karpenter): add azure nodeclass status+termination controllers
chokevin Apr 22, 2026
100ca7f
feat(karpenter): wire azure cloudprovider into controller main
chokevin Apr 22, 2026
e026782
docs(karpenter): add azure example NodeClass+NodePool
chokevin Apr 22, 2026
c4b32b6
fix(azure): address P0/P1 review findings
Apr 22, 2026
6e7b5b9
fix(charts): grant rbac for azureflexnodeclasses
Apr 22, 2026
d17174b
chore(karpenter): go mod tidy
Apr 22, 2026
a4d892b
feat(catalog): add Standard_ND96isr_H100_v5 (8x H100 SKU)
Apr 22, 2026
0f6173f
userdata: regenerate containerd v2 config before aks-flex-node apply
Apr 22, 2026
244839e
userdata: write v3-schema containerd config AFTER aks-flex-node apply
Apr 22, 2026
16e0483
userdata: fix conf.d/99-nvidia.toml bin_dir override
Apr 22, 2026
9a8fdf5
flexvm: garbage-collect orphan NICs after failed VM creation
Apr 23, 2026
0a0bfb2
address copilot review comments
chokevin Apr 23, 2026
e68df85
fix(azure-flex): handle mismatched agentpool types in GC paths
chokevin Apr 24, 2026
460b21b
chore(karpenter): tidy protobuf module classification
chokevin Apr 24, 2026
3ec4f79
fix(karpenter): harden azure h200 provisioning
chokevin May 16, 2026
85ca16f
chore(karpenter): drop obsolete provider patches
chokevin May 16, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 3 additions & 1 deletion cli/internal/config/karpenter/karpenter.go
Original file line number Diff line number Diff line change
Expand Up @@ -44,9 +44,11 @@ podLabels:

controller:
nebiusCredentials:
enabled: true
{{- if .NebiusCredentialsJSON }}
enabled: true
credentialsJSON: {{ .NebiusCredentialsJSON }}
{{- else }}
enabled: false
{{- end }}
image:
digest: ""
Expand Down
27 changes: 22 additions & 5 deletions docs/usages/karpenter.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,7 +46,9 @@ $ kubectl create namespace karpenter

### 2. Locate your Nebius credentials file

The karpenter controller needs Nebius API credentials to provision VMs. The credentials file is a JSON file generated by the Nebius console (see the [Nebius authorized keys documentation](https://docs.nebius.com/iam/service-accounts/authorized-keys)).
This step is only needed if you plan to provision Nebius nodes. Azure and Azure Flex H200 nodes do not require Nebius credentials.

For Nebius, the karpenter controller needs Nebius API credentials to provision VMs. The credentials file is a JSON file generated by the Nebius console (see the [Nebius authorized keys documentation](https://docs.nebius.com/iam/service-accounts/authorized-keys)).

Note the local path to this file — you will pass it to the CLI in step 4 via `--nebius-credentials-file`. The chart will create the `nebius-credentials` Secret in the `karpenter` namespace automatically during `helm upgrade --install`; no separate `kubectl create secret` step is needed.

Expand All @@ -65,15 +67,16 @@ The template also creates a **federated identity credential** that pairs the man

### 4. Generate the Helm values file and install

Use the CLI to generate a `karpenter_values.yaml` file with all required values pre-populated. Pass `--nebius-credentials-file` to have the chart create the `nebius-credentials` Secret automatically, and `--ssh-public-key-file` to embed the SSH public key used when bootstrapping provisioned nodes:
Use the CLI to generate a `karpenter_values.yaml` file with all required values pre-populated. Pass `--ssh-public-key-file` to embed the SSH public key used when bootstrapping provisioned nodes:

```bash
$ aks-flex-cli config karpenter helm \
--nebius-credentials-file ~/.nebius/credentials.json \
--ssh-public-key-file ~/.ssh/id_ed25519.pub
```

The command reads both files, embeds their contents into `karpenter_values.yaml`, and prints the install command to stdout:
If you also use Nebius, add `--nebius-credentials-file ~/.nebius/credentials.json` so the chart creates and mounts the Nebius credentials Secret. For Azure-only H200 clusters, omit it; the generated values will keep `controller.nebiusCredentials.enabled: false`.

The command reads the files, embeds their contents into `karpenter_values.yaml`, and prints the install command to stdout:

```
helm upgrade --install karpenter charts/karpenter \
Expand Down Expand Up @@ -101,7 +104,7 @@ podLabels:

controller:
nebiusCredentials:
enabled: true
enabled: false
image:
digest: ""
env:
Expand Down Expand Up @@ -246,6 +249,20 @@ azure-cpu-nodepool-6rhlk aks-azure-cpu-nodepo
> aks-flex-cli aks deploy --nvidia-dra-driver --skip-arm
> ```

### Creating an Azure Flex H200 NodePool

For cross-region Azure H200 nodes, apply one Azure Flex NodeClass and NodePool per Azure region, then deploy a GPU workload with a matching toleration and node affinity:

```bash
$ kubectl apply -f examples/azure/azureflexnodeclass-h200-eastus2.yaml
$ kubectl apply -f examples/azure/nodepool-h200.yaml
$ kubectl apply -f examples/azure/azureflexnodeclass-h200-eastus2euap.yaml
$ kubectl apply -f examples/azure/nodepool-h200-eastus2euap.yaml
$ kubectl apply -f examples/azure/h200_deployment.yaml
```

Each H200 NodePool references exactly one `AzureFlexNodeClass`, so use separate NodePools when trying the same `Standard_ND96isr_H200_v5` SKU in both `eastus2` and `eastus2euap`. The H200 NodePools must have a non-zero `limits.nvidia.com/gpu` value. Set each one to `8` for one node in that region, or a higher multiple of 8 for more nodes.

## Creating Nodes on Nebius via Karpenter

With the karpenter controller running, you can define a `NebiusNodeClass` and `NodePool` to tell Karpenter how and when to provision Nebius nodes.
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,214 @@
---
apiVersion: apiextensions.k8s.io/v1
kind: CustomResourceDefinition
metadata:
annotations:
controller-gen.kubebuilder.io/version: v0.20.1
name: azureflexnodeclasses.flex.aks.azure.com
spec:
group: flex.aks.azure.com
names:
categories:
- karpenter
- nap
kind: AzureFlexNodeClass
listKind: AzureFlexNodeClassList
plural: azureflexnodeclasses
shortNames:
- afnc
- afncs
singular: azureflexnodeclass
scope: Cluster
versions:
- additionalPrinterColumns:
- jsonPath: .status.conditions[?(@.type=='Ready')].status
name: Ready
type: string
- jsonPath: .metadata.creationTimestamp
name: Age
type: date
name: v1alpha1
schema:
openAPIV3Schema:
description: |-
AzureFlexNodeClass is the Schema for the AzureFlexNodeClass API.

It enables a NodePool in an AKS cluster to auto-provision external Azure VMs in a
(potentially different) Azure region than the AKS cluster's own region. Each node
is a single VM (not VMSS) so that cross-region placement is straightforward.
properties:
apiVersion:
description: |-
APIVersion defines the versioned schema of this representation of an object.
Servers should convert recognized schemas to the latest internal value, and
may reject unrecognized values.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#resources
type: string
kind:
description: |-
Kind is a string value representing the REST resource this object represents.
Servers may infer this from the endpoint the client submits requests to.
Cannot be updated.
In CamelCase.
More info: https://git.k8s.io/community/contributors/devel/sig-architecture/api-conventions.md#types-kinds
type: string
metadata:
type: object
spec:
description: |-
AzureFlexNodeClassSpec is the spec for AzureFlexNodeClass.

Phase 1 scope (issue #63): single region per NodeClass, no spot, no zones,
no identity/UAMI per-NodeClass (the controller MI is assumed to have
Contributor on the target subscription/RG/subnet), no quota preflight,
no PPG/capacity reservation, no spot, no WireGuard.
properties:
allocateNodePublicIP:
default: false
description: AllocateNodePublicIP controls whether each node receives
a public IP.
type: boolean
imageID:
description: ImageID is a SIG / community gallery image resource ID.
Mutually exclusive with ImageReference.
type: string
imageReference:
description: |-
ImageReference selects an Azure Marketplace image. Mutually exclusive with ImageID.
If neither is set, defaults to microsoft-dsvm/ubuntu-hpc/2204/latest.
properties:
offer:
type: string
publisher:
type: string
sku:
type: string
version:
default: latest
type: string
required:
- offer
- publisher
- sku
type: object
location:
description: Location is the Azure region (e.g. "eastus2"). May differ
from the AKS cluster region.
type: string
maxPodsPerNode:
default: 110
description: MaxPodsPerNode is advertised in the node's capacity and
affects Karpenter scheduling.
format: int32
type: integer
osDiskSizeGB:
default: 128
description: OSDiskSizeGB is the size of the OS disk in GB.
format: int32
type: integer
resourceGroup:
description: |-
ResourceGroup is the resource group where VMs, NICs, and OS disks land.
Must already exist.
type: string
securityType:
default: Standard
description: |-
SecurityType selects the VM security profile. Currently only "Standard" is supported.
TrustedLaunch is deferred — it has been observed to break the DSVM image.
enum:
- Standard
type: string
sshPublicKeys:
description: SSHPublicKeys is the list of SSH public keys to install
on each node.
items:
type: string
type: array
subnetID:
description: |-
SubnetID is the full ARM resource ID of the subnet (must already exist
and be reachable from the AKS cluster).
type: string
subscriptionID:
description: SubscriptionID is the Azure subscription where VMs will
be created.
type: string
tags:
additionalProperties:
type: string
description: Tags are applied to every Azure resource (VM, NIC, OS
disk) created from this NodeClass.
type: object
required:
- location
- resourceGroup
- subnetID
- subscriptionID
type: object
status:
description: status contains the resolved state of the AzureFlexNodeClass.
properties:
conditions:
description: conditions contains signals for health and readiness
items:
description: Condition aliases the upstream type and adds additional
helper methods
properties:
lastTransitionTime:
description: |-
lastTransitionTime is the last time the condition transitioned from one status to another.
This should be when the underlying condition changed. If that is not known, then using the time when the API field changed is acceptable.
format: date-time
type: string
message:
description: |-
message is a human readable message indicating details about the transition.
This may be an empty string.
maxLength: 32768
type: string
observedGeneration:
description: |-
observedGeneration represents the .metadata.generation that the condition was set based upon.
For instance, if .metadata.generation is currently 12, but the .status.conditions[x].observedGeneration is 9, the condition is out of date
with respect to the current state of the instance.
format: int64
minimum: 0
type: integer
reason:
description: |-
reason contains a programmatic identifier indicating the reason for the condition's last transition.
Producers of specific condition types may define expected values and meanings for this field,
and whether the values are considered a guaranteed API.
The value should be a CamelCase string.
This field may not be empty.
maxLength: 1024
minLength: 1
pattern: ^[A-Za-z]([A-Za-z0-9_,:]*[A-Za-z0-9_])?$
type: string
status:
description: status of the condition, one of True, False, Unknown.
enum:
- "True"
- "False"
- Unknown
type: string
type:
description: type of condition in CamelCase or in foo.example.com/CamelCase.
maxLength: 316
pattern: ^([a-z0-9]([-a-z0-9]*[a-z0-9])?(\.[a-z0-9]([-a-z0-9]*[a-z0-9])?)*/)?(([A-Za-z0-9][-A-Za-z0-9_.]*)?[A-Za-z0-9])$
type: string
required:
- lastTransitionTime
- message
- reason
- status
- type
type: object
type: array
type: object
type: object
served: true
storage: true
subresources:
status: {}
2 changes: 1 addition & 1 deletion karpenter/charts/karpenter/templates/clusterrole-core.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -71,7 +71,7 @@ rules:
{{- if .Values.webhook.enabled }}
- apiGroups: ["apiextensions.k8s.io"]
resources: ["customresourcedefinitions/status"]
resourceNames: ["aksnodeclasses.karpenter.azure.com", "nodepools.karpenter.sh", "nodeclaims.karpenter.sh", "nebiusnodeclasses.flex.aks.azure.com"]
resourceNames: ["aksnodeclasses.karpenter.azure.com", "nodepools.karpenter.sh", "nodeclaims.karpenter.sh", "nebiusnodeclasses.flex.aks.azure.com", "azureflexnodeclasses.flex.aks.azure.com", "kaitonodeclasses.kaito.sh"]
verbs: ["patch"]
- apiGroups: ["apiextensions.k8s.io"]
resources: ["customresourcedefinitions"]
Expand Down
4 changes: 2 additions & 2 deletions karpenter/charts/karpenter/templates/clusterrole.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -33,7 +33,7 @@ rules:
resources: ["aksnodeclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: ["flex.aks.azure.com"]
resources: ["nebiusnodeclasses"]
resources: ["nebiusnodeclasses", "azureflexnodeclasses"]
verbs: ["get", "list", "watch"]
- apiGroups: ["kaito.sh"]
resources: ["kaitonodeclasses"]
Expand All @@ -43,7 +43,7 @@ rules:
resources: ["aksnodeclasses", "aksnodeclasses/status"]
verbs: ["patch", "update"]
- apiGroups: ["flex.aks.azure.com"]
resources: ["nebiusnodeclasses", "nebiusnodeclasses/status"]
resources: ["nebiusnodeclasses", "nebiusnodeclasses/status", "azureflexnodeclasses", "azureflexnodeclasses/status"]
verbs: ["patch", "update"]
- apiGroups: ["kaito.sh"]
resources: ["kaitonodeclasses", "kaitonodeclasses/status"]
Expand Down
Loading
Loading