Skip to content

config node-bootstrap: add --node-label and --taint flags#62

Open
chokevin wants to merge 2 commits into
Azure:mainfrom
chokevin:chokevin/node-bootstrap-flags
Open

config node-bootstrap: add --node-label and --taint flags#62
chokevin wants to merge 2 commits into
Azure:mainfrom
chokevin:chokevin/node-bootstrap-flags

Conversation

@chokevin
Copy link
Copy Markdown
Contributor

Problem

config node-bootstrap flex|ubuntu populates the kubeadm Config from the live AKS cluster (cluster MC name, kubernetes.azure.com/managed=false, aks.azure.com/stretch-managed=true) but offers no way for the operator to add extra labels or any taints.

This matters because:

  • Karpenter / NodePool partitioning — typical use is to label nodes by SKU, region, GPU model, capacity-pool name, etc. for nodeAffinity/topology spread. Without --node-label you have to hand-edit the rendered cloud-init.
  • GPU workloadsnvidia.com/gpu=present:NoSchedule is the standard taint to keep general workloads off GPU nodes; without --taint the same hand-edit is needed.

The proto (kubeadm.Config) already supports both node_labels (map) and register_with_taints ([]Taint), and helpers AddNodeLabels / AddK8SRegisterTaints already exist on the generated type — this PR is purely surfacing them in the CLI.

Change

Two new flags on config node-bootstrap:

--node-label key=value      (repeatable)
--taint key[=value]:Effect  (repeatable; Effect ∈ {NoSchedule,PreferNoSchedule,NoExecute})

Parsed in a small flags.go (with unit tests) and merged into the kubeadm Config returned by DefaultKubeadmConfig for both the flex and ubuntu writers.

Validation

  • go build ./... && go test ./... clean
  • Unit tests cover key=value, empty value, domain-style keys, missing =, empty key, duplicates, all three taint effects, key-only taints, and rejection of bogus effects.
  • Smoke-tested against a real cluster:
$ config node-bootstrap flex --variant script \\
    --node-label nvidia.com/gpu.product=H200 \\
    --node-label kvinodu/region=eastus2 \\
    --taint nvidia.com/gpu=present:NoSchedule

renders both into the kubeadm join JoinConfiguration.NodeRegistration.

Context

Found while joining ND96isr_H200_v5 nodes from eastus2 into a westeurope-based aks-flex cluster. Without these flags I had to bypass config node-bootstrap entirely and hand-build the spec in Python — see the upstream gaps doc that motivated this PR alongside #61.

Copilot AI review requested due to automatic review settings April 22, 2026 04:01
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds CLI support to apply additional Kubernetes node labels and taints when generating config node-bootstrap output, by parsing new flags and merging them into the kubeadm join configuration derived from the live AKS cluster (or placeholders).

Changes:

  • Add repeatable --node-label key=value and --taint key[=value]:Effect flags to config node-bootstrap.
  • Merge parsed labels/taints into the generated kubeadm Config for both flex and ubuntu node-bootstrap writers.
  • Introduce a small flag-parsing helper with unit tests.

Reviewed changes

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

File Description
cli/internal/config/nodebootstrap/nodebootstrap.go Adds new CLI flags and merges parsed values into the kubeadm config used for userdata rendering.
cli/internal/config/nodebootstrap/flags.go Implements parsing for --node-label and --taint.
cli/internal/config/nodebootstrap/flags_test.go Unit tests for label/taint parsing behavior and error cases.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread cli/internal/config/nodebootstrap/flags.go Outdated
Kevin Choi and others added 2 commits April 22, 2026 10:49
Allow callers of 'config node-bootstrap flex|ubuntu' to register the
node with extra labels and taints in addition to the AKS-derived defaults.

The proto already supports node_labels and register_with_taints; this
just exposes them as CLI flags and merges them into the kubeadm config
returned by configcmd.DefaultKubeadmConfig.

Without these flags, anyone who needs a custom label (e.g. SKU/region
partitioning for Karpenter) or the standard nvidia.com/gpu=present:NoSchedule
taint has to hand-edit the rendered cloud-init or fork the CLI.
Co-authored-by: Copilot <175728472+Copilot@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants