AWS provider's `accelerator` label has reservation-scoped semantics, diverging from other MNNVL-aware providers

# AWS provider's `accelerator` label has reservation-scoped semantics, diverging from other MNNVL-aware providers

## Context

Topograph's `network.topology.nvidia.com/accelerator` label is defined as an accelerated interconnect domain identifier — in practice, operators treat it as "same-value → same NVLink fabric". Four of topograph's five providers that emit this label honor that contract by deriving the value from a Fabric-Manager-derived NVLink clique ID:

| Provider | Source of `accelerator` value |
|---|---|
| `dra` | `nvidia.com/gpu.clique` label from the NVIDIA GPU Operator ([`pkg/providers/dra/provider.go`](https://github.com/NVIDIA/topograph/blob/main/pkg/providers/dra/provider.go#L28)) |
| `infiniband-bm` | `ClusterUUID.CliqueId` via `nvidia-smi` ([`pkg/providers/infiniband/bm.go`](https://github.com/NVIDIA/topograph/blob/main/pkg/providers/infiniband/bm.go#L24-L36)) |
| `infiniband-k8s` | `ClusterUUID.CliqueId` from the device plugin's annotations ([`pkg/providers/infiniband/k8s.go`](https://github.com/NVIDIA/topograph/blob/main/pkg/providers/infiniband/k8s.go#L97-L123)) |
| `lambdai` | `NVLink.DomainID.CliqueID` from the Lambda AI API ([`pkg/providers/lambdai/provider.go`](https://github.com/NVIDIA/topograph/blob/main/pkg/providers/lambdai/provider.go#L71)) |

The AWS provider is the exception — it derives `accelerator` from AWS's `CapacityBlockId` attribute:

```go
// pkg/providers/aws/instance_topology.go:110-111
if inst.CapacityBlockId != nil {
    topo.AcceleratorID = *inst.CapacityBlockId
}
```

Per the [AWS EC2 API reference for `InstanceTopology`](https://docs.aws.amazon.com/AWSEC2/latest/APIReference/API_InstanceTopology.html), on UltraServer instances `CapacityBlockId` "identifies instances within the UltraServer domain" — it is a **reservation-scoped identifier**, not an NVLink-partition identifier. AWS's explicit "same NVLink domain" label is [`topology.k8s.aws/ultraserver-id`](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-hyperpod-topology.html).

## Why this matters

Per the Run:ai [canonical definition](https://run-ai-docs.nvidia.com/saas/platform-management/aiinitiatives/resources/using-gb200), a clique is "a logical split of the MNNVL into smaller domains". An UltraServer can contain multiple cliques (e.g., an x72 split into two x36 halves), and a clique can be absent when NVIDIA Fabric Manager has not completed init (NVML reports `NVML_GPU_FABRIC_STATE_COMPLETED` is required before the label is written).

The practical consequence is that two nodes with the same `network.topology.nvidia.com/accelerator` value can mean:

- **On DRA / InfiniBand / Lambda AI providers**: same NVLink fabric (operator's likely mental model)
- **On the AWS provider**: same UltraServer reservation, which is co-extensive with the UltraServer-level MNNVL domain on P6e-GB200 but may contain multiple cliques — so "same accelerator" is coarser than "same NVLink partition"

Empirical data from an NVIDIA-internal cluster confirms the N-cliques-per-CapacityBlock case in production: multiple distinct `nvidia.com/gpu.clique` values were observed within a single `topology.k8s.aws/capacity-block-id`, with some Capacity Blocks having no clique ID at all.

A downstream scheduler's CEL rule or `podAffinity` expression that assumes `accelerator` equality implies NVLink reachability will therefore be accurate on DRA/IB/Lambda AI but can over-colocate on AWS.

## Options

1. **Keep AWS as-is, document.** The docs in [#289](https://github.com/NVIDIA/topograph/pull/289) already clarify the semantic difference. Schedulers that need true NVLink-partition granularity can consume `nvidia.com/gpu.clique` or `topology.k8s.aws/ultraserver-id` directly on AWS nodes.
2. **AWS provider prefers `topology.k8s.aws/ultraserver-id` when present, falls back to `CapacityBlockId`.** This is the cleanest semantic fix for the UltraServer case — `ultraserver-id` is AWS's documented "same NVLink domain" identifier. Still doesn't capture within-UltraServer cliques on partitioned hardware.
3. **AWS provider prefers `nvidia.com/gpu.clique` (from the GPU Operator on MNNVL nodes) when present, falls back to `CapacityBlockId`.** Closest alignment with other providers' semantics. Requires the AWS provider to read node labels (it already runs as a Kubernetes-aware workload), and has to reason about the non-MNNVL case where no clique label exists.
4. **Emit two distinct labels**, e.g., `accelerator` (broad — UltraServer / Capacity Block level) and `accelerator-partition` (fine-grained — NVLink clique level). Breaking change for current consumers.
5. **Change the default on AWS** to `ultraserver-id`, keep `CapacityBlockId` behind an engineParams flag for operators who still want reservation-level grouping.

## Ask

This is an inquiry to @dmitsh and @ravisoundar to assess.  

## Related

- [#289](https://github.com/NVIDIA/topograph/pull/289) — documentation refinement surfacing this distinction
- NVIDIA/NVSentinel#1205 — cross-repo integration discussion that raised the question


Provider	Source of `accelerator` value
`dra`	`nvidia.com/gpu.clique` label from the NVIDIA GPU Operator (`pkg/providers/dra/provider.go`)
`infiniband-bm`	`ClusterUUID.CliqueId` via `nvidia-smi` (`pkg/providers/infiniband/bm.go`)
`infiniband-k8s`	`ClusterUUID.CliqueId` from the device plugin's annotations (`pkg/providers/infiniband/k8s.go`)
`lambdai`	`NVLink.DomainID.CliqueID` from the Lambda AI API (`pkg/providers/lambdai/provider.go`)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

AWS provider's `accelerator` label has reservation-scoped semantics, diverging from other MNNVL-aware providers #293

AWS provider's `accelerator` label has reservation-scoped semantics, diverging from other MNNVL-aware providers

Context

Why this matters

Options

Ask

Related

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

AWS provider's accelerator label has reservation-scoped semantics, diverging from other MNNVL-aware providers #293

Description

AWS provider's accelerator label has reservation-scoped semantics, diverging from other MNNVL-aware providers

Context

Why this matters

Options

Ask

Related

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

AWS provider's `accelerator` label has reservation-scoped semantics, diverging from other MNNVL-aware providers #293

AWS provider's `accelerator` label has reservation-scoped semantics, diverging from other MNNVL-aware providers