Skip to content

Question: how will Grove support intra-node NUMA-aware scheduling? #644

@yankay

Description

@yankay

Hi maintainers, thanks for GREP-244.

I noticed TopologyDomainNuma = "numa" is defined in the API and Story 3 (NUMA-Aware GPU Benchmarking) is listed as a motivating user story, but I cannot find any controller/scheduler logic that consumes it.

The current ClusterTopologyBinding model requires each domain to map to a Node label key. That works for rack/zone/host (one Node → one label value), but NUMA sockets live inside a Node and are never exposed as Node labels. So Story 3 ("2 GPUs from an 8-GPU node on the same NUMA node") does not seem reachable through node-label pack alone.

Related context:

Question: which direction is planned?

  1. DRA path — rely on upstream DRA (pcieRoot/NUMA attributes) ; Grove stays at the node-domain layer and possibly injects ResourceClaim selectors (similar to auto-MNNVL).
  2. Scheduler path — push intra-node NUMA awareness into the scheduler backend (e.g. KAI, see kai-scheduler/KAI-Scheduler#1598,Volcano numa-aware).
  3. Others

Happy to help with a prototype or follow-up GREP once the direction is clear. Thanks!

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions