Skip to content

feat: ability to "decommission" (or drain) a cluster #1969

@ajf

Description

@ajf

Is this a new feature, an enhancement, or a change to existing functionality?

New Feature

How would you describe the priority of this feature request

Medium

Please provide a clear description of problem this feature solves

We should probably have a way to decommission a site that results in all the hardware reset to factory defaults, and prevent re-ingestion of machines while the machine is being decommissioned.

The way I see this working is:

  • Every node gets "prevent allocations" health alert
  • Add flag to prevent new machines from becoming managed hosts (we probably still want to issue them DHCP addresses). I think there's already a create_machines config value for this.
  • Any machine in the ready state we reset the Host BMC and DPU BMC to factory defaults and reboot them.
  • Remove the managed host and credentials for each host or somehow label them with "decommissioned state" and do nothing else with them.

Need to make sure that the machine have gotten scrabbed (deprovisioned) so that's why I'd say as the ready state.

Feature Description

As a site administrator, I want to be able to re-install an older version of NICo or have continuous deployment pipelines reset sites to zero. I need to be able to reset the hardware to a known good state so I can re-setup the cluster from scratch

As a site administrator, I want to uninstall NICo because I'm done testing it and want to manage the hardware in some other way. I need to be able to reset machines to a known good state so another provisioning system can take over.

Describe your ideal solution

No response

Describe any alternatives you have considered

No response

Additional context

No response

Code of Conduct

  • I agree to follow NCX Infra Controller's Code of Conduct
  • I have searched the open feature requests and have found no duplicates for this feature request

Metadata

Metadata

Assignees

Labels

featureFeature (deprecated - use issue type, but it's needed for reporting now)interest/dsxroadmapRoadmap item with program-level tracking
No fields configured for Enhancement.

Projects

Status

Backlog

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions