Skip to content

EAI_5684 rancher disk#232

Merged
brownzebra merged 19 commits into
mainfrom
EAI_5684_rancher_disk
May 4, 2026
Merged

EAI_5684 rancher disk#232
brownzebra merged 19 commits into
mainfrom
EAI_5684_rancher_disk

Conversation

@woojae-siloai
Copy link
Copy Markdown
Contributor

@woojae-siloai woojae-siloai commented Apr 27, 2026

Summary
Add complete support for RANCHER_DISK parameter to dedicate separate storage for /var/lib/rancher directory, designed primarily for GPU worker nodes with intensive workloads.

What's New
RANCHER_DISK Parameter:

  • Dedicates separate disk for /var/lib/rancher (kubelet and container runtime data)
  • Handles raw device paths like CLUSTER_DISKS (automatic formatting and mounting)
  • Highly recommended for GPU worker nodes with heavy workloads
  • Optional for control plane nodes and CPU worker nodes
  • Can be combined with CLUSTER_DISKS and CLUSTER_PREMOUNTED_DISKS

Key Features

  • Automatic Setup: Device formatting, mounting at /var/lib/rancher, fstab persistence
  • Complete Cleanup: Unmounting, fstab cleanup, clean directory restoration
  • Auto-Discovery: Cleanup command automatically finds all bloom storage without config
  • Consistent Previews: Same cleanup information whether using config file or not

Implementation
Core Components:

  • Configuration schema validation and Go config parsing
  • Ansible playbook for device setup (rancher_storage.yaml)
  • Cleanup functions with proper device management
  • Auto-discovery system for consistent cleanup previews

Documentation:

  • Complete parameter documentation and examples
  • GPU worker node guidance and recommendations

Example Usage

GPU Worker Node (Recommended)

CLUSTER_DISKS: /dev/nvme0n1          # Application storage
RANCHER_DISK: /dev/nvme2n1           # Dedicated /var/lib/rancher

Control Plane (Optional)

RANCHER_DISK: /dev/nvme1n1           # Optional dedicated RKE2 storage

@woojae-siloai woojae-siloai marked this pull request as ready for review April 27, 2026 12:42
Q-Dub
Q-Dub previously approved these changes Apr 28, 2026
Copy link
Copy Markdown
Contributor

@Q-Dub Q-Dub left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@woojae-siloai
Copy link
Copy Markdown
Contributor Author

woojae-siloai commented Apr 29, 2026

@Q-Dub
I updated a bit to cover a legacy node case that which is bloomed with mounting /var/lib/rancher and updating fstab manually configured. Tested with medium and large size including mock legacy case on pd3.

@woojae-siloai woojae-siloai requested a review from Q-Dub April 29, 2026 12:02
Copy link
Copy Markdown
Contributor

@brownzebra brownzebra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@brownzebra brownzebra merged commit 5cd4c72 into main May 4, 2026
3 checks passed
@brownzebra brownzebra deleted the EAI_5684_rancher_disk branch May 4, 2026 07:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants