__________________ _____
| ___| _ \ ___ \/ ___|
| |_ | | | | |_/ /\ `--.
| _| | | | | / `--. \
| | | |/ /| |\ \ /\__/ /
\_| |___/ \_| \_|\____/
F D R S - Fully Dynamic Resource Scheduler
FDRS (Fully Dynamic Resource Scheduler) is a Python-based tool designed to automate and optimize resource management in VMware (by Broadcom) vSphere environments. It provides intelligent auto-balancing of cluster workloads and enforcement of VM anti-affinity rules to enhance performance and stability.
- Cluster Auto Balancing:
- Dynamically balances clusters based on CPU, Memory, Network I/O, and Disk I/O metrics.
- The goal is to keep the percentage point difference between the most and least utilized hosts (for each selected metric) within defined limits.
- These limits are controlled by the
--aggressivenessflag:- Level 5: Max 5% difference
- Level 4: Max 10% difference
- Level 3 (Default): Max 15% difference
- Level 2: Max 20% difference
- Level 1: Max 25% difference
- Smart Anti-Affinity Rules:
- Automatically distributes VMs based on their names to improve resilience and performance.
- VMs with the same name prefix (e.g., "webserver" derived from "webserver01", "webserver02" or "webserver123") are considered part of an anti-affinity group.
- The rule ensures that the number of these sibling VMs on any single host does not differ by more than 1 from the count on any other host in the cluster.
- Iterative Planning Mode (
--iterativeflag):- Guarantees convergence to optimal or near-optimal state by repeating the planning cycle.
- Achieves 99%+ convergence for both anti-affinity satisfaction AND resource balance simultaneously.
- Useful for complex clusters where single-pass planning may leave unresolved issues.
- Adaptive thresholds on iteration 2+ prevent deadlocks and ensure progress.
- Cron Support: The CLI tool can be scheduled with cron jobs for automated execution.
- Support for VMware vSphere: Designed to work seamlessly with VMware vSphere Standard licensed clusters (DRS not required for FDRS functionality).
- Dry-Run Mode: Allows users to preview planned migrations without executing them, ensuring safety and control.
- Max Migration: Allow to control migration count per run by the
--max-migrationsflag (Default: 20 )
- Workflow Priority: FDRS processes rules in a specific order.
- Anti-Affinity First: It first evaluates and plans migrations to satisfy anti-affinity rules.
- Resource Balancing: After anti-affinity considerations, it evaluates the cluster for resource imbalances (CPU, Memory, Disk I/O, Network Throughput) and plans further migrations if necessary.
- Single-Pass vs. Iterative Planning:
- Single-Pass (Default): Executes one complete cycle of anti-affinity + balancing. Fast but may achieve 50-80% convergence in complex scenarios.
- Iterative Planning (
--iterativeflag): Repeats the planning cycle until convergence (AA violations = 0 AND cluster balanced). Achieves 99%+ convergence. Adds 2-3x execution time but guarantees optimal results. - Convergence Check: After each iteration, FDRS verifies both objectives are satisfied and exits early if converged.
- VM Grouping (Anti-Affinity): For anti-affinity, VMs are grouped based on their name prefix. The prefix is determined by removing the last numerical characters of the VM name.
- Balancing Mechanism (Resource Balancing): Resource balancing aims to ensure that for any given metric (CPU, Memory, Disk I/O, Network Throughput), the difference in utilization percentage between the most loaded host and the least loaded host does not exceed the threshold defined by the chosen aggressiveness level.
- Clone the repository:
git clone https://github.com/fsolen/vsphere_scheduler.git cd vsphere_scheduler - Ensure you have Python installed (Python 3.7+ is recommended).
- Install the required dependencies:
pip install -r requirements.txt
Note: All examples require vCenter connection arguments: --vcenter <vc_ip_or_hostname> --username <user>. The --password argument is optional; if not provided, FDRS will prompt you to enter the password securely (recommended for security).
You can optionally specify a --cluster <cluster_name> to target a specific cluster within vCenter. If omitted, all clusters are processed.
Runs the full FDRS workflow: first applies anti-affinity rules, then performs resource balancing using default aggressiveness (Level 3) for all metrics.
Option 1: With password provided
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass>Option 2: Without password (will prompt securely)
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user>To run FDRS against only one cluster within vCenter, use the --cluster flag. This is useful in multi-cluster environments where you want to balance resources within a specific cluster only.
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --cluster <cluster_name>Example with all options:
python fdrs.py --vcenter <vc_ip_or_hostname> --username admin --cluster prod-cluster-01 --balance --aggressiveness 4Focuses only on balancing specified metrics (CPU and Memory in this example) with a specific aggressiveness level (Level 4: Max 10% difference). Anti-affinity rules are not specifically enforced in this mode beyond what MigrationManager might consider for placement safety if it were extended to do so.
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --balance --metrics cpu,memory --aggressiveness 4 --max-migrations 50This command only evaluates and enforces anti-affinity rules, making any necessary migrations to satisfy them. Resource balancing is not performed.
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --apply-anti-affinityThis command only evaluates and enforces anti-affinity rules, making any necessary migrations to satisfy them. Resource balancing is not performed.
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --ignore-anti-affinityThe --iterative flag enables iterative planning mode, which guarantees both anti-affinity satisfaction AND resource balance by repeating the planning cycle until convergence or maximum iterations reached.
When to use Iterative Mode:
- Production optimization requiring guaranteed convergence
- Complex anti-affinity rules with multiple VM groups
- Resource-constrained clusters where single-pass optimization is insufficient
- When both AA satisfaction and balance quality are critical
Convergence Guarantee:
- Single-pass mode: ~50-80% convergence
- Iterative mode: 99%+ convergence (both objectives satisfied simultaneously)
Typical behavior:
- Iteration 1: Fixes most anti-affinity violations and initial balance (~70-80% improvement)
- Iteration 2: Re-balances after AA migrations, handles edge cases (~90-95% improvement)
- Iteration 3: Final convergence pass (~99%+ optimization)
- Early exit: If converged before reaching max iterations
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --balance --iterativeFor very complex scenarios or resource-constrained clusters, you can increase the maximum iterations:
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --balance --iterative --max-iterations 4Apply anti-affinity rules iteratively until all violations are resolved:
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --apply-anti-affinity --iterativeCombine iterative mode with specific aggressiveness level:
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --balance --iterative --aggressiveness 4 --metrics cpu,memoryApply iterative balancing to a specific cluster only:
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --cluster <cluster_name> --balance --iterativeConfiguration: Iterative mode behavior can be customized in config/fdrs_config.yaml:
iterative:
max_iterations: 3 # Default maximum iterations
threshold_multiplier: 1.05 # Loosens constraints on iteration 2+ to prevent deadlock
convergence_timeout_seconds: 300See Also: ITERATIVE_CONVERGENCE_GUARANTEES.md for mathematical proof of convergence and detailed scenario analysis.
To see what migrations FDRS would perform without actually making any changes, add the --dry-run flag to any command:
python fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> --dry-runSchedule the default FDRS workflow (anti-affinity + balancing) to run daily at midnight and log output. Adjust paths as necessary.
0 0 * * * /usr/bin/python /path/to/your/cloned/repo/vmware_cluster/fdrs.py --vcenter <vc_ip_or_hostname> --username <user> --password <pass> >> /var/log/fdrs.log 2>&1For a full list of options and detailed explanations:
python fdrs.py --help-
Interactive Password Prompt: If you don't provide the
--passwordargument, FDRS will prompt you to enter your vCenter password securely. This is recommended for security since it avoids storing passwords in shell history or scripts. -
Providing Password via Argument: You can provide the password directly using
--password <pass>, which is useful for automated scripts and cron jobs. However, be aware that the password will be visible in process logs and shell history. -
Environment Variables (Recommended for Automation): For automated/scheduled executions, consider using environment variables or credential management tools rather than hardcoding passwords in scripts.
-
Cron Jobs: When scheduling with cron, use password prompt or secure credential storage rather than plaintext passwords in the crontab.
- FDSS (Fully Dynamic Storage Scheduler) Potential future development for storage-specific dynamic scheduling.:
- VMFS datastore anti-affinity group logic with naming pattern
- VMFS IO performance balancing
- Enhanced vSphere API Integration: Exploring deeper integration with VMware APIs for more advanced features and metrics.:
- Cluster name switch - DONE
- Ignore anti-affinity switch - DONE (--ignore-anti-affinity)
- Select best host and best datastore switch awareness with anti-affinity and performance
- Password input optimization - DONE