Skip to content

Move model provisioning policies out of the simulator#311

Closed
James-QiuHaoran wants to merge 9 commits into
mainfrom
hqiu/model-allocator
Closed

Move model provisioning policies out of the simulator#311
James-QiuHaoran wants to merge 9 commits into
mainfrom
hqiu/model-allocator

Conversation

@James-QiuHaoran
Copy link
Copy Markdown
Collaborator

This PR refactored the code to decouple the model provisioning policies (e.g., greedy, MILP) from the simulator. In addition, in the StreamWise dashboard, the user can auto-deploy a workflow with resource-optimized allocation plan.

Haoran Qiu and others added 3 commits May 15, 2026 13:11
Delete 17 shim files from simulator/ that re-exported from
streamwise.model_provisioner. Update simulator/__init__.py to add
streamwise/ to sys.path so model_provisioner is importable.

Update imports in simulator/provisioning.py, multirequests.py, and
plot_utils.py to use model_provisioner.* prefixed imports.

Update all 19 test files in tests/simulator/ to:
- Pass both 'simulator' and 'streamwise' to temp_sys_path
- Use model_provisioner.* prefixed imports for moved modules
- Fix patch.dict target in test_models.py
- Fix inline import in test_hexgen.py

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move actions, auto_model_allocator, constants, data_loading, evaluator,
model_allocator, models, sim_types, sim_types_json, utils, and workflows
from streamwise/model_provisioner/ back to simulator/.

Only 6 policy files remain in model_provisioner: greedy, helix, hexgen,
milp, naive_baseline, and policies.

Import changes:
- Moved files use bare imports (from sim_types import ...) instead of
  relative imports (from .sim_types import ...)
- Policy files use bare imports for moved modules and keep relative
  imports for sibling policy modules
- simulator/ and streamwise/allocator_bridge.py updated accordingly
- All test files updated to match new import paths
- Added tests/simulator/conftest.py to set PYTHONPATH for child processes
  spawned by ProcessPoolExecutor

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
@James-QiuHaoran James-QiuHaoran requested a review from Copilot May 15, 2026 23:01
@James-QiuHaoran James-QiuHaoran self-assigned this May 15, 2026
Comment thread tests/streamwise/test_streamwise_auto_deploy.py Fixed
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors GPU allocation policies (greedy, MILP, HexGen, Helix, naive) out of simulator/ into a new streamwise/model_provisioner/ package, and adds an "Auto Deploy" feature in the StreamWise dashboard that runs the allocator against a user-supplied GPU budget and deploys the resulting plan via pod_manager.

Changes:

  • New streamwise/model_provisioner/ package containing the policies and allocators (greedy/naive/MILP/HexGen/Helix); imports throughout simulator/tests updated to reference the new location.
  • New streamwise/allocator_bridge.py that translates allocator Result objects into DeploymentSpecs for pod_manager.add_pod, plus three new endpoints in streamwise.py (/api/auto_deploy, /api/auto_deploy/confirm, /api/auto_deploy/workflows) and corresponding UI in add_pod.html.
  • Tests added under tests/streamwise/ for the bridge and endpoints; sys.path plumbing added in simulator/__init__.py, simulator/provisioning.py, and tests/simulator/conftest.py so the cross-package imports work in subprocess workers.

Reviewed changes

Copilot reviewed 40 out of 42 changed files in this pull request and generated 1 comment.

Show a summary per file
File Description
streamwise/model_provisioner/init.py, policies.py, greedy.py, naive_baseline.py, milp.py, hexgen.py, helix.py New package containing the relocated allocation policies/allocators with relative imports.
streamwise/allocator_bridge.py New bridge mapping allocator Results to deployment specs and exposing run_allocator/get_available_* helpers.
streamwise/streamwise.py Adds three auto-deploy API endpoints.
streamwise/templates/add_pod.html Adds the Auto Deploy form, results table, and JS to call the new endpoints.
simulator/init.py, simulator/provisioning.py Adds sys.path/PYTHONPATH plumbing so model_provisioner is importable in worker processes.
simulator/data_loading.py Switches default data_dir to an absolute path computed from __file__ (path computation appears off by one).
simulator/auto_model_allocator.py, actions.py, model_allocator.py, multirequests.py Updated import paths to model_provisioner.*.
tests/simulator/*.py Adjusted temp_sys_path and import paths to the new package layout.
tests/simulator/conftest.py, tests/streamwise/conftest.py New conftests adding paths so child processes/lazy imports resolve.
tests/streamwise/test_allocator_bridge.py, test_streamwise_auto_deploy.py New tests for bridge logic and auto-deploy endpoints.
.gitignore Ignore .venv/.

Comment thread simulator/data_loading.py Outdated
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
Comment thread tests/streamwise/test_streamwise_auto_deploy.py Fixed
Comment thread simulator/provisioning.py Outdated
Comment thread simulator/data_loading.py Outdated
});
});
}
// Auto-Deploy
Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe we should split the PR into two and have one just for the module and then one for the setting?

Comment thread tests/streamwise/test_streamwise_auto_deploy.py Dismissed
@github-actions
Copy link
Copy Markdown

github-actions Bot commented May 15, 2026

Test Results

1 169 tests  +25   1 169 ✅ +25   14m 56s ⏱️ -2s
   13 suites ± 0       0 💤 ± 0 
   13 files   ± 0       0 ❌ ± 0 

Results for commit a0636f8. ± Comparison against base commit c20eb25.

♻️ This comment has been updated with latest results.

Copy link
Copy Markdown

@kdh0102 kdh0102 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR. Most of the changes look good to me, but currently having some issues with importing the model_provisioner package inside the simulator. Especially, I've been testing the two jupyter notebooks (cost_estimator_*.ipynb), but failing to import model_provisioner properly.
I'm first leaving this review and will do another pass.

from workflows import PODCAST_WORKFLOW

from policies import STREAMWISE_POLICY
from model_provisioner.policies import STREAMWISE_POLICY
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cannot import model_provisioner when running inside the cost_estimator_multirequests.ipynb

Comment thread streamwise/allocator_bridge.py Dismissed
@github-actions
Copy link
Copy Markdown

Lint Results

Check Status
Python
Shell
YAML
JSON
Markdown
Bicep

@github-actions
Copy link
Copy Markdown

Mypy Type Checking

✅ No issues found

Metric Count
📂 Files 299
❌ Errors 0
⚠️ Warnings 0
📝 Notes 0
Full mypy output
Success: no issues found in 299 source files

@github-actions
Copy link
Copy Markdown

Diff Coverage

Diff: origin/main...HEAD, staged and unstaged changes

  • simulator/init.py (0.0%): Missing lines 7-8,11,14-15
  • simulator/actions.py (100%)
  • simulator/auto_model_allocator.py (100%)
  • simulator/data_loading.py (100%)
  • simulator/model_allocator.py (100%)
  • simulator/multirequests.py (100%)
  • simulator/provisioning.py (100%)
  • streamwise/allocator_bridge.py (100%)
  • streamwise/model_provisioner/init.py (100%)
  • streamwise/model_provisioner/greedy.py (100%)
  • streamwise/model_provisioner/helix.py (100%)
  • streamwise/model_provisioner/hexgen.py (100%)
  • streamwise/model_provisioner/milp.py (100%)
  • streamwise/model_provisioner/naive_baseline.py (100%)
  • streamwise/streamwise.py (76.5%): Missing lines 763-765,805-806,821-824,833-835
  • tests/simulator/test_auto_model_allocator.py (100%)
  • tests/simulator/test_data_loading.py (100%)
  • tests/simulator/test_evaluator.py (100%)
  • tests/simulator/test_greedy.py (100%)
  • tests/simulator/test_helix.py (100%)
  • tests/simulator/test_hexgen.py (100%)
  • tests/simulator/test_milp.py (100%)
  • tests/simulator/test_models.py (100%)
  • tests/simulator/test_multirequests_derive.py (100%)
  • tests/simulator/test_simulator.py (100%)
  • tests/simulator/test_simulator_actions.py (100%)
  • tests/simulator/test_simulator_baseline.py (100%)
  • tests/simulator/test_simulator_energy.py (100%)
  • tests/simulator/test_simulator_multirequests.py (100%)
  • tests/simulator/test_simulator_plotutils.py (100%)
  • tests/simulator/test_simulator_policies.py (100%)
  • tests/simulator/test_simulator_provisioning.py (100%)
  • tests/simulator/test_simulator_types.py (100%)
  • tests/simulator/test_simulator_utils.py (100%)
  • tests/simulator/test_workflows.py (100%)
  • tests/streamwise/test_allocator_bridge.py (100%)
  • tests/streamwise/test_streamwise_auto_deploy.py (100%)
  • wrapper/run_httpserver.py (0.0%): Missing lines 1269-1270

Summary

  • Total: 418 lines
  • Missing: 19 lines
  • Coverage: 95%

simulator/init.py

Lines 3-16

   3 on top of the model_provisioner allocation policies.
   4 
   5 The allocation policy implementations live in ``streamwise/model_provisioner/``.
   6 """
!  7 import os
!  8 import sys
   9 
  10 # Make model_provisioner importable for simulator modules.
! 11 _STREAMWISE_DIR = os.path.normpath(
  12     os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "streamwise")
  13 )
! 14 if _STREAMWISE_DIR not in sys.path:
! 15     sys.path.insert(0, _STREAMWISE_DIR)

streamwise/streamwise.py

Lines 759-769

  759         return jsonify(allocator_bridge.deployment_plan_to_json(plan)), HTTPStatus.OK
  760 
  761     except ValueError as ve:
  762         return jsonify({"error": str(ve)}), HTTPStatus.BAD_REQUEST
! 763     except Exception as ex:
! 764         logging.exception("Error in auto_deploy: %s", ex)
! 765         return jsonify({"error": str(ex)}), HTTPStatus.INTERNAL_SERVER_ERROR
  766 
  767 
  768 @route("/api/auto_deploy/confirm", methods=["POST"])
  769 async def api_auto_deploy_confirm() -> QuartReturn:

Lines 801-810

  801 
  802         for spec in specs:
  803             container_name = spec.get("container_name")
  804             if not container_name:
! 805                 errors.append("Spec missing 'container_name'")
! 806                 continue
  807 
  808             try:
  809                 await pod_manager.add_pod(
  810                     container_name=container_name,

Lines 817-828

  817                     namespace=NAMESPACE,
  818                     k8s_cluster=k8s_cluster,
  819                 )
  820                 deployed.append(container_name)
! 821             except Exception as pod_ex:
! 822                 msg = f"Failed to deploy '{container_name}': {pod_ex}"
! 823                 logging.error(msg)
! 824                 errors.append(msg)
  825 
  826         status = HTTPStatus.OK if not errors else HTTPStatus.MULTI_STATUS
  827         return jsonify({
  828             "deployed": deployed,

Lines 829-839

  829             "errors": errors,
  830             "message": f"Deployed {len(deployed)}/{len(specs)} containers.",
  831         }), status
  832 
! 833     except Exception as ex:
! 834         logging.exception("Error in auto_deploy/confirm: %s", ex)
! 835         return jsonify({"error": str(ex)}), HTTPStatus.INTERNAL_SERVER_ERROR
  836 
  837 
  838 @route("/api/auto_deploy/workflows", methods=["GET"])
  839 async def api_auto_deploy_workflows() -> QuartReturn:

wrapper/run_httpserver.py

Lines 1265-1274

  1265     last_ping_time = time.time()
  1266 
  1267     try:
  1268         payload_bytes = await asyncio.to_thread(pickle.dumps, gen_task)
! 1269         payload_buffer = bytearray(payload_bytes)
! 1270         payload_tensor = torch.frombuffer(payload_buffer, dtype=torch.uint8).to("cuda")
  1271         payload_size = torch.tensor([payload_tensor.numel()], dtype=torch.int64, device="cuda")
  1272 
  1273         if payload_size.item() > MAX_PAYLOAD_BYTES:
  1274             logging.error(f"[{rank}] Payload too large: {payload_size.item()} bytes.")

@github-actions
Copy link
Copy Markdown

Code Coverage

Package Line Rate Complexity Health
. 80% 0
apps 67% 0
apps.streamanimate 89% 0
apps.streamcast 91% 0
apps.streamchat 88% 0
apps.streamdub 64% 0
apps.streamedit 79% 0
apps.streamlecture 94% 0
apps.streammovie 89% 0
apps.streampersona 64% 0
apps.streamshort 71% 0
simulator 90% 0
streamwise 71% 0
streamwise.model_provisioner 83% 0
tests 99% 0
tests.simulator 100% 0
tests.streamwise 100% 0
tests.streamwise_app 99% 0
wrapper 67% 0
wrapper.4kagent 94% 0
wrapper.bagel 37% 0
wrapper.cogview 45% 0
wrapper.fantasytalking 55% 0
wrapper.flux 74% 0
wrapper.flux2 100% 0
wrapper.flux2klein 99% 0
wrapper.fluxkontext 100% 0
wrapper.fluxkrea 99% 0
wrapper.fluxupscaler 68% 0
wrapper.hidream 88% 0
wrapper.hunyuanavatar 74% 0
wrapper.hunyuanframepack 52% 0
wrapper.hunyuanframepackf1 59% 0
wrapper.hunyuanframepackvae 63% 0
wrapper.hunyuanimage 84% 0
wrapper.imageresize 100% 0
wrapper.januspro 91% 0
wrapper.kokoro 87% 0
wrapper.llamagen 65% 0
wrapper.mock 86% 0
wrapper.podcasttranscript 45% 0
wrapper.qwenimage 89% 0
wrapper.qwenimageedit 88% 0
wrapper.realesrgan 77% 0
wrapper.slidetranscript 59% 0
wrapper.vibevoice 31% 0
wrapper.vibevoice.schedule 50% 0
wrapper.wan 34% 0
wrapper.wan22 75% 0
wrapper.xtts 75% 0
wrapper.yolo 69% 0
Summary 79% (27075 / 34464) 0

@James-QiuHaoran
Copy link
Copy Markdown
Collaborator Author

Superseded by two focused PRs:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants