Move model provisioning policies out of the simulator#311
Move model provisioning policies out of the simulator#311James-QiuHaoran wants to merge 9 commits into
Conversation
Delete 17 shim files from simulator/ that re-exported from streamwise.model_provisioner. Update simulator/__init__.py to add streamwise/ to sys.path so model_provisioner is importable. Update imports in simulator/provisioning.py, multirequests.py, and plot_utils.py to use model_provisioner.* prefixed imports. Update all 19 test files in tests/simulator/ to: - Pass both 'simulator' and 'streamwise' to temp_sys_path - Use model_provisioner.* prefixed imports for moved modules - Fix patch.dict target in test_models.py - Fix inline import in test_hexgen.py Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Move actions, auto_model_allocator, constants, data_loading, evaluator, model_allocator, models, sim_types, sim_types_json, utils, and workflows from streamwise/model_provisioner/ back to simulator/. Only 6 policy files remain in model_provisioner: greedy, helix, hexgen, milp, naive_baseline, and policies. Import changes: - Moved files use bare imports (from sim_types import ...) instead of relative imports (from .sim_types import ...) - Policy files use bare imports for moved modules and keep relative imports for sibling policy modules - simulator/ and streamwise/allocator_bridge.py updated accordingly - All test files updated to match new import paths - Added tests/simulator/conftest.py to set PYTHONPATH for child processes spawned by ProcessPoolExecutor Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
This PR refactors GPU allocation policies (greedy, MILP, HexGen, Helix, naive) out of simulator/ into a new streamwise/model_provisioner/ package, and adds an "Auto Deploy" feature in the StreamWise dashboard that runs the allocator against a user-supplied GPU budget and deploys the resulting plan via pod_manager.
Changes:
- New
streamwise/model_provisioner/package containing the policies and allocators (greedy/naive/MILP/HexGen/Helix); imports throughout simulator/tests updated to reference the new location. - New
streamwise/allocator_bridge.pythat translates allocatorResultobjects intoDeploymentSpecs forpod_manager.add_pod, plus three new endpoints instreamwise.py(/api/auto_deploy,/api/auto_deploy/confirm,/api/auto_deploy/workflows) and corresponding UI inadd_pod.html. - Tests added under
tests/streamwise/for the bridge and endpoints;sys.pathplumbing added insimulator/__init__.py,simulator/provisioning.py, andtests/simulator/conftest.pyso the cross-package imports work in subprocess workers.
Reviewed changes
Copilot reviewed 40 out of 42 changed files in this pull request and generated 1 comment.
Show a summary per file
| File | Description |
|---|---|
| streamwise/model_provisioner/init.py, policies.py, greedy.py, naive_baseline.py, milp.py, hexgen.py, helix.py | New package containing the relocated allocation policies/allocators with relative imports. |
| streamwise/allocator_bridge.py | New bridge mapping allocator Results to deployment specs and exposing run_allocator/get_available_* helpers. |
| streamwise/streamwise.py | Adds three auto-deploy API endpoints. |
| streamwise/templates/add_pod.html | Adds the Auto Deploy form, results table, and JS to call the new endpoints. |
| simulator/init.py, simulator/provisioning.py | Adds sys.path/PYTHONPATH plumbing so model_provisioner is importable in worker processes. |
| simulator/data_loading.py | Switches default data_dir to an absolute path computed from __file__ (path computation appears off by one). |
| simulator/auto_model_allocator.py, actions.py, model_allocator.py, multirequests.py | Updated import paths to model_provisioner.*. |
| tests/simulator/*.py | Adjusted temp_sys_path and import paths to the new package layout. |
| tests/simulator/conftest.py, tests/streamwise/conftest.py | New conftests adding paths so child processes/lazy imports resolve. |
| tests/streamwise/test_allocator_bridge.py, test_streamwise_auto_deploy.py | New tests for bridge logic and auto-deploy endpoints. |
| .gitignore | Ignore .venv/. |
Co-authored-by: Copilot Autofix powered by AI <223894421+github-code-quality[bot]@users.noreply.github.com>
| }); | ||
| }); | ||
| } | ||
| // Auto-Deploy |
There was a problem hiding this comment.
Maybe we should split the PR into two and have one just for the module and then one for the setting?
kdh0102
left a comment
There was a problem hiding this comment.
Thanks for the PR. Most of the changes look good to me, but currently having some issues with importing the model_provisioner package inside the simulator. Especially, I've been testing the two jupyter notebooks (cost_estimator_*.ipynb), but failing to import model_provisioner properly.
I'm first leaving this review and will do another pass.
| from workflows import PODCAST_WORKFLOW | ||
|
|
||
| from policies import STREAMWISE_POLICY | ||
| from model_provisioner.policies import STREAMWISE_POLICY |
There was a problem hiding this comment.
Cannot import model_provisioner when running inside the cost_estimator_multirequests.ipynb
Lint Results
|
Mypy Type Checking✅ No issues found
Full mypy output |
Diff CoverageDiff: origin/main...HEAD, staged and unstaged changes
Summary
simulator/init.pyLines 3-16 3 on top of the model_provisioner allocation policies.
4
5 The allocation policy implementations live in ``streamwise/model_provisioner/``.
6 """
! 7 import os
! 8 import sys
9
10 # Make model_provisioner importable for simulator modules.
! 11 _STREAMWISE_DIR = os.path.normpath(
12 os.path.join(os.path.dirname(os.path.abspath(__file__)), "..", "streamwise")
13 )
! 14 if _STREAMWISE_DIR not in sys.path:
! 15 sys.path.insert(0, _STREAMWISE_DIR)streamwise/streamwise.pyLines 759-769 759 return jsonify(allocator_bridge.deployment_plan_to_json(plan)), HTTPStatus.OK
760
761 except ValueError as ve:
762 return jsonify({"error": str(ve)}), HTTPStatus.BAD_REQUEST
! 763 except Exception as ex:
! 764 logging.exception("Error in auto_deploy: %s", ex)
! 765 return jsonify({"error": str(ex)}), HTTPStatus.INTERNAL_SERVER_ERROR
766
767
768 @route("/api/auto_deploy/confirm", methods=["POST"])
769 async def api_auto_deploy_confirm() -> QuartReturn:Lines 801-810 801
802 for spec in specs:
803 container_name = spec.get("container_name")
804 if not container_name:
! 805 errors.append("Spec missing 'container_name'")
! 806 continue
807
808 try:
809 await pod_manager.add_pod(
810 container_name=container_name,Lines 817-828 817 namespace=NAMESPACE,
818 k8s_cluster=k8s_cluster,
819 )
820 deployed.append(container_name)
! 821 except Exception as pod_ex:
! 822 msg = f"Failed to deploy '{container_name}': {pod_ex}"
! 823 logging.error(msg)
! 824 errors.append(msg)
825
826 status = HTTPStatus.OK if not errors else HTTPStatus.MULTI_STATUS
827 return jsonify({
828 "deployed": deployed,Lines 829-839 829 "errors": errors,
830 "message": f"Deployed {len(deployed)}/{len(specs)} containers.",
831 }), status
832
! 833 except Exception as ex:
! 834 logging.exception("Error in auto_deploy/confirm: %s", ex)
! 835 return jsonify({"error": str(ex)}), HTTPStatus.INTERNAL_SERVER_ERROR
836
837
838 @route("/api/auto_deploy/workflows", methods=["GET"])
839 async def api_auto_deploy_workflows() -> QuartReturn:wrapper/run_httpserver.pyLines 1265-1274 1265 last_ping_time = time.time()
1266
1267 try:
1268 payload_bytes = await asyncio.to_thread(pickle.dumps, gen_task)
! 1269 payload_buffer = bytearray(payload_bytes)
! 1270 payload_tensor = torch.frombuffer(payload_buffer, dtype=torch.uint8).to("cuda")
1271 payload_size = torch.tensor([payload_tensor.numel()], dtype=torch.int64, device="cuda")
1272
1273 if payload_size.item() > MAX_PAYLOAD_BYTES:
1274 logging.error(f"[{rank}] Payload too large: {payload_size.item()} bytes.") |
|
|
Superseded by two focused PRs:
|
This PR refactored the code to decouple the model provisioning policies (e.g., greedy, MILP) from the simulator. In addition, in the StreamWise dashboard, the user can auto-deploy a workflow with resource-optimized allocation plan.