Add transfer control-CFG (cfg for control inputs) + smoke-test coverage#68
Merged
Conversation
Port the control-input CFG feature from i4 commit f11349b into the transfer inference path, reconciling with logic already synced into this repo: - omni_mot_model.py already carries the velocity_postprocess_builder hook, so no model change was needed. - transfer.py: add _build_no_control_inference_state and build_control_cfg_postprocess, and wire them through generate_samples_from_batch via velocity_postprocess_builder. Previously transfer.py passed control_guidance/control_guidance_interval directly, which were silently dropped by **kwargs (control-CFG was a no-op). - args.py: add emphasize_control_in_prompt (TransferDataArgs/Overrides + _TRANSFER_SAMPLE_DEFAULTS) to match the ported prompt-emphasis logic. Test: extend tests/nano_inference_smoke_test.py to cover transfer inference. The existing throughput run keeps t2vs + policy + forward_dynamics; transfer is run as a SEPARATE latency-preset call. Control-CFG runs an extra control-dropped forward each step, which under throughput (data-parallel over samples, FSDP- sharded) executes on only the transfer rank and deadlocks the cross-rank allgather -- so transfer must use the latency preset (context/CFG parallel, all ranks on one sample), matching the cookbook multi-GPU transfer recipe. The spec is built inline (_TRANSFER_SPEC, written to a temp file) and pulls the control video from the public NVIDIA/cosmos GitHub raw URL (same file the cookbook edge.json uses), downscaled for a fast smoke run. Validates transfer-specific attributes (edge control_path, control_guidance>1, guidance>1) and a non-degenerate output clip via the new _assert_video_has_content helper. Verified on a GB200 node: the README Nano edge transfer and the inline smoke spec both generate valid, non-degenerate video; the latency-preset transfer (4 ranks: cfgp=2, cp=2) completes and passes the test assertions, while the throughput+mixed path reproduces the deadlock. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
682e0a5 to
7771d1d
Compare
yy-code-nv
approved these changes
Jun 29, 2026
Dinghow
approved these changes
Jun 29, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports the control-input CFG feature (from i4 commit
f11349b) into the transfer inference path, reconciling with logic already synced into this repo, and adds CD smoke-test coverage for transfer inference.omni_mot_model.pyalready carries thevelocity_postprocess_builderhook — no model change needed.transfer.py: add_build_no_control_inference_stateandbuild_control_cfg_postprocess, wired throughgenerate_samples_from_batchviavelocity_postprocess_builder. Previouslytransfer.pypassedcontrol_guidance/control_guidance_intervaldirectly, where they were silently dropped by**kwargs(control-CFG was a no-op).args.py: addemphasize_control_in_prompt(TransferDataArgs/Overrides+_TRANSFER_SAMPLE_DEFAULTS) to match the ported prompt-emphasis logic.Test coverage
Extends
tests/nano_inference_smoke_test.py(thegenerator-inference-smokeCD job) to also run avideo2videoedge transfer withcontrol_guidance=1.5in the same Nano inference call:_TRANSFER_SPEC, written to a temp file — not committed underinputs/), pulling the control video from the publicNVIDIA/cosmosGitHub raw URL (same file the cookbookedge.jsonuses), downscaled for a fast smoke run (480p / 10 steps / single 29-frame chunk).control_path,control_guidance > 1,guidance > 1) and a non-degenerate output clip via a new_assert_video_has_contenthelper (frame count + pixel variation).Verification
Verified end-to-end on a GB200 node with this repo's
.venv:control_guidance=1.5) → valid 121-frame 720p video;emphasize_control_in_promptand the control-CFG path both run.status: success,control_guidance=1.5,guidance=3.0, non-degenerate output (frames=29, pixel std ≈ 68).🤖 Generated with Claude Code