[HETERO] Fix multi-hop cycle in SubgraphCollector#36126
Open
wgzintel wants to merge 16 commits into
Open
Conversation
3 tasks
Contributor
There was a problem hiding this comment.
Pull request overview
Fixes a Cannot sort subgraphs! assert in HETERO::compile_model when affinity assignment plus Union-Find through a shared graph-input boundary (e.g., a shared Constant) creates a multi-hop cycle in the subgraph DAG that the existing per-node cycle splitter cannot detect. Adds a subgraph-DAG SCC fallback after the per-node fix-point loop in SubgraphCollector::split_cyclic_dependencies().
Changes:
- Add a Kahn-based SCC detection pass over the subgraph DAG; for any cyclic subgraph, promote one same-subgraph non-boundary edge into
_subgraph_inputsand re-run Union-Find until the DAG is acyclic. - Add a regression test
multi_hop_subgraph_scc_cyclesynthesizing the yolo26s-seg 4-subgraph cycle. - Allow
expected_subgraph_count == 0to disable exact subgraph-count assertion in the parameterized harness (partition shape is an implementation detail of the fallback).
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/plugins/hetero/src/subgraph_collector.cpp | Adds <queue> include and a bounded SCC fallback loop that builds the cross-subgraph DAG, peels acyclic subgraphs via Kahn, and promotes one internal edge per iteration until acyclic. |
| src/plugins/hetero/tests/unit/subgraph_collector.cpp | New create_multi_hop_scc_cycle_model + parameterized test entry; loosens the exact expected_subgraph_count assertion when set to 0. |
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Replace the in-band 0 sentinel on SubgraphCollectorTestParam::expected_subgraph_count with std::optional<size_t>. std::nullopt now explicitly opts out of the exact subgraph-count assertion (used by the SCC-fallback test where the partition shape is an implementation detail), so a future test that omits the field can no longer silently disable the check.
Split the subgraph SCC fallback in split_cyclic_dependencies() into three local lambdas: build_subgraph_adjacency, find_non_trivial_scc_members (iterative Tarjan), and find_promotable_internal_edge. The main loop now reads as four labeled steps instead of a single ~130-line block. No behavior change: the helpers are direct extractions and iteration order, asserts, and the edge-budget bound are preserved.
…Adds unit tests and validates end-to-end on yolo26seg.
…b.com/wgzintel/openvino into guozhong/fix-multi-hop-subgraph-cycle
…redundant Union-Find passes Signed-off-by: guozhong <guozhong.wang@intel.com>
…b.com/wgzintel/openvino into guozhong/fix-multi-hop-subgraph-cycle
| subgraph_ids = collect_subgraphs_ids(); | ||
| } | ||
| return subgraph_ids; | ||
| } |
Contributor
There was a problem hiding this comment.
Just want to know will there be a significant change in performance after adding subgraph-level SCC fallback?
WeldonWangwang
approved these changes
Jun 2, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Details
Fixes
OPENVINO_ASSERT("Cannot sort subgraphs!")inov::hetero::SubgraphCollector::run()on real workloads (reproduced on yolo26seg withHETERO:GPU,CPU, 736 ops / 643 → GPU / 93 → CPU).Root cause
The existing per-node cyclic-dependency check in
split_cyclic_dependencies()(the bit-scan Phase 4a/4b) only catches cycles whose re-entry consumer is itself a node whosecyc_depbitset is non-empty (same-subgraph data flowing back through one foreign subgraph into that node's inputs).Two classes of subgraph-DAG cycles fall outside its scope, and both occur naturally in valid
ov::Models:sg_A → sg_B → sg_C → sg_D → sg_A. Producer and re-entry consumer are several subgraphs apart; no single node sees its own sg on the cycle.Constant(or other graph input) fans out to multiple consumers that Union-Find fuses into one subgraph; that fused subgraph then both produces and consumes data on the same neighbor subgraph. The cut edge is an input of the foreign-sg node, not of the same-sg node whosecyc_depis non-empty, so the existing Phase 4b cannot promote it by construction.Neither is a real data cycle in the
ov::Model(which is always a DAG); both are artifacts of subgraph fusion that the topological sort inrun()cannot resolve, leading to the assert.Fix
Adds a subgraph-DAG SCC fallback that runs after the existing per-node heuristic. Each iteration:
build_subgraph_adjacency— derives the cross-subgraph DAG from_subgraph_inputs. Parallel edges between the same subgraph pair are de-duplicated; self-edges (producer_sg == owner_sg) are filtered.find_non_trivial_scc_members— runs iterative Tarjan and returns subgraphs that belong to any SCC of size > 1. An exact SCC algorithm is required: a two-peel (forward + reverse Kahn) approximation also flags acyclic bridges between disjoint cycles, which would either waste promotion on acyclic subgraphs or trip the "no internal edge" assert.isolate_one_scc_node— picks one SCC-member node with the fewest same-sg non-boundary inputs (tie-break: smaller topo index) and promotes all its same-sg inputs into_subgraph_inputs. Single-edge cuts diverge in practice — the chosen node re-merges via its other same-sg inputs in the nextcollect_subgraphs_idsround; full-dissolution explodes the partition. The "isolate one node" cut is the minimum that strictly reduces the non-singleton mass of SCC members each iteration.Convergence is bounded by total node-input edges; two strict-progress asserts inside the loop catch any future strategy regression that fails to make progress:
OPENVINO_ASSERT(promoted > 0, ...)— helper invariant.OPENVINO_ASSERT(_subgraph_inputs.size() > inputs_before_step, ...)— defensive againstinsert()returning all-duplicates.Constant-sourced edges are eligible promotion targets (no
is_graph_input_nodefilter on candidates). Without this, the SCC fallback would trip "no internal edge to promote" on shared-Constant cycles.The rationale and convergence argument are documented inline above
isolate_one_scc_node. The fallback's header comment explicitly classifies cases (a) and (b) above as first-class in-scope cases rather than exceptional safety-net cases.Tests
New parametric cases under
INSTANTIATE_TEST_SUITE_P(SubgraphCollectorParamTest, ...):split_cyclic_dependency_diamond— minimal diamond cycle requiring the existing per-node pass.split_cyclic_dependency_nested— stacked cycles, multi-iteration convergence of the existing pass.shared_const_indirect_bridge_cycle,shared_const_cross_device_fanout_cycle,shared_const_as_cycle_source— three shared-Constant cycle shapes the SCC fallback must resolve.shared_const_no_cycle_no_split,shared_const_indirect_bridge_no_cycle— negative cases (shared-Constant topology without an actual cycle must not be split).multi_hop_subgraph_scc_cycle— minimal synthesis of the multi-hop SCC observed on yolo26seg.expected_subgraph_countisstd::optional<size_t>set tostd::nullopthere because the exact partition is a property of the promotion strategy, not the correctness contract; the assertion under test is "run()does not assertCannot sort subgraphs!" and "merge round-trip succeeds".New standalone regression tests:
SubgraphCollectorBridgeBetweenCyclesTest.bridge_subgraph_not_split— pins the "exact SCC, not two-peel approximation" invariant; an acyclic bridge between two disjoint cycles must stay un-split.SubgraphCollectorSharedConstSccTest.scc_with_only_constant_sourced_edges_converges— pins the Constant-source-filter removal: when every promotable candidate has a Constant producer, the SCC fallback must still converge instead of asserting "no internal edge to promote".SubgraphCollectorTestParam::expected_subgraph_countwas changed tostd::optional<size_t>so opting out of an exact submodel-count assertion is explicit at the test definition site.Validation
End-to-end on yolo26seg,
HETERO:GPU,CPU(736 ops, 643 → GPU, 93 → CPU): SCC fallback converges in 145 iterations into 151 submodels; all submodels compile successfully andcompile_modelreturns OK. Existing hetero unit and functional tests pass.Impact
No public API or ABI change. Hetero plugin only; no change to bindings, frontends, or other plugins.
Tickets
AI Assistance: