Skip to content

[HETERO] Fix multi-hop cycle in SubgraphCollector#36126

Open
wgzintel wants to merge 16 commits into
openvinotoolkit:masterfrom
wgzintel:guozhong/fix-multi-hop-subgraph-cycle
Open

[HETERO] Fix multi-hop cycle in SubgraphCollector#36126
wgzintel wants to merge 16 commits into
openvinotoolkit:masterfrom
wgzintel:guozhong/fix-multi-hop-subgraph-cycle

Conversation

@wgzintel
Copy link
Copy Markdown
Contributor

@wgzintel wgzintel commented May 29, 2026

Details

Fixes OPENVINO_ASSERT("Cannot sort subgraphs!") in ov::hetero::SubgraphCollector::run() on real workloads (reproduced on yolo26seg with HETERO:GPU,CPU, 736 ops / 643 → GPU / 93 → CPU).

Root cause

The existing per-node cyclic-dependency check in split_cyclic_dependencies() (the bit-scan Phase 4a/4b) only catches cycles whose re-entry consumer is itself a node whose cyc_dep bitset is non-empty (same-subgraph data flowing back through one foreign subgraph into that node's inputs).

Two classes of subgraph-DAG cycles fall outside its scope, and both occur naturally in valid ov::Models:

  1. Multi-hop subgraph-DAG cyclessg_A → sg_B → sg_C → sg_D → sg_A. Producer and re-entry consumer are several subgraphs apart; no single node sees its own sg on the cycle.
  2. Shared-graph-input cycles — a Constant (or other graph input) fans out to multiple consumers that Union-Find fuses into one subgraph; that fused subgraph then both produces and consumes data on the same neighbor subgraph. The cut edge is an input of the foreign-sg node, not of the same-sg node whose cyc_dep is non-empty, so the existing Phase 4b cannot promote it by construction.

Neither is a real data cycle in the ov::Model (which is always a DAG); both are artifacts of subgraph fusion that the topological sort in run() cannot resolve, leading to the assert.

Fix

Adds a subgraph-DAG SCC fallback that runs after the existing per-node heuristic. Each iteration:

  1. build_subgraph_adjacency — derives the cross-subgraph DAG from _subgraph_inputs. Parallel edges between the same subgraph pair are de-duplicated; self-edges (producer_sg == owner_sg) are filtered.
  2. find_non_trivial_scc_members — runs iterative Tarjan and returns subgraphs that belong to any SCC of size > 1. An exact SCC algorithm is required: a two-peel (forward + reverse Kahn) approximation also flags acyclic bridges between disjoint cycles, which would either waste promotion on acyclic subgraphs or trip the "no internal edge" assert.
  3. isolate_one_scc_node — picks one SCC-member node with the fewest same-sg non-boundary inputs (tie-break: smaller topo index) and promotes all its same-sg inputs into _subgraph_inputs. Single-edge cuts diverge in practice — the chosen node re-merges via its other same-sg inputs in the next collect_subgraphs_ids round; full-dissolution explodes the partition. The "isolate one node" cut is the minimum that strictly reduces the non-singleton mass of SCC members each iteration.

Convergence is bounded by total node-input edges; two strict-progress asserts inside the loop catch any future strategy regression that fails to make progress:

  • OPENVINO_ASSERT(promoted > 0, ...) — helper invariant.
  • OPENVINO_ASSERT(_subgraph_inputs.size() > inputs_before_step, ...) — defensive against insert() returning all-duplicates.

Constant-sourced edges are eligible promotion targets (no is_graph_input_node filter on candidates). Without this, the SCC fallback would trip "no internal edge to promote" on shared-Constant cycles.

The rationale and convergence argument are documented inline above isolate_one_scc_node. The fallback's header comment explicitly classifies cases (a) and (b) above as first-class in-scope cases rather than exceptional safety-net cases.

Tests

New parametric cases under INSTANTIATE_TEST_SUITE_P(SubgraphCollectorParamTest, ...):

  • split_cyclic_dependency_diamond — minimal diamond cycle requiring the existing per-node pass.
  • split_cyclic_dependency_nested — stacked cycles, multi-iteration convergence of the existing pass.
  • shared_const_indirect_bridge_cycle, shared_const_cross_device_fanout_cycle, shared_const_as_cycle_source — three shared-Constant cycle shapes the SCC fallback must resolve.
  • shared_const_no_cycle_no_split, shared_const_indirect_bridge_no_cycle — negative cases (shared-Constant topology without an actual cycle must not be split).
  • multi_hop_subgraph_scc_cycle — minimal synthesis of the multi-hop SCC observed on yolo26seg. expected_subgraph_count is std::optional<size_t> set to std::nullopt here because the exact partition is a property of the promotion strategy, not the correctness contract; the assertion under test is "run() does not assert Cannot sort subgraphs!" and "merge round-trip succeeds".

New standalone regression tests:

  • SubgraphCollectorBridgeBetweenCyclesTest.bridge_subgraph_not_split — pins the "exact SCC, not two-peel approximation" invariant; an acyclic bridge between two disjoint cycles must stay un-split.
  • SubgraphCollectorSharedConstSccTest.scc_with_only_constant_sourced_edges_converges — pins the Constant-source-filter removal: when every promotable candidate has a Constant producer, the SCC fallback must still converge instead of asserting "no internal edge to promote".

SubgraphCollectorTestParam::expected_subgraph_count was changed to std::optional<size_t> so opting out of an exact submodel-count assertion is explicit at the test definition site.

Validation

End-to-end on yolo26seg, HETERO:GPU,CPU (736 ops, 643 → GPU, 93 → CPU): SCC fallback converges in 145 iterations into 151 submodels; all submodels compile successfully and compile_model returns OK. Existing hetero unit and functional tests pass.

Impact

No public API or ABI change. Hetero plugin only; no change to bindings, frontends, or other plugins.

Tickets

AI Assistance:

  • AI assistance used: yes

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Fixes a Cannot sort subgraphs! assert in HETERO::compile_model when affinity assignment plus Union-Find through a shared graph-input boundary (e.g., a shared Constant) creates a multi-hop cycle in the subgraph DAG that the existing per-node cycle splitter cannot detect. Adds a subgraph-DAG SCC fallback after the per-node fix-point loop in SubgraphCollector::split_cyclic_dependencies().

Changes:

  • Add a Kahn-based SCC detection pass over the subgraph DAG; for any cyclic subgraph, promote one same-subgraph non-boundary edge into _subgraph_inputs and re-run Union-Find until the DAG is acyclic.
  • Add a regression test multi_hop_subgraph_scc_cycle synthesizing the yolo26s-seg 4-subgraph cycle.
  • Allow expected_subgraph_count == 0 to disable exact subgraph-count assertion in the parameterized harness (partition shape is an implementation detail of the fallback).

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
src/plugins/hetero/src/subgraph_collector.cpp Adds <queue> include and a bounded SCC fallback loop that builds the cross-subgraph DAG, peels acyclic subgraphs via Kahn, and promotes one internal edge per iteration until acyclic.
src/plugins/hetero/tests/unit/subgraph_collector.cpp New create_multi_hop_scc_cycle_model + parameterized test entry; loosens the exact expected_subgraph_count assertion when set to 0.

Comment thread src/plugins/hetero/src/subgraph_collector.cpp Outdated
Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread src/plugins/hetero/src/subgraph_collector.cpp Outdated
Replace the in-band 0 sentinel on SubgraphCollectorTestParam::expected_subgraph_count with std::optional<size_t>. std::nullopt now explicitly opts out of the exact subgraph-count assertion (used by the SCC-fallback test where the partition shape is an implementation detail), so a future test that omits the field can no longer silently disable the check.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Split the subgraph SCC fallback in split_cyclic_dependencies() into three local lambdas: build_subgraph_adjacency, find_non_trivial_scc_members (iterative Tarjan), and find_promotable_internal_edge. The main loop now reads as four labeled steps instead of a single ~130-line block. No behavior change: the helpers are direct extractions and iteration order, asserts, and the edge-budget bound are preserved.
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

@wgzintel wgzintel requested a review from Copilot June 1, 2026 15:15
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

Comment thread src/plugins/hetero/src/subgraph_collector.cpp
…redundant Union-Find passes

Signed-off-by: guozhong <guozhong.wang@intel.com>
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated 1 comment.

Comment thread src/plugins/hetero/src/subgraph_collector.cpp
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 3 out of 3 changed files in this pull request and generated no new comments.

subgraph_ids = collect_subgraphs_ids();
}
return subgraph_ids;
}
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just want to know will there be a significant change in performance after adding subgraph-level SCC fallback?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

category: HETERO OpenVINO HETERO plugin

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants