Limit graph heuristic config queries by yeliu-oss · Pull Request #279 · NVIDIA/cudnn-frontend

yeliu-oss · 2026-06-04T16:40:54Z

Summary by CodeRabbit

New Features
- Added capability to limit the number of engine configurations evaluated during graph execution plan creation. Users can specify a maximum via the new max_engine_configs parameter (default unlimited).
- Graph optimization with heuristics-based policies now defaults to evaluating a single engine configuration for faster compilation.

coderabbitai · 2026-06-04T16:41:08Z

📝 Walkthrough

Walkthrough

This PR introduces an optional max_engine_configs parameter to control engine configuration enumeration during cuDNN heuristics queries. The parameter threads through the public Graph API, internal heuristics implementations, and Python bindings, with conditional policy-based logic that automatically uses 1 when the build policy is HEURISTICS_CHOICE.

Changes

Engine config limiting feature

Layer / File(s)	Summary
Graph API and conditional build logic `include/cudnn_frontend/graph_interface.h`	`Graph::create_execution_plans` now accepts an optional `max_engine_configs` parameter (default `-1`). Both `Graph::build` overloads conditionally pass `max_engine_configs = 1` for `HEURISTICS_CHOICE` policy, otherwise `-1`, and forward the parameter into `query_cudnn_heuristics_impl`.
Heuristics implementation cascade `include/cudnn_frontend/plans.h`, `include/cudnn_frontend_Heuristics.h`	`query_cudnn_heuristics_impl`, `get_heuristics_list`, and `get_heuristics_list_impl` are extended to accept and propagate `max_engine_configs`. `get_heuristics_list_impl` uses conditional logic to query either the specified limit (when positive) or the full engine-config count (when zero or negative), with updated logging to reflect the behavior.
Python bindings and wrapper `python/pygraph/pygraph.h`, `python/pygraph/pygraph.cpp`, `python/cudnn/wrapper.py`	`PyGraph::create_execution_plans` method and its pybind11 binding are extended with the new parameter. The Python `Graph.__exit__` method explicitly passes `max_engine_configs=1` when creating execution plans during graph compilation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes

🐰 A parameter hops through the stack,
From Graph down to heuristic tracks,
When HEURISTICS calls, it's quick and light—
Just one config at a time done right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Description Check	✅ Passed	Check skipped - CodeRabbit’s high-level summary is enabled.
Title check	✅ Passed	The title accurately reflects the main change: adding a max_engine_configs parameter to limit heuristic engine-config enumeration across the cuDNN graph API.
Linked Issues check	✅ Passed	Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check	✅ Passed	Check skipped because no linked issues were found for this pull request.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Infer (1.2.0)

python/pygraph/pygraph.cpp

python/pygraph/pygraph.cpp:6:10: fatal error: 'dlpack/dlpack.h' file not found
6 | #include "dlpack/dlpack.h"
| ^~~~~~~~~~~~~~~~~
1 error generated.
Aborting translation of method 'cudnn_frontend::python_bindings::PyGraph::tensor' in file 'python/pygraph/pygraph.cpp': "Assert_failure src/clang/cAst_utils.ml:249:53"
Uncaught Internal Error: "Assert_failure src/clang/cAst_utils.ml:249:53"
Error backtrace:
Raised at ClangFrontend__CAst_utils.get_decl_from_typ_ptr in file "src/clang/cAst_utils.ml", line 249, characters 53-65
Called from ClangFrontend__CTrans.CTrans_funct.get_destructor_decl_ref in file "src/clang/cTrans.ml", line 658, characters 12-59
Called from ClangFrontend__CTrans.CTrans_funct.destructor_calls.(fun) in file "src/clang/cTrans.ml", line 2048, characters 12-69
Called from Base__List.rev_filter_map.loop in file "src/list.ml", line 944, characters 13-17
Called from Base__List.filter_map in file "src/list.ml" (inlined), line 951, characters 26-47
Called fr

... [truncated 2200 characters] ...

Frontend_decl.CFrontend_decl_funct.process_method_decl.add_method_if_create_procdesc in file "src/clang/cFrontend_decl.ml" (inlined), line 123, characters 16-158
Called from ClangFrontend__CFrontend_decl.CFrontend_decl_funct.process_method_decl in file "src/clang/cFrontend_decl.ml", line 126, characters 17-97
Called from ClangFrontend__CFrontend_decl.CFrontend_decl_funct.process_methods in file "src/clang/cFrontend_decl.ml" (inlined), line 270, characters 8-122
Called from Stdlib__List.iter in file "list.ml" (inlined), line 110, characters 12-15
Called from Stdlib__List.iter in file "list.ml" (inlined), line 108, characters 13-64
Called from Base__List0.iter in file "src/list0.ml" (inlined), line 25, characters 16-35
Called from ClangFrontend__CFrontend_decl.CFrontend_decl_funct.process_me

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Warning

⚠️ This pull request might be slop. It has been flagged by CodeRabbit slop detection and should be reviewed carefully.

coderabbitai

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)

python/cudnn/wrapper.py (1)
271-279: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the new default max_engine_configs=1 compile behavior.

Graph.__exit__ now enforces a single engine-config query, but the method docstring still describes a generic “Creates execution plans” step. Please make this explicit in the wrapper docs to avoid user confusion.
✏️ Suggested doc update
-        3. Creates execution plans
+        3. Creates execution plans (currently limited with max_engine_configs=1)
As per coding guidelines, python/cudnn/**: - Focus on documentation.

Also applies to: 294-294
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cudnn/wrapper.py` around lines 271 - 279, Update the Graph.__exit__
docstring to state that the compile step now defaults to max_engine_configs=1
and thus performs a single engine-config query when creating execution plans;
explicitly replace the generic "Creates execution plans" line with language that
documents the single-engine-config behavior, mentions the default parameter name
max_engine_configs and its value 1, and note any user-impact (e.g., fewer
candidate engine configs are queried by default) so readers of Graph.__exit__
(around the current docstring and the nearby comment at the other location)
understand this new default behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@include/cudnn_frontend_Heuristics.h`:
- Around line 365-366: Clamp the positive max_engine_configs to the backend's
available configs before it's used: if max_engine_configs > 0, set
max_engine_configs = std::min(max_engine_configs, getEngineConfigCount()), so
callers cannot request more than getEngineConfigCount(); update the logic in the
function that accepts max_engine_configs (referencing the parameter
max_engine_configs and the calls to getEngineConfigCount() and
getEngineConfig(num_config)) to use the clamped value when iterating/creating
descriptors to avoid excessive descriptor creation.

---

Outside diff comments:
In `@python/cudnn/wrapper.py`:
- Around line 271-279: Update the Graph.__exit__ docstring to state that the
compile step now defaults to max_engine_configs=1 and thus performs a single
engine-config query when creating execution plans; explicitly replace the
generic "Creates execution plans" line with language that documents the
single-engine-config behavior, mentions the default parameter name
max_engine_configs and its value 1, and note any user-impact (e.g., fewer
candidate engine configs are queried by default) so readers of Graph.__exit__
(around the current docstring and the nearby comment at the other location)
understand this new default behavior.

🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

Push a commit to this branch (recommended)
Create a new PR with the fixes

ℹ️ Review info

⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e5b93f10-9298-4094-9406-11b492385998

📥 Commits

Reviewing files that changed from the base of the PR and between 1a2799b and 702015c.

📒 Files selected for processing (6)

include/cudnn_frontend/graph_interface.h
include/cudnn_frontend/plans.h
include/cudnn_frontend_Heuristics.h
python/cudnn/wrapper.py
python/pygraph/pygraph.cpp
python/pygraph/pygraph.h

coderabbitai · 2026-06-04T16:51:14Z

+                         std::shared_ptr<const DeviceProperties> device_properties = nullptr,
+                         int64_t max_engine_configs                              = -1) {


⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clamp max_engine_configs to available backend configs.

Line 374 currently trusts caller input directly when positive. Very large values can force excessive descriptor creation in getEngineConfig(num_config), causing unnecessary memory/time overhead. Clamp to getEngineConfigCount() first.

Proposed fix

- int64_t num_config = max_engine_configs; - if (num_config <= 0) { - num_config = heuristics.getEngineConfigCount(); - NV_CUDNN_RETURN_IF_ERROR(heuristics); - CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " has " << num_config - << " configurations."); - } else { - CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " requesting up to " << num_config - << " configurations."); - } + int64_t total_configs = heuristics.getEngineConfigCount(); + NV_CUDNN_RETURN_IF_ERROR(heuristics); + int64_t num_config = (max_engine_configs > 0) ? std::min(max_engine_configs, total_configs) : total_configs; + if (max_engine_configs > 0) { + CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " requesting up to " << max_engine_configs + << " configurations (capped to " << num_config << ")."); + } else { + CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " has " << num_config + << " configurations."); + }

As per coding guidelines, `include/**`: "Make sure the cpp code has good quality".
Also applies to: 374-385

🤖 Prompt for AI Agents

Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@include/cudnn_frontend_Heuristics.h` around lines 365 - 366, Clamp the positive max_engine_configs to the backend's available configs before it's used: if max_engine_configs > 0, set max_engine_configs = std::min(max_engine_configs, getEngineConfigCount()), so callers cannot request more than getEngineConfigCount(); update the logic in the function that accepts max_engine_configs (referencing the parameter max_engine_configs and the calls to getEngineConfigCount() and getEngineConfig(num_config)) to use the clamped value when iterating/creating descriptors to avoid excessive descriptor creation.

No, we intended to keep existing behavior

@yeliu-oss, understood — keeping the existing behavior makes sense here since getEngineConfig already resizes the result to the actual count returned by the backend, so passing a value larger than what's available is safe. Acknowledged, and I'll drop this suggestion.

(´･ω･`)

✏️ Learnings added

Learnt from: yeliu-oss Repo: NVIDIA/cudnn-frontend PR: 279 File: include/cudnn_frontend_Heuristics.h:365-366 Timestamp: 2026-06-04T18:23:21.984Z Learning: In `include/cudnn_frontend_Heuristics.h`, the `get_heuristics_list_impl` function intentionally does NOT clamp `max_engine_configs` to `getEngineConfigCount()` before passing it to `getEngineConfig(num_config)`. This is by design: `getEngineConfig` internally resizes the result to the actual backend-returned count, so over-requesting is safe. Do not flag missing clamping here as a bug.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

vedaanta · 2026-06-04T18:21:08Z

@cudnn-ci-bot run

cudnn-ci-bot · 2026-06-04T18:22:17Z

🚀 Running mirror pipeline

Branch: cudnn-gh/pr-279-702015c
Pipeline: 53688519

Limit graph heuristic config queries

702015c

coderabbitai Bot reviewed Jun 4, 2026

View reviewed changes

vedaanta requested review from Anerudhan and vedaanta June 4, 2026 18:20

vedaanta approved these changes Jun 4, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Limit graph heuristic config queries#279

Limit graph heuristic config queries#279
yeliu-oss wants to merge 1 commit into
NVIDIA:developfrom
yeliu-oss:yeliu/bug_5880337_graph_api

yeliu-oss commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

Uh oh!

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Uh oh!

coderabbitai Bot Jun 4, 2026 •

edited

Loading

Uh oh!

yeliu-oss Jun 4, 2026

Uh oh!

coderabbitai Bot Jun 4, 2026

Uh oh!

vedaanta commented Jun 4, 2026

Uh oh!

cudnn-ci-bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

		std::shared_ptr<const DeviceProperties> device_properties = nullptr,
		int64_t max_engine_configs = -1) {

Conversation

yeliu-oss commented Jun 4, 2026 • edited by coderabbitai Bot Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary by CodeRabbit

Uh oh!

coderabbitai Bot commented Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Changes

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

yeliu-oss Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

coderabbitai Bot Jun 4, 2026

Choose a reason for hiding this comment

Uh oh!

vedaanta commented Jun 4, 2026

Uh oh!

cudnn-ci-bot commented Jun 4, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

yeliu-oss commented Jun 4, 2026 •

edited by coderabbitai Bot

Loading

coderabbitai Bot commented Jun 4, 2026 •

edited

Loading

coderabbitai Bot Jun 4, 2026 •

edited

Loading