Skip to content

Limit graph heuristic config queries#279

Open
yeliu-oss wants to merge 1 commit into
NVIDIA:developfrom
yeliu-oss:yeliu/bug_5880337_graph_api
Open

Limit graph heuristic config queries#279
yeliu-oss wants to merge 1 commit into
NVIDIA:developfrom
yeliu-oss:yeliu/bug_5880337_graph_api

Conversation

@yeliu-oss

@yeliu-oss yeliu-oss commented Jun 4, 2026

Copy link
Copy Markdown

Summary by CodeRabbit

  • New Features
    • Added capability to limit the number of engine configurations evaluated during graph execution plan creation. Users can specify a maximum via the new max_engine_configs parameter (default unlimited).
    • Graph optimization with heuristics-based policies now defaults to evaluating a single engine configuration for faster compilation.

@coderabbitai

coderabbitai Bot commented Jun 4, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

This PR introduces an optional max_engine_configs parameter to control engine configuration enumeration during cuDNN heuristics queries. The parameter threads through the public Graph API, internal heuristics implementations, and Python bindings, with conditional policy-based logic that automatically uses 1 when the build policy is HEURISTICS_CHOICE.

Changes

Engine config limiting feature

Layer / File(s) Summary
Graph API and conditional build logic
include/cudnn_frontend/graph_interface.h
Graph::create_execution_plans now accepts an optional max_engine_configs parameter (default -1). Both Graph::build overloads conditionally pass max_engine_configs = 1 for HEURISTICS_CHOICE policy, otherwise -1, and forward the parameter into query_cudnn_heuristics_impl.
Heuristics implementation cascade
include/cudnn_frontend/plans.h, include/cudnn_frontend_Heuristics.h
query_cudnn_heuristics_impl, get_heuristics_list, and get_heuristics_list_impl are extended to accept and propagate max_engine_configs. get_heuristics_list_impl uses conditional logic to query either the specified limit (when positive) or the full engine-config count (when zero or negative), with updated logging to reflect the behavior.
Python bindings and wrapper
python/pygraph/pygraph.h, python/pygraph/pygraph.cpp, python/cudnn/wrapper.py
PyGraph::create_execution_plans method and its pybind11 binding are extended with the new parameter. The Python Graph.__exit__ method explicitly passes max_engine_configs=1 when creating execution plans during graph compilation.

Estimated code review effort

🎯 2 (Simple) | ⏱️ ~12 minutes


🐰 A parameter hops through the stack,
From Graph down to heuristic tracks,
When HEURISTICS calls, it's quick and light—
Just one config at a time done right! ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 6.25% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately reflects the main change: adding a max_engine_configs parameter to limit heuristic engine-config enumeration across the cuDNN graph API.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
🧪 Generate unit tests (beta)
  • Create PR with unit tests

Warning

There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure.

🔧 Infer (1.2.0)
python/pygraph/pygraph.cpp

python/pygraph/pygraph.cpp:6:10: fatal error: 'dlpack/dlpack.h' file not found
6 | #include "dlpack/dlpack.h"
| ^~~~~~~~~~~~~~~~~
1 error generated.
Aborting translation of method 'cudnn_frontend::python_bindings::PyGraph::tensor' in file 'python/pygraph/pygraph.cpp': "Assert_failure src/clang/cAst_utils.ml:249:53"
Uncaught Internal Error: "Assert_failure src/clang/cAst_utils.ml:249:53"
Error backtrace:
Raised at ClangFrontend__CAst_utils.get_decl_from_typ_ptr in file "src/clang/cAst_utils.ml", line 249, characters 53-65
Called from ClangFrontend__CTrans.CTrans_funct.get_destructor_decl_ref in file "src/clang/cTrans.ml", line 658, characters 12-59
Called from ClangFrontend__CTrans.CTrans_funct.destructor_calls.(fun) in file "src/clang/cTrans.ml", line 2048, characters 12-69
Called from Base__List.rev_filter_map.loop in file "src/list.ml", line 944, characters 13-17
Called from Base__List.filter_map in file "src/list.ml" (inlined), line 951, characters 26-47
Called fr

... [truncated 2200 characters] ...

Frontend_decl.CFrontend_decl_funct.process_method_decl.add_method_if_create_procdesc in file "src/clang/cFrontend_decl.ml" (inlined), line 123, characters 16-158
Called from ClangFrontend__CFrontend_decl.CFrontend_decl_funct.process_method_decl in file "src/clang/cFrontend_decl.ml", line 126, characters 17-97
Called from ClangFrontend__CFrontend_decl.CFrontend_decl_funct.process_methods in file "src/clang/cFrontend_decl.ml" (inlined), line 270, characters 8-122
Called from Stdlib__List.iter in file "list.ml" (inlined), line 110, characters 12-15
Called from Stdlib__List.iter in file "list.ml" (inlined), line 108, characters 13-64
Called from Base__List0.iter in file "src/list0.ml" (inlined), line 25, characters 16-35
Called from ClangFrontend__CFrontend_decl.CFrontend_decl_funct.process_me


Comment @coderabbitai help to get the list of available commands and usage tips.

Warning

⚠️ This pull request might be slop. It has been flagged by CodeRabbit slop detection and should be reviewed carefully.

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

Caution

Some comments are outside the diff and can’t be posted inline due to platform limitations.

⚠️ Outside diff range comments (1)
python/cudnn/wrapper.py (1)

271-279: ⚠️ Potential issue | 🟡 Minor | ⚡ Quick win

Document the new default max_engine_configs=1 compile behavior.

Graph.__exit__ now enforces a single engine-config query, but the method docstring still describes a generic “Creates execution plans” step. Please make this explicit in the wrapper docs to avoid user confusion.

✏️ Suggested doc update
-        3. Creates execution plans
+        3. Creates execution plans (currently limited with max_engine_configs=1)

As per coding guidelines, python/cudnn/**: - Focus on documentation.

Also applies to: 294-294

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@python/cudnn/wrapper.py` around lines 271 - 279, Update the Graph.__exit__
docstring to state that the compile step now defaults to max_engine_configs=1
and thus performs a single engine-config query when creating execution plans;
explicitly replace the generic "Creates execution plans" line with language that
documents the single-engine-config behavior, mentions the default parameter name
max_engine_configs and its value 1, and note any user-impact (e.g., fewer
candidate engine configs are queried by default) so readers of Graph.__exit__
(around the current docstring and the nearby comment at the other location)
understand this new default behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@include/cudnn_frontend_Heuristics.h`:
- Around line 365-366: Clamp the positive max_engine_configs to the backend's
available configs before it's used: if max_engine_configs > 0, set
max_engine_configs = std::min(max_engine_configs, getEngineConfigCount()), so
callers cannot request more than getEngineConfigCount(); update the logic in the
function that accepts max_engine_configs (referencing the parameter
max_engine_configs and the calls to getEngineConfigCount() and
getEngineConfig(num_config)) to use the clamped value when iterating/creating
descriptors to avoid excessive descriptor creation.

---

Outside diff comments:
In `@python/cudnn/wrapper.py`:
- Around line 271-279: Update the Graph.__exit__ docstring to state that the
compile step now defaults to max_engine_configs=1 and thus performs a single
engine-config query when creating execution plans; explicitly replace the
generic "Creates execution plans" line with language that documents the
single-engine-config behavior, mentions the default parameter name
max_engine_configs and its value 1, and note any user-impact (e.g., fewer
candidate engine configs are queried by default) so readers of Graph.__exit__
(around the current docstring and the nearby comment at the other location)
understand this new default behavior.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Enterprise

Run ID: e5b93f10-9298-4094-9406-11b492385998

📥 Commits

Reviewing files that changed from the base of the PR and between 1a2799b and 702015c.

📒 Files selected for processing (6)
  • include/cudnn_frontend/graph_interface.h
  • include/cudnn_frontend/plans.h
  • include/cudnn_frontend_Heuristics.h
  • python/cudnn/wrapper.py
  • python/pygraph/pygraph.cpp
  • python/pygraph/pygraph.h

Comment on lines +365 to +366
std::shared_ptr<const DeviceProperties> device_properties = nullptr,
int64_t max_engine_configs = -1) {

@coderabbitai coderabbitai Bot Jun 4, 2026

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Clamp max_engine_configs to available backend configs.

Line 374 currently trusts caller input directly when positive. Very large values can force excessive descriptor creation in getEngineConfig(num_config), causing unnecessary memory/time overhead. Clamp to getEngineConfigCount() first.

Proposed fix
-    int64_t num_config = max_engine_configs;
-    if (num_config <= 0) {
-        num_config = heuristics.getEngineConfigCount();
-        NV_CUDNN_RETURN_IF_ERROR(heuristics);
-        CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " has " << num_config
-                                                            << " configurations.");
-    } else {
-        CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " requesting up to " << num_config
-                                                            << " configurations.");
-    }
+    int64_t total_configs = heuristics.getEngineConfigCount();
+    NV_CUDNN_RETURN_IF_ERROR(heuristics);
+    int64_t num_config = (max_engine_configs > 0) ? std::min(max_engine_configs, total_configs) : total_configs;
+    if (max_engine_configs > 0) {
+        CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " requesting up to " << max_engine_configs
+                                                            << " configurations (capped to " << num_config << ").");
+    } else {
+        CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " has " << num_config
+                                                            << " configurations.");
+    }
As per coding guidelines, `include/**`: "Make sure the cpp code has good quality".

Also applies to: 374-385

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@include/cudnn_frontend_Heuristics.h` around lines 365 - 366, Clamp the
positive max_engine_configs to the backend's available configs before it's used:
if max_engine_configs > 0, set max_engine_configs = std::min(max_engine_configs,
getEngineConfigCount()), so callers cannot request more than
getEngineConfigCount(); update the logic in the function that accepts
max_engine_configs (referencing the parameter max_engine_configs and the calls
to getEngineConfigCount() and getEngineConfig(num_config)) to use the clamped
value when iterating/creating descriptors to avoid excessive descriptor
creation.

Copy link
Copy Markdown
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, we intended to keep existing behavior

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@yeliu-oss, understood — keeping the existing behavior makes sense here since getEngineConfig already resizes the result to the actual count returned by the backend, so passing a value larger than what's available is safe. Acknowledged, and I'll drop this suggestion.

(´・ω・`)


✏️ Learnings added
Learnt from: yeliu-oss
Repo: NVIDIA/cudnn-frontend PR: 279
File: include/cudnn_frontend_Heuristics.h:365-366
Timestamp: 2026-06-04T18:23:21.984Z
Learning: In `include/cudnn_frontend_Heuristics.h`, the `get_heuristics_list_impl` function intentionally does NOT clamp `max_engine_configs` to `getEngineConfigCount()` before passing it to `getEngineConfig(num_config)`. This is by design: `getEngineConfig` internally resizes the result to the actual backend-returned count, so over-requesting is safe. Do not flag missing clamping here as a bug.

Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.

@vedaanta vedaanta requested review from Anerudhan and vedaanta June 4, 2026 18:20
@vedaanta

vedaanta commented Jun 4, 2026

Copy link
Copy Markdown
Collaborator

@cudnn-ci-bot run

@cudnn-ci-bot

Copy link
Copy Markdown

🚀 Running mirror pipeline

Branch: cudnn-gh/pr-279-702015c
Pipeline: 53688519

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants