Limit graph heuristic config queries#279
Conversation
📝 WalkthroughWalkthroughThis PR introduces an optional ChangesEngine config limiting feature
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Warning There were issues while running some tools. Please review the errors and either fix the tool's configuration or disable the tool if it's a critical failure. 🔧 Infer (1.2.0)python/pygraph/pygraph.cpppython/pygraph/pygraph.cpp:6:10: fatal error: 'dlpack/dlpack.h' file not found ... [truncated 2200 characters] ... Frontend_decl.CFrontend_decl_funct.process_method_decl.add_method_if_create_procdesc in file "src/clang/cFrontend_decl.ml" (inlined), line 123, characters 16-158 Comment Warning |
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
python/cudnn/wrapper.py (1)
271-279:⚠️ Potential issue | 🟡 Minor | ⚡ Quick winDocument the new default
max_engine_configs=1compile behavior.
Graph.__exit__now enforces a single engine-config query, but the method docstring still describes a generic “Creates execution plans” step. Please make this explicit in the wrapper docs to avoid user confusion.✏️ Suggested doc update
- 3. Creates execution plans + 3. Creates execution plans (currently limited with max_engine_configs=1)As per coding guidelines,
python/cudnn/**: - Focus on documentation.Also applies to: 294-294
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@python/cudnn/wrapper.py` around lines 271 - 279, Update the Graph.__exit__ docstring to state that the compile step now defaults to max_engine_configs=1 and thus performs a single engine-config query when creating execution plans; explicitly replace the generic "Creates execution plans" line with language that documents the single-engine-config behavior, mentions the default parameter name max_engine_configs and its value 1, and note any user-impact (e.g., fewer candidate engine configs are queried by default) so readers of Graph.__exit__ (around the current docstring and the nearby comment at the other location) understand this new default behavior.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@include/cudnn_frontend_Heuristics.h`:
- Around line 365-366: Clamp the positive max_engine_configs to the backend's
available configs before it's used: if max_engine_configs > 0, set
max_engine_configs = std::min(max_engine_configs, getEngineConfigCount()), so
callers cannot request more than getEngineConfigCount(); update the logic in the
function that accepts max_engine_configs (referencing the parameter
max_engine_configs and the calls to getEngineConfigCount() and
getEngineConfig(num_config)) to use the clamped value when iterating/creating
descriptors to avoid excessive descriptor creation.
---
Outside diff comments:
In `@python/cudnn/wrapper.py`:
- Around line 271-279: Update the Graph.__exit__ docstring to state that the
compile step now defaults to max_engine_configs=1 and thus performs a single
engine-config query when creating execution plans; explicitly replace the
generic "Creates execution plans" line with language that documents the
single-engine-config behavior, mentions the default parameter name
max_engine_configs and its value 1, and note any user-impact (e.g., fewer
candidate engine configs are queried by default) so readers of Graph.__exit__
(around the current docstring and the nearby comment at the other location)
understand this new default behavior.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Enterprise
Run ID: e5b93f10-9298-4094-9406-11b492385998
📒 Files selected for processing (6)
include/cudnn_frontend/graph_interface.hinclude/cudnn_frontend/plans.hinclude/cudnn_frontend_Heuristics.hpython/cudnn/wrapper.pypython/pygraph/pygraph.cpppython/pygraph/pygraph.h
| std::shared_ptr<const DeviceProperties> device_properties = nullptr, | ||
| int64_t max_engine_configs = -1) { |
There was a problem hiding this comment.
Clamp max_engine_configs to available backend configs.
Line 374 currently trusts caller input directly when positive. Very large values can force excessive descriptor creation in getEngineConfig(num_config), causing unnecessary memory/time overhead. Clamp to getEngineConfigCount() first.
Proposed fix
- int64_t num_config = max_engine_configs;
- if (num_config <= 0) {
- num_config = heuristics.getEngineConfigCount();
- NV_CUDNN_RETURN_IF_ERROR(heuristics);
- CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " has " << num_config
- << " configurations.");
- } else {
- CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " requesting up to " << num_config
- << " configurations.");
- }
+ int64_t total_configs = heuristics.getEngineConfigCount();
+ NV_CUDNN_RETURN_IF_ERROR(heuristics);
+ int64_t num_config = (max_engine_configs > 0) ? std::min(max_engine_configs, total_configs) : total_configs;
+ if (max_engine_configs > 0) {
+ CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " requesting up to " << max_engine_configs
+ << " configurations (capped to " << num_config << ").");
+ } else {
+ CUDNN_FE_LOG_LABEL_ENDL("Heuristic query for mode " << heur_mode << " has " << num_config
+ << " configurations.");
+ }Also applies to: 374-385
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@include/cudnn_frontend_Heuristics.h` around lines 365 - 366, Clamp the
positive max_engine_configs to the backend's available configs before it's used:
if max_engine_configs > 0, set max_engine_configs = std::min(max_engine_configs,
getEngineConfigCount()), so callers cannot request more than
getEngineConfigCount(); update the logic in the function that accepts
max_engine_configs (referencing the parameter max_engine_configs and the calls
to getEngineConfigCount() and getEngineConfig(num_config)) to use the clamped
value when iterating/creating descriptors to avoid excessive descriptor
creation.
There was a problem hiding this comment.
No, we intended to keep existing behavior
There was a problem hiding this comment.
@yeliu-oss, understood — keeping the existing behavior makes sense here since getEngineConfig already resizes the result to the actual count returned by the backend, so passing a value larger than what's available is safe. Acknowledged, and I'll drop this suggestion.
(´・ω・`)
✏️ Learnings added
Learnt from: yeliu-oss
Repo: NVIDIA/cudnn-frontend PR: 279
File: include/cudnn_frontend_Heuristics.h:365-366
Timestamp: 2026-06-04T18:23:21.984Z
Learning: In `include/cudnn_frontend_Heuristics.h`, the `get_heuristics_list_impl` function intentionally does NOT clamp `max_engine_configs` to `getEngineConfigCount()` before passing it to `getEngineConfig(num_config)`. This is by design: `getEngineConfig` internally resizes the result to the actual backend-returned count, so over-requesting is safe. Do not flag missing clamping here as a bug.
Note: Learnings are effective only in the context of similar code segments. To apply general coding guidelines, please configure review instructions. You can manage existing learnings in the app.
|
@cudnn-ci-bot run |
|
🚀 Running mirror pipeline Branch: cudnn-gh/pr-279-702015c |
Summary by CodeRabbit
max_engine_configsparameter (default unlimited).