Skip to content

[QNN EP] Added sustained high performance mode implementation#38

Open
qti-monumeen wants to merge 36 commits into
mainfrom
dev/qti-monumeen/AISW-149444
Open

[QNN EP] Added sustained high performance mode implementation#38
qti-monumeen wants to merge 36 commits into
mainfrom
dev/qti-monumeen/AISW-149444

Conversation

@qti-monumeen

Copy link
Copy Markdown
Collaborator

Description

Added sustained high performance mode

Motivation and Context

@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch from eac469f to c0d67c0 Compare March 2, 2026 10:38
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch 6 times, most recently from e027b0c to b083fde Compare March 9, 2026 10:13
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch from 23f4603 to b28e043 Compare March 10, 2026 06:48
Comment thread onnxruntime/core/providers/qnn/qnn_execution_provider.cc Outdated
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch 2 times, most recently from 5f838ca to 4067092 Compare March 13, 2026 04:36
Comment thread onnxruntime/core/providers/qnn/qnn_execution_provider.cc Outdated
Comment thread onnxruntime/test/providers/qnn/qnn_basic_test.cc Outdated
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch 5 times, most recently from 387cec2 to 01b8803 Compare March 17, 2026 06:05
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch 2 times, most recently from b7b84c1 to 8615732 Compare March 18, 2026 10:37
@qti-jkilpatrick

Copy link
Copy Markdown
Collaborator

It looks like this needs to be rebased on main and CI isn't clean yet. Please consider converting this to a draft until it's ready so reviewers can see they should wait. Thanks!

@qti-monumeen qti-monumeen marked this pull request as draft March 25, 2026 05:38
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch from 8848a3b to 2acfa11 Compare March 25, 2026 09:19
@qti-monumeen qti-monumeen force-pushed the dev/qti-monumeen/AISW-149444 branch from 3670430 to 13cf72e Compare June 5, 2026 06:03
Comment thread onnxruntime/core/providers/qnn/builder/qnn_htp_power_state_guard.h Outdated
Comment thread onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc
Comment thread onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc
Comment thread onnxruntime/core/providers/qnn/builder/timer.h
Comment thread onnxruntime/core/providers/qnn/builder/qnn_htp_power_config_manager.h Outdated
Comment thread onnxruntime/core/providers/qnn/builder/qnn_backend_manager.cc
constexpr const uint32_t kMaxRpcPolling = 9999;

// performance timer timeout value is in microseconds
static constexpr uint64_t kDefaultTimerTimeoutUs = 300000;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also fix this

Comment thread onnxruntime/core/providers/qnn/qnn_execution_provider.h Outdated
Comment on lines +25 to +28
try {
thread_status_ = threadState::IDLE;
cv_.notify_all();
} catch (const std::system_error& e) {

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[N-2] (Minor) try/catch in BkgTimer() is dead code: cv_.notify_all() is noexcept

In the initialization block of BkgTimer() (timer.cc lines 25–32), cv_.notify_all() is wrapped in a try { ... } catch (const std::system_error& e). The C++11 standard specifies that std::condition_variable::notify_all() is noexcept and can never throw std::system_error. The catch block is therefore dead code, and the thread_status_ = FAILED path can never execute.

Impact: The FAILED error-handling path in Initialize() can never be triggered, making the error handling misleading.

Suggested fix:

Remove the try/catch and write directly: thread_status_ = threadState::IDLE; cv_.notify_all();. To catch thread-startup errors, handle std::system_error around the std::thread constructor call instead.

// Explicitly sets HTP performance after work is done and returns its status.
// After this call the destructor will not invoke SetState again.
Ort::Status SetPostRunHtpPerf() {
finalized_ = true;

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[N-3] (Minor) SetPostRunHtpPerf() sets finalized_=true before calling SetState — destructor fallback skipped on failure

In HtpPowerStateGuard::SetPostRunHtpPerf() (lines 65–70), finalized_ = true is set before calling power_manager_->SetState(done_state_, ...).

If SetState(done_state_) fails and returns an error, finalized_ is already true, so the destructor's if (pre_run_called_ && !finalized_) condition is false and the destructor will not retry the done-state transition. The HTP power config is left in start_state (high-performance) until the next call.

Impact: In the edge case where SetPostRunHtpPerf() fails (power-config API error), HTP stays in high-performance mode instead of being relaxed.

Suggested fix:

Move finalized_ = true to after the SetState(done_state_) call, setting it only on success. Alternatively, keep the current behavior (always set finalized_) but log a WARNING on failure to indicate that the HTP power state may not have been correctly relaxed.

@huaychou huaychou left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, what is the benefit of this PR? Should it reduce the init/deinit time of each execution?

@qti-monumeen

Copy link
Copy Markdown
Collaborator Author

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants