[QNN EP] Added sustained high performance mode implementation#38
[QNN EP] Added sustained high performance mode implementation#38qti-monumeen wants to merge 36 commits into
Conversation
eac469f to
c0d67c0
Compare
e027b0c to
b083fde
Compare
23f4603 to
b28e043
Compare
5f838ca to
4067092
Compare
387cec2 to
01b8803
Compare
b7b84c1 to
8615732
Compare
|
It looks like this needs to be rebased on |
8848a3b to
2acfa11
Compare
3670430 to
13cf72e
Compare
| constexpr const uint32_t kMaxRpcPolling = 9999; | ||
|
|
||
| // performance timer timeout value is in microseconds | ||
| static constexpr uint64_t kDefaultTimerTimeoutUs = 300000; |
| try { | ||
| thread_status_ = threadState::IDLE; | ||
| cv_.notify_all(); | ||
| } catch (const std::system_error& e) { |
There was a problem hiding this comment.
[N-2] (Minor) try/catch in BkgTimer() is dead code: cv_.notify_all() is noexcept
In the initialization block of BkgTimer() (timer.cc lines 25–32), cv_.notify_all() is wrapped in a try { ... } catch (const std::system_error& e). The C++11 standard specifies that std::condition_variable::notify_all() is noexcept and can never throw std::system_error. The catch block is therefore dead code, and the thread_status_ = FAILED path can never execute.
Impact: The FAILED error-handling path in Initialize() can never be triggered, making the error handling misleading.
Suggested fix:
Remove the try/catch and write directly: thread_status_ = threadState::IDLE; cv_.notify_all();. To catch thread-startup errors, handle std::system_error around the std::thread constructor call instead.
| // Explicitly sets HTP performance after work is done and returns its status. | ||
| // After this call the destructor will not invoke SetState again. | ||
| Ort::Status SetPostRunHtpPerf() { | ||
| finalized_ = true; |
There was a problem hiding this comment.
[N-3] (Minor) SetPostRunHtpPerf() sets finalized_=true before calling SetState — destructor fallback skipped on failure
In HtpPowerStateGuard::SetPostRunHtpPerf() (lines 65–70), finalized_ = true is set before calling power_manager_->SetState(done_state_, ...).
If SetState(done_state_) fails and returns an error, finalized_ is already true, so the destructor's if (pre_run_called_ && !finalized_) condition is false and the destructor will not retry the done-state transition. The HTP power config is left in start_state (high-performance) until the next call.
Impact: In the edge case where SetPostRunHtpPerf() fails (power-config API error), HTP stays in high-performance mode instead of being relaxed.
Suggested fix:
Move finalized_ = true to after the SetState(done_state_) call, setting it only on success. Alternatively, keep the current behavior (always set finalized_) but log a WARNING on failure to indicate that the HTP power state may not have been correctly relaxed.
huaychou
left a comment
There was a problem hiding this comment.
Just wondering, what is the benefit of this PR? Should it reduce the init/deinit time of each execution?
|
Refer to this page for reason to add this change: |
Description
Added sustained high performance mode
Motivation and Context