feat(OCR): 全局OCR单例队列,GPU请求串行化#2043
Conversation
Walkthrough移除 step_ocr 中的 GPU 异步分支,新增独立的单线程 Changes
Sequence Diagram(s)sequenceDiagram
participant Caller as 调用者
participant OnnxOcr as OnnxOcrMatcher
participant Executor as ocr_executor
participant ThreadPool as od_ocr 线程池
Caller->>OnnxOcr: run_ocr(image)
alt GPU 已启用
OnnxOcr->>Executor: run_sync/_submit(_ocr_impl, image)
Executor->>ThreadPool: 提交任务(标记为 OCR 线程)
ThreadPool->>ThreadPool: 执行 OCR 模型推理(det/rec/cls)
ThreadPool-->>Executor: 返回结果 (Future 完成)
Executor-->>OnnxOcr: 返回解析后的结果
else GPU 未启用
OnnxOcr->>OnnxOcr: 直接调用本地实现 _ocr_impl/_run_ocr_impl
end
OnnxOcr-->>Caller: 返回 OCR 结果
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Possibly related PRs
Suggested reviewers
Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 2
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py (1)
365-381:⚠️ Potential issue | 🔴 Critical
_ocr_impl缺少self._model is None及image is None的防御检查,可能引发AttributeError
_run_ocr_impl(第 246-250 行)有以下两项防御:if image is None: log.warning('OCR输入的图片为None') return {} if self._model is None and not self.init_model(): return {}而
_ocr_impl均未做此检查,第 381 行直接调用self._model.ocr(...)若_model为None将抛出AttributeError。🛠️ 建议修复
def _ocr_impl(self, image: MatLike, threshold: float = 0, merge_line_distance: float = -1) -> list[OcrMatchResult]: ... start_time = time.time() + if image is None: + log.warning('OCR输入的图片为None') + return [] + if self._model is None and not self.init_model(): + return [] ocr_result_list: list[OcrMatchResult] = [] scan_result_list: list = self._model.ocr(image, cls=False)🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py` around lines 365 - 381, The _ocr_impl method calls self._model.ocr(image, ...) without defensive checks and can raise AttributeError if image or self._model is None; mirror the guards used in _run_ocr_impl by first checking if image is None (log warning and return empty list) and then ensure the model is available (if self._model is None and not self.init_model(): return empty list), so modify _ocr_impl to validate image and initialize/check self._model before calling self._model.ocr.
🧹 Nitpick comments (1)
src/one_dragon/utils/ocr_executor.py (1)
45-45: Ruff TRY003:异常消息建议内置于异常类中Ruff 建议将较长的异常消息封装进自定义异常类,而非直接在
raise处传递字符串。鉴于目前消息内容简单,可忽略或通过# noqa: TRY003标注。🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/one_dragon/utils/ocr_executor.py` at line 45, The current raise statement uses a long message directly in raise TimeoutError(...) which triggers Ruff TRY003; either define and use a custom exception (e.g., class OCRTimeoutError(TimeoutError): pass) and raise OCRTimeoutError(f"OCR task timed out after {timeout} seconds") from e (update any callers accordingly), or keep the built-in TimeoutError but suppress the lint rule by appending a noqa (e.g., the raise line: raise TimeoutError(...) # noqa: TRY003); locate the raise TimeoutError(...) in ocr_executor.py (the OCR task executor function) and apply one of these two fixes.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/one_dragon/utils/ocr_executor.py`:
- Line 26: The functions _submit_internal, submit, and run_sync lack proper type
annotations for the fn parameter and the run_sync return type; add a TypeVar
(e.g., T), annotate fn as Callable[..., T] (or Callable[..., Awaitable[T]] if
appropriate), update _submit_internal and submit to return Future[T] and
annotate their fn parameter accordingly, and set run_sync to return T (or
Awaitable[T]→T if it awaits); also import TypeVar, Callable, Awaitable, and
Future from typing/asyncio as needed to satisfy the annotations and keep
signatures consistent (refer to _submit_internal, submit, and run_sync in
ocr_executor.py).
- Around line 40-44: When catching FutureTimeoutError from the future returned
by _submit_internal (the variable f), add a warning log that the future timed
out and may still be running on GPU (include f.cancel() result and f.state() or
str(f) / repr(f) plus the timeout value) so ops can detect leaked GPU work; keep
the existing f.cancel() call but log that cancel may fail for RUNNING futures
and that the underlying GPU task could still occupy resources, using the
module/logger used elsewhere in this file to emit the warning.
---
Outside diff comments:
In `@src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py`:
- Around line 365-381: The _ocr_impl method calls self._model.ocr(image, ...)
without defensive checks and can raise AttributeError if image or self._model is
None; mirror the guards used in _run_ocr_impl by first checking if image is None
(log warning and return empty list) and then ensure the model is available (if
self._model is None and not self.init_model(): return empty list), so modify
_ocr_impl to validate image and initialize/check self._model before calling
self._model.ocr.
---
Nitpick comments:
In `@src/one_dragon/utils/ocr_executor.py`:
- Line 45: The current raise statement uses a long message directly in raise
TimeoutError(...) which triggers Ruff TRY003; either define and use a custom
exception (e.g., class OCRTimeoutError(TimeoutError): pass) and raise
OCRTimeoutError(f"OCR task timed out after {timeout} seconds") from e (update
any callers accordingly), or keep the built-in TimeoutError but suppress the
lint rule by appending a noqa (e.g., the raise line: raise TimeoutError(...) #
noqa: TRY003); locate the raise TimeoutError(...) in ocr_executor.py (the OCR
task executor function) and apply one of these two fixes.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
src/one_dragon/base/cv_process/steps/step_ocr.pysrc/one_dragon/base/matcher/ocr/onnx_ocr_matcher.pysrc/one_dragon/base/operation/one_dragon_context.pysrc/one_dragon/utils/ocr_executor.py
There was a problem hiding this comment.
Actionable comments posted: 3
🧹 Nitpick comments (1)
src/one_dragon/utils/ocr_executor.py (1)
14-16: 可以把超时文案收敛到OCRTimeoutError内部,减少重复字符串。这能同时消除 TRY003 告警并让异常语义更集中。
♻️ 建议修改
class OCRTimeoutError(TimeoutError): """OCR executor timed out while waiting for result.""" - pass + def __init__(self, timeout: float | None): + super().__init__(f"OCR task timed out after {timeout} seconds") ... - raise OCRTimeoutError(f"OCR task timed out after {timeout} seconds") from e + raise OCRTimeoutError(timeout) from eAlso applies to: 61-61
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/one_dragon/utils/ocr_executor.py` around lines 14 - 16, Replace repeated timeout strings by centralizing them in the OCRTimeoutError class: add a default message (e.g. DEFAULT_MESSAGE or via __init__ that sets a default message like "OCR executor timed out while waiting for result.") and allow an optional custom message, then update all code that raises OCRTimeoutError (references around the raise sites that previously passed the literal string) to raise OCRTimeoutError() (or OCRTimeoutError() without duplicating the text). This collapses duplicated literals into the OCRTimeoutError class and removes the TRY003 warning.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py`:
- Around line 220-223: The GPU branch currently calls ocr_executor.run_sync(...)
without specifying timeout, causing the executor's default (60s) to count queue
wait time and produce OCRTimeoutError for requests that haven't started; to fix,
explicitly pass timeout=None to ocr_executor.run_sync when delegating to the GPU
path (e.g., where is_use_gpu() leads to
ocr_executor.run_sync(self._run_ocr_single_line_impl, ...)) so the executor
won't apply the enqueue wait timeout, and apply the same change to the other
GPU-run_sync call sites mentioned (the two similar GPU branches around the other
public OCR methods); leave existing OCRTimeoutError handling unchanged.
- Around line 224-229: 函数 _run_ocr_single_line_impl 在 strict_one_line=True 时直接调用
_run_ocr_without_det,缺少对空图像的保护,会触发底层异常;请在 _run_ocr_single_line_impl 开头复用与
_run_ocr_impl/run_ocr 相同的空输入检查(如判断 image 为 None 或尺寸为0),在检测到空图像时直接返回空字符串并避免调用
_run_ocr_without_det,确保行为与非 strict 分支一致。
In `@src/one_dragon/utils/ocr_executor.py`:
- Around line 64-65: The shutdown function is missing a return type annotation;
update the function signature for shutdown to include an explicit "-> None"
return type (i.e., def shutdown(wait: bool = True) -> None:) so it complies with
the project's type hinting rules; ensure you modify the definition that calls
_executor.shutdown(wait=wait) and keep the existing behavior.
---
Nitpick comments:
In `@src/one_dragon/utils/ocr_executor.py`:
- Around line 14-16: Replace repeated timeout strings by centralizing them in
the OCRTimeoutError class: add a default message (e.g. DEFAULT_MESSAGE or via
__init__ that sets a default message like "OCR executor timed out while waiting
for result.") and allow an optional custom message, then update all code that
raises OCRTimeoutError (references around the raise sites that previously passed
the literal string) to raise OCRTimeoutError() (or OCRTimeoutError() without
duplicating the text). This collapses duplicated literals into the
OCRTimeoutError class and removes the TRY003 warning.
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.pysrc/one_dragon/utils/ocr_executor.py
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py`:
- Around line 388-393: In _ocr_impl, the guard only checks "if image is None" so
empty image arrays (size == 0) still reach self._model.ocr; update the check in
the _ocr_impl method to mirror the earlier protection by treating images with
zero pixels as invalid (e.g., check image is None or getattr(image, "size",
None) == 0) and return [] before attempting to init_model or call
self._model.ocr, keeping the existing init_model() / self._model flow intact.
In `@src/one_dragon/utils/ocr_executor.py`:
- Around line 50-63: 在捕获 FutureTimeoutError 后不要盲目将所有超时类异常转换为 OCRTimeoutError;检查
future f 是否已完成(f.done()):如果 f.done() 为真,说明异常来自任务内部,应重新抛出原始异常(raise e)以保留任务内部的
TimeoutError;如果 f.done() 为假,才视为真正的等待超时,像现在一样调用 f.cancel()、记录日志并抛出
OCRTimeoutError(timeout);在实现中保留对 f.result(timeout=timeout) 的原用法并在 except
FutureTimeoutError as e 分支中按上述逻辑分流。
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.pysrc/one_dragon/utils/ocr_executor.py
bc81ca9 to
b840ff7
Compare
There was a problem hiding this comment.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py (1)
246-257:⚠️ Potential issue | 🟠 Major
_run_ocr_impl仍缺少空图保护,run_ocr路径存在输入异常风险Line 255 目前只判断了
image is None。当输入是空数组(size == 0)时,仍会继续进入模型调用,和本文件其他 OCR 路径的防护不一致。🛠️ 建议修改
- if image is None: - log.warning('OCR输入的图片为None') + if image is None or getattr(image, 'size', 0) == 0: + log.warning('OCR输入的图片为None或空图') return {}🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py` around lines 246 - 257, The _run_ocr_impl function currently only checks for image is None and should also guard against empty images to match other OCR paths; update _run_ocr_impl (and ensure consistency with run_ocr call sites) to detect an empty image (e.g., image size == 0 or equivalent for MatLike) and return an empty dict/log a warning like the existing None branch instead of proceeding to model calls; reference the _run_ocr_impl function and run_ocr pathways when making the change so all OCR entry points have the same empty-image protection.
🧹 Nitpick comments (1)
src/one_dragon/utils/ocr_executor.py (1)
56-63: 日志字符串格式可与仓库规范统一为 f-stringLine 57 当前使用
%模板,占位符参数可以直接改为 f-string,保持与仓库规范一致。♻️ 建议修改
log.warning( - "OCR task timeout after %.2fs; cancel=%s running=%s done=%s future=%r", - timeout if timeout is not None else -1.0, - cancelled, - f.running(), - f.done(), - f, + f"OCR task timeout after {(timeout if timeout is not None else -1.0):.2f}s; " + f"cancel={cancelled} running={f.running()} done={f.done()} future={f!r}" )As per coding guidelines: Use f-strings for string formatting instead of other formatting methods.
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/one_dragon/utils/ocr_executor.py` around lines 56 - 63, Replace the percent-style formatted log.warning call with an f-string: in the OCR timeout logging site (the log.warning that references timeout, cancelled, f.running(), f.done(), and f), build a single f-string that interpolates timeout (use -1.0 if None), cancelled, f.running(), f.done(), and the future object f directly, and use that f-string as the first argument to log.warning (keeping the same log level and message semantics).
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/one_dragon/utils/ocr_executor.py`:
- Around line 16-17: The constructor OCRTimeoutError.__init__ is missing its
return type annotation; update the signature of OCRTimeoutError.__init__(self,
timeout: float | None) to include the explicit return type -> None to match
project typing conventions and other constructors in the codebase.
---
Outside diff comments:
In `@src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.py`:
- Around line 246-257: The _run_ocr_impl function currently only checks for
image is None and should also guard against empty images to match other OCR
paths; update _run_ocr_impl (and ensure consistency with run_ocr call sites) to
detect an empty image (e.g., image size == 0 or equivalent for MatLike) and
return an empty dict/log a warning like the existing None branch instead of
proceeding to model calls; reference the _run_ocr_impl function and run_ocr
pathways when making the change so all OCR entry points have the same
empty-image protection.
---
Nitpick comments:
In `@src/one_dragon/utils/ocr_executor.py`:
- Around line 56-63: Replace the percent-style formatted log.warning call with
an f-string: in the OCR timeout logging site (the log.warning that references
timeout, cancelled, f.running(), f.done(), and f), build a single f-string that
interpolates timeout (use -1.0 if None), cancelled, f.running(), f.done(), and
the future object f directly, and use that f-string as the first argument to
log.warning (keeping the same log level and message semantics).
ℹ️ Review info
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
src/one_dragon/base/matcher/ocr/onnx_ocr_matcher.pysrc/one_dragon/utils/ocr_executor.py
| def __init__(self, timeout: float | None): | ||
| super().__init__(f"OCR task timed out after {timeout} seconds") |
There was a problem hiding this comment.
🧩 Analysis chain
🏁 Script executed:
# Check if the file exists and read the relevant section
if [ -f "src/one_dragon/utils/ocr_executor.py" ]; then
echo "=== File content around lines 16-17 ==="
sed -n '10,25p' "src/one_dragon/utils/ocr_executor.py" | cat -n
else
echo "File not found, searching for it..."
find . -name "ocr_executor.py" -type f
fiRepository: OneDragon-Anything/ZenlessZoneZero-OneDragon
Length of output: 671
🏁 Script executed:
# Check other __init__ methods in Python files to see if -> None is consistently used
echo "=== Checking __init__ return type patterns ==="
rg "def __init__\(" --type py -A 1 | head -30Repository: OneDragon-Anything/ZenlessZoneZero-OneDragon
Length of output: 1963
🏁 Script executed:
# Check specifically for __init__ methods with and without -> None in the codebase
echo "=== Count of __init__ with -> None ==="
rg "def __init__\([^)]*\)\s*->\s*None:" --type py | wc -l
echo "=== Count of __init__ without -> None ==="
rg "def __init__\([^)]*\):" --type py | grep -v "-> None" | wc -lRepository: OneDragon-Anything/ZenlessZoneZero-OneDragon
Length of output: 301
补齐 OCRTimeoutError.__init__ 的返回类型注解
__init__ 在 Line 16 缺少 -> None,与当前仓库类型规范和代码库现有模式不一致。
🛠️ 建议修改
class OCRTimeoutError(TimeoutError):
"""OCR executor timed out while waiting for result."""
- def __init__(self, timeout: float | None):
+ def __init__(self, timeout: float | None) -> None:
super().__init__(f"OCR task timed out after {timeout} seconds")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/one_dragon/utils/ocr_executor.py` around lines 16 - 17, The constructor
OCRTimeoutError.__init__ is missing its return type annotation; update the
signature of OCRTimeoutError.__init__(self, timeout: float | None) to include
the explicit return type -> None to match project typing conventions and other
constructors in the codebase.
DoctorReid
left a comment
There was a problem hiding this comment.
有点忘记当初为什么没有在ocr matcher里统一做这个事情了 (论留下文档的重要性)
但看代码,感觉是因为多个onnx模型,并发调用也可能出现问题,可以写个比较简单的测试脚本,让ocr模型和flash模型(或者其他的)一起跑个1分钟看看。
如果是的话,还是得统一用一个gpu_executor,只不过可以在各个matcher里统一处理。
|
我写了一个压测脚本: stress_test_directdml.py 问题应该还是 我注意到yolo底层有的没走gpu_executor.py ZenlessZoneZero-OneDragon/src/one_dragon/yolo/yolov8_onnx_det.py Lines 109 to 116 in 21882d1 ZenlessZoneZero-OneDragon/src/one_dragon/yolo/yolov8_onnx_cls.py Lines 119 to 126 in 21882d1 这可能就导致yolo自己挂了。之前的 总结:此pr可能方向错了(不需要单独给ocr维护一套),是把所有的 |
|
压测可以做成插件app的形式 plugins\README.md |

变更内容
验证
Summary by CodeRabbit
发布说明