Skip to content

Model improvements#326

Closed
EHxuban11 wants to merge 3 commits into
devfrom
model-improvements
Closed

Model improvements#326
EHxuban11 wants to merge 3 commits into
devfrom
model-improvements

Conversation

@EHxuban11
Copy link
Copy Markdown
Contributor

No description provided.

Copy link
Copy Markdown

@cursor cursor Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 4 potential issues.

Fix All in Cursor

❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Comment @cursor review or bugbot run to trigger another review on this PR

Reviewed by Cursor Bugbot for commit 55bcb79. Configure here.

)
validator_cls = (
SegmentationValidator if self.task == "segment" else DetectionValidator
)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Export val ignores pose task

Medium Severity

BaseBackend.val picks only SegmentationValidator or DetectionValidator, so pose backends (e.g. exported LibreEC with task="pose") run detection metrics instead of keypoint validation, unlike BaseModel.val.

Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 55bcb79. Configure here.

iou_thres=iou,
num_workers=workers,
allow_download_scripts=allow_download_scripts,
device=validation_device,
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bare GPU index breaks export val

High Severity

BaseBackend.val forwards a user device such as "0" into ValidationConfig unchanged. BaseValidator then calls torch.device("0"), which raises RuntimeError: Invalid device string instead of normalizing to cuda:0.

Fix in Cursor Fix in Web

Triggered by learned rule: Normalise bare integer device strings before torch.device()

Reviewed by Cursor Bugbot for commit 55bcb79. Configure here.

else:
resolved_device = device

map_location = torch.device(resolved_device)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TorchScript rejects bare GPU index

High Severity

TorchScriptBackend passes device straight to torch.device() without mapping "0" / "1" to cuda:<n>, so loading with a numeric GPU id fails immediately.

Fix in Cursor Fix in Web

Triggered by learned rule: Normalise bare integer device strings before torch.device()

Reviewed by Cursor Bugbot for commit 55bcb79. Configure here.

input_names = ["in0"]
if not output_names:
output_names = ["out0"]
return input_names, output_names
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NCNN misses second output

Medium Severity

_discover_blob_names only parses Input lines and defaults outputs to ["out0"]. When the Python binding does not expose output_names(), multi-output exports (e.g. YOLO-NAS) only run through one blob and parsing breaks.

Additional Locations (1)
Fix in Cursor Fix in Web

Reviewed by Cursor Bugbot for commit 55bcb79. Configure here.

@EHxuban11 EHxuban11 force-pushed the model-improvements branch from 55bcb79 to f125731 Compare June 4, 2026 20:53
@chatgpt-codex-connector
Copy link
Copy Markdown

💡 Codex Review

libreyolo = ["assets/*.jpg"]

P2 Badge Include bundled YAML configs in package data

In wheel/sdist installs, only assets/*.jpg is declared as package data, while runtime loaders look up bundled YAML files under libreyolo/config/datasets, libreyolo/config/export, and libreyolo/training/train_config.yaml via paths relative to the installed package. Since those YAML files are not included here or in MANIFEST.in, PyPI users will hit FileNotFoundError for built-ins such as load_data_config("coco8") or load_export_config("tensorrt_default.yaml") even though they work from a source checkout.


missing, unexpected = self.model.load_state_dict(loaded, strict=False)

P1 Badge Unwrap RF-DETR trainer checkpoints before loading

When loading a LibreYOLO RF-DETR checkpoint produced by training (best.pt/last.pt), wrap_libreyolo_checkpoint() stores the actual state dict under the model key, but this code passes the entire metadata wrapper to load_state_dict(). That makes freshly trained RF-DETR runs fail when train() reloads best_ckpt, and prevents later inference/resume from those checkpoints, because keys such as model, schema_version, and names are reported as unexpected instead of loading loaded["model"] and rebuilding for nc as needed.


nn_model = copy.deepcopy(nn_model)

P2 Badge Restore the original DEIMv2 model after export

For DEIMv2 exports where the requested export device differs from the model's current device, the original model is moved at line 297 before this deep copy is made; after that, the finally block only moves the copied wrapper back. A CPU-loaded DEIMv2 model exported with device="cuda" is therefore left with parameters on CUDA while self.device still says CPU, so the next prediction sends CPU inputs into a CUDA model and fails with a device mismatch.


keep = _nms_numpy(boxes, max_scores, iou)

P2 Badge Use class-aware NMS for exported backends

For exported YOLO-style backends, this applies _nms_numpy using only boxes and scores, so overlapping boxes from different classes suppress each other. Native postprocessing in the model families uses class-aware batched_nms, so ONNX/TensorRT/OpenVINO/NCNN/CoreML inference can drop valid detections and report lower mAP whenever two different objects/classes overlap with high IoU.


if int8 and data is None and self.supports_int8:

P2 Badge Reject INT8 for exporters that cannot quantize

When int8=True is passed to formats with supports_int8=False (ONNX, TorchScript, NCNN, or CoreML), validation falls through instead of failing, and the export path either ignores the flag or only records misleading precision='int8' metadata. That lets commands such as libreyolo export format=onnx int8=true data=... succeed while producing an unquantized model, which is especially easy to miss because calibration data may still be loaded.

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

@EHxuban11 EHxuban11 force-pushed the model-improvements branch from effdd27 to 3ce1c9e Compare June 4, 2026 21:06
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 3ce1c9e673

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

img_chw, ratio, pad = preprocess_numpy(np.array(img), input_size)
img_tensor = torch.from_numpy(img_chw).unsqueeze(0)
return img_tensor, original_img, original_size
return img_tensor, original_img, original_size, ratio, pad
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Preserve YOLO9 backend preprocessing arity

When an exported/backend YOLO9 or YOLO9-E2E model runs inference, BaseBackend._preprocess_image still calls this helper as tensor, img, size = preprocess_image(...) and then supplies its own ratio. Returning five values here raises ValueError: too many values to unpack before the backend can execute any prediction, so ONNX/TensorRT/OpenVINO-style YOLO9 backends are broken until the backend is updated or this helper keeps the old arity for that path.

Useful? React with 👍 / 👎.


arr = np.ascontiguousarray(padded, dtype=np.float32) / 255.0
return arr.transpose(2, 0, 1), ratio
return arr.transpose(2, 0, 1), ratio, (float(left), float(top))
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Keep calibration preprocess functions two-valued

For INT8 export calibration, CalibrationDataLoader._preprocess obtains model._get_preprocess_numpy() and unpacks it as result, _ = self._preprocess_fn(...). This new three-value return makes every YOLO9 calibration image hit ValueError: too many values to unpack and get skipped, which breaks or empties INT8 calibration for YOLO9-family exports.

Useful? React with 👍 / 👎.

padded[:new_h, :new_w] = resized
top = int(round(dh - 0.1))
left = int(round(dw - 0.1))
padded[top : top + new_h, left : left + new_w] = resized
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Align YOLO9 validation with centered letterbox

For non-square YOLO9/YOLO9-E2E images, prediction now center-pads here, but YOLO9ValPreprocessor and the YOLO9 training preproc still put the resized image at the top-left and validation calls _postprocess without a pad. That means validation metrics and locally trained checkpoints are using a different input geometry than model.predict, so reported mAP no longer measures the inference path users actually run unless the train/val preprocessors are updated too.

Useful? React with 👍 / 👎.

Comment thread libreyolo/models/yolo9/utils.py Outdated
# multi_label only helps when masks are not requested (segmentation uses
# one coeff vector per anchor, so stick to the best-class path there).
if multi_label and coeffs_all is None:
anchor_idx, class_ids = (scores > conf_thres).nonzero(as_tuple=True)
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Bound YOLO9 multi-label candidates before NMS

With validation's default low conf_thres=0.001, this path expands every class above threshold for every anchor before any pruning; for a typical 640 YOLO9 head that can create roughly 8400 * 80 candidates and feed hundreds of thousands of boxes into batched_nms for one image. This makes standard YOLO9 validation or low-conf prediction much slower and potentially memory-heavy compared with the previous argmax path, so add a pre-NMS cap/top-k or keep multi-label opt-in for low-threshold evaluation.

Useful? React with 👍 / 👎.

Centered letterbox (aspect-preserving, padding split evenly) + multi_label
NMS (one detection per above-conf class, as val.py does at conf 0.001).
Together these bring yolov9 t/s/m/c to within +/-0.1 of the paper on COCO
val2017 (yolov9s 46.8, yolov9m 51.2). Pad offset threaded through predict,
validation, ONNX/TRT/CoreML backend postprocess, INT8 calibration, mask
cropping, and the yolo9_e2e scale-back; backward-compatible with the legacy
scalar ratio (pad=None).

Signature changes and the regressions they would otherwise cause:

* preprocess_numpy now returns (chw, ratio, pad); preprocess_image returns
  (tensor, img, size, ratio, pad). All callers updated.
* backends/base.py: the generic (yolo9) preprocess branch unpacks the new
  5-tuple and packs (ratio, pad_w, pad_h) into the shared ratio slot;
  _parse_yolo9 subtracts pad before the ratio divide (and threads it into
  mask processing) so exported yolo9 / yolo9_e2e / yolo9-seg inference no
  longer ValueErrors and produces correctly-placed boxes on non-square input.
* export/calibration.py: tolerate the 3-tuple preprocess return (INT8
  calibration no longer crashes); docstring updated.
* validation: YOLO9ValPreprocessor now centers the letterbox (wants_unresized
  so its int(round) rounding matches predict exactly) and exposes
  letterbox_pad(); DetectionValidator threads (ratio, pad_w, pad_h) into
  _postprocess so predict and val share identical geometry. YOLONAS opts out
  (handles its own 636 center-pad internally).
* postprocess multi_label: a pre-topk candidate guard (filter anchors whose
  best class beats conf, then expand, cap to 30000 by score) bounds the
  ~600k-row nonzero() at conf=0.001 without changing results.
* _process_masks crops with int(round(orig*ratio)) to match the centered
  letterbox placement (was int(), 1px off).

Tests: centered-pad math == preprocess_numpy, pad round-trip in postprocess,
legacy scalar-ratio (pad=None) back-compat, multi_label emits >1 det/anchor
(=1 when False), R5 candidate-guard equivalence, seg mask crop with pad,
predict==val geometry, picodet cv2 output != PIL, yolox BN eps==1e-3 on
t/s/m/l/x.
@EHxuban11 EHxuban11 force-pushed the model-improvements branch from 3ce1c9e to 70677ec Compare June 4, 2026 21:46
Copy link
Copy Markdown

@chatgpt-codex-connector chatgpt-codex-connector Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 70677ec615

ℹ️ About Codex in GitHub

Codex has been enabled to automatically review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

When you sign up for Codex through ChatGPT, Codex can also answer questions or update the PR, like "@codex address that feedback".

padded[:new_h, :new_w] = resized
top = int(round(dh - 0.1))
left = int(round(dw - 0.1))
padded[top : top + new_h, left : left + new_w] = resized
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Align training letterbox with centered inference

For non-square YOLO9 training images, this centered placement is now the inference/validation geometry, but the current training path still uses libreyolo/models/yolo9/transforms.py:41 to paste the resized image at the top-left and only scales labels by r without adding the new pad. Fresh evidence in this revision is that validation was updated to center-pad, while the training transform was not; locally trained YOLO9/YOLO9-seg checkpoints will learn boxes/masks in a different input canvas than predict/export uses, causing systematic offsets on non-square images.

Useful? React with 👍 / 👎.

max_det=max_det,
letterbox=kwargs.get("letterbox", True),
pad=pad,
multi_label=kwargs.get("multi_label", True),
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P2 Badge Preserve backend multi-label parity

With this new default, eager YOLO9 emits one detection for every class above conf_thres, but exported runtimes still go through BaseBackend._parse_yolo9, which uses np.max/np.argmax and keeps only one class per anchor. In the same scenario covered by the new multi-label unit test (one anchor with two above-threshold classes), native predict/validation returns both detections while ONNX/TensorRT/OpenVINO-style backends return only the max class, so backend results no longer round-trip the model semantics.

Useful? React with 👍 / 👎.

@EHxuban11 EHxuban11 closed this Jun 5, 2026
@EHxuban11 EHxuban11 deleted the model-improvements branch June 5, 2026 00:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant