rusty-cv includes a small postprocessing stack aimed at object-detection style inference pipelines in Rust and Python.
The functions in this guide are the “harder” CV pieces in the crate today. They are also the functions most aligned with the library goal of moving slow Python-side postprocessing into Rust.
nms(...)batched_nms(...)multiclass_nms(...)
Use hard NMS when overlapping boxes should be fully removed once they exceed the IoU threshold.
soft_nms(...)batched_soft_nms(...)multiclass_soft_nms(...)
Use soft NMS when overlapping boxes should stay alive with decayed scores instead of being removed outright.
nms(...) and soft_nms(...) operate on:
boxes:N x 4scores:N
batched_nms(...) and batched_soft_nms(...) operate on:
boxes:N x 4scores:Nclass_ids:N
Only boxes from the same class interact with one another.
multiclass_nms(...) and multiclass_soft_nms(...) operate on:
boxes:N x 4class_scores:N x C
Each class is expanded into its own candidate stream, processed independently, then merged back into one globally sorted detection list.
Hard NMS uses:
iou_thresholdscore_thresholdpre_nms_top_kmax_detections
Soft NMS uses:
methodiou_thresholdscore_thresholdsigmapre_nms_top_kmax_detections
method can be:
LinearGaussian
Behavior:
Linear: scores are reduced only when IoU is above the thresholdGaussian: scores are always decayed as a smooth function of overlap
Rust advanced APIs return Vec<Detection> where each item contains:
box_indexclass_idscore
Python bindings expose the same information as dictionaries of NumPy arrays:
indicesclass_idsscores
- outputs are sorted by score descending
- ties fall back to the lower original box index
- merged multiclass outputs use class id as a later tie-break
use rusty_cv::{batched_nms, multiclass_nms, nms, BBoxXYXY, NmsOptions};
let boxes = [
BBoxXYXY { x1: 0.0, y1: 0.0, x2: 10.0, y2: 10.0 },
BBoxXYXY { x1: 1.0, y1: 1.0, x2: 11.0, y2: 11.0 },
BBoxXYXY { x1: 20.0, y1: 20.0, x2: 30.0, y2: 30.0 },
];
let scores = [0.9, 0.8, 0.7];
let class_ids = [0usize, 0usize, 1usize];
let class_scores = [
0.9, 0.1,
0.8, 0.75,
0.1, 0.7,
];
let keep = nms(&boxes, &scores, 0.5)?;
let batched = batched_nms(&boxes, &scores, &class_ids, &NmsOptions::default())?;
let multi = multiclass_nms(&boxes, &class_scores, 2, &NmsOptions::default())?;
# let _ = (keep, batched, multi);
# Ok::<(), rusty_cv::BBoxError>(())use rusty_cv::{
batched_soft_nms, multiclass_soft_nms, soft_nms, BBoxXYXY, SoftNmsMethod, SoftNmsOptions,
};
let boxes = [
BBoxXYXY { x1: 0.0, y1: 0.0, x2: 10.0, y2: 10.0 },
BBoxXYXY { x1: 1.0, y1: 1.0, x2: 11.0, y2: 11.0 },
BBoxXYXY { x1: 20.0, y1: 20.0, x2: 30.0, y2: 30.0 },
];
let scores = [0.9, 0.8, 0.7];
let class_ids = [0usize, 0usize, 1usize];
let class_scores = [
0.9, 0.1,
0.8, 0.75,
0.1, 0.7,
];
let options = SoftNmsOptions {
method: SoftNmsMethod::Linear,
iou_threshold: 0.5,
score_threshold: 0.25,
sigma: 0.5,
pre_nms_top_k: None,
max_detections: Some(4),
};
let soft = soft_nms(&boxes, &scores, &options)?;
let batched_soft = batched_soft_nms(&boxes, &scores, &class_ids, &options)?;
let multi_soft = multiclass_soft_nms(&boxes, &class_scores, 2, &options)?;
# let _ = (soft, batched_soft, multi_soft);
# Ok::<(), rusty_cv::BBoxError>(())import numpy as np
import rusty_cv
boxes = np.array(
[
[0.0, 0.0, 10.0, 10.0],
[1.0, 1.0, 11.0, 11.0],
[20.0, 20.0, 30.0, 30.0],
],
dtype=np.float32,
)
scores = np.array([0.9, 0.8, 0.7], dtype=np.float32)
class_ids = np.array([0, 0, 1], dtype=np.int64)
class_scores = np.array(
[
[0.9, 0.1],
[0.8, 0.75],
[0.1, 0.7],
],
dtype=np.float32,
)
keep = rusty_cv.nms(boxes, scores, iou_threshold=0.5)
batched = rusty_cv.batched_nms(boxes, scores, class_ids, iou_threshold=0.5)
multi = rusty_cv.multiclass_nms(
boxes,
class_scores,
iou_threshold=0.5,
score_threshold=0.7,
max_detections=3,
)soft = rusty_cv.soft_nms(
boxes,
scores,
method="linear",
iou_threshold=0.5,
score_threshold=0.25,
)
soft_batched = rusty_cv.batched_soft_nms(
boxes,
scores,
class_ids,
method="linear",
iou_threshold=0.5,
score_threshold=0.25,
)
soft_multi = rusty_cv.multiclass_soft_nms(
boxes,
class_scores,
method="linear",
iou_threshold=0.5,
score_threshold=0.25,
max_detections=4,
)Reference script:
python3 scripts/nms_ref.pyComparison script:
.venv/bin/python scripts/bench_nms.py 64 20 0.5 4 0.25The benchmark script checks parity and measures:
- single hard NMS
- batched hard NMS
- multiclass hard NMS
- single soft NMS
- batched soft NMS
- multiclass soft NMS
Example output observed in this repository:
[single]
rust_us_per_iter=114.01
python_us_per_iter=921.99
speedup=8.09x
[batched]
rust_us_per_iter=81.77
python_us_per_iter=233.61
speedup=2.86x
[multiclass]
rust_us_per_iter=368.89
python_us_per_iter=3389.74
speedup=9.19x
[soft]
rust_us_per_iter=129.45
python_us_per_iter=1116.96
speedup=8.63x
[soft_batched]
rust_us_per_iter=68.73
python_us_per_iter=317.48
speedup=4.62x
[soft_multiclass]
rust_us_per_iter=483.40
python_us_per_iter=4697.54
speedup=9.72x
These are local measurements on this machine, not portability claims.