AI-Based AprilTag Pipeline Acceleration by DoctorFogarty · Pull Request #2410 · PhotonVision/photonvision

DoctorFogarty · 2026-03-26T04:49:56Z

Description

Adds a two-stage hybrid ML/traditional AprilTag detection pipeline that leverages NPU hardware for accelerated tag
detection. A YOLO v11 model identifies AprilTag regions of interest (ROIs) on the NPU, then the traditional WPILib AprilTag detector decodes only the cropped sub-images for accurate tag ID and pose. This reduces the per-frame computational load on the CPU by narrowing the search space. Falls back to full-frame traditional detection when ML finds no tags if the user wishes to enable a fallback setting.

Two-Stage Hybrid Pipeline

Stage 1 (ML ROI detection): AprilTagROIDetectionPipe runs a YOLO v11 model on the NPU to produce bounding boxes around candidate tags
Stage 2 (Traditional decode): AprilTagROIDecodePipe extracts each ROI sub-image, runs the WPILib AprilTag detector on it, and maps corners + homography back to full-frame coordinates
Fallback: When ML detection finds zero tags and mlFallbackToTraditional is enabled (default), the pipeline falls back to full-frame traditional detection
Visualization: DrawMLROIPipe renders cyan bounding boxes around ML-detected ROIs on the output stream for tuning

Homography Coordinate Transformation

transformHomography() applies translation-only mapping (ROI offset to full frame)
transformHomographyWithScale() applies combined inverse-scaling and translation for ATR-resized ROIs
Mathematics derived from UMich AprilTag library homography conventions (row-major 3×3)
Deduplication by tag ID (keeps highest decision margin) when overlapping ROIs detect the same tag

Adaptive Tag Resizing (ATR)

Downscales large/close tags within each ROI to a target pixel dimension before detection, improving decode performance for near-field tags
atrEnabled (default: true), atrTargetDimension (default: 200 px), atrMinScaleFactor (default: 0.25, caps at 4× downscale)
Coordinates and homography are correctly inverse-scaled back to full-frame space after detection

New AprilTag Pipeline Settings

ML detection: useMLDetection, mlConfidenceThreshold (0.5), mlNmsThreshold (0.45), mlRoiPaddingPixels (40), mlFallbackToTraditional (true), mlModelName, showDetectionBoxes (true)
ATR: atrEnabled (true), atrTargetDimension (200), atrMinScaleFactor (0.25)
Multi-tag ambiguity filtering: multiTagAmbiguityThreshold (0.2) — filters high-ambiguity single-tag poses before multi-tag PNP estimation
Configurable max targets: Replaced boolean outputShowMultipleTargets with numeric outputMaximumTargets (default: 20, max: 127). Backward-compatible deserialization via @JsonAnySetter migration

Model Management

Added apriltagV4-yolo11.rknn (RK3588) and apriltagV4-yolo11.tflite (QCS6490/Rubik Pi 3)
NeuralNetworkModelManager handles platform-aware model discovery and loading

Frontend / UI

AprilTagTab.vue: New "AI-Assisted Detection (NPU)" section with model selector, confidence/NMS/padding sliders, fallback toggle, and ROI box visualization toggle. Conditionally rendered only when supportedBackends is non-empty
OutputTab.vue: New "Max Allowed Ambiguity" slider for multi-tag filtering. Replaced "Show Multiple Targets" toggle with "Maximum Targets" numeric slider
TypeScript type updates in PipelineTypes.ts and SettingTypes.ts

Bug Fixes & Improvements

Fixed stale dashboard UI components persisting when activating cameras
CVMat memory management improvements (release processed Focus mat, fix refcounting)
Additive pixel padding strategy for ROI expansion (naturally adaptive: small/far tags get proportionally more expansion)
Thread pooling configuration defaults (4 threads)
Frame.java: Added mlDetectionRois field to carry ROI bounding boxes through the pipeline for visualization

Meta

Merge checklist:

Pull Request title is short, imperative summary of proposed changes
The description documents the what and why, including events that led to this PR
If this PR changes behavior or adds a feature, user documentation is updated
If this PR touches photon-serde, all messages have been regenerated and hashes have not changed unexpectedly
If this PR touches configuration, this is backwards compatible with all settings going back to the previous seasons's last release (seasons end after champs ends)
If this PR touches pipeline settings or anything related to data exchange, the frontend typing is updated
If this PR addresses a bug, a regression test for it is added
If this PR adds a dependency, the license has been checked for compatibility and steps taken to follow it

mcm001 · 2026-03-26T04:57:40Z

    public final FrameStaticProperties frameStaticProperties;

+    /** Optional ML detection ROI bounding boxes for visualization. Set by ML-assisted pipelines. */
+    public List<RotatedRect> mlDetectionRois = List.of();


Frame isn't the right place to maintain this state. can it move to the pipeline result?

mcm001 · 2026-03-26T04:58:29Z

    }

+    /** Result container for ML hybrid detection */
+    private static class MLDetectionResult {


Let's refactor this to not live as an inner class

Should be resolved in
DoctorFogarty@0dac7b3

mcm001 · 2026-03-26T04:58:48Z

+     * Performs ML-assisted hybrid AprilTag detection. Stage 1: ML model detects ROIs Stage 2:
+     * Traditional detector decodes tags within ROIs
+     */
+    private MLDetectionResult processMLHybrid(Frame frame) {


This logic feels like it wants to be a Pipe

spacey-sooty · 2026-03-27T17:05:07Z

+    <pv-slider
+      v-if="
+        (currentPipelineSettings.pipelineType === PipelineType.AprilTag ||
+          currentPipelineSettings.pipelineType === PipelineType.Aruco) &&
+        useCameraSettingsStore().isCurrentVideoFormatCalibrated &&
+        useCameraSettingsStore().currentPipelineSettings.solvePNPEnabled &&
+        currentPipelineSettings.doMultiTarget
+      "
+      v-model="currentPipelineSettings.multiTagAmbiguityThreshold"
+      label="Max Allowed Ambiguity"
+      tooltip="Tags with pose ambiguity above this value are excluded from multi-tag estimation. Lower = stricter. 0 = only unambiguous tags. 1 = include all (disabled)."
+      :min="0"
+      :max="1"
+      :step="0.05"
+      :switch-cols="interactiveCols"
+      @update:modelValue="
+        (value) => useCameraSettingsStore().changeCurrentPipelineSetting({ multiTagAmbiguityThreshold: value }, false)
+      "
+    />


This feature should be split to a separate PR

Setup Basic Tests Included Roboflow model tflite yolov8n trained

…curs

…ix type is cited from

This reverts commit e40f174.

…ackaged V8 model as I will replace it soon.

Removed old V8 model for AprilTags. Added entries for current V11 AprilTag Models for Rubik and OPi5

me-it-is · 2026-04-17T00:26:54Z

What are the performance benefits of this like?

srimanachanta · 2026-04-18T03:07:14Z

What are the performance benefits of this like?

Ditto, I'm curious to see performance benefits from doing this. Quad fitting versus ROI cropping which still requires either a DMA transfer or a mem-copy to the NPU.

I'd also want to see the performance benefits of being able to reduce decimate in just those areas given less pixels are being searched to begin with (increased range for the same baseline latency addition from using ML).

DoctorFogarty · 2026-04-19T01:28:21Z

OV9281
1280x800
AI DETECTION OFF - Decimate 1

DoctorFogarty · 2026-04-19T01:28:53Z

OV9281
1280x800
Decimate 1 Hardcoded for ROI Frames
AI DETECTOR ON

DoctorFogarty · 2026-04-19T01:32:39Z

ThriftyCam 2MP Camera
1600x1304 YUYV
AI DETECTOR OFF
Decimate 1

DoctorFogarty · 2026-04-19T01:33:23Z

ThriftyCam 2MP Camera
1600x1304 YUYV
AI DETECTOR ON
Decimate 1 Hardcoded for ROI Frames

DoctorFogarty · 2026-04-19T01:34:31Z

@me-it-is @srimanachanta see above.

srimanachanta · 2026-04-19T08:17:06Z

@me-it-is @srimanachanta see above.

Insanely cool. Good work.

mcm001 · 2026-04-23T04:39:28Z

I think this is worth a design doc in the developer section of our website + some extra words added to our normal user docs as well before we merge. There's a lotta brains and thinking going on here and I want to support both future devs and users confused about why the tags have a bounding box now

DoctorFogarty requested a review from a team as a code owner March 26, 2026 04:49

github-actions Bot added frontend Having to do with PhotonClient and its related items backend Things relating to photon-core and photon-server labels Mar 26, 2026

mcm001 reviewed Mar 26, 2026

View reviewed changes

DoctorFogarty force-pushed the apriltag-ml-experimental-sync branch from 5f60a17 to 93c2d80 Compare March 26, 2026 05:12

github-actions Bot added documentation Anything relating to https://docs.photonvision.org photonlib Things related to the PhotonVision library labels Mar 26, 2026

DoctorFogarty force-pushed the apriltag-ml-experimental-sync branch from 93c2d80 to ca0e47b Compare March 26, 2026 05:18

github-actions Bot removed documentation Anything relating to https://docs.photonvision.org photonlib Things related to the PhotonVision library labels Mar 26, 2026

spacey-sooty reviewed Mar 27, 2026

View reviewed changes

DoctorFogarty and others added 19 commits March 30, 2026 10:57

Created Apriltag ML assisted Apriltag settings

490181a

Setup Basic Tests Included Roboflow model tflite yolov8n trained

Small change to stop spamming logs every time a pipe setting check oc…

f95fa72

…curs

remove duplicated .tflite, remove tool-versions.yaml

e790e0c

Homography transform added to requirements

5b4df53

Match existing pattern for settingsStore

1906547

Minor comment cleanup, added source umich documentation to where matr…

8d5fbb4

…ix type is cited from

Unit testing homography transformation

5ef08dc

ROI Decimate should always be 1

215a0ea

testing enhancements to ROI size fallback

a85ded9

Thread pooling

597bae2

Revert "testing enhancements to ROI size fallback"

e033c97

This reverts commit e40f174.

Adaptive Tag Resizing to fix poor near field performance

3bc64f6

AprilTags are Y down, UI selector

894af9c

Removed Subpix refinement as it was not well informed. Removed auto p…

4893221

…ackaged V8 model as I will replace it soon.

feat: Model selector on AprilTag screen after choosing AI Acceleration

019263e

Slightly higher atr

afcd2e6

AprilTag Pipeline ROI box viewer

84d3a1e

Additive Pixel Padding strategy change

def86fe

Synced Changes from 2026.3.2

2db59b6

judsonjames and others added 3 commits March 30, 2026 10:57

Add TFLite and RKNN models to source for review

ab06e25

Removed old model weights.

44cd48e

Removed old V8 model for AprilTags. Added entries for current V11 AprilTag Models for Rubik and OPi5

Added a configurable multi-tag estimate ambiguity filter

f088b09

samfreund force-pushed the apriltag-ml-experimental-sync branch from c703924 to fd791a8 Compare March 30, 2026 15:57

lint

076b6ba

samfreund force-pushed the apriltag-ml-experimental-sync branch from fd791a8 to 076b6ba Compare March 30, 2026 16:04

DoctorFogarty and others added 4 commits March 30, 2026 15:47

chore: Fix merge text artifact

e4ca2bb

Move MLDetectionResult to standalone record

0dac7b3

WPIFormat on MLDetectionResult

27b2604

Merge branch 'PhotonVision:main' into apriltag-ml-experimental-sync

00a6270

Conversation

DoctorFogarty commented Mar 26, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Two-Stage Hybrid Pipeline

Homography Coordinate Transformation

Adaptive Tag Resizing (ATR)

New AprilTag Pipeline Settings

Model Management

Frontend / UI

Bug Fixes & Improvements

Meta

Uh oh!

mcm001 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

mcm001 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

judsonjames Mar 31, 2026

Choose a reason for hiding this comment

Uh oh!

mcm001 Mar 26, 2026

Choose a reason for hiding this comment

Uh oh!

spacey-sooty Mar 27, 2026

Choose a reason for hiding this comment

Uh oh!

me-it-is commented Apr 17, 2026

Uh oh!

srimanachanta commented Apr 18, 2026

Uh oh!

DoctorFogarty commented Apr 19, 2026

Uh oh!

DoctorFogarty commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DoctorFogarty commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DoctorFogarty commented Apr 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

DoctorFogarty commented Apr 19, 2026

Uh oh!

srimanachanta commented Apr 19, 2026

Uh oh!

mcm001 commented Apr 23, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

7 participants

DoctorFogarty commented Mar 26, 2026 •

edited

Loading

DoctorFogarty commented Apr 19, 2026 •

edited

Loading

DoctorFogarty commented Apr 19, 2026 •

edited

Loading

DoctorFogarty commented Apr 19, 2026 •

edited

Loading