[ONNX FE] Support com.microsoft::DynamicQuantizeLSTM by mvafin · Pull Request #36207 · openvinotoolkit/openvino

mvafin · 2026-06-03T11:39:02Z

Details:

Adds com.microsoft::DynamicQuantizeLSTM (opset 1) translator to the ONNX Frontend
Dequantizes W/R using ov::decomposition::low_precision_dequantize so MarkDequantization can fire and the CPU/GPU quantized kernel runs when weights are graph constants
Rejects unsupported peephole input P with a clear error; adds a TODO to support it via LSTMCell unrolling
Extracts shared recurrent utilities (normalize_tensor_rank, LSTMDimensions, default optional-input helpers) into utils/recurrent.hpp/.cpp, used by both the new translator and lstm.cpp
Adds two tests: runtime-input variant (Unsqueeze alignment path) and graph-constant variant (MarkDequantization path)
Updates add-fe-op/onnx.md agent skill with lessons from this implementation

Tickets:

CVS-183376

AI Assistance:

AI assistance used: yes
Claude reviewed PR [EXPERIMENT][WIP] [ONNX FE] Add DynamicQuantizeLSTM operator support #35680 (original DynamicQuantizeLSTM implementation), identified correctness gaps and duplication, implemented fixes and refactoring.

…ain) Implements the DynamicQuantizeLSTM contrib operator from the com.microsoft ONNX domain. The translator dequantizes quantized W and R tensors, normalizes their layout to the standard ONNX LSTM gate ordering, and feeds them into LSTMSequence. Validated by building openvino_onnx_frontend, loading a standalone DynamicQuantizeLSTM model extracted from KittenML/kitten-tts-mini-0.8, and comparing OpenVINO outputs against ONNX Runtime. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Copilot

Pull request overview

Adds ONNX Frontend support for com.microsoft::DynamicQuantizeLSTM (opset 1) by lowering it to ov::op::v5::LSTMSequence, including weight dequantization via ov::decomposition::low_precision_dequantize. The PR also refactors shared recurrent/LSTM utilities, adds regression coverage for both runtime-parameter and constant-weight paths, and updates related documentation.

Changes:

Add DynamicQuantizeLSTM translator under src/frontends/onnx/frontend/src/op/com.microsoft/, including scale/zero-point alignment and explicit rejection of peephole input P.
Extract shared recurrent helpers (normalize_tensor_rank, LSTMDimensions, default optional-input builders) into utils/recurrent.hpp/.cpp and reuse them from the existing LSTM translator.
Add two new ONNX FE tests + prototxt models covering runtime-input alignment vs constant-weight MarkDequantization patterns; update supported-ops doc and internal agent skill notes.

Reviewed changes

Copilot reviewed 9 out of 9 changed files in this pull request and generated 2 comments.

Show a summary per file

File	Description
`src/frontends/onnx/tests/onnx_import_com_microsoft.in.cpp`	Adds two regression tests for `DynamicQuantizeLSTM` (runtime inputs vs const weights) sharing common inputs/expected outputs.
`src/frontends/onnx/tests/models/com.microsoft/dynamic_quantize_lstm.prototxt`	New prototxt model exercising runtime-provided W/R/scales/zps.
`src/frontends/onnx/tests/models/com.microsoft/dynamic_quantize_lstm_const_weights.prototxt`	New prototxt model with W/R/scales/zps as initializers (const weights path).
`src/frontends/onnx/frontend/src/utils/recurrent.hpp`	Introduces shared recurrent utilities declarations (`normalize_tensor_rank`, `LSTMDimensions`, default optional input builders).
`src/frontends/onnx/frontend/src/utils/recurrent.cpp`	Implements the new recurrent utilities and refactors existing default-input construction to use them.
`src/frontends/onnx/frontend/src/op/lstm.cpp`	Switches LSTM translator to use the shared recurrent utilities instead of local helpers/duplicated logic.
`src/frontends/onnx/frontend/src/op/com.microsoft/dynamic_quantize_lstm.cpp`	New translator implementation for `com.microsoft::DynamicQuantizeLSTM`.
`src/frontends/onnx/docs/supported_ops.md`	Marks `DynamicQuantizeLSTM` as supported and documents the peephole `P` limitation.
`.github/agents-prototype/skills/add-fe-op/onnx.md`	Updates internal guidance with lessons learned (LPT dequant pattern + axis alignment + testing expectations).

+// Runtime dimension values extracted from OV-layout X [batch, seq, input]
+// and R [num_dir, gates*hidden, hidden]. Each member is a rank-1 i32 node.
+struct LSTMDimensions {


+    const auto gate_axis_size = 4 * hidden_size;
+    const auto& shape = weights.get_partial_shape();
+    const auto dim1_matches = shape[1].is_static() && shape[1].get_length() == gate_axis_size;
+    const auto dim2_matches = shape[2].is_static() && shape[2].get_length() == gate_axis_size;
+
+    CHECK_VALID_NODE(node,
+                     dim1_matches || dim2_matches,
+                     "DynamicQuantizeLSTM input '",
+                     input_name,
+                     "' must have either axis 1 or axis 2 equal to 4*hidden_size (",
+                     gate_axis_size,
+                     "). Got shape: ",
+                     shape);
+
+    if (!dim1_matches && dim2_matches) {
+        return ov::op::util::reorder_axes(weights, {0, 2, 1});
+    }
+    return weights;
+}


Review-driven improvements on top of the DynamicQuantizeLSTM PR: - Fix the test that could never pass. ONNX Runtime's DynamicQuantizeLSTM dynamically quantizes the activations (X and the recurrent hidden state) and runs integer matmuls, whereas this translator only dequantizes the W/R weights and runs a float LSTMSequence. The two differ by the activation-quantization noise (~1.5e-3 here), so the original 1e-6 tolerance failed against the ORT-generated expected values. Relax the tolerance to 0.0055 (matching the sibling DynamicQuantizeMatMul test) and document the approximation with a TODO to model activation quantization (OpenVINO CPU/GPU plugins support dynamic quantization). - Drop the speculative rank-2 (num_directions-omitted) weight handling. The com.microsoft spec defines W/R as rank-3 only; the rank-2 branch was dead code that would also have emitted outputs with an extra num_directions dimension (no squeeze-back, no bidirectional guard, unlike lstm.cpp). Require rank-3 and remove the now-dead original_rank plumbing. - Reject the unsupported peephole input P instead of silently dropping it. - Reduce duplication with the standard LSTM translator: extract normalize_tensor_rank and the default optional-input fabrication (dimension extraction + default bias / sequence_lens / initial state) into shared recurrent utils, used by both lstm.cpp and the dynamic translator. Data-bearing edges (X, W, R, provided initial states) stay explicit in each translator so future activation-quantization insertion is unobstructed.

mlukasze and others added 2 commits June 3, 2026 12:07

Add DynamicQuantizeLSTM test

31c5fcc

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

mvafin requested a review from Copilot June 3, 2026 11:39

mvafin requested review from a team as code owners June 3, 2026 11:39

mvafin requested review from tsavina and removed request for a team June 3, 2026 11:39

Copilot started reviewing on behalf of mvafin June 3, 2026 11:39 View session

github-actions Bot added category: CI OpenVINO public CI category: docs OpenVINO documentation category: ONNX FE OpenVINO ONNX FrontEnd labels Jun 3, 2026

Copilot AI reviewed Jun 3, 2026

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ONNX FE] Support com.microsoft::DynamicQuantizeLSTM#36207

[ONNX FE] Support com.microsoft::DynamicQuantizeLSTM#36207
mvafin wants to merge 3 commits into
openvinotoolkit:masterfrom
mvafin:mvafin/onnx/dynamic-quantize-lstm-improvements

mvafin commented Jun 3, 2026

Uh oh!

Copilot AI left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

mvafin commented Jun 3, 2026

Details:

Tickets:

AI Assistance:

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants