Skip to content

Add ONNX Runtime GenAI generation comparison in OnnxDiscrepancyCheck#2487

Open
xadupre wants to merge 9 commits into
mainfrom
xadupre/genai
Open

Add ONNX Runtime GenAI generation comparison in OnnxDiscrepancyCheck#2487
xadupre wants to merge 9 commits into
mainfrom
xadupre/genai

Conversation

@xadupre
Copy link
Copy Markdown
Member

@xadupre xadupre commented Jun 1, 2026

Describe your changes

Extended OnnxDiscrepancyCheck to optionally run ONNX Runtime GenAI generation and compare it with the reference transformers generation using the longest common prefix of token IDs.

Also incorporated follow-up fixes from review feedback:

  • Updated tests to correctly mock the function-local onnxruntime_genai import via patch.dict(sys.modules, ...).
  • Flattened nested with usage in tests to satisfy linting.
  • Set generate_max_new_tokens default to 32 so GenAI generation comparison runs with a practical default.
  • Switched the GenAI generation path to the latest onnxruntime-genai API by using generator.append_tokens(...) and removing deprecated params.input_ids / compute_logits() usage.
  • Added a clear optional-dependency error message when onnxruntime-genai is not installed.
  • Updated tests to validate append_tokens(...) is invoked with the expected prompt token IDs.

Checklist before requesting a review

  • Add unit tests for this change.
  • Make sure all tests can pass.
  • Update documents if necessary.
  • Lint and apply fixes to your code by running lintrunner -a
  • Is this a user-facing change? If yes, give a description of this change to be included in the release notes.

(Optional) Issue link

Comment thread test/passes/onnx/test_discrepancy_check.py Fixed
Co-authored-by: Copilot Autofix powered by AI <62310815+github-advanced-security[bot]@users.noreply.github.com>
Comment thread test/passes/onnx/test_discrepancy_check.py Fixed
Comment thread test/passes/onnx/test_discrepancy_check.py Fixed
Comment thread test/passes/onnx/test_discrepancy_check.py Fixed
Comment thread test/passes/onnx/test_discrepancy_check.py Fixed
@xadupre xadupre marked this pull request as ready for review June 1, 2026 16:40
Copilot AI review requested due to automatic review settings June 1, 2026 16:40
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Extends the OnnxDiscrepancyCheck pass with an optional comparison between the reference HuggingFace model's generate() output and ONNX Runtime GenAI's generation output, reporting (and optionally enforcing a threshold on) the length of the longest matching token prefix. Adds unit tests for the new helper and the _compare_generation method.

Changes:

  • Adds _longest_common_token_sequence helper and a new _compare_generation method that runs both transformers and ORT-GenAI generation.
  • Adds genai_model_path, generate_prompt, generate_max_new_tokens, and min_longest_common_tokens config parameters, with a failure when the common prefix is below the configured minimum.
  • Adds test/passes/onnx/test_discrepancy_check.py covering the helper and the generation comparison path with mocks.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 3 comments.

File Description
olive/passes/onnx/discrepancy_check.py Adds GenAI-based token-sequence comparison logic, new config params, and failure threshold.
test/passes/onnx/test_discrepancy_check.py Unit tests for the new helper and _compare_generation, using mocks for og and tokenizers.

Comment thread olive/passes/onnx/discrepancy_check.py
Comment thread test/passes/onnx/test_discrepancy_check.py Outdated
Comment thread olive/passes/onnx/discrepancy_check.py Outdated
@xadupre
Copy link
Copy Markdown
Member Author

xadupre commented Jun 2, 2026

@copilot fix test_compare_generation_returns_common_prefix_length,

opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:1396: in patched
return func(*newargs, **newkeywargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
E TypeError: TestCompareGeneration.test_compare_generation_fully_matching() takes 1 positional argument but 2 were given

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

@copilot fix test_compare_generation_returns_common_prefix_length,

opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:1396: in patched
...

Fixed in c0ea083. I removed the extra @patch(...) decorators on the TestCompareGeneration test methods so unittest no longer injects an unexpected positional mock argument.

@xadupre
Copy link
Copy Markdown
Member Author

xadupre commented Jun 2, 2026

@copilot test/passes/onnx/test_discrepancy_check.py:98: in test_compare_generation_returns_common_prefix_length
patch("olive.passes.onnx.discrepancy_check.og", mock_og),
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:1467: in enter
original, local = self.get_original()
^^^^^^^^^^^^^^^^^^^
/opt/hostedtoolcache/Python/3.12.13/x64/lib/python3.12/unittest/mock.py:1437: in get_original
raise AttributeError(
E AttributeError: <module 'olive.passes.onnx.discrepancy_check' from '/mnt/vss/_work/1/s/olive/passes/onnx/discrepancy_check.py'> does not have the attribute 'og'

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

@copilot test/passes/onnx/test_discrepancy_check.py:98: in test_compare_generation_returns_common_prefix_length
patch("olive.passes.onnx.discrepancy_check.og", mock_og),
^^^^^^^^^^^^^^^^^^^^...

Fixed in 4e82f66. The tests now mock the function-local import onnxruntime_genai as og via patch.dict(sys.modules, {"onnxruntime_genai": mock_og}) instead of patching olive.passes.onnx.discrepancy_check.og, and the remaining nested with was flattened.

Comment thread olive/passes/onnx/discrepancy_check.py Outdated
Copilot AI changed the title add onnxruntime-genai in OnnxDiscrepancyCheck to measure the first token different Add ONNX Runtime GenAI generation comparison in OnnxDiscrepancyCheck Jun 2, 2026
@xadupre
Copy link
Copy Markdown
Member Author

xadupre commented Jun 2, 2026

@copilot can you switch the latest API for onnxruntime-genai?

Copy link
Copy Markdown
Contributor

Copilot AI commented Jun 2, 2026

@copilot can you switch the latest API for onnxruntime-genai?

Done in 38aa991. I switched OnnxDiscrepancyCheck.compare_generation to the latest onnxruntime-genai API by using generator.append_tokens([genai_input_ids]) and removing deprecated params.input_ids / compute_logits() usage. I also added a clear ImportError message when onnxruntime-genai is not installed and updated tests to assert append_tokens is called.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants