Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
63 commits
Select commit Hold shift + click to select a range
ef42e47
Initial plan
Copilot May 11, 2026
00571f0
feat: add CLI test-model config for HF inputs
Copilot May 11, 2026
485dfbf
test: broaden HF test-model coverage
Copilot May 11, 2026
a6fa34a
chore: polish test model config handling
Copilot May 11, 2026
273850c
fix: fail fast for HF test model loading
Copilot May 11, 2026
318fcbe
refactor: remove nested try from HF test loading
Copilot May 11, 2026
40b0740
test: cover trust_remote_code helper behavior
Copilot May 11, 2026
386ff01
feat: persist reusable HF test model path
Copilot May 11, 2026
09fac8c
fix: tighten HF test model path handling
Copilot May 11, 2026
09df0a7
refactor: simplify test model path handling
Copilot May 11, 2026
d4ebad5
lintrunner
xadupre May 11, 2026
6321d32
lint
xadupre May 11, 2026
6709852
docs: add phi test conversion how-to
Copilot May 11, 2026
3f8f8fc
feat: support run command test models
Copilot May 11, 2026
189289a
chore: address review nits for run test support
Copilot May 11, 2026
c90c520
chore: simplify run test override handling
Copilot May 11, 2026
5cec47f
chore: polish run test support follow-up
Copilot May 11, 2026
eaf0a16
fix: use saved test checkpoint in model builder
Copilot May 11, 2026
17cb075
chore: tidy model builder test fixture
Copilot May 11, 2026
996f633
chore: clarify model builder test model errors
Copilot May 11, 2026
00732b7
fix dtype: auto
xadupre May 11, 2026
76ee4ef
update documentation
xadupre May 11, 2026
9ba38c7
docs: clarify phi smoke test output path
Copilot May 11, 2026
fa6bee7
docs: switch smoke test how-to to qwen
Copilot May 11, 2026
ffda0dc
test: cover documented llm smoke flow
Copilot May 11, 2026
d0f868f
test: polish documented smoke flow test
Copilot May 11, 2026
a408b63
test: rename smoke flow cli test
Copilot May 11, 2026
e272c2a
test: refine smoke flow workflow stubs
Copilot May 11, 2026
c901b63
test: tidy smoke flow helper names
Copilot May 11, 2026
36410cd
test: clarify smoke flow mocks
Copilot May 11, 2026
e16cb82
test: polish documented smoke flow test naming
Copilot May 11, 2026
7507604
test: lift smoke flow imports and mock defaults
Copilot May 11, 2026
ac7840f
fix: keep qwen test layer types in sync
Copilot May 11, 2026
f165dda
Merge origin/main and resolve model builder test conflict
Copilot May 12, 2026
5daba5d
Merge origin/main into copilot/fr-add-model-to-config-json
Copilot May 12, 2026
8941efb
Potential fix for pull request finding
xadupre May 12, 2026
be35ef4
Merge branch 'main' into copilot/fr-add-model-to-config-json
xadupre May 13, 2026
1620644
Merge branch 'main' into copilot/fr-add-model-to-config-json
xadupre May 25, 2026
b9d6d32
Add Ubuntu CI smoke test job for --test flow
Copilot May 25, 2026
98ff554
Isolate Ubuntu smoke test flow
Copilot May 25, 2026
3f635f5
Move smoke CI to GitHub Actions
Copilot May 25, 2026
935a970
test: remove mocks from smoke flow
Copilot May 25, 2026
36cfda8
test: refactor smoke test model coverage
Copilot May 25, 2026
5b4a957
try
xadupre May 25, 2026
6a1067c
fix
xadupre May 25, 2026
6e999f3
more comments
xadupre May 25, 2026
a831161
Add smoke artifact size assertions
Copilot May 25, 2026
052db98
add mistralai
xadupre May 25, 2026
6cdfedb
rename a file
xadupre May 25, 2026
c8be9f5
Tighten smoke test artifact assertions
Copilot May 25, 2026
6de206e
Move smoke test expectations to use site
Copilot May 25, 2026
acd8c8a
rename
xadupre May 25, 2026
1239db7
rename
xadupre May 25, 2026
03d7528
Protect test output reuse
Copilot May 25, 2026
9d5f85b
fix remaining unit tests
xadupre May 26, 2026
70625b9
fix lint issues
xadupre May 26, 2026
6f876df
Potential fix for pull request finding 'CodeQL / Wrong number of argu…
xadupre May 26, 2026
2cb2f41
Potential fix for pull request finding
xadupre May 26, 2026
9a73e09
Relax smoke test output file assertions
Copilot May 26, 2026
4408a05
Fix --test dry-run output marking
Copilot May 26, 2026
e0e15ee
Potential fix for pull request finding
xadupre May 26, 2026
8040341
comment
xadupre May 26, 2026
f16bc10
Merge branch 'copilot/fr-add-model-to-config-json' of https://github.…
xadupre May 26, 2026
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
39 changes: 39 additions & 0 deletions .github/workflows/test-model-fast.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
# This job is run as a github action.
# It checks Olive works on random and small models.
name: Test model fast

on:
workflow_dispatch:
push:
branches:
- main
pull_request:
branches:
- main
Comment thread
xadupre marked this conversation as resolved.

jobs:
ubuntu-test-model-fast:
name: Ubuntu test model fast
runs-on: ubuntu-latest
permissions:
contents: read
steps:
- uses: actions/checkout@v4

- name: Setup Python
uses: actions/setup-python@v5
with:
python-version: "3.12"

- name: Install dependencies
run: |
python -m pip install -r requirements.txt
python -m pip install -r test/requirements-test-cpu.txt

- name: pip freeze
run: |
python -m pip freeze

- name: Run fast test
run: |
python -m pytest -v -s -p no:warnings --disable-warnings --log-cli-level=WARNING test/cli/test_cli_test_model_smoke.py
77 changes: 77 additions & 0 deletions docs/source/how-to/cli/cli-fast-test.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,77 @@
# How to convert a Qwen model with a quick `--test` smoke check

If you are converting a large language model, it is often useful to validate the Olive command, environment, and conversion recipe on a much smaller model before spending time on the full checkpoint.

The `--test` option does that for Hugging Face models. Olive keeps the same model architecture, reduces it to a random 2-layer test model, saves it to the folder you provide, and reuses that folder on later runs.

This example uses [`Qwen/Qwen3-0.6B`](https://huggingface.co/Qwen/Qwen3-0.6B), but the same pattern works for other supported Hugging Face LLMs.

## Step 1: generate the workflow config

Start by generating the config that Olive will run for the Qwen conversion.

```bash
olive optimize \
Comment thread
xadupre marked this conversation as resolved.
Comment thread
xadupre marked this conversation as resolved.
--model_name_or_path Qwen/Qwen3-0.6B \
--device cpu \
--provider CPUExecutionProvider \
--precision int4 \
--output_path out/qwen \
--dry_run
```

This creates `out/qwen/config.json` without launching the full conversion yet.

## Step 2: run a fast smoke test with `olive run --test`

Use the generated config with `olive run` and pass `--test` so Olive swaps in a reduced random Qwen model.

```bash
olive run \
Comment thread
xadupre marked this conversation as resolved.
Comment thread
xadupre marked this conversation as resolved.
--config out/qwen/config.json \
--test out/qwen-test-model \
--output_path out/qwen-test-run
Comment thread
xadupre marked this conversation as resolved.
```

What this does:

- `--test out/qwen-test-model` creates a reduced random Qwen model and saves it in `out/qwen-test-model`
- later runs reuse the same saved test model instead of recreating it
- `--output_path out/qwen-test-run` gives the smoke test its own output folder, so the generated ONNX artifacts are easy to find
- Olive marks that output folder as a test-only run and refuses to reuse a non-test conversion folder for `--test`

After the smoke test finishes, look under `out/qwen-test-run` for the exported ONNX model and related files.

This is a quick way to confirm that:

- Olive can load the source model
- the selected optimization recipe is valid for your setup
- the conversion path completes before you run the full model

If you omit the folder and just pass `--test`, `olive run` will save the reduced model under `<output_path>/test_model`.

## Step 3: run the full conversion

Once the smoke test succeeds, rerun the conversion on the full Qwen checkpoint by removing `--test`.

```bash
olive run \
--config out/qwen/config.json \
--output_path out/qwen-full
```

At this point you know the Olive command and the conversion recipe already worked on the lightweight test model, so you can focus on the full-model run instead of debugging both at once.

## Why keep the test model folder?

The saved test model is useful beyond the first smoke test:

- you can rerun the reduced conversion quickly while iterating on options
- you can reuse the same HF test model later when comparing the Hugging Face model against the exported ONNX model
- you avoid recreating a new random test checkpoint every time

## Related docs

- [How to use the `olive optimize` command to optimize a Pytorch model](cli-optimize)
- [How to write a new workflow from scratch](../configure-workflows/build-workflow)
- [CLI reference](../../reference/cli)
2 changes: 2 additions & 0 deletions docs/source/how-to/index.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@ The Olive CLI provides a set of primitives such as `quantize`, `finetune`, `onnx
- [how to use the `olive finetune` command to create (Q)LoRA adapters](cli/cli-finetune)
- [How to use the `olive quantize` command to quantize your model with different precisions and techniques such as AWQ](cli/cli-quantize)
- [How to use the `olive run` command to execute an Olive workflow.](cli/cli-run)
- [How to convert a Qwen model with a quick `--test` fast check](cli/cli-fast-test)

# Olive Python API

Expand Down Expand Up @@ -43,6 +44,7 @@ The Olive CLI provides a set of primitives such as `quantize`, `finetune`, `onnx

installation
cli/cli-optimize
cli/cli-fast-test
cli/cli-auto-opt
cli/cli-finetune
cli/cli-quantize
Expand Down
78 changes: 77 additions & 1 deletion olive/cli/base.py
Original file line number Diff line number Diff line change
Expand Up @@ -17,6 +17,52 @@
from olive.hardware.constants import DEVICE_TO_EXECUTION_PROVIDERS
from olive.resource_path import OLIVE_RESOURCE_ANNOTATIONS

TEST_OUTPUT_MARKER_FILE = "olive_test_output.json"


def _get_test_output_marker_path(output_path: str) -> Path:
return Path(output_path) / TEST_OUTPUT_MARKER_FILE


def is_test_output_dir(output_path: str) -> bool:
marker_path = _get_test_output_marker_path(output_path)
if not marker_path.is_file():
return False

try:
marker = json.loads(marker_path.read_text())
except (OSError, TypeError, ValueError):
return False

return marker.get("type") == "olive_hf_test_output"


def validate_test_output_path(output_path: Optional[str], test_value) -> None:
if test_value in (None, False) or not output_path:
return

output_dir = Path(output_path)
if not output_dir.exists():
return
if not output_dir.is_dir():
raise ValueError(f"--output_path {output_path} must be a directory.")
if any(output_dir.iterdir()) and not is_test_output_dir(output_path):
raise ValueError(
f"--output_path {output_path} already exists and is not marked as an Olive test output directory. "
"Use a dedicated output folder for --test runs."
)


def mark_test_output_path(output_path: Optional[str]) -> None:
if not output_path:
return

output_dir = Path(output_path)
if not output_dir.is_dir():
return

_get_test_output_marker_path(output_path).write_text(json.dumps({"type": "olive_hf_test_output"}, indent=2))


class BaseOliveCLICommand(ABC):
allow_unknown_args: ClassVar[bool] = False
Expand All @@ -33,16 +79,21 @@ def _run_workflow(self):

from olive.workflows import run as olive_run

validate_test_output_path(self.args.output_path, getattr(self.args, "test", None))
Path(self.args.output_path).mkdir(parents=True, exist_ok=True)

with tempfile.TemporaryDirectory(prefix="olive-cli-tmp-", dir=self.args.output_path) as tempdir:
run_config = self._get_run_config(tempdir)
if self.args.save_config_file or self.args.dry_run:
self._save_config_file(run_config)
if self.args.dry_run:
if getattr(self.args, "test", None) not in (None, False):
mark_test_output_path(self.args.output_path)
print("Dry run mode enabled. Configuration file is generated but no optimization is performed.")
return None
workflow_output = olive_run(run_config)
if getattr(self.args, "test", None) not in (None, False):
mark_test_output_path(self.args.output_path)
if not workflow_output.has_output_model():
print("No output model produced. Please check the log for details.")
else:
Expand Down Expand Up @@ -82,6 +133,21 @@ def run(self):
raise NotImplementedError


def add_hf_test_model_config(input_model: dict, test_value, output_path: Optional[str] = None) -> dict:
if test_value in (None, False):
return input_model

test_model_output_path = test_value
# Use 2 layers to keep the test model fast and lightweight while preserving the original architecture family.
input_model["test_model_config"] = {"hidden_layers": 2}
if test_model_output_path is True:
if not output_path:
raise ValueError("--test requires an explicit folder when output_path is not available.")
test_model_output_path = str(Path(output_path) / "test_model")
input_model["test_model_path"] = test_model_output_path
return input_model


def _get_hf_input_model(args: Namespace, model_path: OLIVE_RESOURCE_ANNOTATIONS) -> dict:
"""Get the input model config for HuggingFace model.

Expand All @@ -105,7 +171,7 @@ def _get_hf_input_model(args: Namespace, model_path: OLIVE_RESOURCE_ANNOTATIONS)
input_model["adapter_path"] = args.adapter_path
if getattr(args, "trust_remote_code", None) is not None:
input_model["load_kwargs"]["trust_remote_code"] = args.trust_remote_code
return input_model
return add_hf_test_model_config(input_model, getattr(args, "test", None), getattr(args, "output_path", None))


def _get_onnx_input_model(args: Namespace, model_path: str) -> dict:
Expand Down Expand Up @@ -371,6 +437,16 @@ def add_input_model_options(
model_group.add_argument(
"--trust_remote_code", action="store_true", help="Trust remote code when loading a huggingface model."
)
model_group.add_argument(
Comment thread
xadupre marked this conversation as resolved.
"--test",
type=str,
nargs="?",
const=True,
help=(
"Use a randomly initialized test model with the same Hugging Face architecture and 2 hidden layers. "
"Optionally provide a folder where the generated test model should be saved and reused."
),
)

if enable_hf_adapter:
assert enable_hf, "enable_hf must be True when enable_hf_adapter is True."
Expand Down
15 changes: 15 additions & 0 deletions olive/cli/run.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,10 +6,13 @@

from olive.cli.base import (
BaseOliveCLICommand,
add_hf_test_model_config,
add_input_model_options,
add_logging_options,
add_telemetry_options,
get_input_model_config,
mark_test_output_path,
validate_test_output_path,
)
from olive.telemetry import action

Expand Down Expand Up @@ -59,6 +62,14 @@ def run(self):
if input_model_config := get_input_model_config(self.args, required=False):
print("Replacing input model config in run config")
run_config["input_model"] = input_model_config
elif self.args.test not in (None, False):
input_model = run_config.get("input_model")
if not isinstance(input_model, dict) or input_model.get("type", "").lower() != "hfmodel":
raise ValueError("--test for olive run requires a Hugging Face input_model in the run config.")
output_path = (
self.args.output_path or run_config.get("output_dir") or run_config.get("engine", {}).get("output_dir")
)
run_config["input_model"] = add_hf_test_model_config(input_model, self.args.test, output_path)

for arg_key, rc_key in [("output_path", "output_dir"), ("log_level", "log_severity_level")]:
if (arg_value := getattr(self.args, arg_key)) is not None:
Expand All @@ -68,12 +79,16 @@ def run(self):
# add value to run config directly
run_config[rc_key] = arg_value

output_path = run_config.get("output_dir") or run_config.get("engine", {}).get("output_dir")
validate_test_output_path(output_path, self.args.test)
workflow_output = olive_run(
run_config,
list_required_packages=self.args.list_required_packages,
tempdir=self.args.tempdir,
package_config=self.args.package_config,
)
if self.args.test not in (None, False):
mark_test_output_path(output_path)

if self.args.list_required_packages is True:
print("Required packages listed!")
Expand Down
8 changes: 6 additions & 2 deletions olive/common/hf/model_io.py
Original file line number Diff line number Diff line change
Expand Up @@ -27,6 +27,7 @@ def get_model_io_config(
model_name: str,
task: str,
model: Optional["PreTrainedModel"] = None,
test_model_config: Optional[dict[str, Any]] = None,
**kwargs,
) -> Optional[dict[str, Any]]:
"""Get the input/output config for the model and task.
Expand All @@ -35,6 +36,7 @@ def get_model_io_config(
model_name: The model name or path.
task: The task type (e.g., "text-generation", "text-classification").
model: Optional loaded model for input signature inspection.
test_model_config: Optional overrides for creating a lightweight random test model from the same config.
**kwargs: Additional arguments including use_cache.

Returns:
Expand Down Expand Up @@ -68,7 +70,7 @@ def get_model_io_config(
return None

# Get model config
model_config = get_model_config(model_name, **kwargs)
model_config = get_model_config(model_name, test_model_config=test_model_config, **kwargs)

# Handle PEFT models
actual_model = model
Expand All @@ -92,6 +94,7 @@ def get_model_dummy_input(
model_name: str,
task: str,
model: Optional["PreTrainedModel"] = None,
test_model_config: Optional[dict[str, Any]] = None,
**kwargs,
) -> Optional[dict[str, Any]]:
"""Get dummy inputs for the model and task.
Expand All @@ -100,6 +103,7 @@ def get_model_dummy_input(
model_name: The model name or path.
task: The task type.
model: Optional loaded model for input signature inspection.
test_model_config: Optional overrides for creating a lightweight random test model from the same config.
**kwargs: Additional arguments including use_cache, batch_size, sequence_length.

Returns:
Expand Down Expand Up @@ -133,7 +137,7 @@ def get_model_dummy_input(
return None

# Get model config (handles MLflow paths)
model_config = get_model_config(model_name, **kwargs)
model_config = get_model_config(model_name, test_model_config=test_model_config, **kwargs)

# Handle PEFT models
actual_model = model
Expand Down
Loading
Loading