microsoft · xadupre · May 28, 2026 · May 11, 2026 · May 11, 2026 · May 11, 2026
diff --git a/.github/workflows/test-model-fast.yml b/.github/workflows/test-model-fast.yml
@@ -0,0 +1,39 @@
+# This job is run as a github action.
+# It checks Olive works on random and small models.
+name: Test model fast
+
+on:
+  workflow_dispatch:
+  push:
+    branches:
+      - main
+  pull_request:
+    branches:
+      - main
+
+jobs:
+  ubuntu-test-model-fast:
+    name: Ubuntu test model fast
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+    steps:
+      - uses: actions/checkout@v4
+
+      - name: Setup Python
+        uses: actions/setup-python@v5
+        with:
+          python-version: "3.12"
+
+      - name: Install dependencies
+        run: |
+          python -m pip install -r requirements.txt
+          python -m pip install -r test/requirements-test-cpu.txt
+
+      - name: pip freeze
+        run: |
+          python -m pip freeze
+
+      - name: Run fast test
+        run: |
+          python -m pytest -v -s -p no:warnings --disable-warnings --log-cli-level=WARNING test/cli/test_cli_test_model_smoke.py
diff --git a/docs/source/how-to/cli/cli-fast-test.md b/docs/source/how-to/cli/cli-fast-test.md
@@ -0,0 +1,77 @@
+# How to convert a Qwen model with a quick `--test` smoke check
+
+If you are converting a large language model, it is often useful to validate the Olive command, environment, and conversion recipe on a much smaller model before spending time on the full checkpoint.
+
+The `--test` option does that for Hugging Face models. Olive keeps the same model architecture, reduces it to a random 2-layer test model, saves it to the folder you provide, and reuses that folder on later runs.
+
+This example uses [`Qwen/Qwen3-0.6B`](https://huggingface.co/Qwen/Qwen3-0.6B), but the same pattern works for other supported Hugging Face LLMs.
+
+## Step 1: generate the workflow config
+
+Start by generating the config that Olive will run for the Qwen conversion.
+
+```bash
+olive optimize \
+    --model_name_or_path Qwen/Qwen3-0.6B \
+    --device cpu \
+    --provider CPUExecutionProvider \
+    --precision int4 \
+    --output_path out/qwen \
+    --dry_run
+```
+
+This creates `out/qwen/config.json` without launching the full conversion yet.
+
+## Step 2: run a fast smoke test with `olive run --test`
+
+Use the generated config with `olive run` and pass `--test` so Olive swaps in a reduced random Qwen model.
+
+```bash
+olive run \
+    --config out/qwen/config.json \
+    --test out/qwen-test-model \
+    --output_path out/qwen-test-run
+```
+
+What this does:
+
+- `--test out/qwen-test-model` creates a reduced random Qwen model and saves it in `out/qwen-test-model`
+- later runs reuse the same saved test model instead of recreating it
+- `--output_path out/qwen-test-run` gives the smoke test its own output folder, so the generated ONNX artifacts are easy to find
+- Olive marks that output folder as a test-only run and refuses to reuse a non-test conversion folder for `--test`
+
+After the smoke test finishes, look under `out/qwen-test-run` for the exported ONNX model and related files.
+
+This is a quick way to confirm that:
+
+- Olive can load the source model
+- the selected optimization recipe is valid for your setup
+- the conversion path completes before you run the full model
+
+If you omit the folder and just pass `--test`, `olive run` will save the reduced model under `<output_path>/test_model`.
+
+## Step 3: run the full conversion
+
+Once the smoke test succeeds, rerun the conversion on the full Qwen checkpoint by removing `--test`.
+
+```bash
+olive run \
+    --config out/qwen/config.json \
+    --output_path out/qwen-full
+```
+
+At this point you know the Olive command and the conversion recipe already worked on the lightweight test model, so you can focus on the full-model run instead of debugging both at once.
+
+## Why keep the test model folder?
+
+The saved test model is useful beyond the first smoke test:
+
+- you can rerun the reduced conversion quickly while iterating on options
+- you can reuse the same HF test model later when comparing the Hugging Face model against the exported ONNX model
+- you avoid recreating a new random test checkpoint every time
+
+## Related docs
+
+- [How to use the `olive optimize` command to optimize a Pytorch model](cli-optimize)
+- [How to write a new workflow from scratch](../configure-workflows/build-workflow)
+- [CLI reference](../../reference/cli)
diff --git a/docs/source/how-to/index.md b/docs/source/how-to/index.md
@@ -12,6 +12,7 @@ The Olive CLI provides a set of primitives such as `quantize`, `finetune`, `onnx
 - [how to use the `olive finetune` command to create (Q)LoRA adapters](cli/cli-finetune)
 - [How to use the `olive quantize` command to quantize your model with different precisions and techniques such as AWQ](cli/cli-quantize)
 - [How to use the `olive run` command to execute an Olive workflow.](cli/cli-run)
+- [How to convert a Qwen model with a quick `--test` fast check](cli/cli-fast-test)
 
 # Olive Python API
 
@@ -43,6 +44,7 @@ The Olive CLI provides a set of primitives such as `quantize`, `finetune`, `onnx
 
 installation
 cli/cli-optimize
+cli/cli-fast-test
 cli/cli-auto-opt
 cli/cli-finetune
 cli/cli-quantize

diff --git a/olive/cli/base.py b/olive/cli/base.py
@@ -17,6 +17,52 @@
 from olive.hardware.constants import DEVICE_TO_EXECUTION_PROVIDERS
 from olive.resource_path import OLIVE_RESOURCE_ANNOTATIONS
 
+TEST_OUTPUT_MARKER_FILE = "olive_test_output.json"
+
+
+def _get_test_output_marker_path(output_path: str) -> Path:
+    return Path(output_path) / TEST_OUTPUT_MARKER_FILE
+
+
+def is_test_output_dir(output_path: str) -> bool:
+    marker_path = _get_test_output_marker_path(output_path)
+    if not marker_path.is_file():
+        return False
+
+    try:
+        marker = json.loads(marker_path.read_text())
+    except (OSError, TypeError, ValueError):
+        return False
+
+    return marker.get("type") == "olive_hf_test_output"
+
+
+def validate_test_output_path(output_path: Optional[str], test_value) -> None:
+    if test_value in (None, False) or not output_path:
+        return
+
+    output_dir = Path(output_path)
+    if not output_dir.exists():
+        return
+    if not output_dir.is_dir():
+        raise ValueError(f"--output_path {output_path} must be a directory.")
+    if any(output_dir.iterdir()) and not is_test_output_dir(output_path):
+        raise ValueError(
+            f"--output_path {output_path} already exists and is not marked as an Olive test output directory. "
+            "Use a dedicated output folder for --test runs."
+        )
+
+
+def mark_test_output_path(output_path: Optional[str]) -> None:
+    if not output_path:
+        return
+
+    output_dir = Path(output_path)
+    if not output_dir.is_dir():
+        return
+
+    _get_test_output_marker_path(output_path).write_text(json.dumps({"type": "olive_hf_test_output"}, indent=2))
+
 
 class BaseOliveCLICommand(ABC):
     allow_unknown_args: ClassVar[bool] = False
@@ -33,16 +79,21 @@ def _run_workflow(self):
 
         from olive.workflows import run as olive_run
 
+        validate_test_output_path(self.args.output_path, getattr(self.args, "test", None))
         Path(self.args.output_path).mkdir(parents=True, exist_ok=True)
 
         with tempfile.TemporaryDirectory(prefix="olive-cli-tmp-", dir=self.args.output_path) as tempdir:
             run_config = self._get_run_config(tempdir)
             if self.args.save_config_file or self.args.dry_run:
                 self._save_config_file(run_config)
             if self.args.dry_run:
+                if getattr(self.args, "test", None) not in (None, False):
+                    mark_test_output_path(self.args.output_path)
                 print("Dry run mode enabled. Configuration file is generated but no optimization is performed.")
                 return None
             workflow_output = olive_run(run_config)
+            if getattr(self.args, "test", None) not in (None, False):
+                mark_test_output_path(self.args.output_path)
             if not workflow_output.has_output_model():
                 print("No output model produced. Please check the log for details.")
             else:
@@ -82,6 +133,21 @@ def run(self):
         raise NotImplementedError
 
 
+def add_hf_test_model_config(input_model: dict, test_value, output_path: Optional[str] = None) -> dict:
+    if test_value in (None, False):
+        return input_model
+
+    test_model_output_path = test_value
+    # Use 2 layers to keep the test model fast and lightweight while preserving the original architecture family.
+    input_model["test_model_config"] = {"hidden_layers": 2}
+    if test_model_output_path is True:
+        if not output_path:
+            raise ValueError("--test requires an explicit folder when output_path is not available.")
+        test_model_output_path = str(Path(output_path) / "test_model")
+    input_model["test_model_path"] = test_model_output_path
+    return input_model
+
+
 def _get_hf_input_model(args: Namespace, model_path: OLIVE_RESOURCE_ANNOTATIONS) -> dict:
     """Get the input model config for HuggingFace model.
 
@@ -105,7 +171,7 @@ def _get_hf_input_model(args: Namespace, model_path: OLIVE_RESOURCE_ANNOTATIONS)
         input_model["adapter_path"] = args.adapter_path
     if getattr(args, "trust_remote_code", None) is not None:
         input_model["load_kwargs"]["trust_remote_code"] = args.trust_remote_code
-    return input_model
+    return add_hf_test_model_config(input_model, getattr(args, "test", None), getattr(args, "output_path", None))
 
 
 def _get_onnx_input_model(args: Namespace, model_path: str) -> dict:
@@ -371,6 +437,16 @@ def add_input_model_options(
         model_group.add_argument(
             "--trust_remote_code", action="store_true", help="Trust remote code when loading a huggingface model."
         )
+        model_group.add_argument(
+            "--test",
+            type=str,
+            nargs="?",
+            const=True,
+            help=(
+                "Use a randomly initialized test model with the same Hugging Face architecture and 2 hidden layers. "
+                "Optionally provide a folder where the generated test model should be saved and reused."
+            ),
+        )
 
     if enable_hf_adapter:
         assert enable_hf, "enable_hf must be True when enable_hf_adapter is True."

diff --git a/olive/cli/run.py b/olive/cli/run.py
@@ -6,10 +6,13 @@
 
 from olive.cli.base import (
     BaseOliveCLICommand,
+    add_hf_test_model_config,
     add_input_model_options,
     add_logging_options,
     add_telemetry_options,
     get_input_model_config,
+    mark_test_output_path,
+    validate_test_output_path,
 )
 from olive.telemetry import action
 
@@ -59,6 +62,14 @@ def run(self):
         if input_model_config := get_input_model_config(self.args, required=False):
             print("Replacing input model config in run config")
             run_config["input_model"] = input_model_config
+        elif self.args.test not in (None, False):
+            input_model = run_config.get("input_model")
+            if not isinstance(input_model, dict) or input_model.get("type", "").lower() != "hfmodel":
+                raise ValueError("--test for olive run requires a Hugging Face input_model in the run config.")
+            output_path = (
+                self.args.output_path or run_config.get("output_dir") or run_config.get("engine", {}).get("output_dir")
+            )
+            run_config["input_model"] = add_hf_test_model_config(input_model, self.args.test, output_path)
 
         for arg_key, rc_key in [("output_path", "output_dir"), ("log_level", "log_severity_level")]:
             if (arg_value := getattr(self.args, arg_key)) is not None:
@@ -68,12 +79,16 @@ def run(self):
                 # add value to run config directly
                 run_config[rc_key] = arg_value
 
+        output_path = run_config.get("output_dir") or run_config.get("engine", {}).get("output_dir")
+        validate_test_output_path(output_path, self.args.test)
         workflow_output = olive_run(
             run_config,
             list_required_packages=self.args.list_required_packages,
             tempdir=self.args.tempdir,
             package_config=self.args.package_config,
         )
+        if self.args.test not in (None, False):
+            mark_test_output_path(output_path)
 
         if self.args.list_required_packages is True:
             print("Required packages listed!")

diff --git a/olive/common/hf/model_io.py b/olive/common/hf/model_io.py
@@ -27,6 +27,7 @@ def get_model_io_config(
     model_name: str,
     task: str,
     model: Optional["PreTrainedModel"] = None,
+    test_model_config: Optional[dict[str, Any]] = None,
     **kwargs,
 ) -> Optional[dict[str, Any]]:
     """Get the input/output config for the model and task.
@@ -35,6 +36,7 @@ def get_model_io_config(
         model_name: The model name or path.
         task: The task type (e.g., "text-generation", "text-classification").
         model: Optional loaded model for input signature inspection.
+        test_model_config: Optional overrides for creating a lightweight random test model from the same config.
         **kwargs: Additional arguments including use_cache.
 
     Returns:
@@ -68,7 +70,7 @@ def get_model_io_config(
         return None
 
     # Get model config
-    model_config = get_model_config(model_name, **kwargs)
+    model_config = get_model_config(model_name, test_model_config=test_model_config, **kwargs)
 
     # Handle PEFT models
     actual_model = model
@@ -92,6 +94,7 @@ def get_model_dummy_input(
     model_name: str,
     task: str,
     model: Optional["PreTrainedModel"] = None,
+    test_model_config: Optional[dict[str, Any]] = None,
     **kwargs,
 ) -> Optional[dict[str, Any]]:
     """Get dummy inputs for the model and task.
@@ -100,6 +103,7 @@ def get_model_dummy_input(
         model_name: The model name or path.
         task: The task type.
         model: Optional loaded model for input signature inspection.
+        test_model_config: Optional overrides for creating a lightweight random test model from the same config.
         **kwargs: Additional arguments including use_cache, batch_size, sequence_length.
 
     Returns:
@@ -133,7 +137,7 @@ def get_model_dummy_input(
         return None
 
     # Get model config (handles MLflow paths)
-    model_config = get_model_config(model_name, **kwargs)
+    model_config = get_model_config(model_name, test_model_config=test_model_config, **kwargs)
 
     # Handle PEFT models
     actual_model = model