Windows ML is the unified and high-performance local AI inferencing framework for Windows, powered by ONNX Runtime. With Windows ML, you can run AI models locally and accelerate inference on NPUs, GPUs, and CPUs through optional execution providers that Windows manages and keeps up to date. You can use models from PyTorch, TensorFlow/Keras, TFLite, scikit-learn, and convert them to ONNX to use them with Windows ML.
Windows ML is generally available and is available two ways: as part of the Windows App SDK (1.8.1+) via Microsoft.WindowsAppSDK.ML, or as a standalone package — Microsoft.Windows.AI.MachineLearning — with no Windows App SDK dependency.
Windows ML is Microsoft's recommended local AI inferencing framework for Windows — the official, Windows-native way to run custom and open-source AI models on Windows PCs, with hardware-accelerated inference across CPU, GPU, and NPU. It's built and optimized for Scale, Performance, and Deployment across the Windows ecosystem.
- Run AI on-device — models run locally on the user's hardware, keeping data private, reducing latency, eliminating cloud costs, and working without an internet connection.
- Use models you already have — bring models from PyTorch, TensorFlow, scikit-learn, Hugging Face, and more, convert them to ONNX, and use them with Windows ML.
- Scale across silicon - Windows ML is powered by ONNX Runtime and offers broad hardware support, so you can scale your workloads across Windows PCs with any hardware configuration.
- Hardware acceleration, facilitated by Windows — Windows ML allows you to access NPUs, GPUs, and CPUs via execution providers that Windows installs and keeps up to date — no need to bundle them in your app.
- One runtime, many apps — optionally use Windows ML as a shared system component, so your app stays small and all apps on the device share the same up-to-date runtime, rather than every app bundling its own copy.
- Windows-supported — regardless of how you deploy, you get Windows-maintained, optimized runtime dependencies built for stability across updates.
- Best-in-class performance — Windows ML delivers performance on par with dedicated SDKs like TensorRT for RTX or Qualcomm's AI Engine Direct. See Accelerate AI models for hardware and model-specific guidance.
To learn about the benefits of using Windows ML compared to ONNX Runtime directly, see the Windows ML docs.
Windows ML works hand-in-hand with two Microsoft-built tools that handle the steps around inference:
- Foundry Toolkit for VS Code — convert, quantize, optimize, and evaluate ONNX models inside VS Code before shipping.
- Windows ML CLI (preview) — a unified, agent-ready toolchain for model prep, optimization, and benchmarking, with agent skills for AI and agent-driven workflows.
Both ship from Microsoft and are designed to feed directly into Windows ML.
The shortest possible Windows ML program in C#: discover and register execution providers, then run an ONNX model — and choose a policy to control which hardware runs it.
using Microsoft.Windows.AI.MachineLearning;
using Microsoft.ML.OnnxRuntime;
// 1. Discover execution providers from the Windows ML EP catalog.
// Windows installs and keeps these up to date — your app doesn't bundle them.
var catalog = ExecutionProviderCatalog.GetDefault();
foreach (var provider in catalog.FindAllProviders())
{
await provider.EnsureReadyAsync();
provider.TryRegister();
}
// 2. Create an ONNX Runtime environment.
var envOptions = new EnvironmentCreationOptions { logId = "HelloWindowsML" };
using var ortEnv = OrtEnv.CreateInstanceWithOptions(ref envOptions);
// 3. Pick an execution provider policy.
using var sessionOptions = new SessionOptions();
sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.PREFER_NPU);
// Other policies you can try:
// sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.DEFAULT);
// sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.PREFER_GPU);
// sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.PREFER_CPU);
// sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.MAX_PERFORMANCE);
// sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.MAX_EFFICIENCY);
// sessionOptions.SetEpSelectionPolicy(ExecutionProviderDevicePolicy.MIN_OVERALL_POWER);
// 4. Load your ONNX model and run inference.
using var session = new InferenceSession("model.onnx", sessionOptions);
using var results = session.Run(inputs);For the full working example (image preprocessing, EP selection by name vs. policy, model compilation), see Samples/cs/CSharpConsoleDesktop. C++ developers, start with Samples/cpp/CppConsoleDesktop. Python developers, see Samples/python/SqueezeNetPython.
Samples showing how to use Windows ML in C#, C++, and Python, including console, GUI, GenAI, and self-contained / framework-dependent deployment variants.
Open Samples/WindowsML-Samples.sln in Visual Studio 2022 to build everything at once, or jump straight to a single sample below.
| Sample | What it shows |
|---|---|
| CppConsoleDesktop | Basic console app — EP discovery, command-line options, model compilation |
| CppConsoleDesktop.FrameworkDependent | Framework-dependent deployment (shared runtime, smallest footprint) |
| CppConsoleDesktop.SelfContained | Self-contained deployment (no runtime dependency) |
| CppConsoleDesktop.GenAI | Local LLM inference with ONNX Runtime GenAI |
| CppConsoleDll | Using Windows ML from a shared library |
| CppResnetBuildDemo | ResNet image classification end-to-end (model conversion, EP compilation) |
| Sample | What it shows |
|---|---|
| ResNetConsoleDesktop | CMake-based ResNet sample (framework-dependent) |
| ResNetConsoleDesktop.SelfContained | CMake-based ResNet sample (self-contained) |
| WinMLEpCatalog | Enumerate execution providers using the EP catalog C API |
| Sample | What it shows |
|---|---|
| CppAbiEPEnumerationSample | Direct ABI implementation using raw COM interfaces — no projections |
| Sample | What it shows |
|---|---|
| CSharpConsoleDesktop | Basic C# console app |
| ResnetBuildDemoCS | ResNet image classification with EP selection policy and model compilation |
| HelloPhi | Local Phi-family LLM inference with ONNX Runtime GenAI (works with Phi-3, Phi-3.5, and other GenAI-compatible ONNX models) |
| cs-wpf | WPF image classification UI |
| cs-winforms | Windows Forms image classification UI |
| cs-winui | WinUI 3 image classification UI |
| Sample | What it shows |
|---|---|
| SqueezeNetPython | Image classification using the Windows ML Python bindings |
| Resource | Description |
|---|---|
| capture-logs | PowerShell + WPR/WPA profiles for capturing Windows ML diagnostic traces. See Capturing Windows ML logs. |
| Package | Use it for | Latest |
|---|---|---|
Microsoft.WindowsAppSDK.ML |
Windows ML via the Windows App SDK (recommended for packaged / WinUI apps) | Ships in Windows App SDK 1.8.1+ |
Microsoft.Windows.AI.MachineLearning |
Standalone Windows ML — no Windows App SDK dependency | 2.1.1 |
Microsoft.ML.OnnxRuntimeGenAI.WinML |
Generative AI (Phi, Llama, Mistral, Gemma, DeepSeek…) on top of Windows ML | 0.13.2 |
Namespace: Microsoft.Windows.AI.MachineLearning. Execution providers are distributed and updated through Windows Update.
| Operating systems | Windows 11, Windows 10 (19H1+), Windows Server 2019+, Windows 365 (Cloud PC) |
| Architectures | x64, ARM64 |
| Languages | C#, C++/WinRT, C, C++, Python (3.10–3.13) |
| Packaging | Unpackaged, Packaged (MSIX) |
| Deployment | Self-contained, framework-dependent |
Note: CPU and GPU (via DirectML) work on all supported Windows versions. Hardware-optimized execution providers for NPUs and specific GPUs require Windows 11 24H2 (build 26100) or later. See Windows ML execution providers.
DirectML is in sustained engineering. DirectML continues to be supported, but new feature development has moved to Windows ML for Windows-based ONNX Runtime deployments. For new projects, prefer the vendor-specific GPU and NPU execution providers that Windows ML installs and maintains. See DirectML Overview.
- Open
Samples/WindowsML-Samples.slnin Visual Studio 2022 (with the C++ and .NET desktop workloads). - Pick a sample, set it as the startup project, and run.
- For Python, see
Samples/python/SqueezeNetPython/.
For full setup walk-throughs, see Get started with Windows ML.
Found a bug, have a question, or want to suggest a sample? Open an issue in this repo — we triage them directly. For broader Windows ML platform discussions or runtime/API issues that span beyond the samples, you can also use the Windows App SDK repo.
- Windows ML documentation — official docs (aka.ms/TryWinML)
- Windows ML CLI (preview) — a unified, agent-ready toolchain for model prep, optimization, and benchmarking, with agent skills for AI and agent-driven workflows
- AI Toolkit / Foundry Toolkit for VS Code — convert, quantize, optimize, evaluate models, all inside VS Code
- AI Dev Gallery — interactive Microsoft Store app to discover and experiment with local AI scenarios on your PC
- Windows App SDK — the platform that ships Windows ML
- WindowsAppSDK-Samples — broader Windows App SDK samples
- ONNX Runtime — the runtime Windows ML is built on
- ONNX Runtime GenAI — generative AI extensions for ONNX Runtime
- 📖 What is Windows ML?
- 📣 Windows ML is generally available (Windows Developer Blog, Sept 2025)
- 🚀 Accelerate AI models on NPU / GPU / CPU
- 📦 Distributing your app
- 🛠️ Convert models to ONNX
This project welcomes contributions and suggestions. Most contributions require you to agree to a Contributor License Agreement (CLA) declaring that you have the right to, and actually do, grant us the rights to use your contribution. For details, visit https://cla.opensource.microsoft.com.
When you submit a pull request, a CLA bot will automatically determine whether you need to provide a CLA and decorate the PR appropriately (e.g., status check, comment). Simply follow the instructions provided by the bot. You will only need to do this once across all repos using our CLA.
This project has adopted the Microsoft Open Source Code of Conduct. For more information see the Code of Conduct FAQ or contact opencode@microsoft.com with any additional questions or comments.
This project may contain trademarks or logos for projects, products, or services. Authorized use of Microsoft trademarks or logos is subject to and must follow Microsoft's Trademark & Brand Guidelines. Use of Microsoft trademarks or logos in modified versions of this project must not cause confusion or imply Microsoft sponsorship. Any use of third-party trademarks or logos are subject to those third-party's policies.
See LICENSE for code and LICENSE-DOCS for documentation.
