Skip to content

turintech/openvino-desktop

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

89 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

OpenVINO Desktop

A Windows desktop application for managing and serving OpenVINO models via OpenVINO Model Server (OVMS).

OpenVINO Desktop Demo

Features

  • Download (pull) or export models from Hugging Face
  • Manage installed models (view, delete)
  • Start/stop the OVMS inference server
  • Detect available OpenVINO devices (CPU, GPU, NPU)
  • Auto-start with Windows
  • Local REST API for external integrations

Requirements

  • Windows 10/11 x64
  • Internet connection (for first-time setup and model downloads)

Getting Started

  1. Launch the application. On first run it downloads OVMS and sets up the Python export environment automatically.
  2. Go to the Models tab, search for a model on Hugging Face, select a target device and click Pull or Export.
  3. The OVMS server starts automatically once models are available.

Settings

Setting Description
Setup Folder Directory where OVMS and models are installed (default: ~/openvino-desktop)
OVMS Download URL URL to the OVMS Windows zip archive
uv Download URL URL to uv.exe used for the export environment
REST API Port Port for the local REST API (default: 3333)
Search Limit Max number of models returned by Hugging Face search

Changing the REST API port requires a restart to take effect.

REST API

The app exposes a local HTTP API on port 3333 (configurable in Settings).

Endpoints

GET /status

Returns server state and available devices.

{
  "running": true,
  "deps_ready": true,
  "ovms_ready": true,
  "version": "2026.1",
  "available_devices": ["CPU"]
}

GET /models

Returns the list of installed models.

[
  {
    "name": "OpenVINO/Qwen3-Embedding-0.6B-int8-ov",
    "base_path": "C:\\Users\\user\\openvino-desktop\\models\\OpenVINO\\Qwen3-Embedding-0.6B-int8-ov",
    "target_device": "CPU",
    "task": "feature-extraction"
  }
]

The task field is the HuggingFace pipeline tag ("text-generation", "feature-extraction", or "sentence-similarity") stored at pull/export time.

GET /models/test

Sends a short prompt to every installed model via the OVMS inference API and returns the results. Useful for verifying all models are loaded and responding.

  • "text-generation" models are tested via POST /v3/chat/completions with "hello"
  • "feature-extraction" and "sentence-similarity" models are tested via POST /v3/embeddings with "hello"
curl http://localhost:3333/models/test
[
  {
    "model": "Qwen/Qwen2.5-0.5B-Instruct",
    "type": "text-generation",
    "status": 200,
    "response": { ... }
  },
  {
    "model": "BAAI/bge-small-en-v1.5",
    "type": "embeddings",
    "status": 200,
    "response": { ... }
  }
]
Field Description
model Model name
type "text-generation" or "embeddings"
status HTTP status code returned by OVMS
response Raw OVMS response body
error Set if the request to OVMS failed entirely

POST /models/pull

Pulls a pre-converted OpenVINO model from Hugging Face. Returns 202 Accepted immediately — operation runs in the background.

curl -X POST http://localhost:3333/models/pull \
  -H "Content-Type: application/json" \
  -d '{"model_id": "OpenVINO/Qwen3-Embedding-0.6B-int8-ov", "target_device": "CPU", "pipeline_tag": "feature-extraction"}'
Field Values
pipeline_tag "text-generation", "feature-extraction", or "sentence-similarity"
target_device "CPU", "GPU", "NPU", "AUTO"

POST /models/export

Exports and converts a model from Hugging Face using export_model.py. Returns 202 Accepted immediately.

curl -X POST http://localhost:3333/models/export \
  -H "Content-Type: application/json" \
  -d '{"model_id": "Qwen/Qwen3-Embedding-0.6B", "target_device": "CPU", "task": "sentence-similarity", "extra_opts": {"weight-format": "fp16", "extra_quantization_params": "--library sentence_transformers"}}'
Field Values
task "text-generation", "feature-extraction", or "sentence-similarity"
target_device "CPU", "GPU", "NPU", "AUTO"
extra_opts Optional key/value pairs passed as CLI flags to export_model.py

GET /job

Poll the status of the current or last pull/export job.

{ "busy": false, "last_ok": true, "last_error": "" }

Only one pull/export can run at a time. A second request while busy returns 409 Conflict.

Development

This project uses Task as its task runner. Install it with go install github.com/go-task/task/v3/cmd/task@latest or see the install docs.

# List available tasks
task

# Install dependencies
task deps

# Run in dev mode (hot-reload for frontend)
task ui-dev

# Build production binary
task ui-build

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors