Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -113,3 +113,8 @@ omlx/admin/tailwindcss-*

# Git worktrees
.worktrees/
# claude
.claude/
.omc/
.omx/
outputs/
9 changes: 8 additions & 1 deletion README.ja.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ git clone https://github.com/jundot/omlx.git
cd omlx
pip install -e . # コアのみ
pip install -e ".[mcp]" # MCP(Model Context Protocol)サポート付き
pip install -e ".[image]" # 画像生成サポート付き(mfluxが必要)
```

Python 3.10+とApple Silicon(M1/M2/M3/M4)が必要です。
Expand Down Expand Up @@ -118,7 +119,7 @@ brew services info omlx # ステータス確認

## 機能

Apple SiliconでテキストLLM、ビジョン言語モデル(VLM)、OCRモデル、エンベディング、リランカーをサポートします
Apple SiliconでテキストLLM、ビジョン言語モデル(VLM)、OCRモデル、エンベディング、リランカー、画像生成モデルをサポートします

### 管理画面

Expand All @@ -132,6 +133,10 @@ Apple SiliconでテキストLLM、ビジョン言語モデル(VLM)、OCRモ

テキストLLMと同じ連続バッチング・階層型KVキャッシュスタックでVLMを実行します。マルチ画像チャット、base64/URL/ファイル画像入力、ビジョンコンテキストを活用したツール呼び出しをサポートします。OCRモデル(DeepSeek-OCR、DOTS-OCR、GLM-OCR)は最適化されたプロンプトで自動検出されます。

### 画像生成

mfluxによる拡散モデルでテキストプロンプトから画像を生成(Text-to-Image)または既存の画像を変換(Image-to-Image)します。FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2、Qwen Imageモデルをサポートし、OpenAI互換の`/v1/images/generations` APIを提供します。

### 階層型KVキャッシュ(ホット+コールド)

vLLMにインスパイアされたブロックベースのKVキャッシュ管理で、プレフィックス共有とCopy-on-Writeをサポートします。キャッシュは2つの階層で動作します:
Expand Down Expand Up @@ -216,6 +221,7 @@ OpenAIとAnthropic APIのドロップイン代替です。ストリーミング
| `POST /v1/messages` | Anthropic Messages API |
| `POST /v1/embeddings` | テキストエンベディング |
| `POST /v1/rerank` | ドキュメントリランキング |
| `POST /v1/images/generations` | 画像生成(T2I & I2I) |
| `GET /v1/models` | 利用可能なモデル一覧 |

### ツール呼び出し&構造化出力
Expand Down Expand Up @@ -257,6 +263,7 @@ mlx-lmで利用可能なすべての関数呼び出し形式、JSONスキーマ
| OCR | DeepSeek-OCR、DOTS-OCR、GLM-OCR |
| エンベディング | BERT、BGE-M3、ModernBERT |
| リランカー | ModernBERT、XLM-RoBERTa |
| 画像 | FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2、Qwen Image([mflux](https://github.com/mflux-ai/mflux)経由) |

## CLI 設定

Expand Down
9 changes: 8 additions & 1 deletion README.ko.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ git clone https://github.com/jundot/omlx.git
cd omlx
pip install -e . # 코어만
pip install -e ".[mcp]" # MCP (Model Context Protocol) 포함
pip install -e ".[image]" # 이미지 생성 지원 포함 (mflux 필요)
```

Python 3.10+와 Apple Silicon (M1/M2/M3/M4)이 필요합니다.
Expand Down Expand Up @@ -118,7 +119,7 @@ brew services info omlx # 상태 확인

## 기능

Apple Silicon에서 텍스트 LLM, 비전-언어 모델(VLM), OCR 모델, 임베딩 모델, 리랭커를 지원합니다.
Apple Silicon에서 텍스트 LLM, 비전-언어 모델(VLM), OCR 모델, 임베딩 모델, 리랭커, 이미지 생성 모델을 지원합니다.

### 관리자 대시보드

Expand All @@ -132,6 +133,10 @@ Apple Silicon에서 텍스트 LLM, 비전-언어 모델(VLM), OCR 모델, 임베

텍스트 LLM과 동일한 연속 배칭 및 계층형 KV 캐시 스택으로 VLM을 실행합니다. 멀티 이미지 채팅, base64/URL/파일 이미지 입력, 비전 컨텍스트를 활용한 Tool calling을 지원합니다. OCR 모델(DeepSeek-OCR, DOTS-OCR, GLM-OCR)은 자동 감지되며 최적화된 프롬프트가 적용됩니다.

### 이미지 생성

mflux를 통한 확산 모델로 텍스트 프롬프트에서 이미지를 생성(Text-to-Image)하거나 기존 이미지를 변환(Image-to-Image)합니다. FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image 모델을 지원하며 OpenAI 호환 `/v1/images/generations` API를 제공합니다.

### 계층형 KV 캐시 (Hot + Cold)

vLLM에서 영감을 받은 블록 기반 KV 캐시 관리로, 프리픽스 공유와 Copy-on-Write를 지원합니다. 캐시는 두 계층으로 나뉩니다:
Expand Down Expand Up @@ -216,6 +221,7 @@ OpenAI 및 Anthropic API를 그대로 대체합니다. 스트리밍 사용량
| `POST /v1/messages` | Anthropic Messages API |
| `POST /v1/embeddings` | 텍스트 임베딩 |
| `POST /v1/rerank` | 문서 리랭킹 |
| `POST /v1/images/generations` | 이미지 생성 (T2I & I2I) |
| `GET /v1/models` | 사용 가능한 모델 목록 |

### Tool calling & 구조화된 출력
Expand Down Expand Up @@ -257,6 +263,7 @@ mlx-lm에서 사용 가능한 모든 함수 호출 형식, JSON 스키마 검증
| OCR | DeepSeek-OCR, DOTS-OCR, GLM-OCR |
| 임베딩 | BERT, BGE-M3, ModernBERT |
| 리랭커 | ModernBERT, XLM-RoBERTa |
| 이미지 | FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image ([mflux](https://github.com/mflux-ai/mflux) 통해) |

## CLI 설정

Expand Down
9 changes: 8 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
Expand Up @@ -79,6 +79,7 @@ git clone https://github.com/jundot/omlx.git
cd omlx
pip install -e . # Core only
pip install -e ".[mcp]" # With MCP (Model Context Protocol) support
pip install -e ".[image]" # With image generation support (requires mflux)
```

Requires macOS 15.0+ (Sequoia), Python 3.10+, and Apple Silicon (M1/M2/M3/M4).
Expand Down Expand Up @@ -121,7 +122,7 @@ Logs are written to two locations:

## Features

Supports text LLMs, vision-language models (VLM), OCR models, embeddings, and rerankers on Apple Silicon.
Supports text LLMs, vision-language models (VLM), OCR models, embeddings, rerankers, and image generation models on Apple Silicon.

### Admin Dashboard

Expand All @@ -135,6 +136,10 @@ Web UI at `/admin` for real-time monitoring, model management, chat, benchmark,

Run VLMs with the same continuous batching and tiered KV cache stack as text LLMs. Supports multi-image chat, base64/URL/file image inputs, and tool calling with vision context. OCR models (DeepSeek-OCR, DOTS-OCR, GLM-OCR) are auto-detected with optimized prompts.

### Image Generation

Generate images from text prompts (Text-to-Image) or transform existing images (Image-to-Image) using diffusion models via mflux. Supports FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, and Qwen Image models with OpenAI-compatible `/v1/images/generations` API.

### Tiered KV Cache (Hot + Cold)

Block-based KV cache management inspired by vLLM, with prefix sharing and Copy-on-Write. The cache operates across two tiers:
Expand Down Expand Up @@ -227,6 +232,7 @@ Drop-in replacement for OpenAI and Anthropic APIs. Supports streaming usage stat
| `POST /v1/messages` | Anthropic Messages API |
| `POST /v1/embeddings` | Text embeddings |
| `POST /v1/rerank` | Document reranking |
| `POST /v1/images/generations` | Image generation (T2I & I2I) |
| `GET /v1/models` | List available models |

### Tool Calling & Structured Output
Expand Down Expand Up @@ -268,6 +274,7 @@ Models are auto-detected by type. You can also download models directly from the
| OCR | DeepSeek-OCR, DOTS-OCR, GLM-OCR |
| Embedding | BERT, BGE-M3, ModernBERT |
| Reranker | ModernBERT, XLM-RoBERTa |
| Image | FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image (via [mflux](https://github.com/mflux-ai/mflux)) |

## CLI Configuration

Expand Down
9 changes: 8 additions & 1 deletion README.zh.md
Original file line number Diff line number Diff line change
Expand Up @@ -76,6 +76,7 @@ git clone https://github.com/jundot/omlx.git
cd omlx
pip install -e . # 仅核心
pip install -e ".[mcp]" # 含 MCP(Model Context Protocol)支持
pip install -e ".[image]" # 含图像生成支持(需要 mflux)
```

需要 macOS 15.0+ (Sequoia), Python 3.10+ 和 Apple Silicon(M1/M2/M3/M4)。
Expand Down Expand Up @@ -118,7 +119,7 @@ brew services info omlx # 查看状态

## 功能

在 Apple Silicon 上支持文本 LLM、视觉语言模型(VLM)、OCR 模型、嵌入模型和重排序模型
在 Apple Silicon 上支持文本 LLM、视觉语言模型(VLM)、OCR 模型、嵌入模型、重排序模型和图像生成模型

### 管理后台

Expand All @@ -132,6 +133,10 @@ brew services info omlx # 查看状态

使用与文本 LLM 相同的连续批处理和分层 KV 缓存堆栈运行 VLM。支持多图聊天、base64/URL/文件图像输入,以及带视觉上下文的工具调用。OCR 模型(DeepSeek-OCR、DOTS-OCR、GLM-OCR)会被自动识别,并使用优化的提示词。

### 图像生成

通过 mflux 使用扩散模型从文本提示生成图像(Text-to-Image)或转换现有图像(Image-to-Image)。支持 FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2 和 Qwen Image 模型,提供 OpenAI 兼容的 `/v1/images/generations` API。

### 分层 KV 缓存(热缓存 + 冷缓存)

借鉴 vLLM 的基于块的 KV 缓存管理,支持前缀共享和写时复制(Copy-on-Write)。缓存分为两个层级:
Expand Down Expand Up @@ -224,6 +229,7 @@ OpenAI 和 Anthropic API 的直接替代品。支持流式使用统计(`stream
| `POST /v1/messages` | Anthropic Messages API |
| `POST /v1/embeddings` | 文本嵌入 |
| `POST /v1/rerank` | 文档重排序 |
| `POST /v1/images/generations` | 图像生成(T2I 和 I2I) |
| `GET /v1/models` | 列出可用模型 |

### 工具调用与结构化输出
Expand Down Expand Up @@ -265,6 +271,7 @@ OpenAI 和 Anthropic API 的直接替代品。支持流式使用统计(`stream
| OCR | DeepSeek-OCR、DOTS-OCR、GLM-OCR |
| 嵌入 | BERT、BGE-M3、ModernBERT |
| 重排序 | ModernBERT、XLM-RoBERTa |
| 图像 | FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2、Qwen Image(通过 [mflux](https://github.com/mflux-ai/mflux)) |

## CLI 配置

Expand Down
8 changes: 5 additions & 3 deletions omlx/admin/hf_downloader.py
Original file line number Diff line number Diff line change
Expand Up @@ -613,9 +613,11 @@ async def _run_download(self, task_id: str, hf_token: str) -> None:

api, endpoint = _get_hf_api()

# Always ignore macOS metadata files and other unneeded files
ignore_patterns = ["*.DS_Store", ".DS_Store"]

# Skip pytorch format when safetensors exist to
# avoid downloading redundant weight files.
ignore_patterns = None
try:
model_info = await asyncio.wait_for(
asyncio.to_thread(
Expand All @@ -629,11 +631,11 @@ async def _run_download(self, task_id: str, hf_token: str) -> None:
if model_info.safetensors and model_info.safetensors.get(
"parameters"
):
ignore_patterns = [
ignore_patterns.extend([
"*.bin",
"original/**",
"consolidated.*.pth",
]
])
except Exception as e:
logger.warning(
f"Could not fetch repo info for {task.repo_id}: {e}"
Expand Down
25 changes: 22 additions & 3 deletions omlx/admin/routes.py
Original file line number Diff line number Diff line change
Expand Up @@ -1551,7 +1551,7 @@ async def update_model_settings(
)
current_settings.model_alias = alias_value
if "model_type_override" in sent:
valid_types = {"llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts"}
valid_types = {"llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts", "image_t2i"}
# Treat empty string as None (auto-detect)
override_value = request.model_type_override or None
if override_value is not None and override_value not in valid_types:
Expand All @@ -1569,6 +1569,7 @@ async def update_model_settings(
"audio_stt": "audio_stt",
"audio_tts": "audio_tts",
"audio_sts": "audio_sts",
"image_t2i": "image",
}
if override_value:
entry.model_type = override_value
Expand Down Expand Up @@ -3117,6 +3118,14 @@ def _build_active_models_data() -> dict:
if sched is not None:
waiting_requests = len(getattr(sched, "waiting", []))

# Image engine: read progress from ImageProgressTracker
image_progress = []
if model_info.get("engine_type") == "image":
from ..image_progress import get_image_progress_tracker
img_tracker = get_image_progress_tracker()
image_progress = img_tracker.get_model_progress(model_id)
active_requests = len(image_progress)

prefilling = tracker.get_model_progress(model_id)
prefilling_ids = {p["request_id"] for p in prefilling}

Expand All @@ -3138,6 +3147,7 @@ def _build_active_models_data() -> dict:
"waiting_requests": waiting_requests,
"prefilling": prefilling,
"generating": generating,
"image_progress": image_progress,
})

total_active += active_requests
Expand Down Expand Up @@ -3617,6 +3627,15 @@ def _add_model(model_path: Path, model_name: str) -> None:
"size_formatted": format_size(total_size),
}
)
def is_model_dir(path: Path) -> bool:
"""Check if directory is a valid model (config.json or diffusers-style image model)."""
if (path / "config.json").exists():
return True
# Diffusers-style image models (Flux, etc.)
has_transformer = (path / "transformer").is_dir()
has_vae = (path / "vae").is_dir()
has_text_encoder = (path / "text_encoder").is_dir() or (path / "text_encoder_2").is_dir()
return has_transformer and (has_vae or has_text_encoder)

models = []
seen_names: set[str] = set()
Expand All @@ -3627,7 +3646,7 @@ def _add_model(model_path: Path, model_name: str) -> None:
if not subdir.is_dir() or subdir.name.startswith("."):
continue

if (subdir / "config.json").exists():
if is_model_dir(subdir):
# Level 1: direct model folder
_add_model(subdir, subdir.name)
else:
Expand All @@ -3643,7 +3662,7 @@ def _add_model(model_path: Path, model_name: str) -> None:
for child in sorted(subdir.iterdir()):
if not child.is_dir() or child.name.startswith("."):
continue
if (child / "config.json").exists():
if is_model_dir(child):
_add_model(child, child.name)

return {"models": models}
Expand Down
2 changes: 1 addition & 1 deletion omlx/admin/static/js/dashboard.js
Original file line number Diff line number Diff line change
Expand Up @@ -1575,7 +1575,7 @@
},

get llmModels() {
return this.models.filter(m => m.model_type === 'llm' || m.model_type === 'vlm' || !m.model_type);
return this.models.filter(m => m.model_type === 'llm' || m.model_type === 'vlm' || m.model_type === 'image_t2i' || !m.model_type);
},

shellQuote(value) {
Expand Down
1 change: 1 addition & 0 deletions omlx/admin/templates/dashboard/_modal_model_settings.html
Original file line number Diff line number Diff line change
Expand Up @@ -222,6 +222,7 @@ <h3 class="text-xs font-bold uppercase tracking-widest text-neutral-400 mb-5">{{
<option value="audio_stt">Audio STT</option>
<option value="audio_tts">Audio TTS</option>
<option value="audio_sts">Audio STS</option>
<option value="image_t2i">Image T2I</option>
</select>
</div>
<div x-show="reasoningParsers.length > 0">
Expand Down
Loading