jundot · iazrael · Apr 19, 2026 · Apr 19, 2026 · Apr 19, 2026 · Apr 21, 2026
diff --git a/.gitignore b/.gitignore
@@ -113,3 +113,8 @@ omlx/admin/tailwindcss-*
 
 # Git worktrees
 .worktrees/
+# claude
+.claude/
+.omc/
+.omx/
+outputs/
diff --git a/README.ja.md b/README.ja.md
@@ -76,6 +76,7 @@ git clone https://github.com/jundot/omlx.git
 cd omlx
 pip install -e .          # コアのみ
 pip install -e ".[mcp]"   # MCP（Model Context Protocol）サポート付き
+pip install -e ".[image]" # 画像生成サポート付き（mfluxが必要）
 ```
 
 Python 3.10+とApple Silicon（M1/M2/M3/M4）が必要です。
@@ -118,7 +119,7 @@ brew services info omlx     # ステータス確認
 
 ## 機能
 
-Apple SiliconでテキストLLM、ビジョン言語モデル（VLM）、OCRモデル、エンベディング、リランカーをサポートします。
+Apple SiliconでテキストLLM、ビジョン言語モデル（VLM）、OCRモデル、エンベディング、リランカー、画像生成モデルをサポートします。
 
 ### 管理画面
 
@@ -132,6 +133,10 @@ Apple SiliconでテキストLLM、ビジョン言語モデル（VLM）、OCRモ
 
 テキストLLMと同じ連続バッチング・階層型KVキャッシュスタックでVLMを実行します。マルチ画像チャット、base64/URL/ファイル画像入力、ビジョンコンテキストを活用したツール呼び出しをサポートします。OCRモデル（DeepSeek-OCR、DOTS-OCR、GLM-OCR）は最適化されたプロンプトで自動検出されます。
 
+### 画像生成
+
+mfluxによる拡散モデルでテキストプロンプトから画像を生成（Text-to-Image）または既存の画像を変換（Image-to-Image）します。FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2、Qwen Imageモデルをサポートし、OpenAI互換の`/v1/images/generations` APIを提供します。
+
 ### 階層型KVキャッシュ（ホット+コールド）
 
 vLLMにインスパイアされたブロックベースのKVキャッシュ管理で、プレフィックス共有とCopy-on-Writeをサポートします。キャッシュは2つの階層で動作します：
@@ -216,6 +221,7 @@ OpenAIとAnthropic APIのドロップイン代替です。ストリーミング
 | `POST /v1/messages` | Anthropic Messages API |
 | `POST /v1/embeddings` | テキストエンベディング |
 | `POST /v1/rerank` | ドキュメントリランキング |
+| `POST /v1/images/generations` | 画像生成（T2I & I2I） |
 | `GET /v1/models` | 利用可能なモデル一覧 |
 
 ### ツール呼び出し＆構造化出力
@@ -257,6 +263,7 @@ mlx-lmで利用可能なすべての関数呼び出し形式、JSONスキーマ
 | OCR | DeepSeek-OCR、DOTS-OCR、GLM-OCR |
 | エンベディング | BERT、BGE-M3、ModernBERT |
 | リランカー | ModernBERT、XLM-RoBERTa |
+| 画像 | FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2、Qwen Image（[mflux](https://github.com/mflux-ai/mflux)経由） |
 
 ## CLI 設定
 

diff --git a/README.ko.md b/README.ko.md
@@ -76,6 +76,7 @@ git clone https://github.com/jundot/omlx.git
 cd omlx
 pip install -e .          # 코어만
 pip install -e ".[mcp]"   # MCP (Model Context Protocol) 포함
+pip install -e ".[image]" # 이미지 생성 지원 포함 (mflux 필요)
 ```
 
 Python 3.10+와 Apple Silicon (M1/M2/M3/M4)이 필요합니다.
@@ -118,7 +119,7 @@ brew services info omlx     # 상태 확인
 
 ## 기능
 
-Apple Silicon에서 텍스트 LLM, 비전-언어 모델(VLM), OCR 모델, 임베딩 모델, 리랭커를 지원합니다.
+Apple Silicon에서 텍스트 LLM, 비전-언어 모델(VLM), OCR 모델, 임베딩 모델, 리랭커, 이미지 생성 모델을 지원합니다.
 
 ### 관리자 대시보드
 
@@ -132,6 +133,10 @@ Apple Silicon에서 텍스트 LLM, 비전-언어 모델(VLM), OCR 모델, 임베
 
 텍스트 LLM과 동일한 연속 배칭 및 계층형 KV 캐시 스택으로 VLM을 실행합니다. 멀티 이미지 채팅, base64/URL/파일 이미지 입력, 비전 컨텍스트를 활용한 Tool calling을 지원합니다. OCR 모델(DeepSeek-OCR, DOTS-OCR, GLM-OCR)은 자동 감지되며 최적화된 프롬프트가 적용됩니다.
 
+### 이미지 생성
+
+mflux를 통한 확산 모델로 텍스트 프롬프트에서 이미지를 생성(Text-to-Image)하거나 기존 이미지를 변환(Image-to-Image)합니다. FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image 모델을 지원하며 OpenAI 호환 `/v1/images/generations` API를 제공합니다.
+
 ### 계층형 KV 캐시 (Hot + Cold)
 
 vLLM에서 영감을 받은 블록 기반 KV 캐시 관리로, 프리픽스 공유와 Copy-on-Write를 지원합니다. 캐시는 두 계층으로 나뉩니다:
@@ -216,6 +221,7 @@ OpenAI 및 Anthropic API를 그대로 대체합니다. 스트리밍 사용량
 | `POST /v1/messages` | Anthropic Messages API |
 | `POST /v1/embeddings` | 텍스트 임베딩 |
 | `POST /v1/rerank` | 문서 리랭킹 |
+| `POST /v1/images/generations` | 이미지 생성 (T2I & I2I) |
 | `GET /v1/models` | 사용 가능한 모델 목록 |
 
 ### Tool calling & 구조화된 출력
@@ -257,6 +263,7 @@ mlx-lm에서 사용 가능한 모든 함수 호출 형식, JSON 스키마 검증
 | OCR | DeepSeek-OCR, DOTS-OCR, GLM-OCR |
 | 임베딩 | BERT, BGE-M3, ModernBERT |
 | 리랭커 | ModernBERT, XLM-RoBERTa |
+| 이미지 | FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image ([mflux](https://github.com/mflux-ai/mflux) 통해) |
 
 ## CLI 설정
 

diff --git a/README.md b/README.md
@@ -79,6 +79,7 @@ git clone https://github.com/jundot/omlx.git
 cd omlx
 pip install -e .          # Core only
 pip install -e ".[mcp]"   # With MCP (Model Context Protocol) support
+pip install -e ".[image]" # With image generation support (requires mflux)
 ```
 
 Requires macOS 15.0+ (Sequoia), Python 3.10+, and Apple Silicon (M1/M2/M3/M4).
@@ -121,7 +122,7 @@ Logs are written to two locations:
 
 ## Features
 
-Supports text LLMs, vision-language models (VLM), OCR models, embeddings, and rerankers on Apple Silicon.
+Supports text LLMs, vision-language models (VLM), OCR models, embeddings, rerankers, and image generation models on Apple Silicon.
 
 ### Admin Dashboard
 
@@ -135,6 +136,10 @@ Web UI at `/admin` for real-time monitoring, model management, chat, benchmark,
 
 Run VLMs with the same continuous batching and tiered KV cache stack as text LLMs. Supports multi-image chat, base64/URL/file image inputs, and tool calling with vision context. OCR models (DeepSeek-OCR, DOTS-OCR, GLM-OCR) are auto-detected with optimized prompts.
 
+### Image Generation
+
+Generate images from text prompts (Text-to-Image) or transform existing images (Image-to-Image) using diffusion models via mflux. Supports FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, and Qwen Image models with OpenAI-compatible `/v1/images/generations` API.
+
 ### Tiered KV Cache (Hot + Cold)
 
 Block-based KV cache management inspired by vLLM, with prefix sharing and Copy-on-Write. The cache operates across two tiers:
@@ -227,6 +232,7 @@ Drop-in replacement for OpenAI and Anthropic APIs. Supports streaming usage stat
 | `POST /v1/messages` | Anthropic Messages API |
 | `POST /v1/embeddings` | Text embeddings |
 | `POST /v1/rerank` | Document reranking |
+| `POST /v1/images/generations` | Image generation (T2I & I2I) |
 | `GET /v1/models` | List available models |
 
 ### Tool Calling & Structured Output
@@ -268,6 +274,7 @@ Models are auto-detected by type. You can also download models directly from the
 | OCR | DeepSeek-OCR, DOTS-OCR, GLM-OCR |
 | Embedding | BERT, BGE-M3, ModernBERT |
 | Reranker | ModernBERT, XLM-RoBERTa |
+| Image | FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image (via [mflux](https://github.com/mflux-ai/mflux)) |
 
 ## CLI Configuration
 

diff --git a/README.zh.md b/README.zh.md
@@ -76,6 +76,7 @@ git clone https://github.com/jundot/omlx.git
 cd omlx
 pip install -e .          # 仅核心
 pip install -e ".[mcp]"   # 含 MCP（Model Context Protocol）支持
+pip install -e ".[image]" # 含图像生成支持（需要 mflux）
 ```
 
 需要 macOS 15.0+ (Sequoia), Python 3.10+ 和 Apple Silicon（M1/M2/M3/M4）。
@@ -118,7 +119,7 @@ brew services info omlx     # 查看状态
 
 ## 功能
 
-在 Apple Silicon 上支持文本 LLM、视觉语言模型（VLM）、OCR 模型、嵌入模型和重排序模型。
+在 Apple Silicon 上支持文本 LLM、视觉语言模型（VLM）、OCR 模型、嵌入模型、重排序模型和图像生成模型。
 
 ### 管理后台
 
@@ -132,6 +133,10 @@ brew services info omlx     # 查看状态
 
 使用与文本 LLM 相同的连续批处理和分层 KV 缓存堆栈运行 VLM。支持多图聊天、base64/URL/文件图像输入，以及带视觉上下文的工具调用。OCR 模型（DeepSeek-OCR、DOTS-OCR、GLM-OCR）会被自动识别，并使用优化的提示词。
 
+### 图像生成
+
+通过 mflux 使用扩散模型从文本提示生成图像（Text-to-Image）或转换现有图像（Image-to-Image）。支持 FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2 和 Qwen Image 模型，提供 OpenAI 兼容的 `/v1/images/generations` API。
+
 ### 分层 KV 缓存（热缓存 + 冷缓存）
 
 借鉴 vLLM 的基于块的 KV 缓存管理，支持前缀共享和写时复制（Copy-on-Write）。缓存分为两个层级：
@@ -224,6 +229,7 @@ OpenAI 和 Anthropic API 的直接替代品。支持流式使用统计（`stream
 | `POST /v1/messages` | Anthropic Messages API |
 | `POST /v1/embeddings` | 文本嵌入 |
 | `POST /v1/rerank` | 文档重排序 |
+| `POST /v1/images/generations` | 图像生成（T2I 和 I2I） |
 | `GET /v1/models` | 列出可用模型 |
 
 ### 工具调用与结构化输出
@@ -265,6 +271,7 @@ OpenAI 和 Anthropic API 的直接替代品。支持流式使用统计（`stream
 | OCR | DeepSeek-OCR、DOTS-OCR、GLM-OCR |
 | 嵌入 | BERT、BGE-M3、ModernBERT |
 | 重排序 | ModernBERT、XLM-RoBERTa |
+| 图像 | FLUX.1/FLUX.2、Z-Image、FIBO、SeedVR2、Qwen Image（通过 [mflux](https://github.com/mflux-ai/mflux)） |
 
 ## CLI 配置
 

diff --git a/omlx/admin/hf_downloader.py b/omlx/admin/hf_downloader.py
@@ -613,9 +613,11 @@ async def _run_download(self, task_id: str, hf_token: str) -> None:
 
                 api, endpoint = _get_hf_api()
 
+                # Always ignore macOS metadata files and other unneeded files
+                ignore_patterns = ["*.DS_Store", ".DS_Store"]
+
                 # Skip pytorch format when safetensors exist to
                 # avoid downloading redundant weight files.
-                ignore_patterns = None
                 try:
                     model_info = await asyncio.wait_for(
                         asyncio.to_thread(
@@ -629,11 +631,11 @@ async def _run_download(self, task_id: str, hf_token: str) -> None:
                     if model_info.safetensors and model_info.safetensors.get(
                         "parameters"
                     ):
-                        ignore_patterns = [
+                        ignore_patterns.extend([
                             "*.bin",
                             "original/**",
                             "consolidated.*.pth",
-                        ]
+                        ])
                 except Exception as e:
                     logger.warning(
                         f"Could not fetch repo info for {task.repo_id}: {e}"

diff --git a/omlx/admin/routes.py b/omlx/admin/routes.py
@@ -1551,7 +1551,7 @@ async def update_model_settings(
                     )
         current_settings.model_alias = alias_value
     if "model_type_override" in sent:
-        valid_types = {"llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts"}
+        valid_types = {"llm", "vlm", "embedding", "reranker", "audio_stt", "audio_tts", "audio_sts", "image_t2i"}
         # Treat empty string as None (auto-detect)
         override_value = request.model_type_override or None
         if override_value is not None and override_value not in valid_types:
@@ -1569,6 +1569,7 @@ async def update_model_settings(
             "audio_stt": "audio_stt",
             "audio_tts": "audio_tts",
             "audio_sts": "audio_sts",
+            "image_t2i": "image",
         }
         if override_value:
             entry.model_type = override_value
@@ -3117,6 +3118,14 @@ def _build_active_models_data() -> dict:
                     if sched is not None:
                         waiting_requests = len(getattr(sched, "waiting", []))
 
+        # Image engine: read progress from ImageProgressTracker
+        image_progress = []
+        if model_info.get("engine_type") == "image":
+            from ..image_progress import get_image_progress_tracker
+            img_tracker = get_image_progress_tracker()
+            image_progress = img_tracker.get_model_progress(model_id)
+            active_requests = len(image_progress)
+
         prefilling = tracker.get_model_progress(model_id)
         prefilling_ids = {p["request_id"] for p in prefilling}
 
@@ -3138,6 +3147,7 @@ def _build_active_models_data() -> dict:
             "waiting_requests": waiting_requests,
             "prefilling": prefilling,
             "generating": generating,
+            "image_progress": image_progress,
         })
 
         total_active += active_requests
@@ -3617,6 +3627,15 @@ def _add_model(model_path: Path, model_name: str) -> None:
                 "size_formatted": format_size(total_size),
             }
         )
+    def is_model_dir(path: Path) -> bool:
+        """Check if directory is a valid model (config.json or diffusers-style image model)."""
+        if (path / "config.json").exists():
+            return True
+        # Diffusers-style image models (Flux, etc.)
+        has_transformer = (path / "transformer").is_dir()
+        has_vae = (path / "vae").is_dir()
+        has_text_encoder = (path / "text_encoder").is_dir() or (path / "text_encoder_2").is_dir()
+        return has_transformer and (has_vae or has_text_encoder)
 
     models = []
     seen_names: set[str] = set()
@@ -3627,7 +3646,7 @@ def _add_model(model_path: Path, model_name: str) -> None:
             if not subdir.is_dir() or subdir.name.startswith("."):
                 continue
 
-            if (subdir / "config.json").exists():
+            if is_model_dir(subdir):
                 # Level 1: direct model folder
                 _add_model(subdir, subdir.name)
             else:
@@ -3643,7 +3662,7 @@ def _add_model(model_path: Path, model_name: str) -> None:
                 for child in sorted(subdir.iterdir()):
                     if not child.is_dir() or child.name.startswith("."):
                         continue
-                    if (child / "config.json").exists():
+                    if is_model_dir(child):
                         _add_model(child, child.name)
 
     return {"models": models}

diff --git a/omlx/admin/static/js/dashboard.js b/omlx/admin/static/js/dashboard.js
@@ -1575,7 +1575,7 @@
             },
 
             get llmModels() {
-                return this.models.filter(m => m.model_type === 'llm' || m.model_type === 'vlm' || !m.model_type);
+                return this.models.filter(m => m.model_type === 'llm' || m.model_type === 'vlm' || m.model_type === 'image_t2i' || !m.model_type);
             },
 
             shellQuote(value) {

diff --git a/omlx/admin/templates/dashboard/_modal_model_settings.html b/omlx/admin/templates/dashboard/_modal_model_settings.html
@@ -222,6 +222,7 @@ <h3 class="text-xs font-bold uppercase tracking-widest text-neutral-400 mb-5">{{
                                         <option value="audio_stt">Audio STT</option>
                                         <option value="audio_tts">Audio TTS</option>
                                         <option value="audio_sts">Audio STS</option>
+                                        <option value="image_t2i">Image T2I</option>
                                     </select>
                                 </div>
                                 <div x-show="reasoningParsers.length > 0">