Skip to content

feat(api): Add OpenAI-compatible Image Generation API (Text-to-Image & Image-to-Image)#855

Open
iazrael wants to merge 4 commits intojundot:mainfrom
iazrael:feature/image-generation-api
Open

feat(api): Add OpenAI-compatible Image Generation API (Text-to-Image & Image-to-Image)#855
iazrael wants to merge 4 commits intojundot:mainfrom
iazrael:feature/image-generation-api

Conversation

@iazrael
Copy link
Copy Markdown

@iazrael iazrael commented Apr 19, 2026

Summary

Implements comprehensive image generation support for oMLX via mflux, enabling Text-to-Image (T2I) and Image-to-Image (I2I) capabilities with OpenAI-compatible API endpoints.

Closes #477

Features

Core Implementation

  • ImageEngine (omlx/engine/image.py): Manages diffusion model loading, inference, and MLX executor for Metal command buffer safety
  • Model Discovery: Auto-detects 20+ image generation models from model_index.json (FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image)
  • API Layer: OpenAI-compatible /v1/images/generations endpoint with Pydantic validation
  • Admin Integration: Model downloader now filters macOS-compatible models and includes image generation models

Supported Models

Model Family Examples
FLUX.1/FLUX.2 FLUX.1-dev, FLUX.1-schnell, FLUX.2-klein
Z-Image Z-Image-Base, Z-Image-Variants
FIBO FIBO-Base
SeedVR2 SeedVR2-Base
Qwen Image Qwen2-VL-Image (via mflux)

API Compatibility

OpenAI DALL-E compatible request/response format:

POST /v1/images/generations
{
  "model": "FLUX.1-dev",
  "prompt": "a sunset over mountains",
  "size": "1024x1024",
  "n": 1
}

Changes

  • 20 files changed, 1701 insertions(+), 22 deletions(-)
  • New: omlx/api/image_models.py, omlx/api/image_routes.py, omlx/engine/image.py
  • Updated: model_discovery.py (refactored nested image detection into helper)
  • Updated: engine_pool.py (ImageEngine integration)
  • Updated: All README files (EN/ZH/KO/JA) with image generation documentation
  • Updated: pyproject.toml (optional [image] dependency with mflux>=0.17.0)

Tests

  • test_image_engine.py: Unit tests for ImageEngine (lazy import, model resolution, generation)
  • test_image_models.py: Pydantic validation tests
  • test_image_gen.py: Integration test script

Installation

pip install -e ".[image]"  # Requires mflux>=0.17.0

Architecture Notes

  • Uses MLX executor to prevent Metal command buffer race conditions
  • Lazy mflux import — only loaded when ImageEngine starts
  • Model alias mapping supports 20+ model name variants
  • I2I temp file cleanup handled in finally block
  • Refactored _check_image_model_from_config() from deeply nested logic in detect_model_type()

Related Issue: #477

@deepsweet
Copy link
Copy Markdown

deepsweet commented Apr 19, 2026

iazrael and others added 3 commits April 19, 2026 22:19
…model detection

Add comprehensive Text-to-Image (T2I) and Image-to-Image (I2I) support via mflux:
- Add ImageEngine with support for FLUX.1/FLUX.2, Z-Image, FIBO, SeedVR2, Qwen Image
- Implement OpenAI-compatible /v1/images/generations endpoint
- Add ImageRequest/ImageResponse Pydantic models with validation
- Integrate image models into EnginePool with LRU eviction support
- Refactor model_discovery.py: extract _check_image_model_from_config() helper
  to reduce nesting and improve testability of image model detection
- Add .worktrees/ and outputs/ to .gitignore

Supported features:
- Batch generation (n=1-4 images)
- Configurable size, quality, guidance scale, inference steps
- Negative prompts and seed control
- Image-to-image transformation with strength parameter

Dependencies:
- Add mflux>=0.17.0 for image generation

Tests:
- test_image_engine.py: ImageEngine unit tests with mocks
- test_image_models.py: Pydantic model validation tests
- test_image_gen.py: Integration test script

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Add denoising step progress display in Active Models card for image
generation requests. Shows step/total with speed metrics and per-request
progress bars when generating images via mflux.

- Add ImageProgressTracker singleton for thread-safe progress state
- Add ImageProgressCallback implementing mflux duck-typed protocols
- Integrate callback registration/cleanup in ImageEngine.generate_image()
- Update admin stats API to read image_progress for image engines
- Update dashboard template to show step progress with green indicator

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
@iazrael iazrael force-pushed the feature/image-generation-api branch from 9cee9b1 to 5f4d60b Compare April 19, 2026 14:19
@iazrael iazrael changed the title feat: Add OpenAI-compatible Image Generation API (Text-to-Image & Image-to-Image) feat(api): Add OpenAI-compatible Image Generation API (Text-to-Image & Image-to-Image) Apr 20, 2026
Remove strict Literal type constraint on image size parameter. Instead
accept any valid WxH format string and validate dynamically. This gives
API callers full control over resolution while providing suggested values
in documentation.

- Change size field from Literal to str with WxH format validation
- Add field_validator to ensure positive integer dimensions
- Document suggested sizes including new 720x1280 portrait option
- Update ImageEngine and model discovery for flexible resolution handling

Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
@0xAlcibiades
Copy link
Copy Markdown

Big up on this, great idea and was going to implement the same myself.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Feature Request: Image Generation API (Text-to-Image / Image-to-Image)

3 participants