Skip to content

fix: support OpenAI image dynamic resolution#57

Merged
ShellMonster merged 3 commits into
mainfrom
fix/openai-image-dynamic-resolution
Apr 25, 2026
Merged

fix: support OpenAI image dynamic resolution#57
ShellMonster merged 3 commits into
mainfrom
fix/openai-image-dynamic-resolution

Conversation

@ShellMonster
Copy link
Copy Markdown
Owner

@ShellMonster ShellMonster commented Apr 25, 2026

User description

Summary

  • fix OpenAI Images timeout lookup so openai-image uses its own provider timeout
  • enable reference images and add gpt-image-2 option for OpenAI Images
  • route gpt-image-2 models through aspect ratio + 1K/2K/4K resolution controls, matching banana-slides dynamic size behavior
  • bump desktop version to 2.8.2

Validation

  • cd backend && go test ./...
  • cd desktop && npm run type-check
  • cd desktop && npm run tauri:build:latest (DMG produced; updater signing failed because TAURI_SIGNING_PRIVATE_KEY is not set)
  • hdiutil verify desktop/src-tauri/target/release/bundle/dmg/大香蕉 AI_2.8.2_aarch64.dmg

CodeAnt-AI Description

Support dynamic OpenAI image sizing and keep OpenAI image settings separate

What Changed

  • OpenAI image generation now supports the gpt-image-2 model with aspect ratio and 1K/2K/4K size controls
  • Reference images are allowed for OpenAI image generation
  • Quality settings now show for OpenAI image generation whenever they are supported, instead of only for one size mode
  • OpenAI image requests now keep their own timeout settings, so they no longer reuse the standard OpenAI timeout by mistake
  • App version was bumped to 2.8.2

Impact

✅ Fewer OpenAI image generation timeouts
✅ More OpenAI image model options
✅ Clearer OpenAI image sizing controls

🔄 Retrigger CodeAnt AI Review

Details

💡 Usage Guide

Checking Your Pull Request

Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.

Talking to CodeAnt AI

Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:

@codeant-ai ask: Your question here

This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.

Example

@codeant-ai ask: Can you suggest a safer alternative to storing this secret?

Preserve Org Learnings with CodeAnt

You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:

@codeant-ai: Your feedback here

This helps CodeAnt AI learn and adapt to your team's coding style and standards.

Example

@codeant-ai: Do not flag unused imports.

Retrigger review

Ask CodeAnt AI to review the PR again, by typing:

@codeant-ai: review

Check Your Repository Health

To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented Apr 25, 2026

CodeAnt AI is reviewing your PR.


Thanks for using CodeAnt! 🎉

We're free for open-source projects. if you're enjoying it, help us grow by sharing.

Share on X ·
Reddit ·
LinkedIn

@gemini-code-assist
Copy link
Copy Markdown

Summary of Changes

Hello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request enhances the OpenAI image generation capabilities by introducing support for the 'gpt-image-2' model and enabling dynamic resolution and aspect ratio controls. It also fixes a timeout reconciliation issue to ensure 'openai-image' tasks receive the correct configuration and enables reference image support across the board. The desktop application version has been bumped to 2.8.2 to reflect these changes.

Highlights

  • OpenAI Image Timeout Fix: Updated the backend provider timeout logic to correctly identify and apply specific timeouts for 'openai-image' instead of falling back to general OpenAI settings.
  • Model Support and Dynamic Resolution: Added support for the 'gpt-image-2' model and implemented dynamic resolution controls (aspect ratio) for OpenAI image generation, aligning it with existing dynamic size behaviors.
  • Feature Enablement: Enabled reference images for OpenAI providers and updated the UI to conditionally display quality and aspect ratio controls based on the selected model.
Ignored Files
  • Ignored by pattern: **/*.json (3)
    • desktop/package-lock.json
    • desktop/package.json
    • desktop/src-tauri/tauri.conf.json
  • Ignored by pattern: **/*.lock (1)
    • desktop/src-tauri/Cargo.lock
  • Ignored by pattern: **/*.toml (1)
    • desktop/src-tauri/Cargo.toml
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

@codeant-ai codeant-ai Bot added the size:L This PR changes 100-499 lines, ignoring generated files label Apr 25, 2026
Comment on lines +460 to +462
if (useQuality) {
formData.append('quality', config.imageQuality);
}
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Architect Review — HIGH

Multipart image-to-image requests now include a quality field from the frontend, but the backend multipart parsing (ParseGenerateRequestFromMultipart) and GenerateWithImagesHandler never read or forward quality into taskParams, so OpenAI Images quality settings are ignored for reference-image flows.

Suggestion: Extend the multipart contract so that quality is parsed into MultipartRequest and included in taskParams in GenerateWithImagesHandler, and add a test covering an OpenAI Images reference-image request to verify that changing quality affects the upstream call.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.

**Path:** desktop/src/hooks/useGenerate.ts
**Line:** 460:462
**Comment:**
	*HIGH: Multipart image-to-image requests now include a `quality` field from the frontend, but the backend multipart parsing (`ParseGenerateRequestFromMultipart`) and `GenerateWithImagesHandler` never read or forward `quality` into taskParams, so OpenAI Images quality settings are ignored for reference-image flows.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

Comment thread desktop/src/store/configStore.ts Outdated
export const usesDynamicOpenAIImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && String(model || '').toLowerCase().includes('gpt-image-2')
);
export const supportsReferenceImages = (_provider: string): boolean => true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🟠 Architect Review — HIGH

supportsReferenceImages now always returns true, so the UI enables reference images for openai-image and accepts JPG/WebP uploads, but OpenAIImageProvider.collectOpenAIImageReferences rejects any non-PNG reference; typical JPEG/WebP reference uploads with OpenAI Images now fail at generation time instead of being prevented or normalized.

Suggestion: Either gate reference-image support per provider/model and format (disallow non-PNG refs for openai-image in the UI), or normalize reference images to PNG before sending and/or surface a clear pre-submit validation error when the format is incompatible.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is an **Architect / Logical Review** comment left during a code review. These reviews are first-class, important findings — not optional suggestions. Do NOT dismiss this as a 'big architectural change' just because the title says architect review; most of these can be resolved with a small, localized fix once the intent is understood.

**Path:** desktop/src/store/configStore.ts
**Line:** 86:86
**Comment:**
	*HIGH: `supportsReferenceImages` now always returns true, so the UI enables reference images for `openai-image` and accepts JPG/WebP uploads, but `OpenAIImageProvider.collectOpenAIImageReferences` rejects any non-PNG reference; typical JPEG/WebP reference uploads with OpenAI Images now fail at generation time instead of being prevented or normalized.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
If a suggested approach is provided above, use it as the authoritative instruction. If no explicit code suggestion is given, you MUST still draft and apply your own minimal, localized fix — do not punt back with 'no suggestion provided, review manually'. Keep the change as small as possible: add a guard clause, gate on a loading state, reorder an await, wrap in a conditional, etc. Do not refactor surrounding code or expand scope beyond the finding.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix

Comment on lines +460 to +461
if (useQuality) {
formData.append('quality', config.imageQuality);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggestion: The multipart flow now sends a quality field, but the backend multipart parser does not read or forward quality into task params, so quality control is silently ignored whenever reference images are used. This creates inconsistent behavior between JSON generation and image-to-image generation. Update the multipart parsing/handler path to include quality (and pass it into provider params) so this new field actually takes effect. [logic error]

Severity Level: Major ⚠️
- ⚠️ OpenAI image-to-image ignores selected quality parameter.
- ⚠️ Quality dropdown ineffective whenever reference images are used.
- ⚠️ Behavior diverges between /tasks/generate and /tasks/generate-with-images.
Steps of Reproduction ✅
1. In the desktop app, open the ConfigPanel
(desktop/src/components/ConfigPanel/index.tsx:4-10) and select the image provider
`openai-image`, which enables the quality dropdown in BatchSettings
(desktop/src/components/ConfigPanel/BatchSettings.tsx:67-75) because
`supportsQualityControl(imageProvider)` returns true for `openai-image`
(desktop/src/store/configStore.ts:12-19). Choose a non-default `imageQuality` (e.g.
"high") and add at least one reference image so that `refFiles.length > 0` (configStore
fields `imageQuality` and `refFiles` are defined in
desktop/src/store/configStore.ts:31-37).

2. Click the "Generate" button, whose onClick calls `generate` from `useGenerate()`
(desktop/src/components/ConfigPanel/index.tsx:34-41). Inside `useGenerate`, `useQuality`
is set from `supportsQualityControl(config.imageProvider)`
(desktop/src/hooks/useGenerate.ts:413) and `allowReferenceImages` is true with
`config.refFiles.length > 0`, so `submitSingleGenerate` or the single-task branch builds a
`FormData` and appends `quality` when `useQuality` is true
(desktop/src/hooks/useGenerate.ts:460-461 and 634-635). The hook then calls
`generateBatchWithImages(formData)` (desktop/src/hooks/useGenerate.ts:101 and 480), which
POSTs this multipart body to `/tasks/generate-with-images`
(desktop/src/services/generateApi.ts:13-17).

3. On the backend, the route `POST /v1/tasks/generate-with-images` is wired to
`GenerateWithImagesHandler` (backend/cmd/server/main.go:276-277).
`GenerateWithImagesHandler` (backend/internal/api/handlers.go:521-524) immediately calls
`ParseGenerateRequestFromMultipart(c)` to parse the multipart form.
`ParseGenerateRequestFromMultipart` and its fallback `parseWithStandardLibrary`
(backend/internal/api/multipart_helper.go:21-35, 37-52, 95-115, and 128-151) register
handlers for fields like `provider`, `model_id`, `prompt`, `aspectRatio`, `imageSize`,
`count`, `verbose_logging`, `prompt_optimize_*`, `refPaths`, and `refImages`, but there is
no registration or parsing of a `quality` field (confirmed by no `"quality"` matches in
multipart_helper.go via Grep).

4. After parsing, `GenerateWithImagesHandler` constructs `taskParams` containing `prompt`,
`provider`, `model_id`, `aspect_ratio`, `resolution_level`, `count`, and
`reference_images` (backend/internal/api/handlers.go:74-88) and passes this map to the
provider. The `OpenAIImageProvider` expects an optional `quality` parameter and validates
it (backend/internal/provider/openai_image.go:16-21) and then forwards it into the OpenAI
request body (`buildImagesGenerationRequestBody` sets `body.Quality` from
`params["quality"]` at backend/internal/provider/openai_image.go:107-115, and
`openAIImageEditFields` includes it in the multipart fields at 334-338). Because
`taskParams` from the multipart path never includes `quality`, `params["quality"]` is
empty, so `body.Quality` stays unset and no `quality` field is sent to OpenAI for image
edits. In contrast, the JSON text-to-image path `/tasks/generate` uses `GenerateHandler`
(backend/internal/api/handlers.go:399-419), which passes `req.Params` (including `quality`
from `buildImageParams` in desktop/src/hooks/useGenerate.ts:415-417 and 51-60) directly to
the provider, and `buildConfigSnapshot` records `params["quality"]`
(backend/internal/api/handlers.go:104-108), so the selected quality only takes effect when
no reference images are used.

Fix in Cursor | Fix in VSCode Claude

(Use Cmd/Ctrl + Click for best experience)

Prompt for AI Agent 🤖
This is a comment left during a code review.

**Path:** desktop/src/hooks/useGenerate.ts
**Line:** 460:461
**Comment:**
	*Logic Error: The multipart flow now sends a `quality` field, but the backend multipart parser does not read or forward `quality` into task params, so quality control is silently ignored whenever reference images are used. This creates inconsistent behavior between JSON generation and image-to-image generation. Update the multipart parsing/handler path to include `quality` (and pass it into provider params) so this new field actually takes effect.

Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix
👍 | 👎

@codeant-ai
Copy link
Copy Markdown

codeant-ai Bot commented Apr 25, 2026

CodeAnt AI finished reviewing your PR.

@codacy-production
Copy link
Copy Markdown

codacy-production Bot commented Apr 25, 2026

Up to standards ✅

🟢 Issues 0 issues

Results:
0 new issues

View in Codacy

🟢 Metrics 34 complexity · 2 duplication

Metric Results
Complexity 34
Duplication 2

View in Codacy

NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.

Copy link
Copy Markdown

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces support for the 'openai-image' model, including backend logic updates to handle provider timeouts and frontend configuration changes to support dynamic image sizes and quality control. The review comments suggest renaming several boolean helper functions in 'desktop/src/store/configStore.ts' to follow the project's naming convention of prefixing boolean variables and functions with 'is/has/should'.

Comment thread desktop/src/store/configStore.ts Outdated
Comment on lines +83 to +85
export const usesDynamicOpenAIImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && String(model || '').toLowerCase().includes('gpt-image-2')
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

根据项目代码审查规范第 106 条,布尔值变量和函数应以 is/has/should 开头。建议将 usesDynamicOpenAIImageSize 重命名为 isUsingDynamicOpenAIImageSize 以符合命名规范。

Suggested change
export const usesDynamicOpenAIImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && String(model || '').toLowerCase().includes('gpt-image-2')
);
export const isUsingDynamicOpenAIImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && String(model || '').toLowerCase().includes('gpt-image-2')
);
References
  1. React/TypeScript 命名规范:布尔值以 is/has/should 开头 (link)

Comment thread desktop/src/store/configStore.ts Outdated
export const usesDynamicOpenAIImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && String(model || '').toLowerCase().includes('gpt-image-2')
);
export const supportsReferenceImages = (_provider: string): boolean => true;
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

根据项目代码审查规范第 106 条,布尔值变量和函数应以 is/has/should 开头。建议将 supportsReferenceImages 重命名为 isReferenceImageSupported

Suggested change
export const supportsReferenceImages = (_provider: string): boolean => true;
export const isReferenceImageSupported = (_provider: string): boolean => true;
References
  1. React/TypeScript 命名规范:布尔值以 is/has/should 开头 (link)

Comment thread desktop/src/store/configStore.ts Outdated
Comment on lines +87 to +89
export const usesNativeImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && !usesDynamicOpenAIImageSize(provider, model)
);
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

根据项目代码审查规范第 106 条,布尔值变量和函数应以 is/has/should 开头。建议将 usesNativeImageSize 重命名为 isUsingNativeImageSize

Suggested change
export const usesNativeImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && !usesDynamicOpenAIImageSize(provider, model)
);
export const isUsingNativeImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && !isUsingDynamicOpenAIImageSize(provider, model)
);
References
  1. React/TypeScript 命名规范:布尔值以 is/has/should 开头 (link)

Comment thread desktop/src/store/configStore.ts Outdated
export const usesNativeImageSize = (provider: string, model?: string): boolean => (
provider === 'openai-image' && !usesDynamicOpenAIImageSize(provider, model)
);
export const supportsQualityControl = (provider: string): boolean => provider === 'openai-image';
Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

根据项目代码审查规范第 106 条,布尔值变量和函数应以 is/has/should 开头。建议将 supportsQualityControl 重命名为 isQualityControlSupported

Suggested change
export const supportsQualityControl = (provider: string): boolean => provider === 'openai-image';
export const isQualityControlSupported = (provider: string): boolean => provider === 'openai-image';
References
  1. React/TypeScript 命名规范:布尔值以 is/has/should 开头 (link)

@ShellMonster ShellMonster merged commit 2ca13c7 into main Apr 25, 2026
5 checks passed
@ShellMonster ShellMonster deleted the fix/openai-image-dynamic-resolution branch April 25, 2026 11:44
ShellMonster added a commit that referenced this pull request Apr 27, 2026
…olution

fix: support OpenAI image dynamic resolution
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

size:L This PR changes 100-499 lines, ignoring generated files

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant