feat: support OpenAI image edits#56
Conversation
|
CodeAnt AI is reviewing your PR. Thanks for using CodeAnt! 🎉We're free for open-source projects. if you're enjoying it, help us grow by sharing. Share on X · |
Summary of ChangesHello, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed! This pull request expands the capabilities of the OpenAI image provider by enabling support for image editing workflows. It introduces logic to handle reference images via multipart requests and adds sophisticated size resolution to ensure compatibility with various OpenAI models and user-defined aspect ratios. Highlights
Using Gemini Code AssistThe full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips. Invoking Gemini You can request assistance from Gemini at any point by creating a comment using either
Customization To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a Limitations & Feedback Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for GitHub and other Google products, sign up here. Footnotes
|
Up to standards ✅🟢 Issues
|
| Metric | Results |
|---|---|
| Complexity | 116 |
| Duplication | 14 |
NEW Get contextual insights on your PRs based on Codacy's metrics, along with PR and Jira context, without leaving GitHub. Enable AI reviewer
TIP This summary will be updated as you push new changes.
There was a problem hiding this comment.
Code Review
This pull request enhances the OpenAI image provider by adding support for image editing (edits) and dynamic resolution scaling based on aspect ratios. It introduces multipart form handling for reference images and includes comprehensive unit tests. Key feedback points include the use of non-standard API parameters that may cause errors, incorrect multipart field naming for image masks, and the need for stricter PNG format validation to comply with OpenAI's requirements. Additionally, recommendations were made to optimize memory usage for large file uploads using streaming and to improve code documentation for complex logic as per the style guide.
| if body.Quality != "" { | ||
| fields["quality"] = body.Quality | ||
| } | ||
| if body.InputFidelity != "" { | ||
| fields["input_fidelity"] = body.InputFidelity | ||
| } | ||
| if body.Background != "" { | ||
| fields["background"] = body.Background | ||
| } | ||
| if body.OutputFormat != "" { | ||
| fields["output_format"] = body.OutputFormat | ||
| } | ||
| if body.Moderation != "" { | ||
| fields["moderation"] = body.Moderation | ||
| } |
| for _, ref := range refs { | ||
| header := make(textproto.MIMEHeader) | ||
| header.Set("Content-Disposition", fmt.Sprintf(`form-data; name="image"; filename="%s"`, escapeMultipartFilename(ref.Name))) | ||
| header.Set("Content-Type", ref.MIME) | ||
| part, err := writer.CreatePart(header) | ||
| if err != nil { | ||
| return nil, nil, fmt.Errorf("构建 OpenAI Images Edit 图片字段失败: %w", err) | ||
| } | ||
| if _, err := part.Write(ref.Content); err != nil { | ||
| return nil, nil, fmt.Errorf("写入 OpenAI Images Edit 图片字段失败: %w", err) | ||
| } | ||
| } |
There was a problem hiding this comment.
OpenAI 的 /v1/images/edits 接口要求主图片字段名为 image,可选的遮罩字段名为 mask。当前逻辑将所有参考图都命名为 image,这不符合官方规范。如果提供了多张参考图,通常应将第一张作为 image,第二张作为 mask,并忽略多余的图片。
| for _, ref := range refs { | |
| header := make(textproto.MIMEHeader) | |
| header.Set("Content-Disposition", fmt.Sprintf(`form-data; name="image"; filename="%s"`, escapeMultipartFilename(ref.Name))) | |
| header.Set("Content-Type", ref.MIME) | |
| part, err := writer.CreatePart(header) | |
| if err != nil { | |
| return nil, nil, fmt.Errorf("构建 OpenAI Images Edit 图片字段失败: %w", err) | |
| } | |
| if _, err := part.Write(ref.Content); err != nil { | |
| return nil, nil, fmt.Errorf("写入 OpenAI Images Edit 图片字段失败: %w", err) | |
| } | |
| } | |
| for i, ref := range refs { | |
| fieldName := "image" | |
| if i == 1 { | |
| fieldName = "mask" | |
| } else if i > 1 { | |
| break // OpenAI Edit 仅支持一张图片和一个遮罩 | |
| } | |
| header := make(textproto.MIMEHeader) | |
| header.Set("Content-Disposition", fmt.Sprintf(`form-data; name="%s"; filename="%s"`, fieldName, escapeMultipartFilename(ref.Name))) | |
| header.Set("Content-Type", ref.MIME) | |
| part, err := writer.CreatePart(header) | |
| if err != nil { | |
| return nil, nil, fmt.Errorf("构建 OpenAI Images Edit 图片字段失败: %w", err) | |
| } | |
| if _, err := part.Write(ref.Content); err != nil { | |
| return nil, nil, fmt.Errorf("写入 OpenAI Images Edit 图片字段失败: %w", err) | |
| } | |
| } |
| if !strings.HasPrefix(mimeType, "image/") { | ||
| return nil, fmt.Errorf("第 %d 张参考图不是有效图片", idx+1) | ||
| } |
There was a problem hiding this comment.
OpenAI 的图片编辑接口(Edits)明确要求输入图片必须是 PNG 格式。当前代码仅检查是否为 image/ 前缀,允许了 JPEG、WebP 等格式,这会导致 API 调用失败。建议在此处严格校验 MIME 类型为 image/png。
| if !strings.HasPrefix(mimeType, "image/") { | |
| return nil, fmt.Errorf("第 %d 张参考图不是有效图片", idx+1) | |
| } | |
| if mimeType != "image/png" { | |
| return nil, fmt.Errorf("第 %d 张参考图不是有效的 PNG 图片,OpenAI Edit 仅支持 PNG 格式", idx+1) | |
| } |
| var payload bytes.Buffer | ||
| writer := multipart.NewWriter(&payload) |
There was a problem hiding this comment.
根据仓库规范第 68 条,大文件操作应注意内存使用,建议使用流式处理。当前实现使用 bytes.Buffer 缓存整个 multipart payload,在处理多张大图(单张限制 20MB)时会导致内存占用激增。建议改用 io.Pipe 结合 goroutine 进行流式写入。
References
- 大文件操作需注意内存使用(流式处理) (link)
| req.Header.Set("Content-Type", writer.FormDataContentType()) | ||
| req.Header.Set("Accept", "application/json") | ||
| req.Header.Set("Authorization", "Bearer "+strings.TrimSpace(p.config.APIKey)) | ||
| req.Header.Set("Connection", "close") |
| if size, _ := params["size"].(string); strings.TrimSpace(size) != "" { | ||
| return strings.TrimSpace(strings.ToLower(size)) |
There was a problem hiding this comment.
Suggestion: Explicit size values are returned as-is without checking model compatibility, so size="auto" (or any unsupported explicit size) will be sent to models like dall-e-2/dall-e-3 that require fixed dimensions, causing avoidable OpenAI 400 failures. Validate explicit sizes against the selected model or normalize invalid values before building the request body. [logic error]
Severity Level: Major ⚠️
- ❌ OpenAI-image GenerateHandler requests can fail on unsupported sizes.
- ⚠️ Image-edit requests may 400 when size conflicts with model.Steps of Reproduction ✅
1. A client calls the JSON image-generation endpoint `GenerateHandler` in
`backend/internal/api/handlers.go:1-7`, setting `provider` to `"openai-image"` and
including `params` with `"prompt": "..."` and an explicit `"size": "auto"` (or any
arbitrary dimension) as part of the request body.
2. Inside `GenerateHandler`, the provider is resolved
(`provider.GetProvider(req.Provider)` at `handlers.go:8-13`), `req.Params` is passed into
`ResolveModelID` for image purpose (`handlers.go:29-35`), and the resolved `model_id` is
added back into `req.Params["model_id"]` if non-empty (`handlers.go:36-38`).
3. `GenerateHandler` then calls `p.ValidateParams(req.Params)` (`handlers.go:40-43`). For
`OpenAIImageProvider`, `ValidateParams` in
`backend/internal/provider/openai_image.go:56-89` loads `size, _ :=
params["size"].(string)` (`openai_image.go:70`) and accepts `"auto"` or any `宽x高` value
because `isValidOpenAIImageSize` returns true for `"auto"` and general
`^[1-9][0-9]{1,4}x[1-9][0-9]{1,4}$` (`openai_image.go:63-69`), so the request passes
validation regardless of the actual model's supported sizes.
4. When the worker later executes the task, it calls `OpenAIImageProvider.Generate`
(`backend/internal/provider/openai_image.go:92-164`), which builds the body via
`buildImagesGenerationRequestBody` (`openai_image.go:166-199`).
`buildImagesGenerationRequestBody` calls `resolveOpenAIImageSize(modelID, params)`
(`openai_image.go:176 & 71-89`), and because `params["size"]` is non-empty,
`resolveOpenAIImageSize` immediately returns the explicit size unchanged
(`openai_image.go:467-468`). This `Size` field is then sent to `/images/generations` or
`/images/edits` (`openai_image.go:201-207` and `openai_image.go:280-285`), so if the
configured OpenAI model rejects `"auto"` or the arbitrary dimensions for that endpoint,
the API responds with HTTP 400 and the user's image request fails, even though the
provider could have normalized or constrained the size for that model.Fix in Cursor | Fix in VSCode Claude
(Use Cmd/Ctrl + Click for best experience)
Prompt for AI Agent 🤖
This is a comment left during a code review.
**Path:** backend/internal/provider/openai_image.go
**Line:** 467:468
**Comment:**
*Logic Error: Explicit `size` values are returned as-is without checking model compatibility, so `size="auto"` (or any unsupported explicit size) will be sent to models like `dall-e-2`/`dall-e-3` that require fixed dimensions, causing avoidable OpenAI 400 failures. Validate explicit sizes against the selected model or normalize invalid values before building the request body.
Validate the correctness of the flagged issue. If correct, How can I resolve this? If you propose a fix, implement it and please make it concise.
Once fix is implemented, also check other comments on the same PR, and ask user if the user wants to fix the rest of the comments as well. if said yes, then fetch all the comments validate the correctness and implement a minimal fix|
CodeAnt AI finished reviewing your PR. |
|
Handled the review feedback in 89142ab:
Validation: go test ./... |
|
Handled the latest CodeAnt size-compatibility finding in 726309d:
Validation: go test ./... |
|
Handled the remaining active review thread by adding a function-level Chinese comment for dynamic OpenAI image size calculation. Verified with |
User description
Summary
Tests
Notes
CodeAnt-AI Description
Support OpenAI image edits and flexible image sizing
What Changed
widthxheightsizes andinput_fidelityvalues of low or highImpact
✅ Image edits with reference photos✅ Fewer invalid size errors✅ Correct image sizing for more model and aspect ratio combinations🔄 Retrigger CodeAnt AI Review
Details
💡 Usage Guide
Checking Your Pull Request
Every time you make a pull request, our system automatically looks through it. We check for security issues, mistakes in how you're setting up your infrastructure, and common code problems. We do this to make sure your changes are solid and won't cause any trouble later.
Talking to CodeAnt AI
Got a question or need a hand with something in your pull request? You can easily get in touch with CodeAnt AI right here. Just type the following in a comment on your pull request, and replace "Your question here" with whatever you want to ask:
This lets you have a chat with CodeAnt AI about your pull request, making it easier to understand and improve your code.
Example
Preserve Org Learnings with CodeAnt
You can record team preferences so CodeAnt AI applies them in future reviews. Reply directly to the specific CodeAnt AI suggestion (in the same thread) and replace "Your feedback here" with your input:
This helps CodeAnt AI learn and adapt to your team's coding style and standards.
Example
Retrigger review
Ask CodeAnt AI to review the PR again, by typing:
Check Your Repository Health
To analyze the health of your code repository, visit our dashboard at https://app.codeant.ai. This tool helps you identify potential issues and areas for improvement in your codebase, ensuring your repository maintains high standards of code health.