Skip to content

feat: add openai flex processing toggle#1465

Open
tasoo-oos wants to merge 3 commits into
kwaroran:mainfrom
tasoo-oos:feat-openai-batching
Open

feat: add openai flex processing toggle#1465
tasoo-oos wants to merge 3 commits into
kwaroran:mainfrom
tasoo-oos:feat-openai-batching

Conversation

@tasoo-oos
Copy link
Copy Markdown
Contributor

@tasoo-oos tasoo-oos commented May 30, 2026

PR Checklist

  • Required Checks
    • Have you added type definitions?
    • Have you tested your changes?
    • Have you checked that it won't break any existing features?
  • If your PR uses models1, check the following:
    • Have you checked if it works normally in all models?
    • Have you checked if it works normally in all web, local, and node-hosted versions? If it doesn't, have you blocked it in those versions?
  • If your PR is highly AI generated2, check the following:
    • Have you understood what the code does?
    • Have you cleaned up any unnecessary or redundant code?
    • Is it not a huge change?
      • We currently do not accept highly AI generated PRs that are large changes.

Summary

Add an Advanced Settings toggle for OpenAI Flex Processing on official OpenAI Chat Completions requests.

Related Issues

None.

Changes

  • Added persisted openAIFlexProcessing setting.
  • Added the Advanced Settings toggle with localized labels and help text.
  • Sends service_tier: flex only for official OpenAI Chat Completions requests when enabled.

Impact

Users can opt into lower-cost OpenAI Flex responses, which may be slower than regular responses.
Other providers such as OpenRouter and Responses API requests are unchanged.

Additional Notes

Tested with pnpm check and manually with risu-official GPT endpoint / Custom API with https://api.openai.com/v1.

Also, as described at the pricing document, only o3, o4-mini, and GPT-5 model family (except for -Chat variant) can use this feature.

I could have implemented a model-wise guard with LLMFlags, but I haven't.
This is because the end user is given a pretty clear and explicit error when using the wrong model with the flex feature, but implementing the LLMFlags-based guard could cause end users who mistakenly do not set up that flag to pay double the price silently. (Custom API for unregistered OpenAI model, etc.)

If you have a better idea, I am open to suggestions!

image

UI

image
image

Footnotes

  1. Modifies the behavior of prompting, requesting, or handling responses from AI models.

  2. Over 80% of the code is AI generated.

@tasoo-oos tasoo-oos changed the title feat(openai): add flex processing toggle feat: add flex processing toggle May 30, 2026
@tasoo-oos tasoo-oos changed the title feat: add flex processing toggle feat: add openai flex processing toggle May 30, 2026
@tasoo-oos tasoo-oos marked this pull request as ready for review May 30, 2026 19:35
@tasoo-oos tasoo-oos marked this pull request as draft May 30, 2026 19:41
@tasoo-oos tasoo-oos marked this pull request as ready for review May 30, 2026 20:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant