GitHub - rojalator/patch_oai: Apply a monkey-patch an OpenAI client to add the responses.create() method by converting calls to chat.completions.create()

=== patch_openai.py ===

(You just need to download the Python file, you haven't got to 'git' or anything fancy if you don't know how: just click on its name here, use the 'Raw' button and save the displayed file.)

This code (in the Python file) applies a monkey-patch an OpenAI client to substitute the responses.create() method by converting calls to chat.completions.create() so that LLM's that don't yet support the responses end-point will work. (So it does GATHER -> DESPATCH -> SCATTER, in effect). You use a single line of Python code (well, two if you count the import) to add it and if and when llama.cpp adds the responses end-point you can just comment it out so that OpenAI's module code will be used.

The single line is : monkey_patch_responses_api(llm_client).

It supports all the calls and parameters... well, you can pass them, I've not tested them all, plus I only use a few. It's rough and ready but should be useful here and there where there's no support for 'responses'.

I use it like this:

    from patch_openai import monkey_patch_responses_api
    llm_client = OpenAI(base_url="http://example.com:8080", api_key=LLAMA_API_KEY")
    monkey_patch_responses_api(llm_client)

Now llm_client has a 'responses.create' function that works: for example:

    response = llm_client.responses.create(model=LLAMA_MODEL, temperature=2, input="In one sentence, tell me about Stan Laurel")
    print(response.output_text)

You can also have it automatically swap role-names if you pass them in a dictionary, thus:

    monkey_patch_responses_api(client=llm, role_name_swaps={'developer': 'system'})

This reads as "If you encounter 'developer' as a role-name, swap the role-name to 'system'": my llama.cpp doesn't like 'developer' so it's useful for me.

Remember: It's a fake endpoint so you are not really talking to responses.create() but to chat.create(). It should mostly work. Mostly. Availability of some things will depend upon the model (some don't support tool use, for example). I tend to use Qwen2.5 1.5B Instruct (https://huggingface.co/Qwen/Qwen2.5-1.5B) which does (and performs quite well on older kit).

The original code was created by claudeai but I've modified to work better. Well, to work at all but it saved a lot of typing initially!

Name		Name	Last commit message	Last commit date
Latest commit History 21 Commits
README.md		README.md
patch_openai.py		patch_openai.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages