Skip to content

Implement local compatibility APIs#59

Open
CobraSoftware wants to merge 3 commits into
jjang-ai:mainfrom
CobraSoftware:api-work
Open

Implement local compatibility APIs#59
CobraSoftware wants to merge 3 commits into
jjang-ai:mainfrom
CobraSoftware:api-work

Conversation

@CobraSoftware
Copy link
Copy Markdown

@CobraSoftware CobraSoftware commented Apr 9, 2026

overview

This is desgined to add more API endpoints for boarder compatiblity.
It was heavily assisted by AI.

Summary

This PR expands the local compatibility surface for the MLX-backed server while keeping inference local-only.

Main additions:

  • OpenAI compatibility improvements, including broader response/resource coverage and realtime session support
  • Anthropic and Ollama compatibility validation
  • LM Studio native API support under /lmstudio/v1/*
  • Deepgram-compatible local APIs under /deepgram/vl/*
  • Model-family detection for Qwen3 Omni and Voxtral realtime variants
  • Additional local implementations for practical OpenAI-style resource APIs such as files and persisted responses retrieval

What Changed

  • Added LM Studio endpoints:

    • /lmstudio/v1/models
    • /lmstudio/v1/models/load
    • /lmstudio/v1/models/unload
    • /lmstudio/v1/models/download
    • /lmstudio/v1/models/download/status
    • /lmstudio/v1/chat
  • Added Deepgram endpoints:

    • /deepgram/vl/listen
    • /deepgram/vl/speak
    • /deepgram/vl/read
    • /deepgram/vl/models
    • /deepgram/vl/models/{model_id}
  • Expanded OpenAI-compatible surface:

    • Realtime session endpoints
    • Realtime client secrets
    • Realtime transcription sessions
    • Local files API
    • Local responses/{id} retrieval, delete, and input item listing
    • Broader spec alias coverage
    • Local-only 501 not_implemented_local fallback responses for remaining unsupported cloud paths
  • Improved auth compatibility:

    • Authorization: Bearer <key>
    • Authorization: Token <key>
    • x-api-key
    • api-key

Testing

I tested compelation python3 -m py_compile vmlx_engine/server.py tests/test_deepgram_api.py tests/test_lmstudio_api.py tests/test_openai_spec_surface.py tests/test_realtime_compat.pyand did get all tests to passs but I lack the hardware to do larger in-depth realworld testing with bigger models.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant