Ollama DeProxy

A lightweight, feature-rich proxy for Ollama, designed for development, testing, and staging environments. It simplifies access to remote Ollama instances that are wrapped behind another proxy layer. Anthropic and OpenAI compatible endpoints, included.

Why Use It?

If you're a developer working locally and need to access a remote Ollama instance that sits behind an application proxy such as OpenWebUI, you may encounter:

Additional authorization requirements
Wrapped or modified HTTP headers
Response compression or transformation
Reverse proxy constraints

Ollama DeProxy provides a clean and simple way to:

Bypass extra authorization layers
Forward requests transparently
Control streaming and decoding behavior
Restore direct API-like access to the upstream Ollama service

It acts as a thin, configurable HTTP bridge between your local tools and the remote Ollama instance.

Features

Transparent Request Forwarding: Acts as a local HTTP server (default port 11434) that forwards all requests to a remote Ollama-compatible API
Authentication Handling: Automatically injects custom authentication headers (JWT, API Keys) to bypass upstream proxy layers
Response Processing: Supports streaming, decompression (Brotli/Gzip), and header filtering
Model Name Correction: Replaces numeric model identifiers with actual model names
Response Caching: Caches responses for specific endpoints with TTL-based eviction
HTTP/2 Support: Full support for modern upstream connections.
Efficient Decoding: Use DECODE_RESPONSE to choose between automatic decompression (Brotli/Gzip) or raw binary passthrough.
Anthropic and OpenAI compatible endpoints detection

Quick Start

UVX

pip install uv
uvx ollama-deproxy -h

UV

pip install uv
uv venv
uv pip install ollama-deproxy
uv run ollama-deproxy -h

PIP

mkdir ollama-deproxy
cd ollama-deproxy
python -m venv venv
venv\Scripts\activate
pip install ollama-deproxy
ollama-deproxy -h
usage: ollama-deproxy [-h] [--remote-url REMOTE_URL] [--remote-auth-token REMOTE_AUTH_TOKEN] [--local-port LOCAL_PORT]
                      [--log-level LOG_LEVEL] [--hash-algorithm HASH_ALGORITHM] [--env_path ENV_PATH] [--version]

Run the Ollama DeProxy application.

options:
  -h, --help            show this help message and exit
  --remote-url REMOTE_URL
                        Override REMOTE_URL environment variable
  --remote-auth-token REMOTE_AUTH_TOKEN
                        Override REMOTE_AUTH_TOKEN environment variable
  --local-port LOCAL_PORT
                        Override local_port environment variable
  --log-level LOG_LEVEL
                        Override log level environment variable, default: INFO
  --hash-algorithm HASH_ALGORITHM
                        Override HASH_ALGORITHM environment variable, default: auto
  --env_path ENV_PATH   Override path to .env file
  --version, -v         Version of the application

Start from repository

Clone the repository:

git clone https://github.com/lexxai/ollama-deproxy.git
cd ollama-deproxy

Configure environment variables:

cp .env.example .env
# Edit `.env` with your configuration

Using Docker Compose

Run the following command in your terminal to start the service:

docker compose up -d

This will launch the container with the specified configuration.

Verifying the Connection

You can monitor the initialization and incoming traffic by checking the service logs:

docker compose logs -f
ollama-deproxy-1  | INFO:     Started server process [1]
ollama-deproxy-1  | INFO:     Waiting for application startup.
ollama-deproxy-1  | INFO:     Application startup complete.
ollama-deproxy-1  | INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
ollama-deproxy-1  | INFO:     172.21.0.1:60700 - "POST /api/generate HTTP/1.1" 200 OK

Zero-Auth Local Access

Once the container is active, your local applications can communicate with the remote Ollama instance via:

Local Address: http://localhost:11434

Security: The proxy handles all necessary authentication headers upstream, allowing your local tools to connect seamlessly without managing API keys or complex auth logic.

Installation

Clone the repository:

git clone https://github.com/lexxai/ollama-deproxy.git
cd ollama-deproxy

Option 1 - Using `uv` (recommended)

uv is a blazing-fast Python package installer and resolver, written in Rust.

Install uv (if not already installed):

pip install uv
# or
curl -LsSf https://astral.sh/uv/install.sh | sh

Set up and sync the environment:

uv venv
uv sync

Configure environment variables:

cp .env.example .env
# Edit `.env` with your configuration

Run the server:

uv run -m src.ollama_deproxy.main

Option 2 - Using `pip` (fallback)

If you prefer pip, or uv is unavailable:

Windows

python -m venv .venv && .venv\Scripts\activate

macOS / Linux

python -m venv .venv && source .venv/bin/activate

Install dependencies:

pip install -r requirements.txt

Configure .env:

cp .env.example .env
# Edit `.env` as needed

Run the server:

python -m src.ollama_deproxy.main

If installed as a wheel:

ollama-deproxy

Build as a Package

Build and install as a distributable package:

UV

uv build
# Outputs:
# Successfully built dist/ollama_deproxy-x.y.z.tar.gz
# Successfully built dist/ollama_deproxy-x.y.z-py3-none-any.whl

PIP

Click to expand long output of build Ollama DeProxy

python -m venv .venv
source .venv/bin/activate # or .\venv\Scripts\activate
pip install -e .
Obtaining file:///C:/.../ollama-deproxy
  Installing build dependencies ... done
  Checking if build backend supports build_editable ... done
  Getting requirements to build editable ... done
  Installing backend dependencies ... done
  Preparing editable metadata (pyproject.toml) ... done
Collecting cachetools>=7.0.2 (from ollama-deproxy==0.4.1)
  Using cached cachetools-7.0.5-py3-none-any.whl.metadata (5.6 kB)
Collecting fastapi>=0.135.1 (from ollama-deproxy==0.4.1)
  Using cached fastapi-0.135.1-py3-none-any.whl.metadata (30 kB)
Collecting httpx>=0.28.1 (from httpx[brotli,http2,zstd]>=0.28.1->ollama-deproxy==0.4.1)
  Using cached httpx-0.28.1-py3-none-any.whl.metadata (7.1 kB)
...
Building wheels for collected packages: ollama-deproxy
  Building editable for ollama-deproxy (pyproject.toml) ... done
  Created wheel for ollama-deproxy: filename=ollama_deproxy-0.4.1-py3-none-any.whl size=2640 sha256=a896df60372b3a000cd802335e23a405b0c21ce96c66c8994a139309ea8c0c56
  Stored in directory: ...\Temp\pip-ephem-wheel-cache-4tfkacrk\wheels\4e\77\b5\f2d22f84a99bda20761e769c4abe4d2465331adcc1a67f21a4
Successfully built ollama-deproxy
Installing collected packages: brotli, zstandard, websockets, typing-extensions, types-cachetools, pyyaml, python-multipart, python-dotenv, idna, hyperframe, httptools, hpack, h11, colorama, certifi, cachetools, annotated-types, annotated-doc, typing-inspection, pydantic-core, httpcore, h2, click, anyio, watchfiles, uvicorn, starlette, pydantic, httpx, fastapi, ollama-deproxy
Successfully installed annotated-doc-0.0.4 annotated-types-0.7.0 anyio-4.12.1 brotli-1.2.0 cachetools-7.0.5 certifi-2026.2.25 click-8.3.1 colorama-0.4.6 fastapi-0.135.1 h11-0.16.0 h2-4.3.0 hpack-4.1.0 httpcore-1.0.9 httptools-0.7.1 httpx-0.28.1 hyperframe-6.1.0 idna-3.11 ollama-deproxy-0.4.1 pydantic-2.12.5 pydantic-core-2.41.5 python-dotenv-1.2.2 python-multipart-0.0.22 pyyaml-6.0.3 starlette-0.52.1 types-cachetools-6.2.0.20251022 typing-extensions-4.15.0 typing-inspection-0.4.2 uvicorn-0.41.0 watchfiles-1.1.1 websockets-16.0 zstandard-0.25.0

Then run the CLI directly:

UV

uv run --no-dev ollama-deproxy

PIP

ollama-deproxy

PIP

ollama-deproxy

Expected output:

ollama-deproxy --log-level DEBUG

============================================================
🚀 Ollama DeProxy Server vx.y.z
============================================================

2026-03-13 17:58:29 DEBUG:    Starting Ollama DeProxy with DEBUG logging... DEBUG_REQUEST=False,CACHE_ENABLED=True 
2026-03-13 17:58:30 INFO:     Started server process [46908]
2026-03-13 17:58:30 INFO:     Waiting for application startup.
2026-03-13 17:58:30 INFO:     Cache key hash algorithm selected: blake2b
2026-03-13 17:58:30 INFO:     Application startup complete.
2026-03-13 17:58:30 INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
2026-03-13 17:58:41 DEBUG:    *** Finished response for /ollama/api/tags in 00:00.6
2026-03-13 17:58:41 DEBUG:    Cache set for key: ollama/api/tags:get...
2026-03-13 17:58:41 INFO:     127.0.0.1:37460 - "GET /api/tags HTTP/1.1" 200

Environment Configuration

Response Caching

The proxy includes a built-in caching system to improve performance for frequently accessed endpoints controlled by environment variables:

CACHE_ENABLED
CACHE_MAXSIZE
CACHE_TTL

HASH_ALGORITHM Includes automatic hash algorithm detection to identify the optimal cache key generation method for your platform and architecture.

uv run ollama-deproxy
Ollama DeProxy vx.y.z
INFO:     Started server process [29256]
INFO:     Waiting for application startup.
INFO:ollama_deproxy.best_hash:Cache key hash algorithm auto-selection...
INFO:ollama_deproxy.cache_base:Cache key hash algorithm auto-selection complete. Can store it on .env file 'HASH_ALGORITHM=blake2b' for skip autodetection next time.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)

Cached Endpoints:

/api/tags - Model list
/api/models - Model information
/api/show - Model details

Benefits:

Reduces latency for repeated requests
Decreases load on remote Ollama instances
Improves response times for model metadata queries

Error Logging & Diagnostics

When the remote server returns an error (HTTP 400+), the proxy interrupts the stream to capture the full context. This allows you to see exactly why the upstream rejected your request.

Example Failure: If you query a model that doesn't exist on the remote host:

ERROR:ollama_deproxy.handlers:Remote Error [400] on https://openwebui.example.com/ollama/api/show {"name":"qwen2.5-coder:1.5b-base1"} {"detail":"Model 'qwen2.5-coder:1.5b-base1' was not found"}

Where is:

Sent Body: {"name":"qwen2.5-coder:1.5b-base1"}
Recv Body: {"detail":"Model 'qwen2.5-coder:1.5b-base1' was not found"}

Example Debug Log:

LOG_LEVEL=DEBUG

Ollama DeProxy vx.y.z
2026-03-13 15:34:08 DEBUG:    Starting Ollama DeProxy with DEBUG logging... DEBUG_REQUEST=False,CACHE_ENABLED=True
2026-03-13 15:34:08 INFO:     Started server process [43460]
2026-03-13 15:34:08 INFO:     Waiting for application startup.
2026-03-13 15:34:08 INFO:     Cache key hash algorithm selected: blake2b
2026-03-13 15:34:08 INFO:     Application startup complete.
2026-03-13 15:34:08 INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)
2026-03-13 15:34:57 DEBUG:    *** Finished response for /ollama/api/tags in 00:00.6
2026-03-13 15:34:57 DEBUG:    Cache set for key: ollama/api/tags:get...
2026-03-13 15:34:57 INFO:     127.0.0.1:8327 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:35:37 DEBUG:    Proxying request corrected to 'api/v1/messages' for Anthropic compatibility
2026-03-13 15:35:37 DEBUG:    *** Handling request for path: /api/v1/messages
2026-03-13 15:36:21 INFO:     127.0.0.1:61399 - "POST /v1/messages?beta=true HTTP/1.1" 200
2026-03-13 15:36:23 DEBUG:    *** Finished up stream for /api/v1/messages in 00:46.1
2026-03-13 15:36:37 DEBUG:    Cache hit for key: ollama/api/tags:get...
2026-03-13 15:36:37 INFO:     127.0.0.1:61402 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:38:25 DEBUG:    Cache hit for key: ollama/api/tags:get...
2026-03-13 15:38:25 INFO:     127.0.0.1:61408 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:38:26 DEBUG:    Cache hit for key: ollama/api/tags:get...
2026-03-13 15:38:26 INFO:     127.0.0.1:61411 - "GET /api/tags HTTP/1.1" 200
2026-03-13 15:39:03 DEBUG:    Proxying request corrected to 'ollama/v1/chat/completions' for OpenAI compatibility
2026-03-13 15:39:03 DEBUG:    *** Handling request for path: /ollama/v1/chat/completions
2026-03-13 15:39:13 INFO:     127.0.0.1:61414 - "POST /chat/completions HTTP/1.1" 200
2026-03-13 15:39:13 DEBUG:    *** Finished up stream for /ollama/v1/chat/completions in 00:09.3
2026-03-13 15:41:18 INFO:     Shutting down
2026-03-13 15:41:18 INFO:     Waiting for application shutdown.
2026-03-13 15:41:18 INFO:     Application shutdown complete.
2026-03-13 15:41:18 INFO:     Finished server process [43460]


Sleeping for 10 sec before restarting server. Press Ctrl+C to exit.
Restarting server...
2026-03-13 15:41:28 DEBUG:    Using proactor: IocpProactor
2026-03-13 15:41:28 INFO:     Started server process [43460]
2026-03-13 15:41:28 INFO:     Waiting for application startup.
2026-03-13 15:41:28 INFO:     Cache key hash algorithm selected: blake2b
2026-03-13 15:41:28 INFO:     Application startup complete.
2026-03-13 15:41:28 INFO:     Uvicorn running on http://0.0.0.0:11434 (Press CTRL+C to quit)

CLI Usage

In CLI mode, you can use the ollama-deproxy command to start the server. And also can override some environment variables.

uv run ollama-deproxy --help           
usage: ollama-deproxy [-h] [--remote-url REMOTE_URL] [--remote-auth-token REMOTE_AUTH_TOKEN]
                      [--local-port LOCAL_PORT] [--log-level LOG_LEVEL] [--env_path ENV_PATH] [--version]

Run the Ollama DeProxy application.

options:
  -h, --help            show this help message and exit
  --remote-url REMOTE_URL
                        Override REMOTE_URL environment variable
  --remote-auth-token REMOTE_AUTH_TOKEN
                        Override REMOTE_AUTH_TOKEN environment variable
  --local-port LOCAL_PORT
                        Override local_port environment variable
  --log-level LOG_LEVEL
                        Override log level environment variable
  --env_path ENV_PATH   Override path to .env file
  --version, -v         Version of the application

Reference

License

MIT License — see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 62 Commits
.github/workflows		.github/workflows
DOCS		DOCS
assets		assets
src/ollama_deproxy		src/ollama_deproxy
srcipts		srcipts
.aiignore		.aiignore
.dockerignore		.dockerignore
.env.example		.env.example
.gitignore		.gitignore
CHANGELOG.md		CHANGELOG.md
Dockerfile		Dockerfile
LICENSE		LICENSE
README.md		README.md
docker-compose.yml		docker-compose.yml
entrypoint.sh		entrypoint.sh
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Ollama DeProxy

Why Use It?

Features

Quick Start

UVX

UV

PIP

Start from repository

Using Docker Compose

Verifying the Connection

Zero-Auth Local Access

Installation

Option 1 - Using `uv` (recommended)

Option 2 - Using `pip` (fallback)

Windows

macOS / Linux

Build as a Package

UV

PIP

UV

PIP

PIP

Environment Configuration

Response Caching

Error Logging & Diagnostics

CLI Usage

Reference

License

About

Uh oh!

Releases 7

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Ollama DeProxy

Why Use It?

Features

Quick Start

UVX

UV

PIP

Start from repository

Using Docker Compose

Verifying the Connection

Zero-Auth Local Access

Installation

Option 1 - Using uv (recommended)

Option 2 - Using pip (fallback)

Windows

macOS / Linux

Build as a Package

UV

PIP

UV

PIP

PIP

Environment Configuration

Response Caching

Error Logging & Diagnostics

CLI Usage

Reference

License

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases 7

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Option 1 - Using `uv` (recommended)

Option 2 - Using `pip` (fallback)

Packages