Skip to content

Commit 6bdf124

Browse files
authored
Merge pull request #3 from mhmmadazis/main
feat: token optimization
2 parents 11c241f + 5da6597 commit 6bdf124

16 files changed

Lines changed: 821 additions & 97 deletions

File tree

CHANGELOG.md

Lines changed: 34 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,40 @@ All notable changes to enowX Coder are documented here.
44

55
---
66

7+
## [0.2.6] — 2026-04-25
8+
9+
### Token Optimization — 99% Reduction for Anthropic-format Gateways
10+
- **Prompt caching now works for custom providers**: Added `api_format` field (`openai` | `anthropic`) to providers — custom gateways using Anthropic Messages API now get prompt caching, reducing prompt tokens from ~11,700 to ~0 on cache hits
11+
- **Chat history sliding window**: Chat path now applies the same context trimming as agent path — max 20 message pairs, 32K char budget, per-message truncation, `html:preview` blocks stripped. Prevents token bloat on long sessions
12+
- **`uses_anthropic_format()` method**: Centralized routing logic replaces scattered `provider_type == "anthropic" || provider_type == "enowxlabs"` checks across chat service and agent runner
13+
14+
### Gateway SSE Compatibility Fix
15+
- **Event-line fallback for SSE parsing**: Some Anthropic-compatible gateways omit the `"type"` field from SSE data payloads. Parser now tracks the preceding `event:` line and uses it as fallback — fixes empty responses from proxies like LiteLLM, Claude Desktop gateway, and enowX Labs gateway
16+
- Applied to both chat SSE parser (`chat_service.rs`) and agent tool SSE parser (`runner.rs`)
17+
18+
### Non-Streaming Fallback for Unsupported Models
19+
- **Auto-retry without streaming**: When a gateway returns an empty stream (message_start → message_stop with no content blocks), the request is automatically retried with `stream: false` and the full response is parsed synchronously
20+
- Fixes blank responses for models where the gateway doesn't support streaming (e.g. `claude-opus-4.6` on certain proxies)
21+
- Applied to both chat path and agent path (with tool call support)
22+
23+
### Endpoint Resolution Fix for Custom Gateways
24+
- **Preserve `/v1` path for custom providers**: Previously, all non-Anthropic providers had `/v1` stripped from their base URL when building the Anthropic endpoint, resulting in `host/messages` instead of `host/v1/messages`. Now only the built-in `enowxlabs` provider strips `/v1`; custom gateways keep their full path
25+
- Fixed in chat service, title generation, and agent runner
26+
27+
### Model Listing for Custom Providers
28+
- **Custom providers can now list models**: Previously, unknown `provider_type` slugs (e.g. user-created `"my-gateway"`) returned "Unknown provider type" error. Now routes by `api_format` — Anthropic-format providers hit `{base_url}/models` with correct auth headers
29+
- `fetch_anthropic_models` now accepts a configurable base URL and auth scheme instead of hardcoding `api.anthropic.com`
30+
31+
### Provider Settings UI
32+
- **API Format selector**: New toggle (OpenAI / Anthropic) in Settings for custom providers — choose Anthropic for Claude-compatible gateways to enable prompt caching and correct message serialization
33+
- Selector shown in both "Add Provider" form and existing provider detail panel
34+
- Built-in providers (`enowxlabs`, `anthropic`) auto-set to Anthropic format
35+
36+
### Database
37+
- **Migration `20260424000_provider_api_format.sql`**: Adds `api_format TEXT NOT NULL DEFAULT 'openai'` column to providers table. Existing `anthropic` and `enowxlabs` providers auto-updated to `'anthropic'` format
38+
39+
---
40+
741
## [0.2.5] — 2026-04-23
842

943
### Excalidraw Canvas — Collaborative Whiteboard

README.md

Lines changed: 18 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,24 @@
7575

7676
![Settings](screenshots/Providers.png)
7777

78+
### Token Optimization — Before & After
79+
80+
Prompt caching with Anthropic-format routing reduces token usage by **99.87%** on repeated requests.
81+
82+
| | Before | After |
83+
|---|---|---|
84+
| **Total Tokens** | 19,819 | 26 |
85+
| **Prompt Tokens** | 19,779 | 0 (cache hit) |
86+
| **Completion Tokens** | 40 | 26 |
87+
88+
**Before** — Every request sends the full system prompt (~19K prompt tokens):
89+
90+
![Before Optimization](screenshots/before.png)
91+
92+
**After** — Prompt caching enabled via Anthropic Messages API format (0 prompt tokens on cache hit):
93+
94+
![After Optimization](screenshots/after.png)
95+
7896
---
7997

8098
## 🚀 Installation

screenshots/after.png

103 KB
Loading

screenshots/before.png

130 KB
Loading
Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
-- Add api_format column to providers.
2+
-- Values: 'openai' (default) or 'anthropic'.
3+
-- This lets custom/gateway providers opt into the Anthropic message format
4+
-- which enables prompt caching and correct content-block serialisation.
5+
ALTER TABLE providers ADD COLUMN api_format TEXT NOT NULL DEFAULT 'openai';
6+
7+
-- Built-in providers that already use Anthropic format
8+
UPDATE providers SET api_format = 'anthropic' WHERE provider_type IN ('anthropic', 'enowxlabs');

0 commit comments

Comments
 (0)