diff --git a/README.md b/README.md
index aa550a3..5199025 100644
--- a/README.md
+++ b/README.md
@@ -1,61 +1,54 @@
 # Gonka OpenAI Proxy
 
-OpenAI-compatible API proxy for Gonka that provides ChatGPT-like interface with API key authentication.
-
-👉 **Try it here:**  
-👉 https://gonka-gateway.mingles.ai/
+OpenAI-compatible API proxy for Gonka that provides a ChatGPT-like interface with API key authentication. Self-hosted, no database required — configured entirely via environment variables.
 
 ## Features
 
-- **OpenAI-compatible API**: Compatible with OpenAI Python SDK and other OpenAI-compatible clients
-- **API Key Authentication**: Secure access using API keys (like ChatGPT API)
+- **OpenAI-compatible API**: Drop-in replacement for OpenAI Python SDK and other OpenAI-compatible clients
+- **API Key Authentication**: Secure access using configurable API keys
 - **Streaming Support**: Supports both streaming and non-streaming responses
+- **Tool Emulation**: Automatic prompt-based tool call emulation for models that don't support native tool calling
+- **Circuit Breaker**: Prevents cascading failures when the Gonka backend is degraded
+- **Retry with Backoff**: Automatic retry with exponential backoff on transient errors
 - **Web Interface**: Built-in web chat interface for testing
-- **Automatic Model Loading**: Loads available models from Gonka API on startup
 - **Docker Support**: Ready-to-use Docker container
 
-## Configuration
+## Quick Start
 
-Copy `.env.example` to `.env` and configure the following variables:
+### Running Locally
 
+1. Clone the repository:
 ```bash
-# Gonka API Configuration
-GONKA_PRIVATE_KEY=your_hex_private_key_here
-GONKA_ADDRESS=your_gonka_address_bech32
-GONKA_ENDPOINT=https://host:port/v1
-GONKA_PROVIDER_ADDRESS=provider_gonka_address_bech32
-
-# API Key for external access (like ChatGPT API)
-API_KEY=sk-your-secret-api-key-here
-
-# Server Configuration (optional)
-HOST=0.0.0.0
-PORT=8000
+git clone https://github.com/MinglesAI/gonka-proxy.git
+cd gonka-proxy
 ```
 
-### Configuration Details
-
-#### GONKA_PROVIDER_ADDRESS
-
-**What is it?** `GONKA_PROVIDER_ADDRESS` is the provider (host) address in the Gonka network in bech32 format. It is used to sign requests to the Gonka API.
-
-**Where to get it?**
-
-1. **From provider documentation**: If you are using a specific Gonka provider, their address should be specified in their documentation or provider page.
-
-2. **From endpoint metadata**: The provider address is usually associated with the endpoint (`GONKA_ENDPOINT`). The provider should specify their Gonka address in the documentation or during registration.
+2. Install dependencies:
+```bash
+pip install -r requirements.txt
+```
 
-3. **Via Gonka Dashboard**: If you have access to the Gonka Dashboard, the provider address can be found in your connection information or node settings.
+3. Create a `.env` file (see [Environment Variables](#environment-variables)):
+```bash
+cp .env.example .env
+# Edit .env with your values
+```
 
-4. **Contact the provider**: If you are using a public Gonka endpoint, contact the endpoint owner or Gonka support to get the provider address.
+4. Run the server:
+```bash
+python -m app.main
+```
 
-**Example**: The address usually looks like `gonka1...` (bech32 format), e.g., `gonka1abc123def456...`
+Or with uvicorn directly:
+```bash
+uvicorn app.main:app --host 0.0.0.0 --port 8000 --reload
+```
 
-**Important**: The provider address is used in the cryptographic signature of each request, so it must be correct for successful authentication.
+5. Open `http://localhost:8000/` in your browser to use the web chat interface.
 
-## Running with Docker
+### Running with Docker
 
-1. Build the Docker image:
+1. Build the image:
 ```bash
 docker build -t gonka-proxy .
 ```
@@ -65,36 +58,80 @@ docker build -t gonka-proxy .
 docker run -d \
   --name gonka-proxy \
   -p 8000:8000 \
-  --env-file .env \
+  -e GONKA_PRIVATE_KEY=your_hex_private_key \
+  -e GONKA_ADDRESS=your_gonka_address_bech32 \
+  -e GONKA_ENDPOINT=https://host:port/v1 \
+  -e GONKA_PROVIDER_ADDRESS=provider_gonka_address_bech32 \
+  -e API_KEY=sk-your-secret-api-key \
   gonka-proxy
 ```
 
-## Running Locally
-
-1. Install dependencies:
+Or with a `.env` file:
 ```bash
-pip install -r requirements.txt
+docker run -d \
+  --name gonka-proxy \
+  -p 8000:8000 \
+  --env-file .env \
+  gonka-proxy
 ```
 
-2. Set environment variables or create `.env` file
+## Environment Variables
 
-3. Run the server:
-```bash
-python -m app.main
+| Variable | Required | Default | Description |
+|---|---|---|---|
+| `GONKA_PRIVATE_KEY` | ✅ | — | Your ECDSA private key in hex format (with or without `0x` prefix) |
+| `GONKA_ADDRESS` | ✅ | — | Your Gonka address in bech32 format (e.g. `gonka1abc...`) |
+| `GONKA_ENDPOINT` | ✅ | — | Gonka API base URL (e.g. `https://host:port/v1`) |
+| `GONKA_PROVIDER_ADDRESS` | ✅ | — | Provider's Gonka address in bech32 format — used for request signing |
+| `API_KEY` | ✅ | — | Secret key clients must send in the `Authorization` header |
+| `HOST` | ❌ | `0.0.0.0` | Server bind address |
+| `PORT` | ❌ | `8000` | Server port |
+| `GONKA_STREAM_READ_TIMEOUT` | ❌ | `300.0` | Max seconds to wait for streaming data from backend |
+
+### Configuration Details
+
+#### GONKA_PRIVATE_KEY
+Your ECDSA private key in hex format. Used to sign every request to the Gonka backend.
+Example: `a1b2c3d4e5f6...` or `0xa1b2c3d4e5f6...`
+
+#### GONKA_ADDRESS
+Your address in the Gonka network (bech32 format). Sent as the `X-Requester-Address` header.
+Example: `gonka1qyqszqgpqyqszqgpqyqszqgp...`
+
+#### GONKA_ENDPOINT
+The Gonka inference API endpoint. Must include the `/v1` path segment.
+Example: `https://my-gonka-node.example.com/v1`
+
+#### GONKA_PROVIDER_ADDRESS
+The **provider's** Gonka address (bech32 format). This is included in the cryptographic signature of every request and must match what the provider expects. Obtain this from your Gonka provider's documentation or contact page.
+Example: `gonka1provideraddress...`
+
+#### API_KEY
+The bearer token clients must include in requests to the proxy.
+Example: `sk-my-secret-key-123`
+
+Clients send it as:
+```
+Authorization: Bearer sk-my-secret-key-123
 ```
 
-Or with uvicorn directly:
+### Example `.env` file
+
 ```bash
-uvicorn app.main:app --host 0.0.0.0 --port 8000
+GONKA_PRIVATE_KEY=0xaabbccddeeff...
+GONKA_ADDRESS=gonka1youraddress...
+GONKA_ENDPOINT=https://my-gonka-node.example.com/v1
+GONKA_PROVIDER_ADDRESS=gonka1provideraddress...
+API_KEY=sk-my-secret-api-key
 ```
 
 ## Usage
 
 ### Web Interface
 
-Access the web interface at `http://localhost:8000/` to test the API interactively.
+Open `http://localhost:8000/` to access the built-in chat interface.
 
-### Using OpenAI Python SDK
+### OpenAI Python SDK
 
 ```python
 from openai import OpenAI
@@ -114,20 +151,6 @@ response = client.chat.completions.create(
 print(response.choices[0].message.content)
 ```
 
-### Using curl
-
-```bash
-curl http://localhost:8000/v1/chat/completions \
-  -H "Content-Type: application/json" \
-  -H "Authorization: Bearer sk-your-secret-key" \
-  -d '{
-    "model": "gonka-model",
-    "messages": [
-      {"role": "user", "content": "Hello!"}
-    ]
-  }'
-```
-
 ### Streaming
 
 ```python
@@ -140,9 +163,7 @@ client = OpenAI(
 
 stream = client.chat.completions.create(
     model="gonka-model",
-    messages=[
-        {"role": "user", "content": "Tell me a story"}
-    ],
+    messages=[{"role": "user", "content": "Tell me a story"}],
     stream=True
 )
 
@@ -151,27 +172,41 @@ for chunk in stream:
         print(chunk.choices[0].delta.content, end="")
 ```
 
+### curl
+
+```bash
+curl http://localhost:8000/v1/chat/completions \
+  -H "Content-Type: application/json" \
+  -H "Authorization: Bearer sk-your-secret-key" \
+  -d '{
+    "model": "gonka-model",
+    "messages": [{"role": "user", "content": "Hello!"}]
+  }'
+```
+
 ## API Endpoints
 
-- `POST /v1/chat/completions` - Chat completions (OpenAI-compatible)
-- `GET /v1/models` - List available models
-- `GET /api/models` - Get available models (no auth, for web interface)
-- `GET /health` - Health check endpoint (no auth required)
-- `GET /` - Web chat interface
+| Endpoint | Auth | Description |
+|---|---|---|
+| `POST /v1/chat/completions` | ✅ | Chat completions (OpenAI-compatible) |
+| `GET /v1/models` | ✅ | List available models (OpenAI-compatible) |
+| `GET /api/models` | ❌ | Get available models (for web interface) |
+| `GET /health` | ❌ | Health check (includes circuit breaker state) |
+| `GET /` | ❌ | Web chat interface |
 
-## Authentication
+## Architecture
 
-All endpoints except `/health`, `/api/models`, and `/` require authentication using the `Authorization` header:
+### Tool Emulation
 
-```
-Authorization: Bearer sk-your-secret-key
-```
+If the Gonka model doesn't support native tool calling (`tools` + `tool_choice`), the proxy automatically converts tool definitions into a system prompt and parses tool call JSON from the model's text response. This is transparent to the client — it still receives standard OpenAI-format `tool_calls` in the response.
 
-Or simply:
+### Circuit Breaker
 
-```
-Authorization: sk-your-secret-key
-```
+Wraps non-streaming Gonka backend calls. After 5 consecutive failures, the circuit opens and requests are rejected immediately with a `503` error for 60 seconds, then transitions to half-open to test recovery.
+
+### Retry
+
+Non-streaming requests are retried up to 2 times with exponential backoff on transient errors.
 
 ## License
 
diff --git a/app/circuit_breaker.py b/app/circuit_breaker.py
new file mode 100644
index 0000000..3bdb261
--- /dev/null
+++ b/app/circuit_breaker.py
@@ -0,0 +1,181 @@
+"""
+Circuit Breaker implementation for resilient backend communication
+"""
+import time
+import logging
+from enum import Enum
+from typing import Callable, Any, Optional
+from functools import wraps
+
+logger = logging.getLogger(__name__)
+
+
+class CircuitState(Enum):
+    """Circuit breaker states"""
+    CLOSED = "closed"      # Normal operation - requests pass through
+    OPEN = "open"          # Failing - reject requests immediately
+    HALF_OPEN = "half_open"  # Testing - allow limited requests to test recovery
+
+
+class CircuitBreakerOpenError(Exception):
+    """Raised when circuit breaker is OPEN"""
+    pass
+
+
+class CircuitBreaker:
+    """
+    Circuit breaker pattern implementation
+    
+    Prevents cascading failures by stopping requests to failing services
+    and allowing them to recover.
+    
+    States:
+    - CLOSED: Normal operation, requests pass through
+    - OPEN: Service is failing, reject requests immediately
+    - HALF_OPEN: Testing if service recovered, allow limited requests
+    """
+    
+    def __init__(
+        self,
+        name: str,
+        failure_threshold: int = 5,
+        recovery_timeout: float = 60.0,
+        success_threshold: int = 2,
+        expected_exception: type = Exception
+    ):
+        """
+        Initialize circuit breaker
+        
+        Args:
+            name: Circuit breaker name (for logging)
+            failure_threshold: Number of failures before opening circuit
+            recovery_timeout: Seconds to wait before trying half-open
+            success_threshold: Number of successes in half-open to close circuit
+            expected_exception: Exception type that triggers failure
+        """
+        self.name = name
+        self.failure_threshold = failure_threshold
+        self.recovery_timeout = recovery_timeout
+        self.success_threshold = success_threshold
+        self.expected_exception = expected_exception
+        
+        self.failure_count = 0
+        self.success_count = 0
+        self.last_failure_time: Optional[float] = None
+        self.state = CircuitState.CLOSED
+    
+    async def call(self, func: Callable, *args, **kwargs) -> Any:
+        """
+        Execute function with circuit breaker protection
+        
+        Args:
+            func: Async function to execute
+            *args, **kwargs: Arguments for function
+        
+        Returns:
+            Function result
+        
+        Raises:
+            CircuitBreakerOpenError: If circuit is OPEN
+            Exception: Original exception from function
+        """
+        # Check if we should transition from OPEN to HALF_OPEN
+        if self.state == CircuitState.OPEN:
+            if self.last_failure_time and \
+               time.time() - self.last_failure_time > self.recovery_timeout:
+                logger.info(f"Circuit breaker {self.name}: OPEN -> HALF_OPEN (testing recovery)")
+                self.state = CircuitState.HALF_OPEN
+                self.success_count = 0
+            else:
+                raise CircuitBreakerOpenError(
+                    f"Circuit breaker {self.name} is OPEN. "
+                    f"Last failure: {self.last_failure_time}"
+                )
+        
+        # Execute function
+        try:
+            result = await func(*args, **kwargs)
+            self._on_success()
+            return result
+        except self.expected_exception as e:
+            self._on_failure()
+            raise
+    
+    def _on_success(self):
+        """Handle successful request"""
+        if self.state == CircuitState.HALF_OPEN:
+            self.success_count += 1
+            if self.success_count >= self.success_threshold:
+                logger.info(f"Circuit breaker {self.name}: HALF_OPEN -> CLOSED (recovered)")
+                self.state = CircuitState.CLOSED
+                self.failure_count = 0
+                self.success_count = 0
+        elif self.state == CircuitState.CLOSED:
+            # Reset failure count on success (gradual recovery)
+            if self.failure_count > 0:
+                self.failure_count = max(0, self.failure_count - 1)
+    
+    def _on_failure(self):
+        """Handle failed request"""
+        self.failure_count += 1
+        self.last_failure_time = time.time()
+        
+        if self.state == CircuitState.HALF_OPEN:
+            # Failure in half-open -> back to OPEN
+            logger.warning(f"Circuit breaker {self.name}: HALF_OPEN -> OPEN (still failing)")
+            self.state = CircuitState.OPEN
+            self.success_count = 0
+        elif self.state == CircuitState.CLOSED:
+            if self.failure_count >= self.failure_threshold:
+                logger.warning(
+                    f"Circuit breaker {self.name}: CLOSED -> OPEN "
+                    f"({self.failure_count} failures >= {self.failure_threshold})"
+                )
+                self.state = CircuitState.OPEN
+    
+    def get_state(self) -> dict:
+        """Get current circuit breaker state"""
+        return {
+            "name": self.name,
+            "state": self.state.value,
+            "failure_count": self.failure_count,
+            "last_failure_time": self.last_failure_time,
+            "failure_threshold": self.failure_threshold,
+            "recovery_timeout": self.recovery_timeout
+        }
+    
+    def reset(self):
+        """Manually reset circuit breaker to CLOSED state"""
+        logger.info(f"Circuit breaker {self.name}: Manual reset")
+        self.state = CircuitState.CLOSED
+        self.failure_count = 0
+        self.success_count = 0
+        self.last_failure_time = None
+
+
+def circuit_breaker_decorator(
+    name: str,
+    failure_threshold: int = 5,
+    recovery_timeout: float = 60.0
+):
+    """
+    Decorator for circuit breaker pattern
+    
+    Usage:
+        @circuit_breaker_decorator("gonka_api", failure_threshold=5)
+        async def call_gonka_api():
+            ...
+    """
+    breaker = CircuitBreaker(
+        name=name,
+        failure_threshold=failure_threshold,
+        recovery_timeout=recovery_timeout
+    )
+    
+    def decorator(func: Callable):
+        @wraps(func)
+        async def wrapper(*args, **kwargs):
+            return await breaker.call(func, *args, **kwargs)
+        wrapper.breaker = breaker  # Attach breaker for access
+        return wrapper
+    return decorator
diff --git a/app/gonka_client.py b/app/gonka_client.py
index 212b3e7..49c525c 100644
--- a/app/gonka_client.py
+++ b/app/gonka_client.py
@@ -11,34 +11,49 @@
 logger = logging.getLogger(__name__)
 
 
+def encode_with_low_s(r: int, s: int, order: int) -> bytes:
+    """Encode ECDSA signature with low-S normalization"""
+    # Normalize s to low-S
+    if s > order // 2:
+        s = order - s
+
+    # Convert to bytes (32 bytes each for r and s)
+    r_bytes = r.to_bytes(32, 'big')
+    s_bytes = s.to_bytes(32, 'big')
+
+    return r_bytes + s_bytes
+
+
 class GonkaClient:
     """Client for making signed requests to Gonka API"""
-    
+
     def __init__(
         self,
         private_key: str,
         address: str,
         endpoint: str,
         provider_address: str,
-        timeout: float = 60.0
+        timeout: float = 60.0,
+        stream_read_timeout: float = 300.0,
     ):
         self.private_key = private_key
         self.address = address
         self.endpoint = endpoint.rstrip('/')
         self.provider_address = provider_address
         self.timeout = timeout
-        
+        self.stream_read_timeout = stream_read_timeout
+
         # Initialize hybrid timestamp tracking
         self._wall_base = time.time_ns()
         self._perf_base = time.perf_counter_ns()
-        
-        # HTTP client
+
+        # HTTP client: default timeout for non-streaming; streaming uses per-request timeout
         self.client = httpx.AsyncClient(timeout=timeout)
-    
+
     def _hybrid_timestamp_ns(self) -> int:
         """Generate hybrid timestamp (monotonic + aligned to wall clock)"""
         return self._wall_base + (time.perf_counter_ns() - self._perf_base)
-    
+
     def _sign_payload(
         self,
         payload_bytes: bytes,
@@ -48,41 +63,45 @@ def _sign_payload(
         """Sign payload using ECDSA with SHA-256"""
         # Remove 0x prefix if present
         pk = self.private_key[2:] if self.private_key.startswith('0x') else self.private_key
-        sk = SigningKey.from_string(bytes.fromhex(pk), curve=SECP256k1)
-        
-        # Message bytes: payload || timestamp || provider_address
-        msg = payload_bytes + str(timestamp_ns).encode('utf-8') + provider_address.encode('utf-8')
-        
-        # Deterministic ECDSA over SHA-256 with low-S normalization
-        sig = sk.sign_deterministic(msg, hashfunc=hashlib.sha256)
-        r, s = sig[:32], sig[32:]
-        
-        order = SECP256k1.order
-        s_int = int.from_bytes(s, 'big')
-        if s_int > order // 2:
-            s_int = order - s_int
-            s = s_int.to_bytes(32, 'big')
-        
-        return base64.b64encode(r + s).decode('utf-8')
-    
+        signing_key = SigningKey.from_string(bytes.fromhex(pk), curve=SECP256k1)
+
+        # Phase 3: Sign hash of payload instead of raw payload
+        payload_hash = hashlib.sha256(payload_bytes).hexdigest()
+
+        # Build signature input: hash + timestamp + transfer_address
+        signature_input = payload_hash
+        signature_input += str(timestamp_ns)
+        signature_input += provider_address
+
+        signature_bytes = signature_input.encode('utf-8')
+
+        # Sign the message with deterministic ECDSA using low-S normalization
+        signature = signing_key.sign_deterministic(
+            signature_bytes,
+            hashfunc=hashlib.sha256,
+            sigencode=lambda r, s, order: encode_with_low_s(r, s, order)
+        )
+
+        return base64.b64encode(signature).decode('utf-8')
+
     def _prepare_request(self, payload: Optional[dict]) -> Tuple[bytes, dict]:
         """Prepare request data (payload bytes, headers with signature)"""
         if payload is None:
             payload = {}
-        
+
         payload_bytes = json.dumps(payload).encode('utf-8')
         timestamp_ns = self._hybrid_timestamp_ns()
         signature = self._sign_payload(payload_bytes, timestamp_ns, self.provider_address)
-        
+
         headers = {
             "Content-Type": "application/json",
             "Authorization": signature,
             "X-Requester-Address": self.address,
             "X-Timestamp": str(timestamp_ns),
         }
-        
+
         return payload_bytes, headers
-    
+
     async def get_models(self) -> list:
         """Get available models from Gonka API"""
         try:
@@ -94,7 +113,7 @@ async def get_models(self) -> list:
         except Exception as e:
             logger.warning(f"Failed to load models from Gonka API: {e}")
             return []
-    
+
     async def request(
         self,
         method: str,
@@ -104,15 +123,8 @@ async def request(
         """Make a signed request to Gonka API (non-streaming)"""
         url = f"{self.endpoint}{path}"
         payload_bytes, headers = self._prepare_request(payload)
-        
-        # Log request body before sending
-        try:
-            request_body = json.loads(payload_bytes.decode('utf-8'))
-            logger.info(f"Gonka API Request: {method} {url}")
-            logger.info(f"Request body: {json.dumps(request_body, indent=2, ensure_ascii=False)}")
-        except Exception as e:
-            logger.warning(f"Failed to log request body: {e}")
-        
+
+        logger.info(f"Gonka API Request: {method} {path}")
         try:
             response = await self.client.request(
                 method,
@@ -123,18 +135,12 @@ async def request(
             response.raise_for_status()
             return response.json()
         except httpx.HTTPStatusError as e:
-            # Log error response
-            try:
-                error_body = e.response.text
-                logger.error(f"Gonka API Error Response: {e.response.status_code}")
-                logger.error(f"Error response body: {error_body}")
-            except Exception:
-                logger.error(f"Gonka API Error Response: {e.response.status_code} (failed to read body)")
+            logger.error(f"Gonka API Error Response: {e.response.status_code}")
             raise
         except Exception as e:
             logger.error(f"Gonka API Request failed: {type(e).__name__}: {str(e)}")
             raise
-    
+
     async def request_stream(
         self,
         method: str,
@@ -144,49 +150,71 @@ async def request_stream(
         """Make a signed streaming request to Gonka API"""
         url = f"{self.endpoint}{path}"
         payload_bytes, headers = self._prepare_request(payload)
-        
-        # Log request body before sending
-        try:
-            request_body = json.loads(payload_bytes.decode('utf-8'))
-            logger.info(f"Gonka API Stream Request: {method} {url}")
-            logger.info(f"Request body: {json.dumps(request_body, indent=2, ensure_ascii=False)}")
-        except Exception as e:
-            logger.warning(f"Failed to log request body: {e}")
-        
+
+        logger.info(f"Gonka API Stream Request: {method} {path}")
         try:
+            # Use longer read timeout for streaming so long generations don't get cut off
+            stream_timeout = httpx.Timeout(self.timeout, read=self.stream_read_timeout)
             async with self.client.stream(
                 method,
                 url,
                 headers=headers,
-                content=payload_bytes
+                content=payload_bytes,
+                timeout=stream_timeout,
             ) as response:
                 if response.status_code >= 400:
-                    # Read error response body
                     try:
                         error_body = await response.aread()
                         error_text = error_body.decode('utf-8', errors='replace')
                         logger.error(f"Gonka API Stream Error Response: {response.status_code}")
                         logger.error(f"Error response body: {error_text}")
                     except Exception as read_err:
-                        logger.error(f"Gonka API Stream Error Response: {response.status_code} (failed to read body: {read_err})")
+                        logger.error(
+                            f"Gonka API Stream Error Response: {response.status_code} "
+                            f"(failed to read body: {read_err})"
+                        )
                     response.raise_for_status()
-                
-                async for chunk in response.aiter_bytes():
-                    yield chunk
+
+                total_bytes = 0
+                chunk_count = 0
+                completed_normally = False
+                try:
+                    async for chunk in response.aiter_bytes():
+                        total_bytes += len(chunk)
+                        chunk_count += 1
+                        yield chunk
+                    completed_normally = True
+                    logger.info(
+                        f"Gonka API Stream completed: {method} {path} "
+                        f"(chunks={chunk_count}, bytes={total_bytes})"
+                    )
+                finally:
+                    if not completed_normally:
+                        logger.info(
+                            f"Gonka API Stream ended without full completion: {method} {path} "
+                            f"(chunks={chunk_count}, bytes={total_bytes}) — client disconnect or stream closed"
+                        )
         except httpx.HTTPStatusError as e:
-            # Log error response (fallback for non-stream errors)
-            try:
-                error_body = e.response.text
-                logger.error(f"Gonka API Stream Error Response: {e.response.status_code}")
-                logger.error(f"Error response body: {error_body}")
-            except Exception:
-                logger.error(f"Gonka API Stream Error Response: {e.response.status_code} (failed to read body)")
+            logger.error(f"Gonka API Stream Error Response: {e.response.status_code}")
+            raise
+        except httpx.ReadTimeout:
+            logger.error(
+                "Gonka API Stream read timeout: backend did not send data within "
+                "stream_read_timeout; stream ended abruptly"
+            )
+            raise
+        except httpx.ConnectError as e:
+            logger.error(f"Gonka API Stream connection error: {e}")
+            raise
+        except httpx.WriteTimeout:
+            logger.error("Gonka API Stream write timeout: request body send timed out")
             raise
         except Exception as e:
-            logger.error(f"Gonka API Stream Request failed: {type(e).__name__}: {str(e)}")
+            logger.error(
+                f"Gonka API Stream failed unexpectedly: {type(e).__name__}: {str(e)}"
+            )
             raise
-    
+
     async def close(self):
         """Close the HTTP client"""
         await self.client.aclose()
-
diff --git a/app/main.py b/app/main.py
index 4a55df2..0070d0c 100644
--- a/app/main.py
+++ b/app/main.py
@@ -10,6 +10,13 @@
 
 from app.gonka_client import GonkaClient
 from app.auth import verify_api_key
+from app.tool_emulation import (
+    emulate_tool_choice_auto,
+    process_response_with_tool_emulation,
+    process_stream_with_tool_emulation,
+)
+from app.circuit_breaker import CircuitBreaker, CircuitBreakerOpenError
+from app.retry import retry_with_backoff, BACKEND_RETRY_CONFIG
 
 
 # Configure logging
@@ -28,14 +35,17 @@ class Settings(BaseSettings):
     gonka_address: str = ""
     gonka_endpoint: str = ""
     gonka_provider_address: str = ""
-    
+
     # API Key for external access
     api_key: str = ""
-    
+
     # Server settings
     host: str = "0.0.0.0"
     port: int = 8000
-    
+
+    # Streaming read timeout (seconds); increase for slow/long generations
+    gonka_stream_read_timeout: float = 300.0
+
     class Config:
         env_file = ".env"
         case_sensitive = False
@@ -48,16 +58,21 @@ class Config:
 gonka_client: Optional[GonkaClient] = None
 available_models: List[Dict] = []
 
+# Circuit breaker for Gonka backend
+gonka_circuit_breaker = CircuitBreaker(
+    name="gonka_backend",
+    failure_threshold=5,
+    recovery_timeout=60.0,
+    success_threshold=2,
+)
+
 
 @asynccontextmanager
 async def lifespan(app: FastAPI):
     """Lifespan context manager for startup and shutdown events"""
-    # Startup
     global gonka_client, available_models
-    
-    # Initialize client and load models
+
     try:
-        # Check if configuration is complete before loading models
         client = _create_gonka_client()
         if client:
             models = await client.get_models()
@@ -69,10 +84,9 @@ async def lifespan(app: FastAPI):
     except Exception as e:
         logger.error(f"Failed to load models at startup: {e}")
         available_models = []
-    
+
     yield
-    
-    # Shutdown
+
     if gonka_client:
         await gonka_client.close()
 
@@ -92,7 +106,8 @@ def _create_gonka_client() -> Optional[GonkaClient]:
             private_key=settings.gonka_private_key,
             address=settings.gonka_address,
             endpoint=settings.gonka_endpoint,
-            provider_address=settings.gonka_provider_address
+            provider_address=settings.gonka_provider_address,
+            stream_read_timeout=settings.gonka_stream_read_timeout,
         )
     return gonka_client
 
@@ -109,8 +124,8 @@ def get_gonka_client() -> GonkaClient:
         if not settings.gonka_endpoint:
             missing.append("GONKA_ENDPOINT")
         if not settings.gonka_provider_address:
-            missing.append("GONKA_PROVIDER_ADDRESS (provider address in bech32 format, get it from the Gonka provider)")
-        
+            missing.append("GONKA_PROVIDER_ADDRESS")
+
         raise HTTPException(
             status_code=500,
             detail=f"Gonka configuration incomplete. Missing: {', '.join(missing)}. "
@@ -142,19 +157,17 @@ def get_gonka_client() -> GonkaClient:
 async def list_models(request: Request, api_key_valid: bool = Depends(verify_api_key)):
     """List available models (OpenAI-compatible endpoint)"""
     global available_models
-    
-    # Convert Gonka models format to OpenAI format
+
     models_data = []
     for model in available_models:
         model_id = model.get("id", "unknown")
         models_data.append({
             "id": model_id,
             "object": "model",
-            "created": 1677610602,  # Default timestamp
+            "created": 1677610602,
             "owned_by": "gonka"
         })
-    
-    # If no models loaded, return default
+
     if not models_data:
         models_data = [{
             "id": "gonka-model",
@@ -162,21 +175,19 @@ async def list_models(request: Request, api_key_valid: bool = Depends(verify_api
             "created": 1677610602,
             "owned_by": "gonka"
         }]
-    
+
     return {
         "object": "list",
         "data": models_data
     }
 
+
 # Models endpoint without auth (for web interface)
 @app.get("/api/models")
 async def get_models_no_auth():
     """Get available models without authentication (for web interface)"""
     global available_models
-    
-    return {
-        "models": available_models
-    }
+    return {"models": available_models}
 
 
 # Chat completions endpoint
@@ -187,34 +198,47 @@ async def chat_completions(
 ):
     """Chat completions endpoint (OpenAI-compatible)"""
     client = get_gonka_client()
-    
+
     try:
         body = await request.json()
-        # Log incoming request body
         logger.info("Incoming chat completions request")
-        logger.info(f"Request body: {json.dumps(body, indent=2, ensure_ascii=False)}")
     except Exception as e:
         logger.error(f"Failed to parse request JSON: {e}")
         raise HTTPException(status_code=400, detail=f"Invalid JSON: {str(e)}")
-    
+
     stream = body.get("stream", False)
-    
+    original_tools = body.get("tools")
+
+    # Apply tool emulation if model doesn't support native tool calling
+    # (emulate_tool_choice_auto is a no-op when no tools are present)
+    emulated_body = emulate_tool_choice_auto(body)
+    tools_were_emulated = original_tools and "tools" not in emulated_body
+
     try:
         if stream:
-            # Streaming response - proxy SSE from Gonka
             async def generate():
                 try:
-                    async for chunk in client.request_stream(
+                    raw_stream = client.request_stream(
                         method="POST",
                         path="/chat/completions",
-                        payload=body
-                    ):
-                        # Yield chunk as-is (Gonka should return SSE format)
-                        yield chunk
+                        payload=emulated_body,
+                    )
+                    if tools_were_emulated:
+                        async for chunk in process_stream_with_tool_emulation(
+                            raw_stream, original_tools
+                        ):
+                            yield chunk
+                    else:
+                        async for chunk in raw_stream:
+                            yield chunk
+                except CircuitBreakerOpenError as e:
+                    logger.error(f"Circuit breaker open: {e}")
+                    error_payload = json.dumps({"error": {"message": str(e), "type": "service_unavailable"}})
+                    yield f"data: {error_payload}\n\ndata: [DONE]\n\n".encode()
                 except Exception as e:
                     logger.error(f"Streaming error: {type(e).__name__}: {str(e)}")
                     raise
-            
+
             return StreamingResponse(
                 generate(),
                 media_type="text/event-stream",
@@ -225,13 +249,31 @@ async def generate():
                 }
             )
         else:
-            # Non-streaming response
-            response = await client.request(
-                method="POST",
-                path="/chat/completions",
-                payload=body
+            # Non-streaming: wrap with circuit breaker + retry
+            async def do_request():
+                return await gonka_circuit_breaker.call(
+                    client.request,
+                    method="POST",
+                    path="/chat/completions",
+                    payload=emulated_body,
+                )
+
+            response = await retry_with_backoff(
+                do_request,
+                max_retries=BACKEND_RETRY_CONFIG.max_retries,
+                initial_delay=BACKEND_RETRY_CONFIG.initial_delay,
+                max_delay=BACKEND_RETRY_CONFIG.max_delay,
+                exceptions=BACKEND_RETRY_CONFIG.exceptions,
             )
+
+            if tools_were_emulated:
+                response = process_response_with_tool_emulation(response, original_tools)
+
             return response
+
+    except CircuitBreakerOpenError as e:
+        logger.error(f"Circuit breaker open: {e}")
+        raise HTTPException(status_code=503, detail=f"Backend temporarily unavailable: {str(e)}")
     except Exception as e:
         logger.error(f"Chat completions error: {type(e).__name__}: {str(e)}")
         raise HTTPException(
@@ -246,11 +288,16 @@ async def web_interface():
     """Serve web chat interface"""
     return FileResponse("app/static/index.html")
 
+
 # Health check endpoint (no auth required)
 @app.get("/health")
 async def health():
     """Health check endpoint"""
-    return {"status": "ok"}
+    return {
+        "status": "ok",
+        "circuit_breaker": gonka_circuit_breaker.get_state(),
+    }
+
 
 # Serve static files (must be last)
 app.mount("/static", StaticFiles(directory="app/static"), name="static")
@@ -264,4 +311,3 @@ async def health():
         port=settings.port,
         reload=False
     )
-
diff --git a/app/retry.py b/app/retry.py
new file mode 100644
index 0000000..ed5934a
--- /dev/null
+++ b/app/retry.py
@@ -0,0 +1,173 @@
+"""
+Retry utilities with exponential backoff
+"""
+import asyncio
+import random
+import logging
+from typing import Callable, TypeVar, Optional, Tuple, Any
+
+logger = logging.getLogger(__name__)
+
+T = TypeVar('T')
+
+
+async def retry_with_backoff(
+    func: Callable[[], T],
+    max_retries: int = 3,
+    initial_delay: float = 1.0,
+    max_delay: float = 60.0,
+    exponential_base: float = 2.0,
+    jitter: bool = True,
+    exceptions: Tuple[type, ...] = (Exception,),
+    on_retry: Optional[Callable[[int, Exception], None]] = None
+) -> T:
+    """
+    Retry function with exponential backoff
+    
+    Args:
+        func: Async function to retry (no arguments)
+        max_retries: Maximum number of retry attempts
+        initial_delay: Initial delay in seconds
+        max_delay: Maximum delay in seconds
+        exponential_base: Base for exponential backoff
+        jitter: Add random jitter to prevent thundering herd
+        exceptions: Tuple of exceptions to catch and retry
+        on_retry: Optional callback called on each retry (attempt, exception)
+    
+    Returns:
+        Function result
+    
+    Raises:
+        Last exception if all retries fail
+    """
+    delay = initial_delay
+    last_exception = None
+    
+    for attempt in range(max_retries + 1):
+        try:
+            return await func()
+        except exceptions as e:
+            last_exception = e
+            
+            if attempt == max_retries:
+                logger.error(
+                    f"Retry exhausted after {max_retries} attempts. "
+                    f"Last error: {type(e).__name__}: {e}"
+                )
+                raise
+            
+            # Calculate delay with exponential backoff
+            if jitter:
+                # Add random jitter: delay * base + random(0, 1)
+                delay = min(
+                    delay * exponential_base + random.uniform(0, 1),
+                    max_delay
+                )
+            else:
+                delay = min(delay * exponential_base, max_delay)
+            
+            logger.warning(
+                f"Attempt {attempt + 1}/{max_retries} failed: {type(e).__name__}: {e}. "
+                f"Retrying in {delay:.2f}s"
+            )
+            
+            if on_retry:
+                try:
+                    on_retry(attempt + 1, e)
+                except Exception as callback_error:
+                    logger.warning(f"Retry callback error: {callback_error}")
+            
+            await asyncio.sleep(delay)
+    
+    # Should never reach here, but just in case
+    if last_exception:
+        raise last_exception
+    raise RuntimeError("Retry failed without exception")
+
+
+async def retry_with_backoff_args(
+    func: Callable[..., T],
+    *args,
+    max_retries: int = 3,
+    initial_delay: float = 1.0,
+    max_delay: float = 60.0,
+    exponential_base: float = 2.0,
+    jitter: bool = True,
+    exceptions: Tuple[type, ...] = (Exception,),
+    **kwargs
+) -> T:
+    """
+    Retry function with exponential backoff (supports arguments)
+    
+    Args:
+        func: Async function to retry
+        *args: Positional arguments for function
+        max_retries: Maximum number of retry attempts
+        initial_delay: Initial delay in seconds
+        max_delay: Maximum delay in seconds
+        exponential_base: Base for exponential backoff
+        jitter: Add random jitter
+        exceptions: Tuple of exceptions to catch and retry
+        **kwargs: Keyword arguments for function
+    
+    Returns:
+        Function result
+    
+    Raises:
+        Last exception if all retries fail
+    """
+    async def wrapper():
+        return await func(*args, **kwargs)
+    
+    return await retry_with_backoff(
+        wrapper,
+        max_retries=max_retries,
+        initial_delay=initial_delay,
+        max_delay=max_delay,
+        exponential_base=exponential_base,
+        jitter=jitter,
+        exceptions=exceptions
+    )
+
+
+class RetryConfig:
+    """Configuration for retry behavior"""
+    
+    def __init__(
+        self,
+        max_retries: int = 3,
+        initial_delay: float = 1.0,
+        max_delay: float = 60.0,
+        exponential_base: float = 2.0,
+        jitter: bool = True,
+        exceptions: Tuple[type, ...] = (Exception,)
+    ):
+        self.max_retries = max_retries
+        self.initial_delay = initial_delay
+        self.max_delay = max_delay
+        self.exponential_base = exponential_base
+        self.jitter = jitter
+        self.exceptions = exceptions
+
+
+# Predefined retry configurations
+HTTP_RETRY_CONFIG = RetryConfig(
+    max_retries=3,
+    initial_delay=1.0,
+    max_delay=30.0,
+    exceptions=(Exception,)  # Catch all for HTTP errors
+)
+
+DATABASE_RETRY_CONFIG = RetryConfig(
+    max_retries=3,
+    initial_delay=0.5,
+    max_delay=10.0,
+    exceptions=(Exception,)
+)
+
+BACKEND_RETRY_CONFIG = RetryConfig(
+    max_retries=2,  # Fewer retries for backend calls (circuit breaker handles failures)
+    initial_delay=1.0,
+    max_delay=20.0,
+    exceptions=(Exception,)
+)
diff --git a/app/static/index.html b/app/static/index.html
index 78cc77f..f7c7c9d 100644
--- a/app/static/index.html
+++ b/app/static/index.html
@@ -3,7 +3,15 @@
 <head>
     <meta charset="UTF-8">
     <meta name="viewport" content="width=device-width, initial-scale=1.0">
-    <title>Gonka Chat - API Testing</title>
+    <title>Gonka AI Gateway - OpenAI-compatible LLM proxy</title>
+    <link rel="icon" type="image/svg+xml" href="/static/gonka.svg">
+    <!-- Agent/bot documentation: public URL for autonomous agents (not linked in UI). -->
+    <meta name="agent-docs" content="/agent">
+    <link rel="alternate" type="application/json" href="/agent.json" title="Agent integration (JSON)">
+    <script src="/static/chain-config.js"></script>
+    <script>
+      function gaEvent(name, params) { /* no-op in public proxy */ }
+    </script>
     <style>
         * {
             margin: 0;
@@ -11,63 +19,320 @@
             box-sizing: border-box;
         }
 
+        :root {
+            --primary-color: #1976d2;
+            --primary-dark: #1565c0;
+            --primary-light: #42a5f5;
+            --secondary-color: #424242;
+            --background: #fafafa;
+            --surface: #ffffff;
+            --text-primary: #212121;
+            --text-secondary: #757575;
+            --divider: #e0e0e0;
+            --success: #4caf50;
+            --error: #f44336;
+            --warning: #ff9800;
+            --shadow-sm: 0 1px 3px rgba(0, 0, 0, 0.12), 0 1px 2px rgba(0, 0, 0, 0.24);
+            --shadow-md: 0 4px 6px rgba(0, 0, 0, 0.07), 0 2px 4px rgba(0, 0, 0, 0.06);
+            --shadow-lg: 0 10px 20px rgba(0, 0, 0, 0.1), 0 3px 6px rgba(0, 0, 0, 0.08);
+        }
+
         body {
-            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', Roboto, Oxygen, Ubuntu, Cantarell, sans-serif;
-            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            font-family: -apple-system, BlinkMacSystemFont, 'Segoe UI', 'Roboto', 'Inter', 'Helvetica Neue', Arial, sans-serif;
+            background: var(--background);
+            color: var(--text-primary);
             min-height: 100vh;
+            padding: 0;
+            line-height: 1.6;
+        }
+
+        .container {
+            max-width: 1440px;
+            margin: 0 auto;
+            padding: 24px;
+        }
+
+        .header {
+            background: var(--surface);
+            border-radius: 12px;
+            padding: 20px 32px;
+            margin-bottom: 24px;
+            box-shadow: var(--shadow-sm);
             display: flex;
-            justify-content: center;
+            justify-content: space-between;
             align-items: center;
-            padding: 20px;
+            border: 1px solid var(--divider);
         }
 
-        .container {
-            width: 100%;
-            max-width: 900px;
-            background: white;
-            border-radius: 20px;
-            box-shadow: 0 20px 60px rgba(0, 0, 0, 0.3);
+        .header h1 {
+            color: var(--primary-color);
+            font-size: 24px;
+            font-weight: 600;
+            letter-spacing: -0.5px;
+        }
+
+        .user-info {
             display: flex;
-            flex-direction: column;
-            height: 90vh;
-            overflow: hidden;
+            align-items: center;
+            gap: 16px;
         }
 
-        .header {
-            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+
+
+        .main-content {
+            display: grid;
+            grid-template-columns: 280px 1fr;
+            gap: 24px;
+        }
+
+        .sidebar {
+            background: var(--surface);
+            border-radius: 12px;
+            padding: 24px;
+            box-shadow: var(--shadow-sm);
+            height: fit-content;
+            border: 1px solid var(--divider);
+        }
+
+        .sidebar-nav {
+            list-style: none;
+        }
+
+        .sidebar-nav li {
+            margin-bottom: 4px;
+        }
+
+        .sidebar-nav a {
+            display: block;
+            padding: 12px 16px;
+            color: var(--text-primary);
+            text-decoration: none;
+            border-radius: 8px;
+            transition: all 0.2s ease;
+            font-size: 14px;
+            font-weight: 500;
+        }
+
+        .sidebar-nav a:hover {
+            background: rgba(25, 118, 210, 0.08);
+            color: var(--primary-color);
+        }
+
+        .sidebar-nav a.active {
+            background: var(--primary-color);
             color: white;
+        }
+
+        .content-area {
+            background: var(--surface);
+            border-radius: 12px;
+            padding: 32px;
+            box-shadow: var(--shadow-sm);
+            min-height: 600px;
+            border: 1px solid var(--divider);
+        }
+
+        .page {
+            display: none;
+        }
+
+        .page.active {
+            display: block;
+        }
+
+        /* Dashboard */
+        .stats-grid {
+            display: grid;
+            grid-template-columns: repeat(auto-fit, minmax(240px, 1fr));
+            gap: 20px;
+            margin-bottom: 32px;
+        }
+
+        .stat-card {
+            background: var(--surface);
+            border: 1px solid var(--divider);
+            padding: 24px;
+            border-radius: 12px;
+            box-shadow: var(--shadow-sm);
+            transition: all 0.2s ease;
+        }
+
+        .stat-card:hover {
+            box-shadow: var(--shadow-md);
+            transform: translateY(-2px);
+        }
+
+        .stat-card h3 {
+            font-size: 13px;
+            color: var(--text-secondary);
+            margin-bottom: 12px;
+            font-weight: 500;
+            text-transform: uppercase;
+            letter-spacing: 0.5px;
+        }
+
+        .stat-card .value {
+            font-size: 32px;
+            font-weight: 600;
+            color: var(--text-primary);
+            line-height: 1.2;
+        }
+
+        /* API Keys */
+        .api-keys-list {
+            margin-top: 24px;
+        }
+
+        .api-key-item {
+            background: var(--surface);
+            border: 1px solid var(--divider);
             padding: 20px;
-            text-align: center;
+            border-radius: 12px;
+            margin-bottom: 12px;
+            display: flex;
+            justify-content: space-between;
+            align-items: center;
+            transition: all 0.2s ease;
+            box-shadow: var(--shadow-sm);
         }
 
-        .header h1 {
-            font-size: 24px;
-            margin-bottom: 5px;
+        .api-key-item:hover {
+            box-shadow: var(--shadow-md);
+            border-color: var(--primary-light);
         }
 
-        .header p {
-            opacity: 0.9;
+        .api-key-item .info {
+            flex: 1;
+        }
+
+        .api-key-item .name {
+            font-weight: 600;
+            margin-bottom: 6px;
+            color: var(--text-primary);
+            font-size: 15px;
+        }
+
+        .api-key-item .meta {
+            font-size: 13px;
+            color: var(--text-secondary);
+        }
+
+        .btn {
+            padding: 8px 16px;
+            border: none;
+            border-radius: 8px;
+            cursor: pointer;
             font-size: 14px;
+            font-weight: 500;
+            transition: all 0.2s ease;
+            box-shadow: var(--shadow-sm);
+        }
+
+        .btn-danger {
+            background: var(--error);
+            color: white;
+        }
+
+        .btn-danger:hover {
+            background: #d32f2f;
+            box-shadow: var(--shadow-md);
         }
 
+        .btn-primary {
+            background: var(--primary-color);
+            color: white;
+        }
+
+        .btn-primary:hover {
+            background: var(--primary-dark);
+            box-shadow: var(--shadow-md);
+        }
+
+        .create-key-form {
+            margin-top: 24px;
+            padding: 24px;
+            background: var(--background);
+            border-radius: 12px;
+            border: 1px solid var(--divider);
+        }
+
+        .form-group {
+            margin-bottom: 20px;
+        }
+
+        .form-group label {
+            display: block;
+            margin-bottom: 8px;
+            font-weight: 500;
+            color: var(--text-primary);
+            font-size: 14px;
+        }
+
+        .form-group input {
+            width: 100%;
+            padding: 12px 16px;
+            border: 1px solid var(--divider);
+            border-radius: 8px;
+            font-size: 14px;
+            background: var(--surface);
+            color: var(--text-primary);
+            transition: all 0.2s ease;
+        }
+
+        .form-group input:focus {
+            outline: none;
+            border-color: var(--primary-color);
+            box-shadow: 0 0 0 3px rgba(25, 118, 210, 0.1);
+        }
+
+        .new-key-display {
+            background: rgba(25, 118, 210, 0.08);
+            padding: 20px;
+            border-radius: 12px;
+            margin-top: 20px;
+            border: 1px dashed var(--primary-color);
+        }
+
+        .new-key-display code {
+            background: var(--surface);
+            padding: 12px;
+            border-radius: 8px;
+            display: block;
+            word-break: break-all;
+            font-family: 'Monaco', 'Menlo', 'Ubuntu Mono', monospace;
+            font-size: 13px;
+            color: var(--text-primary);
+            border: 1px solid var(--divider);
+        }
+
+        /* Chat */
         .chat-container {
+            display: flex;
+            flex-direction: column;
+            height: 600px;
+        }
+
+        .chat-messages {
             flex: 1;
             overflow-y: auto;
-            padding: 20px;
-            background: #f8f9fa;
+            padding: 24px;
+            background: var(--background);
+            border-radius: 12px;
+            margin-bottom: 20px;
+            border: 1px solid var(--divider);
         }
 
         .message {
-            margin-bottom: 20px;
+            margin-bottom: 24px;
             display: flex;
             gap: 12px;
-            animation: fadeIn 0.3s ease-in;
+            animation: fadeIn 0.3s ease;
         }
 
         @keyframes fadeIn {
             from {
                 opacity: 0;
-                transform: translateY(10px);
+                transform: translateY(8px);
             }
             to {
                 opacity: 1;
@@ -80,523 +345,1053 @@
         }
 
         .message-avatar {
-            width: 40px;
-            height: 40px;
+            width: 36px;
+            height: 36px;
             border-radius: 50%;
             display: flex;
             align-items: center;
             justify-content: center;
-            font-weight: bold;
-            font-size: 18px;
+            font-weight: 600;
             flex-shrink: 0;
+            font-size: 14px;
         }
 
         .message.user .message-avatar {
-            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            background: var(--primary-color);
             color: white;
         }
 
         .message.assistant .message-avatar {
-            background: linear-gradient(135deg, #f093fb 0%, #f5576c 100%);
+            background: var(--text-secondary);
             color: white;
         }
 
         .message-content {
             flex: 1;
-            padding: 12px 16px;
-            border-radius: 18px;
-            max-width: 70%;
+            padding: 14px 18px;
+            border-radius: 12px;
+            max-width: 75%;
             word-wrap: break-word;
             line-height: 1.5;
+            font-size: 14px;
         }
 
         .message.user .message-content {
-            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+            background: var(--primary-color);
             color: white;
             border-bottom-right-radius: 4px;
         }
 
         .message.assistant .message-content {
-            background: white;
-            color: #333;
+            background: var(--surface);
+            color: var(--text-primary);
+            border: 1px solid var(--divider);
             border-bottom-left-radius: 4px;
-            box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
+            box-shadow: var(--shadow-sm);
         }
 
-        .input-container {
-            padding: 20px;
-            background: white;
-            border-top: 1px solid #e0e0e0;
+        .chat-input-area {
             display: flex;
-            gap: 10px;
+            gap: 12px;
+            align-items: center;
         }
 
-        .input-wrapper {
+        .chat-input-area input {
             flex: 1;
-            position: relative;
-        }
-
-        #messageInput {
-            width: 100%;
-            padding: 12px 16px;
-            border: 2px solid #e0e0e0;
+            padding: 14px 20px;
+            border: 1px solid var(--divider);
             border-radius: 24px;
-            font-size: 16px;
+            font-size: 14px;
             outline: none;
-            transition: border-color 0.3s;
+            background: var(--surface);
+            color: var(--text-primary);
+            transition: all 0.2s ease;
         }
 
-        #messageInput:focus {
-            border-color: #667eea;
+        .chat-input-area input:focus {
+            border-color: var(--primary-color);
+            box-shadow: 0 0 0 3px rgba(25, 118, 210, 0.1);
         }
 
-        #sendButton {
-            padding: 12px 24px;
-            background: linear-gradient(135deg, #667eea 0%, #764ba2 100%);
+        .chat-input-area button {
+            padding: 14px 28px;
+            background: var(--primary-color);
             color: white;
             border: none;
             border-radius: 24px;
-            font-size: 16px;
-            font-weight: 600;
+            font-size: 14px;
+            font-weight: 500;
             cursor: pointer;
-            transition: transform 0.2s, box-shadow 0.2s;
+            transition: all 0.2s ease;
+            box-shadow: var(--shadow-sm);
         }
 
-        #sendButton:hover:not(:disabled) {
-            transform: translateY(-2px);
-            box-shadow: 0 4px 12px rgba(102, 126, 234, 0.4);
+        .chat-input-area button:hover {
+            background: var(--primary-dark);
+            box-shadow: var(--shadow-md);
         }
 
-        #sendButton:disabled {
-            opacity: 0.5;
-            cursor: not-allowed;
+        /* Transactions */
+        .transactions-list {
+            margin-top: 24px;
+            overflow-x: auto;
         }
 
-        .typing-indicator {
-            display: none;
-            padding: 12px 16px;
-            background: white;
-            border-radius: 18px;
-            width: fit-content;
-            margin-bottom: 20px;
-            box-shadow: 0 2px 8px rgba(0, 0, 0, 0.1);
+        .transactions-table {
+            width: 100%;
+            border-collapse: collapse;
+            background: var(--surface);
+            border-radius: 12px;
+            overflow: hidden;
+            box-shadow: var(--shadow-sm);
         }
 
-        .typing-indicator.show {
-            display: block;
+        .transactions-table thead {
+            background: var(--background);
         }
 
-        .typing-dots {
-            display: flex;
-            gap: 4px;
+        .transactions-table th {
+            padding: 16px;
+            text-align: left;
+            font-weight: 600;
+            font-size: 13px;
+            color: var(--text-secondary);
+            text-transform: uppercase;
+            letter-spacing: 0.5px;
+            border-bottom: 2px solid var(--divider);
         }
 
-        .typing-dots span {
-            width: 8px;
-            height: 8px;
-            background: #667eea;
-            border-radius: 50%;
-            animation: typing 1.4s infinite;
+        .transactions-table td {
+            padding: 16px;
+            border-bottom: 1px solid var(--divider);
+            font-size: 14px;
+            color: var(--text-primary);
         }
 
-        .typing-dots span:nth-child(2) {
-            animation-delay: 0.2s;
+        .transactions-table tbody tr:hover {
+            background: var(--background);
         }
 
-        .typing-dots span:nth-child(3) {
-            animation-delay: 0.4s;
+        .transactions-table tbody tr:last-child td {
+            border-bottom: none;
         }
 
-        @keyframes typing {
-            0%, 60%, 100% {
-                transform: translateY(0);
-            }
-            30% {
-                transform: translateY(-10px);
-            }
+        .transaction-amount {
+            font-weight: 600;
         }
 
-        .config-panel {
-            padding: 15px 20px;
-            background: #f8f9fa;
-            border-bottom: 1px solid #e0e0e0;
-            display: flex;
-            gap: 10px;
-            align-items: center;
-            flex-wrap: wrap;
+        .transaction-amount.negative {
+            color: var(--error);
         }
 
-        .config-item {
-            display: flex;
-            align-items: center;
-            gap: 8px;
+        .transaction-amount.positive {
+            color: var(--success);
         }
 
-        .config-item label {
+        .transaction-model {
+            font-family: monospace;
             font-size: 12px;
-            color: #666;
-            font-weight: 500;
+            color: var(--text-secondary);
         }
 
-        .config-item input {
-            padding: 6px 12px;
-            border: 1px solid #e0e0e0;
-            border-radius: 12px;
+        .transaction-api-key {
+            font-family: monospace;
+            font-size: 11px;
+            color: var(--text-secondary);
+            max-width: 150px;
+            overflow: hidden;
+            text-overflow: ellipsis;
+            white-space: nowrap;
+        }
+
+        .transaction-tokens {
             font-size: 13px;
-            outline: none;
-            width: 200px;
+            color: var(--text-secondary);
         }
 
-        .config-item input:focus {
-            border-color: #667eea;
+        .empty-state {
+            text-align: center;
+            padding: 80px 20px;
+            color: var(--text-secondary);
         }
 
-        .config-item select {
-            padding: 6px 12px;
-            border: 1px solid #e0e0e0;
-            border-radius: 12px;
+        .empty-state svg {
+            width: 64px;
+            height: 64px;
+            margin-bottom: 16px;
+            opacity: 0.3;
+        }
+
+        .error-message {
+            background: rgba(244, 67, 54, 0.1);
+            color: var(--error);
+            padding: 14px 18px;
+            border-radius: 8px;
+            margin-bottom: 16px;
+            border: 1px solid rgba(244, 67, 54, 0.2);
+            font-size: 14px;
+        }
+
+        h2 {
+            font-size: 24px;
+            font-weight: 600;
+            color: var(--text-primary);
+            margin-bottom: 24px;
+            letter-spacing: -0.3px;
+        }
+
+        h3 {
+            font-size: 20px;
+            font-weight: 600;
+            color: var(--text-primary);
+            margin-top: 32px;
+            margin-bottom: 16px;
+        }
+
+        h4 {
+            font-size: 16px;
+            font-weight: 600;
+            color: var(--text-primary);
+            margin-top: 20px;
+            margin-bottom: 10px;
+        }
+
+        /* Documentation styles */
+        .docs-section {
+            margin-bottom: 40px;
+        }
+
+        .docs-section h3 {
+            padding-bottom: 10px;
+            border-bottom: 2px solid var(--primary-color);
+        }
+
+        .docs-code-block {
+            background: #f8f9fa;
+            border: 1px solid var(--divider);
+            border-radius: 8px;
+            padding: 20px;
+            margin: 15px 0;
+            overflow-x: auto;
+        }
+
+        .docs-code-block code {
+            font-family: 'Courier New', monospace;
+            font-size: 14px;
+            line-height: 1.6;
+            color: var(--text-primary);
+        }
+
+        .docs-code-block pre {
+            margin: 0;
+            white-space: pre-wrap;
+            word-break: break-word;
+            overflow-x: auto;
+        }
+
+        .docs-code-block pre code {
+            background: none;
+            padding: 0;
+        }
+
+        .docs-code-block-with-copy {
+            position: relative;
+            padding-top: 44px;
+        }
+
+        .docs-code-copy-btn {
+            position: absolute;
+            top: 10px;
+            right: 10px;
+            padding: 6px 14px;
             font-size: 13px;
-            outline: none;
+            font-weight: 500;
+            color: var(--primary-color);
+            background: var(--surface);
+            border: 1px solid var(--divider);
+            border-radius: 6px;
             cursor: pointer;
+            transition: background 0.2s, color 0.2s;
         }
 
-        .config-item select:focus {
-            border-color: #667eea;
+        .docs-code-copy-btn:hover {
+            background: rgba(25, 118, 210, 0.08);
+            color: var(--primary-dark);
         }
 
-        .error-message {
-            background: #fee;
-            color: #c33;
-            padding: 12px 16px;
-            border-radius: 12px;
-            margin-bottom: 10px;
-            border-left: 4px solid #c33;
+        .docs-code-copy-btn.copied {
+            color: var(--success);
+            border-color: var(--success);
         }
 
-        .empty-state {
+        .docs-endpoint {
+            background: #e7f3ff;
+            padding: 15px;
+            border-radius: 8px;
+            margin: 15px 0;
+            border-left: 4px solid var(--primary-color);
+        }
+
+        .docs-method {
+            display: inline-block;
+            padding: 4px 8px;
+            border-radius: 4px;
+            font-weight: 600;
+            font-size: 12px;
+            margin-right: 10px;
+        }
+
+        .docs-method-get {
+            background: #28a745;
+            color: white;
+        }
+
+        .docs-method-post {
+            background: #007bff;
+            color: white;
+        }
+
+        .docs-note {
+            background: #fff3cd;
+            border: 1px solid #ffc107;
+            border-radius: 8px;
+            padding: 15px;
+            margin: 15px 0;
+        }
+
+        .docs-warning {
+            background: #f8d7da;
+            border: 1px solid #dc3545;
+            border-radius: 8px;
+            padding: 15px;
+            margin: 15px 0;
+        }
+
+        .docs-section ol,
+        .docs-section ul {
+            margin-left: 20px;
+            margin-top: 10px;
+        }
+
+        .docs-section li {
+            margin-bottom: 8px;
+        }
+
+        .docs-section code {
+            background: rgba(0, 0, 0, 0.05);
+            padding: 2px 6px;
+            border-radius: 4px;
+            font-family: 'Courier New', monospace;
+            font-size: 13px;
+        }
+
+        /* Model selector */
+        #chatModel {
+            padding: 10px 16px;
+            border: 1px solid var(--divider);
+            border-radius: 8px;
+            font-size: 14px;
+            background: var(--surface);
+            color: var(--text-primary);
+            cursor: pointer;
+            transition: all 0.2s ease;
+        }
+
+        #chatModel:focus {
+            outline: none;
+            border-color: var(--primary-color);
+            box-shadow: 0 0 0 3px rgba(25, 118, 210, 0.1);
+        }
+
+        /* Scrollbar styling */
+        .chat-messages::-webkit-scrollbar,
+        .transactions-list::-webkit-scrollbar,
+        .api-keys-list::-webkit-scrollbar {
+            width: 8px;
+        }
+
+        .chat-messages::-webkit-scrollbar-track,
+        .transactions-list::-webkit-scrollbar-track,
+        .api-keys-list::-webkit-scrollbar-track {
+            background: transparent;
+        }
+
+        .chat-messages::-webkit-scrollbar-thumb,
+        .transactions-list::-webkit-scrollbar-thumb,
+        .api-keys-list::-webkit-scrollbar-thumb {
+            background: var(--divider);
+            border-radius: 4px;
+        }
+
+        .chat-messages::-webkit-scrollbar-thumb:hover,
+        .transactions-list::-webkit-scrollbar-thumb:hover,
+        .api-keys-list::-webkit-scrollbar-thumb:hover {
+            background: var(--text-secondary);
+        }
+
+        /* Responsive design */
+        @media (max-width: 768px) {
+            .main-content {
+                grid-template-columns: 1fr;
+            }
+
+            .sidebar {
+                order: 2;
+            }
+
+            .content-area {
+                order: 1;
+            }
+
+            .header {
+                flex-direction: column;
+                gap: 16px;
+                align-items: flex-start;
+            }
+
+            .user-info {
+                width: 100%;
+            flex-wrap: wrap;
+        }
+
+            .stats-grid {
+                grid-template-columns: 1fr;
+            }
+        }
+
+        /* Loading states */
+        .loading {
+            opacity: 0.6;
+            pointer-events: none;
+        }
+
+        /* Smooth transitions */
+        * {
+            transition: background-color 0.2s ease, border-color 0.2s ease, color 0.2s ease;
+        }
+
+        /* Footer */
+        .footer {
+            margin-top: 48px;
+            padding: 32px 0;
+            border-top: 1px solid var(--divider);
             text-align: center;
-            padding: 60px 20px;
-            color: #999;
         }
 
-        .empty-state svg {
-            width: 80px;
-            height: 80px;
-            margin-bottom: 20px;
-            opacity: 0.5;
+        .footer-content {
+            max-width: 1200px;
+            margin: 0 auto;
+            padding: 0 24px;
+        }
+
+        .footer-links {
+            display: flex;
+            justify-content: center;
+            gap: 24px;
+            flex-wrap: wrap;
+            margin-bottom: 16px;
+        }
+
+        .footer-links a {
+            color: var(--text-secondary);
+            text-decoration: none;
+            font-size: 14px;
+            transition: color 0.2s ease;
+            display: flex;
+            align-items: center;
+            gap: 6px;
+        }
+
+        .footer-links a:hover {
+            color: var(--primary-color);
+        }
+
+        .footer-links a svg {
+            width: 16px;
+            height: 16px;
         }
+
+        .footer-text {
+            color: var(--text-secondary);
+            font-size: 12px;
+            margin-top: 16px;
+        }
+
+
     </style>
 </head>
 <body>
     <div class="container">
         <div class="header">
-            <h1>🤖 Gonka Chat</h1>
-            <p>OpenAI-compatible API proxy testing</p>
+            <div>
+                <h1><img src="/static/gonka.svg" alt="Gonka" style="width: 32px; height: 32px; vertical-align: middle; margin-right: 8px;"> Gonka AI Gateway</h1>
+                <p style="margin: 4px 0 0 0; font-size: 13px; color: var(--text-secondary);">OpenAI-compatible LLM proxy</p>
+            </div>
         </div>
         
-        <div class="config-panel">
-            <div class="config-item">
-                <label>API Key:</label>
-                <input type="password" id="apiKey" placeholder="sk-..." />
+        <div class="main-content">
+            <div class="sidebar">
+                <ul class="sidebar-nav">
+                    <li><a href="#" data-page="chat" class="active">Chat</a></li>
+                    <li><a href="#" data-page="docs">API Docs</a></li>
+                    <li><a href="#" data-page="openclaw">Connect to OpenClaw</a></li>
+                </ul>
             </div>
-            <div class="config-item">
-                <label>Model:</label>
-                <select id="model">
-                    <option value="">Loading models...</option>
-                </select>
-            </div>
-            <div class="config-item">
-                <label>Stream:</label>
-                <select id="stream">
-                    <option value="true">Enabled</option>
-                    <option value="false">Disabled</option>
-                </select>
-            </div>
-        </div>
 
-        <div class="chat-container" id="chatContainer">
-            <div class="empty-state">
-                <svg viewBox="0 0 24 24" fill="none" stroke="currentColor" stroke-width="2">
-                    <path d="M21 15a2 2 0 0 1-2 2H7l-4 4V5a2 2 0 0 1 2-2h14a2 2 0 0 1 2 2z"></path>
-                </svg>
-                <p>Start a conversation by sending a message</p>
-            </div>
-        </div>
+            <div class="content-area">
+                <!-- Chat -->
+                <div id="chat" class="page active">
+                    <h2>Chat</h2>
+                    <div style="margin-bottom: 12px; display: flex; gap: 10px; align-items: center;">
+                        <label for="chatApiKey" style="font-weight: 600; white-space: nowrap;">API Key:</label>
+                        <input type="password" id="chatApiKey" placeholder="sk-..." style="padding: 8px 12px; border: 1px solid #ddd; border-radius: 8px; flex: 1; max-width: 340px; font-size: 14px;" />
+                    </div>
+                    <div style="margin-bottom: 15px; display: flex; gap: 10px; align-items: center;">
+                        <label for="chatModel" style="font-weight: 600;">Model:</label>
+                        <select id="chatModel" style="padding: 8px 12px; border: 1px solid #ddd; border-radius: 8px; flex: 1; max-width: 300px;">
+                            <option value="">Loading models...</option>
+                        </select>
+                    </div>
+                    <div class="chat-container">
+                        <div class="chat-messages" id="chatMessages">
+                            <div class="empty-state">
+                                <p>Start a conversation</p>
+                            </div>
+                        </div>
+                        <div class="chat-input-area">
+                            <input type="text" id="chatInput" placeholder="Enter your message..." />
+                            <button id="sendChatBtn">Send</button>
+                        </div>
+                    </div>
+                </div>
 
-        <div class="input-container">
-            <div class="input-wrapper">
-                <input 
-                    type="text" 
-                    id="messageInput" 
-                    placeholder="Enter your message..." 
-                    autocomplete="off"
-                />
+                <!-- Docs -->
+                <div id="docs" class="page">
+                    <h2>📚 API Documentation</h2>
+                    <p style="color: var(--text-secondary); margin-bottom: 30px;">Complete guide to using Gonka AI Gateway API</p>
+
+                    <div class="docs-section">
+                        <h3>Getting Started</h3>
+                        <p>Gonka AI Gateway provides an OpenAI-compatible API. Configure the <code>API_KEY</code> environment variable on the proxy server; clients authenticate using that key.</p>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Authentication</h3>
+                        <p>All API requests require the API key in the Authorization header:</p>
+                        <div class="docs-code-block">
+                            <code>Authorization: Bearer sk-your-api-key-here</code>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Base URL</h3>
+                        <p>The gateway is served over <strong>HTTPS</strong> (proxied via Traefik). All API requests should be made to:</p>
+                        <div class="docs-code-block">
+                            <code>http://localhost:8000/v1</code>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>API Endpoints</h3>
+                        <div class="docs-endpoint">
+                            <span class="docs-method docs-method-post">POST</span>
+                            <strong>/chat/completions</strong>
+                            <p style="margin-top: 10px; margin-bottom: 0;">Create a chat completion (OpenAI-compatible)</p>
+                        </div>
+                        <h4>Request</h4>
+                        <div class="docs-code-block">
+                            <code>POST /chat/completions<br>
+Content-Type: application/json<br>
+Authorization: Bearer sk-your-api-key<br><br>
+{<br>
+&nbsp;&nbsp;"model": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",<br>
+&nbsp;&nbsp;"messages": [<br>
+&nbsp;&nbsp;&nbsp;&nbsp;{"role": "user", "content": "Hello!"}<br>
+&nbsp;&nbsp;],<br>
+&nbsp;&nbsp;"stream": false<br>
+}</code>
+                        </div>
+                        <h4>Response</h4>
+                        <div class="docs-code-block">
+                            <code>{<br>
+&nbsp;&nbsp;"id": "chatcmpl-123",<br>
+&nbsp;&nbsp;"object": "chat.completion",<br>
+&nbsp;&nbsp;"created": 1677652288,<br>
+&nbsp;&nbsp;"choices": [{<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"index": 0,<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"message": {<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"role": "assistant",<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"content": "Hello! How can I help you?"<br>
+&nbsp;&nbsp;&nbsp;&nbsp;},<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"finish_reason": "stop"<br>
+&nbsp;&nbsp;}],<br>
+&nbsp;&nbsp;"usage": {<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"prompt_tokens": 9,<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"completion_tokens": 12,<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"total_tokens": 21<br>
+&nbsp;&nbsp;}<br>
+}</code>
+                        </div>
+                        <h4>Streaming</h4>
+                        <p>To enable streaming, set <code>"stream": true</code> in the request:</p>
+                        <div class="docs-code-block">
+                            <code>{<br>
+&nbsp;&nbsp;"model": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",<br>
+&nbsp;&nbsp;"messages": [<br>
+&nbsp;&nbsp;&nbsp;&nbsp;{"role": "user", "content": "Tell me a story"}<br>
+&nbsp;&nbsp;],<br>
+&nbsp;&nbsp;"stream": true<br>
+}</code>
+                        </div>
+                        <div class="docs-endpoint" style="margin-top: 30px;">
+                            <span class="docs-method docs-method-get">GET</span>
+                            <strong>/models</strong>
+                            <p style="margin-top: 10px; margin-bottom: 0;">List available models</p>
+                        </div>
+                        <h4>Request</h4>
+                        <div class="docs-code-block">
+                            <code>GET /models<br>
+Authorization: Bearer sk-your-api-key</code>
+                        </div>
+                        <h4>Response</h4>
+                        <div class="docs-code-block">
+                            <code>{<br>
+&nbsp;&nbsp;"object": "list",<br>
+&nbsp;&nbsp;"data": [<br>
+&nbsp;&nbsp;&nbsp;&nbsp;{<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"id": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"object": "model",<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"created": 1677610602,<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;"owned_by": "gonka"<br>
+&nbsp;&nbsp;&nbsp;&nbsp;}<br>
+&nbsp;&nbsp;]<br>
+}</code>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Using with OpenAI Python SDK</h3>
+                        <p>The API is fully compatible with the OpenAI Python SDK:</p>
+                        <div class="docs-code-block">
+                            <code>from openai import OpenAI<br><br>
+client = OpenAI(<br>
+&nbsp;&nbsp;api_key="sk-your-api-key-here",<br>
+&nbsp;&nbsp;base_url="http://localhost:8000/v1"<br>
+)<br><br>
+response = client.chat.completions.create(<br>
+&nbsp;&nbsp;model="Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",<br>
+&nbsp;&nbsp;messages=[<br>
+&nbsp;&nbsp;&nbsp;&nbsp;{"role": "user", "content": "Hello!"}<br>
+&nbsp;&nbsp;]<br>
+)<br><br>
+print(response.choices[0].message.content)</code>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Simple Python Example</h3>
+                        <p>Here's a simple example using Python's <code>requests</code> library:</p>
+                        <div class="docs-code-block">
+                            <code>import requests<br><br>
+url = "http://localhost:8000/v1/chat/completions"<br>
+headers = {<br>
+&nbsp;&nbsp;"Authorization": "Bearer sk-your-api-key-here",<br>
+&nbsp;&nbsp;"Content-Type": "application/json"<br>
+}<br>
+data = {<br>
+&nbsp;&nbsp;"model": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",<br>
+&nbsp;&nbsp;"messages": [<br>
+&nbsp;&nbsp;&nbsp;&nbsp;{"role": "user", "content": "Hello! How are you?"}<br>
+&nbsp;&nbsp;]<br>
+}<br><br>
+response = requests.post(url, headers=headers, json=data)<br>
+result = response.json()<br>
+print(result["choices"][0]["message"]["content"])</code>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Using with curl</h3>
+                        <div class="docs-code-block">
+                            <code>curl http://localhost:8000/v1/chat/completions \<br>
+&nbsp;&nbsp;-H "Content-Type: application/json" \<br>
+&nbsp;&nbsp;-H "Authorization: Bearer sk-your-api-key" \<br>
+&nbsp;&nbsp;-d '{<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"model": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",<br>
+&nbsp;&nbsp;&nbsp;&nbsp;"messages": [<br>
+&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;{"role": "user", "content": "Hello!"}<br>
+&nbsp;&nbsp;&nbsp;&nbsp;]<br>
+&nbsp;&nbsp;}'</code>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Using with n8n</h3>
+                        <div class="docs-note">
+                            <strong>n8n Workflow Example:</strong> Check out this ready-to-use workflow at <a href="https://n8n.io/workflows/12114" target="_blank" rel="noopener noreferrer" style="color: var(--primary-color); text-decoration: none; font-weight: 600;">https://n8n.io/workflows/12114</a>
+                        </div>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Rate Limits</h3>
+                        <p>Currently, there are no rate limits. However, please use the API responsibly.</p>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>Support</h3>
+                        <p>For issues or questions, join our <a href="https://discord.gg/mPd629wmSq" target="_blank" rel="noopener noreferrer" style="color: var(--primary-color);">Discord</a> or open an issue on <a href="https://github.com/MinglesAI/gonka-proxy" target="_blank" rel="noopener noreferrer" style="color: var(--primary-color);">GitHub</a>.</p>
+                    </div>
+                </div>
+
+                <!-- Connect to OpenClaw -->
+                <div id="openclaw" class="page">
+                    <h2>Connect to OpenClaw</h2>
+                    <p style="color: var(--text-secondary); margin-bottom: 24px;">Use this guide to connect <strong>OpenClaw</strong> to the Gonka gateway: configure the provider on your OpenClaw node, then use the <strong>OpenClaw Telegram bot</strong> commands to switch to the Gonka model and check status.</p>
+
+                    <div class="docs-section">
+                        <h3>1. Provider config (OpenClaw node)</h3>
+                        <p>Configure the Gonka provider on your <strong>OpenClaw node/gateway</strong> so the agent can use Gonka models. Edit <code>openclaw.json</code> or your <code>models.json</code> (wherever OpenClaw reads <code>models.providers</code>).</p>
+                        <p>Replace:</p>
+                        <ul style="margin-left: 20px; margin-bottom: 12px;">
+                            <li><code>http://localhost:8000/v1</code> — with your gateway base URL if different.</li>
+                            <li><code>sk-..........</code> — with your API key (from registration or dashboard).</li>
+                        </ul>
+                        <div class="docs-code-block docs-code-block-with-copy" id="openclaw-config-block">
+                            <button type="button" class="docs-code-copy-btn" data-copy-target="openclaw-config-block" aria-label="Copy">Copy</button>
+                            <pre><code>{
+  "models": {
+    "providers": {
+      "gonka": {
+        "baseUrl": "http://localhost:8000/v1",
+        "apiKey": "sk-..........",
+        "auth": "api-key",
+        "api": "openai-completions",
+        "authHeader": true,
+        "models": [
+          {
+            "id": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",
+            "name": "Qwen/Qwen3-235B-A22B-Instruct-2507-FP8",
+            "api": "openai-completions",
+            "reasoning": false,
+            "input": ["text"],
+            "cost": { "input": 0, "output": 0, "cacheRead": 0, "cacheWrite": 0 },
+            "contextWindow": 200000,
+            "maxTokens": 8192
+          }
+        ]
+      }
+    }
+  }
+}</code></pre>
+                        </div>
+                        <p style="margin-top: 12px;">After editing: save the file and restart the OpenClaw gateway/node if it does not reload config automatically.</p>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>2. OpenClaw Telegram bot commands</h3>
+                        <p>These commands are used <strong>in the OpenClaw Telegram bot</strong> chat (not in the gateway config or API). They let you switch to the Gonka model and check that it is active.</p>
+                        <h4>/status</h4>
+                        <p>In the <strong>OpenClaw Telegram bot</strong>, send <code>/status</code> to see the current runtime state, including which model is in use.</p>
+                        <div class="docs-code-block docs-code-block-with-copy" id="openclaw-status-block">
+                            <button type="button" class="docs-code-copy-btn" data-copy-target="openclaw-status-block" aria-label="Copy">Copy</button>
+                            <pre><code>🦞 OpenClaw 2026.2.15 (3fe22ea)
+🧠 Model: gonka/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 · 🔑 api-key sk-dvg…6GIAPE (models.json)
+🧮 Tokens: 18k in / 137 out
+📚 Context: 18k/200k (9%) · 🧹 Compactions: 0
+🧵 Session: agent:main:main • updated just now
+⚙️ Runtime: direct · Think: off · verbose
+🪢 Queue: collect (depth 0)</code></pre>
+                        </div>
+                        <p>From this you can confirm that the <strong>Model</strong> line shows <code>gonka/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8</code> when the Gonka model is active.</p>
+                        <h4>/model &lt;provider/model-id&gt;</h4>
+                        <p>In the <strong>OpenClaw Telegram bot</strong>, send <code>/model &lt;provider/model-id&gt;</code> to switch the active model. To use the Gonka Qwen model, send:</p>
+                        <div class="docs-code-block docs-code-block-with-copy" id="openclaw-model-block">
+                            <button type="button" class="docs-code-copy-btn" data-copy-target="openclaw-model-block" aria-label="Copy">Copy</button>
+                            <pre><code>/model gonka/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8</code></pre>
+                        </div>
+                        <p>Example: you send <code>/model gonka/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8</code>, the agent replies <em>Model set to gonka/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8.</em> Then send <code>/status</code> again to verify the Model line shows the Gonka model.</p>
+                    </div>
+
+                    <div class="docs-section">
+                        <h3>3. Summary</h3>
+                        <table style="width: 100%; border-collapse: collapse; margin-top: 12px;">
+                            <thead>
+                                <tr style="border-bottom: 2px solid var(--divider);">
+                                    <th style="text-align: left; padding: 10px 12px;">What</th>
+                                    <th style="text-align: left; padding: 10px 12px;">Where</th>
+                                    <th style="text-align: left; padding: 10px 12px;">Action</th>
+                                </tr>
+                            </thead>
+                            <tbody>
+                                <tr style="border-bottom: 1px solid var(--divider);">
+                                    <td style="padding: 10px 12px;">Add Gonka provider</td>
+                                    <td style="padding: 10px 12px;"><strong>OpenClaw node</strong> — openclaw.json or models config</td>
+                                    <td style="padding: 10px 12px;">Put the <code>models.providers.gonka</code> block from section 1 into your config. Set <code>baseUrl</code> and <code>apiKey</code>.</td>
+                                </tr>
+                                <tr style="border-bottom: 1px solid var(--divider);">
+                                    <td style="padding: 10px 12px;">Switch to Gonka model</td>
+                                    <td style="padding: 10px 12px;"><strong>OpenClaw Telegram bot</strong></td>
+                                    <td style="padding: 10px 12px;">Send: <code>/model gonka/Qwen/Qwen3-235B-A22B-Instruct-2507-FP8</code></td>
+                                </tr>
+                                <tr style="border-bottom: 1px solid var(--divider);">
+                                    <td style="padding: 10px 12px;">Check current model</td>
+                                    <td style="padding: 10px 12px;"><strong>OpenClaw Telegram bot</strong></td>
+                                    <td style="padding: 10px 12px;">Send: <code>/status</code> — look at the <strong>Model</strong> line in the reply.</td>
+                                </tr>
+                            </tbody>
+                        </table>
+                        <p style="margin-top: 16px;">So: <strong>config</strong> = on the node; <strong>/status</strong> and <strong>/model</strong> = in the OpenClaw Telegram bot chat.</p>
+                    </div>
+                </div>
             </div>
-            <button id="sendButton">Send</button>
         </div>
     </div>
 
     <script>
-        const chatContainer = document.getElementById('chatContainer');
-        const messageInput = document.getElementById('messageInput');
-        const sendButton = document.getElementById('sendButton');
-        const apiKeyInput = document.getElementById('apiKey');
-        const modelSelect = document.getElementById('model');
-        const streamSelect = document.getElementById('stream');
-
+        // Global state
         let messages = [];
 
-        // Load models on startup
-        async function loadModels() {
+        // Load chat models
+        async function loadChatModels() {
             try {
                 const response = await fetch('/api/models');
-                if (!response.ok) {
-                    throw new Error(`HTTP ${response.status}`);
-                }
                 const data = await response.json();
                 const models = data.models || [];
-                
-                // Clear select
+
+                const modelSelect = document.getElementById('chatModel');
+                if (!modelSelect) return;
+
                 modelSelect.innerHTML = '';
-                
+
                 if (models.length === 0) {
                     const option = document.createElement('option');
-                    option.value = 'gonka-model';
-                    option.textContent = 'No models available';
+                    option.value = 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8';
+                    option.textContent = 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 (default)';
                     modelSelect.appendChild(option);
                 } else {
-                    // Add models to select
+                    const defaultModel = 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8';
                     models.forEach(model => {
                         const option = document.createElement('option');
                         option.value = model.id;
                         option.textContent = model.id;
                         modelSelect.appendChild(option);
+                        if (model.id === defaultModel) option.selected = true;
                     });
+                    if (modelSelect.value === '') modelSelect.selectedIndex = 0;
                 }
             } catch (error) {
                 console.error('Failed to load models:', error);
-                modelSelect.innerHTML = '';
-                const option = document.createElement('option');
-                option.value = 'gonka-model';
-                option.textContent = 'Failed to load models';
-                modelSelect.appendChild(option);
+                const modelSelect = document.getElementById('chatModel');
+                if (modelSelect) {
+                    modelSelect.innerHTML = '';
+                    const option = document.createElement('option');
+                    option.value = 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8';
+                    option.textContent = 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8 (default)';
+                    modelSelect.appendChild(option);
+                }
             }
         }
 
-        // Load saved API key from localStorage
-        const savedApiKey = localStorage.getItem('gonka_api_key');
-        if (savedApiKey) {
-            apiKeyInput.value = savedApiKey;
-        }
+        // Send chat message with retry logic
+        async function sendChatMessage() {
+            const apiKey = document.getElementById('chatApiKey').value.trim();
+            if (!apiKey) {
+                alert('Please enter your API key');
+                return;
+            }
 
-        // Save API key on change
-        apiKeyInput.addEventListener('input', () => {
-            localStorage.setItem('gonka_api_key', apiKeyInput.value);
-        });
+            const input = document.getElementById('chatInput');
+            const message = input.value.trim();
+            if (!message) return;
 
-        // Load models on page load
-        loadModels();
+            const messagesDiv = document.getElementById('chatMessages');
+            const emptyState = messagesDiv.querySelector('.empty-state');
+            if (emptyState) emptyState.remove();
 
-        function addMessage(role, content) {
-            messages.push({ role, content });
-            
-            const emptyState = chatContainer.querySelector('.empty-state');
-            if (emptyState) {
-                emptyState.remove();
-            }
+            addChatMessage('user', message);
+            input.value = '';
 
-            const messageDiv = document.createElement('div');
-            messageDiv.className = `message ${role}`;
-            
-            const avatar = document.createElement('div');
-            avatar.className = 'message-avatar';
-            avatar.textContent = role === 'user' ? 'U' : 'A';
-            
-            const contentDiv = document.createElement('div');
-            contentDiv.className = 'message-content';
-            contentDiv.textContent = content;
-            
-            messageDiv.appendChild(avatar);
-            messageDiv.appendChild(contentDiv);
-            chatContainer.appendChild(messageDiv);
-            
-            chatContainer.scrollTop = chatContainer.scrollHeight;
-        }
-
-        function showTypingIndicator() {
-            const indicator = document.createElement('div');
-            indicator.className = 'typing-indicator show';
-            indicator.id = 'typingIndicator';
-            indicator.innerHTML = `
-                <div class="typing-dots">
-                    <span></span>
-                    <span></span>
-                    <span></span>
-                </div>
-            `;
-            chatContainer.appendChild(indicator);
-            chatContainer.scrollTop = chatContainer.scrollHeight;
-        }
+            const modelSelect = document.getElementById('chatModel');
+            const selectedModel = modelSelect.value || 'Qwen/Qwen3-235B-A22B-Instruct-2507-FP8';
 
-        function hideTypingIndicator() {
-            const indicator = document.getElementById('typingIndicator');
-            if (indicator) {
-                indicator.remove();
-            }
-        }
+            gaEvent('chat_message_sent', { model: selectedModel });
 
-        function showError(message) {
-            hideTypingIndicator();
-            const errorDiv = document.createElement('div');
-            errorDiv.className = 'error-message';
-            errorDiv.textContent = `Error: ${message}`;
-            chatContainer.appendChild(errorDiv);
-            chatContainer.scrollTop = chatContainer.scrollHeight;
-        }
+            const maxRetries = 3;
+            const retryDelay = 10000;
+            let lastError = null;
 
-        async function sendMessage() {
-            const message = messageInput.value.trim();
-            if (!message) return;
+            for (let attempt = 1; attempt <= maxRetries; attempt++) {
+                try {
+                    if (attempt > 1) {
+                        const retryMsgDiv = document.createElement('div');
+                        retryMsgDiv.className = 'message system';
+                        retryMsgDiv.style.opacity = '0.7';
+                        retryMsgDiv.style.fontStyle = 'italic';
+                        retryMsgDiv.innerHTML = `<div class="message-content">Retrying ${attempt}/${maxRetries}...</div>`;
+                        messagesDiv.appendChild(retryMsgDiv);
+                        messagesDiv.scrollTop = messagesDiv.scrollHeight;
+                    }
 
-            const apiKey = apiKeyInput.value.trim();
-            if (!apiKey) {
-                alert('Please enter API Key');
-                return;
-            }
+                    const response = await fetch('/v1/chat/completions', {
+                        method: 'POST',
+                        headers: {
+                            'Content-Type': 'application/json',
+                            'Authorization': `Bearer ${apiKey}`
+                        },
+                        body: JSON.stringify({
+                            model: selectedModel,
+                            messages: [...messages, { role: 'user', content: message }],
+                            stream: false
+                        })
+                    });
+
+                    if (!response.ok) {
+                        const error = await response.json().catch(() => ({ detail: 'Unknown error' }));
+                        throw new Error(error.detail || 'Request failed');
+                    }
 
-            // Add user message
-            addMessage('user', message);
-            messageInput.value = '';
-            sendButton.disabled = true;
-            showTypingIndicator();
+                    const data = await response.json();
+                    const content = data.choices?.[0]?.message?.content || 'No response';
 
-            const requestBody = {
-                model: modelSelect.value,
-                messages: [...messages.map(m => ({ role: m.role, content: m.content }))],
-                stream: streamSelect.value === 'true'
-            };
+                    const retryMessages = messagesDiv.querySelectorAll('.message.system');
+                    retryMessages.forEach(msg => msg.remove());
 
-            try {
-                if (requestBody.stream) {
-                    await handleStreamingRequest(apiKey, requestBody);
-                } else {
-                    await handleNonStreamingRequest(apiKey, requestBody);
+                    addChatMessage('assistant', content);
+                    return;
+                } catch (error) {
+                    console.error(`Chat error (attempt ${attempt}/${maxRetries}):`, error);
+                    lastError = error;
+                    if (attempt < maxRetries) {
+                        await new Promise(resolve => setTimeout(resolve, retryDelay));
+                    }
                 }
-            } catch (error) {
-                showError(error.message || 'An error occurred while sending the request');
-                console.error('Error:', error);
-            } finally {
-                sendButton.disabled = false;
-                messageInput.focus();
             }
+
+            gaEvent('chat_error', { model: selectedModel, error: (lastError && lastError.message) || 'Unknown error' });
+            const errorDiv = document.createElement('div');
+            errorDiv.className = 'error-message';
+            errorDiv.innerHTML = `Error after ${maxRetries} attempts: ${lastError.message || 'Unknown error'}`;
+            messagesDiv.appendChild(errorDiv);
+
+            const retryMessages = messagesDiv.querySelectorAll('.message.system');
+            retryMessages.forEach(msg => msg.remove());
         }
 
-        async function handleStreamingRequest(apiKey, requestBody) {
-            const response = await fetch('/v1/chat/completions', {
-                method: 'POST',
-                headers: {
-                    'Content-Type': 'application/json',
-                    'Authorization': `Bearer ${apiKey}`
-                },
-                body: JSON.stringify(requestBody)
-            });
+        function addChatMessage(role, content) {
+            messages.push({ role, content });
+            const messagesDiv = document.getElementById('chatMessages');
+            const messageDiv = document.createElement('div');
+            messageDiv.className = `message ${role}`;
 
-            if (!response.ok) {
-                const error = await response.json().catch(() => ({ detail: 'Unknown error' }));
-                throw new Error(error.detail || `HTTP ${response.status}`);
-            }
+            const avatarDiv = document.createElement('div');
+            avatarDiv.className = 'message-avatar';
+            avatarDiv.textContent = role === 'user' ? 'U' : 'AI';
 
-            hideTypingIndicator();
-            const reader = response.body.getReader();
-            const decoder = new TextDecoder();
-            let assistantMessage = '';
-            let messageDiv = null;
-            let contentDiv = null;
-
-            const createMessageDiv = () => {
-                messageDiv = document.createElement('div');
-                messageDiv.className = 'message assistant';
-                
-                const avatar = document.createElement('div');
-                avatar.className = 'message-avatar';
-                avatar.textContent = 'A';
-                
-                contentDiv = document.createElement('div');
-                contentDiv.className = 'message-content';
-                
-                messageDiv.appendChild(avatar);
-                messageDiv.appendChild(contentDiv);
-                chatContainer.appendChild(messageDiv);
-            };
-
-            while (true) {
-                const { done, value } = await reader.read();
-                if (done) break;
-
-                const chunk = decoder.decode(value);
-                const lines = chunk.split('\n').filter(line => line.trim());
-
-                for (const line of lines) {
-                    if (line.startsWith('data: ')) {
-                        const data = line.slice(6);
-                        if (data === '[DONE]') continue;
-
-                        try {
-                            const json = JSON.parse(data);
-                            const delta = json.choices?.[0]?.delta?.content;
-                            if (delta) {
-                                if (!messageDiv) {
-                                    createMessageDiv();
-                                }
-                                assistantMessage += delta;
-                                contentDiv.textContent = assistantMessage;
-                                chatContainer.scrollTop = chatContainer.scrollHeight;
-                            }
-                        } catch (e) {
-                            // Ignore parsing errors for individual chunks
-                        }
-                    }
-                }
-            }
+            const contentDiv = document.createElement('div');
+            contentDiv.className = 'message-content';
+            contentDiv.textContent = content;
 
-            if (messageDiv) {
-                messages.push({ role: 'assistant', content: assistantMessage });
-            }
+            messageDiv.appendChild(avatarDiv);
+            messageDiv.appendChild(contentDiv);
+            messagesDiv.appendChild(messageDiv);
+            messagesDiv.scrollTop = messagesDiv.scrollHeight;
         }
 
-        async function handleNonStreamingRequest(apiKey, requestBody) {
-            const response = await fetch('/v1/chat/completions', {
-                method: 'POST',
-                headers: {
-                    'Content-Type': 'application/json',
-                    'Authorization': `Bearer ${apiKey}`
-                },
-                body: JSON.stringify(requestBody)
+        // Copy code block buttons
+        document.querySelectorAll('.docs-code-copy-btn').forEach(btn => {
+            btn.addEventListener('click', async () => {
+                const id = btn.getAttribute('data-copy-target');
+                const block = id ? document.getElementById(id) : btn.closest('.docs-code-block');
+                const pre = block && block.querySelector('pre');
+                const code = block && (pre ? pre.querySelector('code') : block.querySelector('code'));
+                const text = (code || pre || block).textContent;
+                try {
+                    await navigator.clipboard.writeText(text);
+                    btn.textContent = 'Copied!';
+                    btn.classList.add('copied');
+                    setTimeout(() => { btn.textContent = 'Copy'; btn.classList.remove('copied'); }, 2000);
+                } catch (e) {
+                    btn.textContent = 'Failed';
+                    setTimeout(() => { btn.textContent = 'Copy'; }, 2000);
+                }
             });
+        });
 
-            if (!response.ok) {
-                const error = await response.json().catch(() => ({ detail: 'Unknown error' }));
-                throw new Error(error.detail || `HTTP ${response.status}`);
-            }
+        // Page navigation
+        document.querySelectorAll('.sidebar-nav a').forEach(link => {
+            link.addEventListener('click', (e) => {
+                e.preventDefault();
+                const page = e.target.getAttribute('data-page');
 
-            const data = await response.json();
-            const content = data.choices?.[0]?.message?.content || 'No response';
-            
-            hideTypingIndicator();
-            addMessage('assistant', content);
-        }
+                document.querySelectorAll('.sidebar-nav a').forEach(a => a.classList.remove('active'));
+                e.target.classList.add('active');
 
-        sendButton.addEventListener('click', sendMessage);
-        messageInput.addEventListener('keypress', (e) => {
-            if (e.key === 'Enter' && !e.shiftKey) {
-                e.preventDefault();
-                sendMessage();
-            }
+                document.querySelectorAll('.page').forEach(p => p.classList.remove('active'));
+                document.getElementById(page).classList.add('active');
+            });
+        });
+
+        // Event listeners
+        document.getElementById('sendChatBtn').addEventListener('click', sendChatMessage);
+        document.getElementById('chatModel').addEventListener('change', (e) => {
+            const model = e.target.value;
+            if (model) gaEvent('model_selected', { model: model });
+        });
+        document.getElementById('chatInput').addEventListener('keypress', (e) => {
+            if (e.key === 'Enter') sendChatMessage();
         });
 
-        // Focus on input field on load
-        messageInput.focus();
+        // Load models on page load
+        window.addEventListener('load', () => {
+            loadChatModels();
+        });
     </script>
+
+    
+    <footer class="footer">
+        <div class="footer-content">
+            <div class="footer-links">
+                <a href="https://gonka.ai/" target="_blank" rel="noopener noreferrer">
+                    <svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke="currentColor">
+                        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13.828 10.172a4 4 0 00-5.656 0l-4 4a4 4 0 105.656 5.656l1.102-1.101m-.758-4.899a4 4 0 005.656 0l4-4a4 4 0 00-5.656-5.656l-1.1 1.1" />
+                    </svg>
+                    Gonka Protocol
+                </a>
+                <a href="https://discord.gg/mPd629wmSq" target="_blank" rel="noopener noreferrer">
+                    <svg xmlns="http://www.w3.org/2000/svg" fill="currentColor" viewBox="0 0 24 24">
+                        <path d="M20.317 4.37a19.791 19.791 0 0 0-4.885-1.515a.074.074 0 0 0-.079.037c-.21.375-.444.864-.608 1.25a18.27 18.27 0 0 0-5.487 0a12.64 12.64 0 0 0-.617-1.25a.077.077 0 0 0-.079-.037A19.736 19.736 0 0 0 3.677 4.37a.07.07 0 0 0-.032.027C.533 9.046-.32 13.58.099 18.057a.082.082 0 0 0 .031.057a19.9 19.9 0 0 0 5.993 3.03a.078.078 0 0 0 .084-.028a14.09 14.09 0 0 0 1.226-1.994a.076.076 0 0 0-.041-.106a13.107 13.107 0 0 1-1.872-.892a.077.077 0 0 1-.008-.128a10.2 10.2 0 0 0 .372-.292a.074.074 0 0 1 .077-.01c3.928 1.793 8.18 1.793 12.062 0a.074.074 0 0 1 .078.01c.12.098.246.198.373.292a.077.077 0 0 1-.006.127a12.299 12.299 0 0 1-1.873.892a.077.077 0 0 0-.041.107c.36.698.772 1.362 1.225 1.993a.076.076 0 0 0 .084.028a19.839 19.839 0 0 0 6.002-3.03a.077.077 0 0 0 .032-.054c.5-5.177-.838-9.674-3.549-13.66a.061.061 0 0 0-.031-.03zM8.02 15.33c-1.183 0-2.157-1.085-2.157-2.419c0-1.333.956-2.419 2.157-2.419c1.21 0 2.176 1.096 2.157 2.42c0 1.333-.956 2.418-2.157 2.418zm7.975 0c-1.183 0-2.157-1.085-2.157-2.419c0-1.333.955-2.419 2.157-2.419c1.21 0 2.176 1.096 2.157 2.42c0 1.333-.946 2.418-2.157 2.418z"/>
+                    </svg>
+                    Discord
+                </a>
+                <a href="https://cloudmine.mingles.ai/" target="_blank" rel="noopener noreferrer">
+                    <svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke="currentColor">
+                        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M3 15a4 4 0 004 4h9a5 5 0 10-.1-9.999 5.002 5.002 0 10-9.78 2.096A4.001 4.001 0 003 15z" />
+                    </svg>
+                    CloudMine
+                </a>
+                <a href="https://gonka-gateway.mingles.ai/" target="_blank" rel="noopener noreferrer">
+                    <svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke="currentColor">
+                        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M13 10V3L4 14h7v7l9-11h-7z" />
+                    </svg>
+                    Gateway
+                </a>
+                <a href="https://gnk.space/" target="_blank" rel="noopener noreferrer">
+                    <svg xmlns="http://www.w3.org/2000/svg" fill="none" viewBox="0 0 24 24" stroke="currentColor">
+                        <path stroke-linecap="round" stroke-linejoin="round" stroke-width="2" d="M21 12a9 9 0 01-9 9m9-9a9 9 0 00-9-9m9 9H3m9 9a9 9 0 01-9-9m9 9c1.657 0 3-4.03 3-9s-1.343-9-3-9m0 18c-1.657 0-3-4.03-3-9s1.343-9 3-9m-9 9a9 9 0 019-9" />
+                    </svg>
+                    GonkaAI Hub
+                </a>
+                <a href="https://github.com/MinglesAI/gonka-proxy" target="_blank" rel="noopener noreferrer">
+                    <svg xmlns="http://www.w3.org/2000/svg" fill="currentColor" viewBox="0 0 24 24">
+                        <path d="M12 0c-6.626 0-12 5.373-12 12 0 5.302 3.438 9.8 8.207 11.387.599.111.793-.261.793-.577v-2.234c-3.338.726-4.033-1.416-4.033-1.416-.546-1.387-1.333-1.756-1.333-1.756-1.089-.745.083-.729.083-.729 1.205.084 1.839 1.237 1.839 1.237 1.07 1.834 2.807 1.304 3.492.997.107-.775.418-1.305.762-1.604-2.665-.305-5.467-1.334-5.467-5.931 0-1.311.469-2.381 1.236-3.221-.124-.303-.535-1.524.117-3.176 0 0 1.008-.322 3.301 1.23.957-.266 1.983-.399 3.003-.404 1.02.005 2.047.138 3.006.404 2.291-1.552 3.297-1.23 3.297-1.23.653 1.653.242 2.874.118 3.176.77.84 1.235 1.911 1.235 3.221 0 4.609-2.807 5.624-5.479 5.921.43.372.823 1.102.823 2.222v3.293c0 .319.192.694.801.576 4.765-1.589 8.199-6.086 8.199-11.386 0-6.627-5.373-12-12-12z"/>
+                    </svg>
+                    GitHub
+                </a>
+            </div>
+            <div class="footer-text">
+                Powered by Gonka Protocol
+            </div>
+        </div>
+    </footer>
 </body>
 </html>
-
diff --git a/app/tool_emulation.py b/app/tool_emulation.py
new file mode 100644
index 0000000..8feeb8f
--- /dev/null
+++ b/app/tool_emulation.py
@@ -0,0 +1,530 @@
+"""
+Tool emulation module for models that don't support native tool calling.
+Emulates tool_choice: "auto" by converting tools to prompts and parsing responses.
+"""
+import json
+import re
+import logging
+import time
+from typing import Dict, List, Optional, Any
+
+logger = logging.getLogger(__name__)
+
+
+def format_tools_for_prompt(tools: List[Dict]) -> str:
+    """Convert tools list to human-readable prompt description"""
+    if not tools:
+        return ""
+    
+    descriptions = []
+    for i, tool in enumerate(tools, 1):
+        func = tool.get("function", {})
+        name = func.get("name", "")
+        desc = func.get("description", "")
+        params = func.get("parameters", {})
+        
+        func_desc = f"{i}. Function: {name}"
+        if desc:
+            func_desc += f"\n   Description: {desc}"
+        
+        if params:
+            props = params.get("properties", {})
+            required = params.get("required", [])
+            if props:
+                func_desc += "\n   Parameters:"
+                for param_name, param_info in props.items():
+                    param_type = param_info.get("type", "string")
+                    param_desc = param_info.get("description", "")
+                    is_required = param_name in required
+                    req_marker = " (required)" if is_required else " (optional)"
+                    func_desc += f"\n     - {param_name} ({param_type}){req_marker}"
+                    if param_desc:
+                        func_desc += f": {param_desc}"
+        
+        descriptions.append(func_desc)
+    
+    return "\n\n".join(descriptions)
+
+
+def create_tool_selection_prompt(tools: List[Dict]) -> str:
+    """Create system prompt for tool selection"""
+    tools_description = format_tools_for_prompt(tools)
+    
+    prompt = f"""You are a helpful assistant with access to the following functions:
+
+{tools_description}
+
+When the user's request requires calling a function, you MUST respond with a JSON object in this exact format:
+{{
+  "reasoning": "brief explanation of why you need to call a function",
+  "tool_calls": [
+    {{
+      "id": "call_abc123",
+      "type": "function",
+      "function": {{
+        "name": "function_name",
+        "arguments": "{{\\"param1\\": \\"value1\\", \\"param2\\": \\"value2\\"}}"
+      }}
+    }}
+  ]
+}}
+
+IMPORTANT RULES:
+- The "arguments" field must be a valid JSON string (escaped)
+- If multiple functions are needed, include all in the "tool_calls" array
+- If no function is needed, respond normally with regular text (not JSON)
+- Always use the exact function names from the list above
+- Include all required parameters in the arguments
+
+Example for a single function call:
+{{
+  "reasoning": "User wants weather information",
+  "tool_calls": [
+    {{
+      "id": "call_123",
+      "type": "function",
+      "function": {{
+        "name": "get_weather",
+        "arguments": "{{\\"location\\": \\"Moscow\\", \\"units\\": \\"celsius\\"}}"
+      }}
+    }}
+  ]
+}}"""
+    
+    return prompt
+
+
+def extract_json_from_text(text: str) -> Optional[Dict]:
+    """Extract JSON object from text response"""
+    if not text:
+        return None
+    
+    # Try to find JSON in code blocks first (most reliable)
+    code_block_pattern = r'```(?:json)?\s*(\{.*?\})\s*```'
+    matches = re.finditer(code_block_pattern, text, re.DOTALL)
+    for match in matches:
+        try:
+            json_str = match.group(1).strip()
+            parsed = json.loads(json_str)
+            if "tool_calls" in parsed:
+                return parsed
+        except (json.JSONDecodeError, AttributeError):
+            continue
+    
+    # Try to find JSON object with tool_calls - look for opening brace before tool_calls
+    # Find all positions of "tool_calls"
+    tool_calls_positions = [m.start() for m in re.finditer(r'"tool_calls"', text)]
+    
+    for pos in tool_calls_positions:
+        # Find the opening brace before this position
+        start_pos = text.rfind('{', 0, pos)
+        if start_pos == -1:
+            continue
+        
+        # Find the matching closing brace
+        brace_count = 0
+        end_pos = start_pos
+        for i in range(start_pos, len(text)):
+            if text[i] == '{':
+                brace_count += 1
+            elif text[i] == '}':
+                brace_count -= 1
+                if brace_count == 0:
+                    end_pos = i + 1
+                    break
+        
+        if end_pos > start_pos:
+            try:
+                json_str = text[start_pos:end_pos]
+                parsed = json.loads(json_str)
+                if "tool_calls" in parsed:
+                    return parsed
+            except json.JSONDecodeError:
+                continue
+    
+    # Fallback: try to find any JSON object that might contain tool_calls
+    # Look for patterns like { ... "tool_calls": ... }
+    json_pattern = r'\{[^{}]*"tool_calls"[^{}]*\}'
+    # Try to expand to include nested braces
+    for match in re.finditer(r'\{', text):
+        start = match.start()
+        brace_count = 0
+        end = start
+        
+        for i in range(start, min(start + 5000, len(text))):  # Limit search
+            if text[i] == '{':
+                brace_count += 1
+            elif text[i] == '}':
+                brace_count -= 1
+                if brace_count == 0:
+                    end = i + 1
+                    break
+        
+        if end > start:
+            try:
+                json_str = text[start:end]
+                if '"tool_calls"' in json_str:
+                    parsed = json.loads(json_str)
+                    if "tool_calls" in parsed:
+                        return parsed
+            except json.JSONDecodeError:
+                continue
+    
+    return None
+
+
+def _extract_one_tool_call_from_str(json_str: str) -> Optional[Dict]:
+    """Find first complete JSON object in string and parse as {name, arguments}. Returns None on failure."""
+    brace_start = json_str.find("{")
+    if brace_start == -1:
+        return None
+    brace_count = 0
+    brace_end = -1
+    for i in range(brace_start, len(json_str)):
+        if json_str[i] == "{":
+            brace_count += 1
+        elif json_str[i] == "}":
+            brace_count -= 1
+            if brace_count == 0:
+                brace_end = i + 1
+                break
+    if brace_end == -1:
+        return None
+    try:
+        parsed = json.loads(json_str[brace_start:brace_end])
+        name = parsed.get("name")
+        arguments = parsed.get("arguments", {})
+        if name:
+            return {"name": name, "arguments": arguments}
+    except (json.JSONDecodeError, AttributeError):
+        pass
+    return None
+
+
+def extract_tool_calls_from_xml_tags(text: str) -> Optional[List[Dict]]:
+    """
+    Extract tool calls from <tool_call>...</tool_call> tags (OpenClaw/ChainKlawd format).
+    Also handles truncated stream: if </tool_call> is missing but JSON after <tool_call> is complete, parse it.
+    Each block contains JSON: {"name": "func_name", "arguments": {...}}
+    """
+    if not text or "<tool_call>" not in text:
+        return None
+    
+    result = []
+    pos = 0
+    while True:
+        start_tag = text.find("<tool_call>", pos)
+        if start_tag == -1:
+            break
+        content_start = start_tag + len("<tool_call>")
+        end_tag = text.find("</tool_call>", content_start)
+        if end_tag == -1:
+            # Truncated or stream ended before </tool_call> - try to parse JSON from content to end of text
+            json_str = text[content_start:].strip()
+            pos = len(text)
+        else:
+            json_str = text[content_start:end_tag].strip()
+            pos = end_tag + len("</tool_call>")
+        
+        one = _extract_one_tool_call_from_str(json_str)
+        if one:
+            result.append(one)
+        if end_tag == -1:
+            break
+    
+    return result if result else None
+
+
+def split_content_for_streaming(content: str, original_tools: List[Dict]) -> tuple:
+    """
+    Split content into reasoning (text to stream) and tool_calls (to emit as delta).
+    Returns (reasoning_content, tool_calls_list or None).
+    If tool_calls found: reasoning = content before the tool_calls block.
+    If no tool_calls: reasoning = full content, tool_calls = None.
+    """
+    if not content:
+        return ("", None)
+
+    tool_calls = parse_tool_calls_from_response(content, original_tools)
+    if not tool_calls:
+        return (content, None)
+
+    # Find where tool_calls block starts to extract reasoning
+    # Try XML format first (simpler): <tool_call>...</tool_call>
+    xml_start = content.find("<tool_call>")
+    if xml_start >= 0:
+        return (content[:xml_start].rstrip(), tool_calls)
+
+    # Try JSON format: {"tool_calls": [...]}
+    pos = content.find('"tool_calls"')
+    if pos > 0:
+        start = content.rfind('{', 0, pos)
+        if start >= 0:
+            return (content[:start].rstrip(), tool_calls)
+
+    return (content, tool_calls)
+
+
+def parse_tool_calls_from_response(response_content: str, original_tools: List[Dict]) -> Optional[List[Dict]]:
+    """Parse tool calls from model response and format as OpenAI tool_calls.
+    Supports:
+    1. JSON with tool_calls array (emulation prompt format)
+    2. <tool_call>{"name": "...", "arguments": {...}}</tool_call> (OpenClaw/ChainKlawd format)
+    """
+    if not response_content:
+        return None
+    
+    # Get available function names for validation
+    available_functions = {tool["function"]["name"] for tool in original_tools}
+    tool_calls_data = None
+    
+    # Try JSON format first ({"tool_calls": [{"function": {...}}]})
+    parsed_json = extract_json_from_text(response_content)
+    if parsed_json:
+        tool_calls_data = parsed_json.get("tool_calls", [])
+    
+    # Fallback: <tool_call>{"name": "...", "arguments": {...}}</tool_call> format
+    if not tool_calls_data:
+        xml_calls = extract_tool_calls_from_xml_tags(response_content)
+        if xml_calls:
+            tool_calls_data = [
+                {"function": {"name": c["name"], "arguments": c["arguments"]}, "id": None}
+                for c in xml_calls
+            ]
+    
+    if not tool_calls_data:
+        return None
+    
+    # Format as OpenAI tool_calls
+    formatted_calls = []
+    for i, call in enumerate(tool_calls_data):
+        func_info = call.get("function", {})
+        func_name = func_info.get("name", "")
+        
+        # Validate function name (warn but still include so client can handle)
+        if available_functions and func_name not in available_functions:
+            logger.debug(f"Function {func_name} not in available tools list, including anyway")
+        
+        # Get arguments (should be JSON string)
+        arguments = func_info.get("arguments", "{}")
+        if isinstance(arguments, str):
+            # Try to parse to validate
+            try:
+                json.loads(arguments)
+            except json.JSONDecodeError:
+                logger.warning(f"Invalid JSON in arguments for {func_name}, using as-is")
+        elif isinstance(arguments, dict):
+            # Convert dict to JSON string
+            arguments = json.dumps(arguments)
+        
+        call_id = call.get("id", f"call_{i}_{int(time.time() * 1000)}")
+        
+        formatted_calls.append({
+            "id": call_id,
+            "type": "function",
+            "function": {
+                "name": func_name,
+                "arguments": arguments if isinstance(arguments, str) else json.dumps(arguments)
+            }
+        })
+    
+    return formatted_calls if formatted_calls else None
+
+
+def emulate_tool_choice_auto(body: Dict) -> Dict:
+    """Emulate tool_choice: auto by converting tools to prompt instructions"""
+    tools = body.get("tools", [])
+    tool_choice = body.get("tool_choice")
+    
+    # Only emulate if tools are present
+    if not tools:
+        return body
+    
+    # Accept "auto" or None (None means default which is "auto")
+    if tool_choice is not None and tool_choice != "auto":
+        return body
+    
+    logger.info(f"Emulating tool_choice: auto for {len(tools)} tools")
+    
+    # Create modified request body (deep copy to avoid modifying original)
+    import copy
+    modified_body = copy.deepcopy(body)
+    
+    # Remove tools and tool_choice from request to model
+    if "tool_choice" in modified_body:
+        del modified_body["tool_choice"]
+    if "tools" in modified_body:
+        del modified_body["tools"]
+    
+    # Create system prompt with tool descriptions
+    system_prompt = create_tool_selection_prompt(tools)
+    
+    # Add or update system message
+    messages = modified_body.get("messages", []).copy()
+    
+    # Check if there's already a system message
+    has_system = messages and messages[0].get("role") == "system"
+    
+    if has_system:
+        # Prepend tool selection instructions to existing system message
+        existing_system = messages[0].get("content", "")
+        messages[0]["content"] = f"{system_prompt}\n\n{existing_system}"
+    else:
+        # Insert new system message at the beginning
+        messages.insert(0, {
+            "role": "system",
+            "content": system_prompt
+        })
+    
+    modified_body["messages"] = messages
+    
+    # Store original tools separately (not in body, will be passed separately)
+    # Don't add _original_tools to body as it will be sent to API
+    
+    return modified_body
+
+
+def process_response_with_tool_emulation(response: Dict, original_tools: Optional[List[Dict]] = None) -> Dict:
+    """Process model response and add tool_calls if detected"""
+    if not original_tools:
+        # Try to get from response metadata
+        original_tools = response.get("_original_tools")
+    
+    if not original_tools:
+        return response
+    
+    # Get response content
+    choices = response.get("choices", [])
+    if not choices:
+        return response
+    
+    message = choices[0].get("message", {})
+    content = message.get("content", "")
+    
+    # Normalize: ML nodes (vLLM etc) may return content as array
+    if isinstance(content, list):
+        parts = []
+        for block in content:
+            if isinstance(block, dict) and block.get("type") == "text" and "text" in block:
+                parts.append(block["text"])
+        content = "\n".join(parts) if parts else ""
+    
+    if not content:
+        return response
+    
+    # Try to parse tool calls from response
+    tool_calls = parse_tool_calls_from_response(content, original_tools)
+    
+    if tool_calls:
+        tool_names = [tc["function"]["name"] for tc in tool_calls]
+        content_preview = (content or "")[:250].replace("\n", " ")
+        logger.info(f"Detected {len(tool_calls)} tool call(s): {tool_names} | content preview: {content_preview!r}")
+        
+        # Modify response to OpenAI format with tool_calls
+        # content=None: OpenAI spec - when tool_calls present, content is null
+        message["tool_calls"] = tool_calls
+        message["content"] = None
+        message["role"] = "assistant"
+        
+        # Update finish_reason
+        choices[0]["finish_reason"] = "tool_calls"
+    
+    return response
+
+
+async def process_stream_with_tool_emulation(
+    stream,
+    original_tools: List[Dict],
+    chunk_callback=None
+):
+    """
+    Process streaming response with tool emulation.
+    Streams each event to the client immediately (so client gets data and does not timeout).
+    Accumulates content for parsing; when [DONE] and tool_calls are found, appends a tool_calls chunk.
+    Yields: raw bytes (SSE format).
+    """
+    total_content = ""
+    first_chunk_data = None
+    sse_line_buffer = ""
+
+    async for chunk in stream:
+        try:
+            chunk_str = chunk.decode("utf-8")
+        except Exception:
+            yield chunk
+            continue
+
+        sse_line_buffer += chunk_str
+
+        while "\n\n" in sse_line_buffer:
+            event, sse_line_buffer = sse_line_buffer.split("\n\n", 1)
+            event = event.strip()
+            if not event.startswith("data: "):
+                continue
+            data_str = event[6:].strip()
+            event_bytes = f"data: {data_str}\n\n".encode("utf-8")
+
+            if data_str == "[DONE]":
+                if chunk_callback:
+                    chunk_callback(total_content)
+                try:
+                    reasoning, tool_calls = split_content_for_streaming(total_content, original_tools)
+                except Exception as e:
+                    logger.warning(f"[TOOL EMULATION STREAM] Parse error: {e}")
+                    tool_calls = None
+
+                if tool_calls:
+                    logger.info(f"[TOOL EMULATION STREAM] Detected {len(tool_calls)} tool call(s)")
+                    try:
+                        base = first_chunk_data or {
+                            "id": "chatcmpl-tool-emu",
+                            "object": "chat.completion.chunk",
+                            "choices": [{"index": 0, "delta": {}, "finish_reason": None}],
+                        }
+                        if first_chunk_data and "model" in first_chunk_data:
+                            base["model"] = first_chunk_data["model"]
+                        tc_deltas = [
+                            {
+                                "index": idx,
+                                "id": tc.get("id") or f"call_{idx}_{int(time.time() * 1000)}",
+                                "type": "function",
+                                "function": {"name": tc["function"]["name"], "arguments": tc["function"]["arguments"]},
+                            }
+                            for idx, tc in enumerate(tool_calls)
+                        ]
+                        tc_chunk = {
+                            **base,
+                            "choices": [
+                                {
+                                    "index": 0,
+                                    "delta": {"tool_calls": tc_deltas},
+                                    "finish_reason": "tool_calls",
+                                }
+                            ],
+                        }
+                        yield f"data: {json.dumps(tc_chunk)}\n\n".encode("utf-8")
+                    except Exception as e:
+                        logger.exception(f"[TOOL EMULATION STREAM] Re-emit error: {e}")
+                yield event_bytes
+                return
+
+            try:
+                chunk_data = json.loads(data_str)
+                if first_chunk_data is None:
+                    first_chunk_data = chunk_data
+                choices = chunk_data.get("choices", [])
+                delta = choices[0].get("delta", {}) if choices else {}
+                if "content" in delta:
+                    total_content += delta["content"]
+            except (json.JSONDecodeError, IndexError, KeyError):
+                pass
+
+            yield event_bytes
+
+    if chunk_callback:
+        chunk_callback(total_content)
+
+    # Stream ended without [DONE] - yield remainder as-is so client gets everything
+    if sse_line_buffer:
+        yield sse_line_buffer.encode("utf-8")
+
diff --git a/docker-compose.yml b/docker-compose.yml
new file mode 100644
index 0000000..af80861
--- /dev/null
+++ b/docker-compose.yml
@@ -0,0 +1,13 @@
+services:
+  gonka-proxy:
+    build: .
+    ports:
+      - "8000:8000"
+    env_file:
+      - .env
+    restart: unless-stopped
+    healthcheck:
+      test: ["CMD", "curl", "-f", "http://localhost:8000/health"]
+      interval: 10s
+      timeout: 5s
+      retries: 3
diff --git a/test_tool_emulation.py b/test_tool_emulation.py
new file mode 100644
index 0000000..ab15416
--- /dev/null
+++ b/test_tool_emulation.py
@@ -0,0 +1,449 @@
+#!/usr/bin/env python3
+"""
+Test script for tool emulation functionality
+Tests the emulation of tool_choice: "auto" for models that don't support native tool calling
+"""
+import sys
+import os
+import json
+
+# Add app to path
+sys.path.insert(0, os.path.dirname(os.path.abspath(__file__)))
+
+from app.tool_emulation import (
+    format_tools_for_prompt,
+    create_tool_selection_prompt,
+    extract_json_from_text,
+    extract_tool_calls_from_xml_tags,
+    parse_tool_calls_from_response,
+    emulate_tool_choice_auto,
+    process_response_with_tool_emulation
+)
+
+
+def test_format_tools_for_prompt():
+    """Test formatting tools into prompt description"""
+    print("=" * 50)
+    print("Test 1: Format Tools for Prompt")
+    print("=" * 50)
+    
+    tools = [
+        {
+            "type": "function",
+            "function": {
+                "name": "get_weather",
+                "description": "Get current weather",
+                "parameters": {
+                    "type": "object",
+                    "properties": {
+                        "location": {
+                            "type": "string",
+                            "description": "City name"
+                        },
+                        "units": {
+                            "type": "string",
+                            "enum": ["celsius", "fahrenheit"],
+                            "description": "Temperature units"
+                        }
+                    },
+                    "required": ["location"]
+                }
+            }
+        }
+    ]
+    
+    result = format_tools_for_prompt(tools)
+    
+    if "get_weather" not in result:
+        print("✗ Function name not found in prompt")
+        return False
+    
+    if "Get current weather" not in result:
+        print("✗ Function description not found")
+        return False
+    
+    if "location" not in result:
+        print("✗ Parameter not found")
+        return False
+    
+    print("✓ Tools formatted correctly")
+    print(f"Prompt preview: {result[:100]}...")
+    print("✓ Test 1 passed!\n")
+    return True
+
+
+def test_create_tool_selection_prompt():
+    """Test creation of tool selection prompt"""
+    print("=" * 50)
+    print("Test 2: Create Tool Selection Prompt")
+    print("=" * 50)
+    
+    tools = [
+        {
+            "type": "function",
+            "function": {
+                "name": "calculate",
+                "description": "Perform calculation",
+                "parameters": {
+                    "type": "object",
+                    "properties": {
+                        "expression": {"type": "string", "description": "Math expression"}
+                    },
+                    "required": ["expression"]
+                }
+            }
+        }
+    ]
+    
+    prompt = create_tool_selection_prompt(tools)
+    
+    if "calculate" not in prompt:
+        print("✗ Function name not in prompt")
+        return False
+    
+    if "tool_calls" not in prompt:
+        print("✗ tool_calls format not in prompt")
+        return False
+    
+    if "JSON" not in prompt:
+        print("✗ JSON format instruction not in prompt")
+        return False
+    
+    print("✓ Tool selection prompt created correctly")
+    print("✓ Test 2 passed!\n")
+    return True
+
+
+def test_extract_json_from_text():
+    """Test JSON extraction from text"""
+    print("=" * 50)
+    print("Test 3: Extract JSON from Text")
+    print("=" * 50)
+    
+    # Test 1: Simple JSON
+    text1 = 'Here is the response: {"tool_calls": [{"id": "call_1", "function": {"name": "test"}}]}'
+    result1 = extract_json_from_text(text1)
+    if not result1 or "tool_calls" not in result1:
+        print("✗ Failed to extract simple JSON")
+        return False
+    print("✓ Simple JSON extracted")
+    
+    # Test 2: JSON in code block
+    text2 = '```json\n{"tool_calls": [{"id": "call_2"}]}\n```'
+    result2 = extract_json_from_text(text2)
+    if not result2 or "tool_calls" not in result2:
+        print("✗ Failed to extract JSON from code block")
+        return False
+    print("✓ JSON from code block extracted")
+    
+    # Test 3: No JSON
+    text3 = "Just regular text without JSON"
+    result3 = extract_json_from_text(text3)
+    if result3 is not None:
+        print("✗ Should return None for text without JSON")
+        return False
+    print("✓ Correctly returns None for text without JSON")
+    
+    print("✓ Test 3 passed!\n")
+    return True
+
+
+def test_parse_tool_calls_from_response():
+    """Test parsing tool calls from response"""
+    print("=" * 50)
+    print("Test 4: Parse Tool Calls from Response")
+    print("=" * 50)
+    
+    original_tools = [
+        {
+            "type": "function",
+            "function": {
+                "name": "get_weather",
+                "description": "Get weather"
+            }
+        },
+        {
+            "type": "function",
+            "function": {
+                "name": "calculate",
+                "description": "Calculate"
+            }
+        }
+    ]
+    
+    # Test 1: Valid JSON response
+    response1 = '''{
+  "reasoning": "User wants weather",
+  "tool_calls": [
+    {
+      "id": "call_123",
+      "type": "function",
+      "function": {
+        "name": "get_weather",
+        "arguments": "{\\"location\\": \\"Moscow\\"}"
+      }
+    }
+  ]
+}'''
+    
+    result1 = parse_tool_calls_from_response(response1, original_tools)
+    if not result1:
+        print("✗ Failed to parse valid tool calls")
+        return False
+    
+    if len(result1) != 1:
+        print(f"✗ Expected 1 tool call, got {len(result1)}")
+        return False
+    
+    if result1[0]["function"]["name"] != "get_weather":
+        print("✗ Wrong function name")
+        return False
+    
+    print("✓ Valid tool calls parsed correctly")
+    
+    # Test 2: Invalid function name
+    response2 = '''{
+  "tool_calls": [
+    {
+      "id": "call_456",
+      "function": {
+        "name": "unknown_function",
+        "arguments": "{}"
+      }
+    }
+  ]
+}'''
+    
+    result2 = parse_tool_calls_from_response(response2, original_tools)
+    if result2 and len(result2) > 0:
+        print("✗ Should skip invalid function names")
+        return False
+    
+    print("✓ Invalid function names correctly skipped")
+    
+    # Test 3: No tool calls
+    response3 = "Just a regular text response"
+    result3 = parse_tool_calls_from_response(response3, original_tools)
+    if result3 is not None:
+        print("✗ Should return None when no tool calls found")
+        return False
+    
+    print("✓ Correctly returns None when no tool calls")
+    
+    # Test 4: OpenClaw/ChainKlawd <tool_call> format
+    response4 = '''I'll check the file.
+<tool_call>
+{"name": "read", "arguments": {"path": "/root/.openclaw/workspace/BOOTSTRAP.md"}}
+</tool_call>'''
+    original_tools_read = [{"type": "function", "function": {"name": "read", "description": "Read file"}}]
+    result4 = parse_tool_calls_from_response(response4, original_tools_read)
+    if not result4 or len(result4) != 1:
+        print(f"✗ Failed to parse <tool_call> format, got {result4}")
+        return False
+    if result4[0]["function"]["name"] != "read":
+        print(f"✗ Wrong function name: {result4[0]['function']['name']}")
+        return False
+    args = json.loads(result4[0]["function"]["arguments"])
+    if args.get("path") != "/root/.openclaw/workspace/BOOTSTRAP.md":
+        print(f"✗ Wrong arguments: {args}")
+        return False
+    print("✓ <tool_call> format parsed correctly")
+    
+    print("✓ Test 4 passed!\n")
+    return True
+
+
+def test_emulate_tool_choice_auto():
+    """Test emulation of tool_choice: auto"""
+    print("=" * 50)
+    print("Test 5: Emulate Tool Choice Auto")
+    print("=" * 50)
+    
+    tools = [
+        {
+            "type": "function",
+            "function": {
+                "name": "test_func",
+                "description": "Test function"
+            }
+        }
+    ]
+    
+    body = {
+        "model": "test-model",
+        "messages": [{"role": "user", "content": "Hello"}],
+        "tool_choice": "auto",
+        "tools": tools
+    }
+    
+    result = emulate_tool_choice_auto(body)
+    
+    # Check that tool_choice and tools are removed
+    if "tool_choice" in result:
+        print("✗ tool_choice not removed")
+        return False
+    
+    if "tools" in result:
+        print("✗ tools not removed")
+        return False
+    
+    # Check that system message is added
+    messages = result.get("messages", [])
+    if not messages or messages[0].get("role") != "system":
+        print("✗ System message not added")
+        return False
+    
+    # Check that system message contains tool descriptions
+    system_content = messages[0].get("content", "")
+    if "test_func" not in system_content:
+        print("✗ Tool description not in system message")
+        return False
+    
+    if "tool_calls" not in system_content:
+        print("✗ Tool calls format not in system message")
+        return False
+    
+    print("✓ Tool choice emulation works correctly")
+    print("✓ Test 5 passed!\n")
+    return True
+
+
+def test_emulate_without_tools():
+    """Test that emulation doesn't modify request without tools"""
+    print("=" * 50)
+    print("Test 6: Emulate Without Tools")
+    print("=" * 50)
+    
+    body = {
+        "model": "test-model",
+        "messages": [{"role": "user", "content": "Hello"}]
+    }
+    
+    result = emulate_tool_choice_auto(body)
+    
+    if result != body:
+        print("✗ Request modified when no tools present")
+        return False
+    
+    print("✓ Request unchanged when no tools")
+    print("✓ Test 6 passed!\n")
+    return True
+
+
+def test_process_response_with_tool_emulation():
+    """Test processing response with tool emulation"""
+    print("=" * 50)
+    print("Test 7: Process Response with Tool Emulation")
+    print("=" * 50)
+    
+    original_tools = [
+        {
+            "type": "function",
+            "function": {
+                "name": "get_weather",
+                "description": "Get weather"
+            }
+        }
+    ]
+    
+    # Test 1: Response with tool calls
+    response1 = {
+        "choices": [{
+            "message": {
+                "role": "assistant",
+                "content": '''{
+  "tool_calls": [
+    {
+      "id": "call_1",
+      "function": {
+        "name": "get_weather",
+        "arguments": "{\\"location\\": \\"Paris\\"}"
+      }
+    }
+  ]
+}'''
+            }
+        }]
+    }
+    
+    result1 = process_response_with_tool_emulation(response1, original_tools)
+    
+    message = result1["choices"][0]["message"]
+    if "tool_calls" not in message:
+        print("✗ tool_calls not added to response")
+        return False
+    
+    if message.get("content") is not None:
+        print("✗ content should be None when tool_calls present")
+        return False
+    
+    if len(message["tool_calls"]) != 1:
+        print(f"✗ Expected 1 tool call, got {len(message['tool_calls'])}")
+        return False
+    
+    print("✓ Response processed correctly with tool calls")
+    
+    # Test 2: Response without tool calls
+    response2 = {
+        "choices": [{
+            "message": {
+                "role": "assistant",
+                "content": "Just a regular response"
+            }
+        }]
+    }
+    
+    result2 = process_response_with_tool_emulation(response2, original_tools)
+    
+    if "tool_calls" in result2["choices"][0]["message"]:
+        print("✗ tool_calls should not be added when not present")
+        return False
+    
+    print("✓ Response unchanged when no tool calls")
+    print("✓ Test 7 passed!\n")
+    return True
+
+
+def run_all_tests():
+    """Run all tool emulation tests"""
+    print("\n" + "=" * 50)
+    print("Testing Tool Emulation Functionality")
+    print("=" * 50 + "\n")
+    
+    tests = [
+        test_format_tools_for_prompt,
+        test_create_tool_selection_prompt,
+        test_extract_json_from_text,
+        test_parse_tool_calls_from_response,
+        test_emulate_tool_choice_auto,
+        test_emulate_without_tools,
+        test_process_response_with_tool_emulation
+    ]
+    
+    passed = 0
+    failed = 0
+    
+    for test in tests:
+        try:
+            if test():
+                passed += 1
+            else:
+                failed += 1
+        except Exception as e:
+            print(f"✗ Test {test.__name__} raised exception: {e}")
+            import traceback
+            traceback.print_exc()
+            failed += 1
+    
+    print("=" * 50)
+    print(f"Test Results: {passed} passed, {failed} failed")
+    print("=" * 50)
+    
+    return failed == 0
+
+
+if __name__ == "__main__":
+    success = run_all_tests()
+    sys.exit(0 if success else 1)
+
+