WebSocket Protocol

The WebSocket connection provides real-time bidirectional communication between clients and the Cass backend. It handles chat messages, status updates, tool execution, and audio delivery.

Connection

Endpoint

ws://localhost:8000/ws

Authentication

Two authentication methods:

Query Parameter (preferred):

ws://localhost:8000/ws?token=<jwt_access_token>

First Message:

{
  "type": "auth",
  "token": "<jwt_access_token>"
}

Localhost Bypass: Connections from 127.0.0.1, ::1, or localhost automatically use DEFAULT_LOCALHOST_USER_ID if ALLOW_LOCALHOST_BYPASS=true (default).

Connection Response

On successful connection:

{
  "type": "connected",
  "message": "Cass vessel connected",
  "sdk_mode": true,
  "user_id": "user-uuid-here",
  "timestamp": "2025-12-09T10:00:00"
}

Message Types

Client → Server

chat

Send a message to Cass:

{
  "type": "chat",
  "message": "Hello Cass!",
  "conversation_id": "conv-uuid",
  "image": "base64-encoded-data",
  "image_media_type": "image/png"
}

Field	Required	Description
message	Yes	The user's message text
conversation_id	No	Conversation to continue (creates new if omitted)
image	No	Base64-encoded image data
image_media_type	No	MIME type (e.g., "image/png")

auth

Authenticate after connection:

{
  "type": "auth",
  "token": "jwt_token_here"
}

onboarding_intro

Trigger onboarding introduction:

{
  "type": "onboarding_intro",
  "user_id": "user-uuid",
  "profile": {
    "display_name": "Name",
    "relationship": "researcher"
  }
}

onboarding_demo

Trigger collaborative demo:

{
  "type": "onboarding_demo",
  "user_id": "user-uuid",
  "message": "optional response",
  "profile": {
    "relationship": "researcher",
    "background": {"context": "..."},
    "communication": {"style": "..."}
  }
}

Server → Client

connected

Connection established:

{
  "type": "connected",
  "message": "Cass vessel connected",
  "sdk_mode": true,
  "user_id": "user-uuid",
  "timestamp": "2025-12-09T10:00:00"
}

auth_success

Authentication successful:

{
  "type": "auth_success",
  "user_id": "user-uuid"
}

auth_error

Authentication failed:

{
  "type": "auth_error",
  "message": "Invalid token"
}

thinking

Status update during processing:

{
  "type": "thinking",
  "status": "Retrieving memories...",
  "memories": {
    "summaries_count": 3,
    "details_count": 5,
    "project_docs_count": 2,
    "user_context_count": 4,
    "wiki_pages_count": 1,
    "has_context": true
  },
  "timestamp": "2025-12-09T10:00:01"
}

Status messages progress through:

"Retrieving memories..."
"Generating response (Claude/OpenAI/local model)..."
"Using tool: [tool_name]..." (if tools used)

response

Cass's response:

{
  "type": "response",
  "text": "Hello! <gesture:wave>",
  "animations": [
    {"type": "gesture", "name": "wave", "position": 7}
  ],
  "input_tokens": 1500,
  "output_tokens": 150,
  "provider": "anthropic",
  "model": "claude-sonnet-4-20250514",
  "conversation_id": "conv-uuid",
  "timestamp": "2025-12-09T10:00:02"
}

Field	Description
text	Response text with gesture/emote tags
animations	Parsed gesture/emote data
input_tokens	Tokens used for input
output_tokens	Tokens generated
provider	"anthropic", "openai", or "local"
model	Specific model used
conversation_id	ID of the conversation

audio

TTS audio available:

{
  "type": "audio",
  "url": "/audio/abc123.wav",
  "timestamp": "2025-12-09T10:00:03"
}

system

System message:

{
  "type": "system",
  "message": "Memory summarization complete",
  "timestamp": "2025-12-09T10:00:04"
}

title_update

Conversation title changed:

{
  "type": "title_update",
  "conversation_id": "conv-uuid",
  "title": "New conversation title",
  "timestamp": "2025-12-09T10:00:05"
}

debug

Debug information (development):

{
  "type": "debug",
  "message": "[Tool Loop #1] stop_reason=tool_use, tools=['recall_journal']",
  "timestamp": "2025-12-09T10:00:06"
}

error

Error occurred:

{
  "type": "error",
  "message": "Failed to generate response",
  "timestamp": "2025-12-09T10:00:07"
}

Chat Flow

Standard Flow

Client                          Server
  |                               |
  |-- chat (message) ------------>|
  |                               |
  |<-- thinking (memories) -------|
  |<-- thinking (generating) -----|
  |                               |
  |<-- response -----------------|
  |<-- audio (optional) ----------|

With Tool Use

Client                          Server
  |                               |
  |-- chat (message) ------------>|
  |                               |
  |<-- thinking (memories) -------|
  |<-- thinking (generating) -----|
  |<-- thinking (using tool) -----|
  |<-- debug (tool loop) ---------|
  |<-- thinking (continuing) -----|
  |                               |
  |<-- response -----------------|

Tool Loop

When Cass uses tools, the server loops:

LLM returns stop_reason: "tool_use"
Server executes requested tool(s)
Server sends tool results back to LLM
LLM generates next response (may use more tools)
Repeat until stop_reason: "end_turn"

Context Building

For each chat message, the server builds context:

Hierarchical Memory - Summaries and recent details
User Context - Profile and observations for the user
Project Context - Documents if conversation is in a project
Self-Model - Cass's self-observations
Wiki Context - Auto-injected relevant wiki pages

Context is formatted and prepended to the conversation.

Connection Management

ConnectionManager

The server maintains active connections:

class ConnectionManager:
    active_connections: List[WebSocket]
    connection_users: Dict[WebSocket, str]  # websocket -> user_id

Disconnect Handling

On WebSocketDisconnect, the connection is removed:

except WebSocketDisconnect:
    manager.disconnect(websocket)

Multi-LLM Support

The WebSocket supports multiple LLM providers:

Provider	Variable	Description
`anthropic`	Claude	Anthropic Claude models
`openai`	OpenAI	OpenAI GPT models
`local`	Ollama	Local Ollama models

Provider is selected via:

/llm TUI command
Ctrl+O TUI shortcut
API endpoint

Response includes provider and model fields.

Image Support

Images can be sent with messages:

{
  "type": "chat",
  "message": "What's in this image?",
  "image": "base64-encoded-image-data",
  "image_media_type": "image/png"
}

The image is passed to the LLM for multimodal understanding.

Error Handling

Errors are sent as error type messages:

{
  "type": "error",
  "message": "Detailed error message"
}

Common errors:

Authentication failures
LLM API errors
Tool execution failures
User not found

Key Files

backend/main_sdk.py:4799-5830 - WebSocket endpoint
backend/main_sdk.py:4799-4830 - ConnectionManager class
tui-frontend/tui.py - TUI WebSocket client
backend/auth.py - Token handling

Navigation

Home

Architecture

Features

Development

GitHub Repo

Uh oh!

WebSocket Protocol

WebSocket Protocol

Connection

Endpoint

Authentication

Connection Response

Message Types

Client → Server

chat

auth

onboarding_intro

onboarding_demo

Server → Client

connected

auth_success

auth_error

thinking

response

audio

system

title_update

debug

error

Chat Flow

Standard Flow

With Tool Use

Tool Loop

Context Building

Connection Management

ConnectionManager

Disconnect Handling

Multi-LLM Support

Image Support

Error Handling

Key Files

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Navigation

Architecture

Features

Development

Clone this wiki locally