Skip to content

fix(proxy): keep terminal WS open on idle by clearing upstream read timeout#99

Closed
viniciussouzax wants to merge 1 commit into
evolution-foundation:mainfrom
viniciussouzax:fix/ws-proxy-idle-timeout
Closed

fix(proxy): keep terminal WS open on idle by clearing upstream read timeout#99
viniciussouzax wants to merge 1 commit into
evolution-foundation:mainfrom
viniciussouzax:fix/ws-proxy-idle-timeout

Conversation

@viniciussouzax

Copy link
Copy Markdown

Problem

On a deployment where the browser reaches the chat through the Flask
/terminal/ws proxy (public VPS, Cloudflare/Tailscale tunnel, reverse proxy),
an idle chat reconnects on a ~10s loop. The user sees the chat flashing
"connecting", and the browser console logs repeated:

WebSocket connection to '.../terminal/ws' failed: Invalid frame header

Server logs confirm a fresh GET /terminal/ws every ~10s while the tab is open and idle.

Root cause

In register_websocket_proxy, the upstream socket is opened with:

upstream = create_connection(target, timeout=10)

websocket-client's timeout argument is not only a connect timeout: it
also becomes the socket read timeout and persists for the life of the
connection. So upstream.recv() in the _pump_upstream_to_client thread
raises a timeout after 10s with no upstream data, the pump exits, stop is
set, and the client socket is closed. The frontend then reconnects (its ping
is every 25s, so the bridge dies before each cycle).

Fix

Clear the read timeout right after connect so recv() blocks until real data
arrives or the socket closes:

upstream = create_connection(target, timeout=10)
upstream.settimeout(None)

Genuine disconnects are still detected by the client receive loop (its own
30s client_ws.receive(timeout=30)) and the frontend ping/pong heartbeat, so
no half-open sockets are introduced.

Testing

  • Before: with an idle chat open, docker logs shows GET /terminal/ws every
    ~10s; UI flashes "connecting"; console shows "Invalid frame header".
  • After: a single stable connection; 0 reconnects observed over a 45s idle
    window; no console errors. Streaming responses and the terminal are unaffected.

Related

Complementary to #86, which fixes the client-side idle teardown
(client_ws.receive returning Nonebreak). This PR fixes the
upstream-side read timeout. The two are independent; both are needed for a
fully idle-stable bridge.

…imeout

websocket-client's create_connection(timeout=10) also sets the socket read
timeout, and it persists after connect. As a result upstream.recv() in the
_pump_upstream_to_client thread raised a timeout every 10s on an idle chat and
tore down the /terminal/ws bridge. The frontend then reconnected on a ~10s
loop, visible as the chat flashing "connecting" (and "Invalid frame header"
in the browser console).

Clear the read timeout after connect (upstream.settimeout(None)) so recv()
blocks until real data arrives or the socket closes. Genuine disconnects are
still detected by the client receive loop (its own 30s timeout) and the
frontend ping/pong heartbeat.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>

@sourcery-ai sourcery-ai Bot left a comment

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry @viniciussouzax, you have reached your weekly rate limit of 500000 diff characters.

Please try again later or upgrade to continue using Sourcery

@viniciussouzax viniciussouzax closed this by deleting the head repository Jun 9, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant