feat: surface computer error state on terminal disconnect#6
feat: surface computer error state on terminal disconnect#6aniketmaurya wants to merge 2 commits into
Conversation
When the terminal WebSocket drops unexpectedly, check the computer status via the API. If the computer entered ERROR state (e.g. host was spot-interrupted), show the error message and suggest creating a new computer instead of a silent "Disconnected". Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
📝 WalkthroughWalkthroughWhen a websocket closes non‑normally in Changes
Sequence Diagram(s)sequenceDiagram
participant User
participant SSHClient as SSH Client
participant WS as WebSocket
participant API as API Client
participant DB as Computers API
User->>SSHClient: start ssh_to_computer()
SSHClient->>WS: open websocket
WS-->>SSHClient: normal data flow...
alt non-normal close (rcvd.code != 1000)
WS-->>SSHClient: Connection closed (non-normal)
SSHClient->>SSHClient: print "Connection lost"
SSHClient->>API: _get_client(api_key) and computers.get(resolved_id)
API->>DB: fetch computer record
DB-->>API: returns status (running/error/stopped/...)
API-->>SSHClient: computer record / status
SSHClient->>User: print guidance based on status
end
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~20 minutes Poem
🚥 Pre-merge checks | ✅ 2 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (2 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (1)
src/celesto/computer.py (1)
266-271: Consider handling the other states, mate. There's more to this world than "error" and "running".Your
ComputerStatuscontract intypes.pydefines nine possible values:"creating","running","stopping","stopped","starting","deleting","deleted", and"error". You're only handling two of them here.If the computer ends up in
"stopped","stopping","deleted", or any transitional state when the connection drops, the user gets no guidance whatsoever — just silence after the "Connection lost" message.Now, I'm not saying you need to handle every bloody state individually, but an
elsebranch with a sensible fallback would show the user you've thought this through properly.♻️ Proposed enhancement: Add fallback for other states
if comp_status == "error": last_err = info.get("last_error", "unknown") console.print(f"[red]Computer error: {last_err}[/red]") console.print("[dim]Create a new computer with: celesto computer create[/dim]") elif comp_status == "running": console.print(f"[yellow]Computer is still running. Reconnect with: celesto computer ssh {computer_id}[/yellow]") + elif comp_status in ("stopped", "stopping", "deleted", "deleting"): + console.print(f"[yellow]Computer is {comp_status}. Create a new one with: celesto computer create[/yellow]") + else: + console.print(f"[dim]Computer status: {comp_status}[/dim]")🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed. In `@src/celesto/computer.py` around lines 266 - 271, The current comp_status handling only covers "error" and "running" in the block that reads comp_status = info.get("status") (the snippet uses comp_status, info and computer_id); add an else branch that covers all other ComputerStatus values (e.g., "creating", "starting", "stopping", "stopped", "deleting", "deleted") and prints a helpful fallback message to the user (for example: indicate the current state and suggested next steps such as waiting, trying "celesto computer start/stop/delete" or recreating with "celesto computer create", and for terminal states like "deleted" suggest recreating); ensure the new branch uses the existing variables comp_status, info.get("last_error") if present, and computer_id to provide context and avoid leaving the user with no guidance.
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/celesto/computer.py`:
- Around line 272-273: You have a bare except in the try/except around the SDK
get() call that swallows all errors (the block with "except Exception: pass");
replace it with targeted exception handling for CelestoAuthenticationError,
CelestoNotFoundError, CelestoNetworkError (and a generic Exception fallback) and
log a clear, user-friendly error message (including the exception details) via
the module/logger used in this file so failures are visible; for recoverable
cases keep the existing best-effort behavior after logging, and for
fatal/authentication/network errors either return a controlled error response or
re-raise after logging to avoid hiding problems.
---
Nitpick comments:
In `@src/celesto/computer.py`:
- Around line 266-271: The current comp_status handling only covers "error" and
"running" in the block that reads comp_status = info.get("status") (the snippet
uses comp_status, info and computer_id); add an else branch that covers all
other ComputerStatus values (e.g., "creating", "starting", "stopping",
"stopped", "deleting", "deleted") and prints a helpful fallback message to the
user (for example: indicate the current state and suggested next steps such as
waiting, trying "celesto computer start/stop/delete" or recreating with "celesto
computer create", and for terminal states like "deleted" suggest recreating);
ensure the new branch uses the existing variables comp_status,
info.get("last_error") if present, and computer_id to provide context and avoid
leaving the user with no guidance.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: a9748e5e-2b5a-4626-a40c-984489e98835
📒 Files selected for processing (1)
src/celesto/computer.py
…connect - Replace bare `except Exception: pass` with specific handlers for CelestoNotFoundError, CelestoAuthenticationError, CelestoNetworkError so failures are visible instead of silently swallowed - Add branches for all ComputerStatus values (stopped, deleted, creating, starting) with actionable guidance for each Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
There was a problem hiding this comment.
Actionable comments posted: 1
🤖 Prompt for all review comments with AI agents
Verify each finding against the current code and only fix it if needed.
Inline comments:
In `@src/celesto/computer.py`:
- Around line 260-287: The branch only runs the status probe when e.rcvd is
present, but websocket network failures raise ConnectionClosedError (or leave
e.rcvd == None) so we must treat e.rcvd is None as an abnormal disconnect:
update the except handling around the websocket recv to also catch
ConnectionClosedError (or check if e.rcvd is None) and in that path execute the
same status probe using _get_client(resolved_id) /
check_client.computers.get(resolved_id) and the existing status-handling logic
(last_error, "running", "stopped"/"stopping", "deleted"/"deleting",
"creating"/"starting"), and set a local flag (e.g., abnormal_disconnect) so the
later clean "Disconnected." footer is suppressed; keep existing
CelestoNotFoundError, CelestoAuthenticationError, CelestoNetworkError handling
but ensure they are used inside this new path so network/auth errors still print
the dim "Could not check computer status" message.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Organization UI
Review profile: CHILL
Plan: Pro
Run ID: d0676181-04a7-4018-b54c-c35f5f33ad97
📒 Files selected for processing (1)
src/celesto/computer.py
| if e.rcvd and e.rcvd.code != 1000: | ||
| console.print(f"\n[red]Connection closed: code={e.rcvd.code} reason={e.rcvd.reason}[/red]") | ||
| console.print(f"\n[red]Connection lost (code={e.rcvd.code}).[/red]") | ||
| # Check if computer entered error state (e.g. host spot-interrupted) | ||
| try: | ||
| with _get_client(api_key) as check_client: | ||
| info = check_client.computers.get(resolved_id) | ||
| comp_status = info.get("status") | ||
| if comp_status == "error": | ||
| last_err = info.get("last_error", "unknown") | ||
| console.print(f"[red]Computer error: {last_err}[/red]") | ||
| console.print("[dim]Create a new computer with: celesto computer create[/dim]") | ||
| elif comp_status == "running": | ||
| console.print(f"[yellow]Computer is still running. Reconnect with: celesto computer ssh {computer_id}[/yellow]") | ||
| elif comp_status in ("stopped", "stopping"): | ||
| console.print(f"[yellow]Computer is {comp_status}. Start it with: celesto computer start {computer_id}[/yellow]") | ||
| elif comp_status in ("deleted", "deleting"): | ||
| console.print("[red]Computer has been deleted.[/red]") | ||
| console.print("[dim]Create a new computer with: celesto computer create[/dim]") | ||
| elif comp_status in ("creating", "starting"): | ||
| console.print(f"[yellow]Computer is {comp_status}. Wait a moment and retry: celesto computer ssh {computer_id}[/yellow]") | ||
| else: | ||
| console.print(f"[yellow]Computer status: {comp_status}[/yellow]") | ||
| except (CelestoNotFoundError,): | ||
| console.print("[red]Computer not found — it may have been deleted.[/red]") | ||
| except (CelestoAuthenticationError, CelestoNetworkError) as exc: | ||
| console.print(f"[dim]Could not check computer status: {exc}[/dim]") | ||
| except Exception as exc: | ||
| console.print(f"[dim]Could not check computer status: {exc}[/dim]") |
There was a problem hiding this comment.
Abrupt drops can still slip past this recovery path.
Line 260 only runs the new status probe when e.rcvd exists, but websockets only sets rcvd if a close frame was actually received. For network failures / broken handshakes, recv() raises ConnectionClosedError and rcvd may be None, so this whole branch is skipped and the flow falls through to the clean Disconnected. footer on Line 334 — exactly the case this PR is trying to surface. Catch the error close explicitly (or treat e.rcvd is None as abnormal) and suppress the clean-disconnect footer for that path. (websockets.readthedocs.io)
🔧 One tidy way to close the gap
old_settings = termios.tcgetattr(stdin_fd)
done = threading.Event()
+ clean_disconnect = True
def recv_loop():
+ nonlocal clean_disconnect
try:
while not done.is_set():
try:
msg = ws.recv(timeout=1)
except TimeoutError:
continue
if isinstance(msg, str):
os.write(sys.stdout.fileno(), msg.encode("utf-8"))
elif isinstance(msg, bytes):
os.write(sys.stdout.fileno(), msg)
- except websockets.exceptions.ConnectionClosed as e:
- if e.rcvd and e.rcvd.code != 1000:
- console.print(f"\n[red]Connection lost (code={e.rcvd.code}).[/red]")
- # Check if computer entered error state (e.g. host spot-interrupted)
- try:
- with _get_client(api_key) as check_client:
- info = check_client.computers.get(resolved_id)
- comp_status = info.get("status")
- if comp_status == "error":
- last_err = info.get("last_error", "unknown")
- console.print(f"[red]Computer error: {last_err}[/red]")
- console.print("[dim]Create a new computer with: celesto computer create[/dim]")
- elif comp_status == "running":
- console.print(f"[yellow]Computer is still running. Reconnect with: celesto computer ssh {computer_id}[/yellow]")
- elif comp_status in ("stopped", "stopping"):
- console.print(f"[yellow]Computer is {comp_status}. Start it with: celesto computer start {computer_id}[/yellow]")
- elif comp_status in ("deleted", "deleting"):
- console.print("[red]Computer has been deleted.[/red]")
- console.print("[dim]Create a new computer with: celesto computer create[/dim]")
- elif comp_status in ("creating", "starting"):
- console.print(f"[yellow]Computer is {comp_status}. Wait a moment and retry: celesto computer ssh {computer_id}[/yellow]")
- else:
- console.print(f"[yellow]Computer status: {comp_status}[/yellow]")
+ except websockets.exceptions.ConnectionClosedError as e:
+ clean_disconnect = False
+ close_code = e.rcvd.code if e.rcvd else (e.sent.code if e.sent else "unknown")
+ console.print(f"\n[red]Connection lost (code={close_code}).[/red]")
+ try:
+ with _get_client(api_key) as check_client:
+ info = check_client.computers.get(resolved_id)
+ comp_status = info.get("status")
+ if comp_status == "error":
+ last_err = info.get("last_error", "unknown")
+ console.print(f"[red]Computer error: {last_err}[/red]")
+ console.print("[dim]Create a new computer with: celesto computer create[/dim]")
+ elif comp_status == "running":
+ console.print(f"[yellow]Computer is still running. Reconnect with: celesto computer ssh {computer_id}[/yellow]")
+ elif comp_status in ("stopped", "stopping"):
+ console.print(f"[yellow]Computer is {comp_status}. Start it with: celesto computer start {computer_id}[/yellow]")
+ elif comp_status in ("deleted", "deleting"):
+ console.print("[red]Computer has been deleted.[/red]")
+ console.print("[dim]Create a new computer with: celesto computer create[/dim]")
+ elif comp_status in ("creating", "starting"):
+ console.print(f"[yellow]Computer is {comp_status}. Wait a moment and retry: celesto computer ssh {computer_id}[/yellow]")
+ else:
+ console.print(f"[yellow]Computer status: {comp_status}[/yellow]")
except (CelestoNotFoundError,):
console.print("[red]Computer not found — it may have been deleted.[/red]")
except (CelestoAuthenticationError, CelestoNetworkError) as exc:
console.print(f"[dim]Could not check computer status: {exc}[/dim]")
except Exception as exc:
console.print(f"[dim]Could not check computer status: {exc}[/dim]")
+ except websockets.exceptions.ConnectionClosedOK:
+ pass
except OSError:
pass
finally:
done.set()
@@
- console.print("\n[dim]Disconnected.[/dim]")
+ if clean_disconnect:
+ console.print("\n[dim]Disconnected.[/dim]")🧰 Tools
🪛 Ruff (0.15.9)
[warning] 282-282: A length-one tuple literal is redundant in exception handlers
Replace with except CelestoNotFoundError
(B013)
[warning] 286-286: Do not catch blind exception: Exception
(BLE001)
🤖 Prompt for AI Agents
Verify each finding against the current code and only fix it if needed.
In `@src/celesto/computer.py` around lines 260 - 287, The branch only runs the
status probe when e.rcvd is present, but websocket network failures raise
ConnectionClosedError (or leave e.rcvd == None) so we must treat e.rcvd is None
as an abnormal disconnect: update the except handling around the websocket recv
to also catch ConnectionClosedError (or check if e.rcvd is None) and in that
path execute the same status probe using _get_client(resolved_id) /
check_client.computers.get(resolved_id) and the existing status-handling logic
(last_error, "running", "stopped"/"stopping", "deleted"/"deleting",
"creating"/"starting"), and set a local flag (e.g., abnormal_disconnect) so the
later clean "Disconnected." footer is suppressed; keep existing
CelestoNotFoundError, CelestoAuthenticationError, CelestoNetworkError handling
but ensure they are used inside this new path so network/auth errors still print
the dim "Could not check computer status" message.
Summary
last_errorand suggestcelesto computer createcelesto computer sshCompanion PRs:
Test plan
🤖 Generated with Claude Code
Summary by CodeRabbit