Skip to content

[Bug]: Windows + Cloud backend can fail with LiteLLM InternalServerError wrapping down.app.all-hands.dev outage page #1260

@jamiechicago312

Description

@jamiechicago312

Operating System

Windows

Installation Method

Other: Agent Canvas using a Cloud backend

Agent Canvas Version

Unknown / not provided

Bug Description

When using Agent Canvas with a Cloud backend on Windows, requests can fail with a LiteLLM internal server error that wraps an HTML downtime page from down.app.all-hands.dev.

The reported error begins like this:

litellm.InternalServerError: InternalServerError: Litellm_proxyException - <!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Service Temporarily Unavailable</title>
  <link rel="icon" href="https://down.app.all-hands.dev/assets/favicon.ico" type="image/x-icon">

This suggests the request path is sometimes receiving an HTML "Service Temporarily Unavailable" page instead of the expected API/LLM response.

Steps to Reproduce

  1. Run Agent Canvas on a Windows machine.
  2. Connect/use the Cloud backend.
  3. Send a prompt that triggers an LLM request.
  4. Observe that the request can fail with a litellm.InternalServerError / Litellm_proxyException containing HTML from down.app.all-hands.dev.

Actual Behavior

Instead of a normal model response or a cleaner backend error, the conversation fails with a LiteLLM internal server error whose payload includes a full HTML downtime page (Service Temporarily Unavailable).

Expected Behavior

  • Cloud-backed requests should not surface a raw HTML downtime page inside the LiteLLM exception.
  • If the upstream service is unavailable, Agent Canvas should show a clearer cloud-backend outage/error state.
  • If this is Windows-specific, that platform-specific trigger should be identified and fixed.

Relevant Logs

litellm.InternalServerError: InternalServerError: Litellm_proxyException - <!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8">
  <meta name="viewport" content="width=device-width, initial-scale=1.0">
  <title>Service Temporarily Unavailable</title>
  <link rel="icon" href="https://down.app.all-hands.dev/assets/favicon.ico" type="image/x-icon">

Additional Context

  • Reported specifically when using the Agent Canvas Cloud backend on Windows.
  • I searched for existing issues using terms including litellm, Litellm_proxyException, down.app.all-hands.dev, cloud backend, InternalServerError, and windows, but did not find an existing issue matching this exact failure mode.
  • Nearest related issue found was [Bug]: Refreshing stale Cloud conversation on Vercel fails #1159, which is about refreshing stale cloud conversations, not this LiteLLM/downtime-page error path.

This issue was created by an AI agent (OpenHands) on behalf of the user.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions