Skip to content

Stop hooks cause conversation to stall in 'Working' state after FinishAction #649

@xingyaoww

Description

@xingyaoww

Bug Description

When a user configures a Stop hook in .openhands/hooks.json, the conversation stalls in the "Working" state after the agent completes (FinishAction). The UI becomes unresponsive — pressing ESC shows "Pausing conversation this may take a few seconds" but never completes.

Steps to Reproduce

  1. Configure a stop hook in ~/.openhands/hooks.json:
{
  "hooks": {
    "Stop": [
      {
        "matcher": "*",
        "hooks": [
          {
            "command": "python ~/.openhands/stop_hook.py",
            "timeout": 30
          }
        ]
      }
    ]
  }
}
  1. Start a conversation and send a message
  2. Wait for the agent to respond with FinishAction
  3. Observe the UI stays at "Working (Ns • ESC: pause)" with the counter increasing indefinitely

Root Cause Analysis

Two issues in how stop hooks interact with the SDK's run loop:

1. Python exit code 2 treated as "block" → infinite loop

The hook executor treats exit code 2 as "block the operation" (a hook protocol convention). However, Python itself exits with code 2 when it can't find a script file:

$ python /nonexistent.py
python3: can't open file '/nonexistent.py': [Errno 2] No such file or directory
$ echo $?
2

This means if the stop hook script doesn't exist (e.g., path issue, different environment), the hook executor interprets this as "deny the stop". The SDK run loop then:

  1. Sets status back to RUNNING
  2. Calls agent.step() → agent calls FinishAction again → FINISHED
  3. Runs stop hook again → denied again → RUNNING
  4. Repeats indefinitely (each iteration involves an LLM call)

2. Stop hooks run while holding the state lock → pause blocked

The SDK's run loop executes stop hooks inside with self._state: (holding the FIFO lock). This means:

  • pause() cannot acquire the state lock while the hook runs (up to 30s timeout)
  • ESC/pause is effectively blocked during hook execution
  • Combined with the infinite loop above, the UI becomes permanently unresponsive

Expected Behavior

  • Stop hooks should run without blocking the ability to pause
  • A failed/errored stop hook should NOT prevent the conversation from finishing
  • Exit code 2 from a missing script should be treated as an error, not a block

Proposed Fix

Strip stop hooks from the HookConfig before passing to the SDK's Conversation, and handle them at the CLI/TUI level after conversation.run() returns. This avoids the state lock issue and gives the CLI control over error handling.

Environment

  • CLI version: 1.14.0
  • SDK version: 1.16.1
  • Issue introduced by SDK bump from 1.11.5 → 1.16.x (which added stop hook handling in the run loop)

This issue was created by an AI assistant (OpenHands) on behalf of the user.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions