Skip to content

Conversation

@jwm4
Copy link
Contributor

@jwm4 jwm4 commented Jan 28, 2026

Summary

Fixes #4754

When making MCP calls through the responses API, the llama-stack server CPU usage could spike to 100% and remain there indefinitely, even after the request completes.

Root Cause

The issue occurs during MCP session cleanup in MCPSessionManager.close_all(). When tasks don't respond to cancellation, anyio's _deliver_cancellation loop can spin indefinitely, causing the CPU spike.

Solution

Added a configurable timeout (default 5 seconds) to the __aexit__ calls using anyio.fail_after(). If cleanup takes longer than the timeout, it's aborted to prevent the CPU spin.

Testing

  • Verified that after the fix, CPU usage returns to idle levels after MCP requests complete
  • Existing error handling catches the TimeoutError from fail_after() gracefully

When making MCP calls through the responses API, the llama-stack server
CPU usage could spike to 100% and remain there indefinitely due to
anyio's _deliver_cancellation loop hanging during session cleanup.

This fix adds a configurable timeout (default 5 seconds) to the
__aexit__ calls in MCPSessionManager.close_all() using anyio.fail_after().
If cleanup takes longer than the timeout, it's aborted to prevent the
CPU spin.

Fixes llamastack#4754
@meta-cla meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 28, 2026
@jwm4 jwm4 mentioned this pull request Jan 28, 2026
@jwm4 jwm4 changed the title Fix MCP CPU spike by adding timeout to session cleanup fix: MCP CPU spike by adding timeout to session cleanup Jan 28, 2026
mattf
mattf previously requested changes Jan 28, 2026
Copy link
Collaborator

@mattf mattf left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please provide reproduction steps.

i did the following and still see 100% CPU usage -

10:53:24 in llama-stack on  fix/mcp-cpu-spike-timeout [$?] is 📦 0.4.0.dev0 …
➜ uv run llama stack run --providers agents=inline::meta-reference,inference=remote::llama-openai-compat,vector_io=inline::faiss,tool_runtime=inline::rag-runtime,files=inline::localfs
...
INFO     2026-01-28 10:53:34,588 uvicorn.error:216 uncategorized: Uvicorn running on http://['::', '0.0.0.0']:8321      
         (Press CTRL+C to quit)                                                                                         
INFO     2026-01-28 10:53:38,379 uvicorn.access:476 uncategorized: ::1:53190 - "POST /v1/responses HTTP/1.1" 200
10:53:35 in llama-stack on  fix/mcp-cpu-spike-timeout [$?] is 📦 0.4.0.dev0 …
➜ curl http://localhost:8321/v1/responses \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama-openai-compat/Llama-4-Scout-17B-16E-Instruct-FP8",
    "input": "Use the provided tool to say something.",
    "tools": [
      {
        "type": "mcp",
        "server_label": "local-mcp",
        "server_url": "http://localhost:9090"
      }
    ],
    "tool_choice": "auto"
  }'

@derekhiggins
Copy link
Contributor

Also still seeing a problem
running https://github.com/derekhiggins/rhoai-auth-demo/blob/main/scripts/interactive-demo.py
python scripts/interactive-demo.py --user admin --tests mcp

@mattf mattf dismissed their stale review January 28, 2026 16:55

proposed alternative change

@derekhiggins
Copy link
Contributor

lgtm, CPU spike gone can when using MCP
thanks both.

@mergify
Copy link

mergify bot commented Jan 30, 2026

This pull request has merge conflicts that must be resolved before it can be merged. @jwm4 please rebase it. https://docs.github.com/en/pull-requests/collaborating-with-pull-requests/working-with-forks/syncing-a-fork

@mergify mergify bot added the needs-rebase label Jan 30, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Meta Open Source bot. needs-rebase

Projects

None yet

Development

Successfully merging this pull request may close these issues.

MCP CPU Spike

3 participants