Skip to content

Conversation

@surajsahani
Copy link

@surajsahani surajsahani commented Jan 29, 2026

Description

This PR adds a robust retry mechanism with exponential backoff to help developers handle transient failures when working with LLM APIs and other external services in the any-agent library.

What Changed

  • src/any_agent/utils/retry.py: New retry utility module with retry_with_backoff decorator and RetryError exception
  • src/any_agent/utils/__init__.py: Export new utilities
  • tests/unit/utils/test_retry.py: Comprehensive test suite with 15 test cases

Features

  • ✅ Exponential backoff with configurable parameters
  • ✅ Support for both sync and async functions (auto-detected)
  • ✅ Selective exception handling (retry only specific error types)
  • ✅ Configurable max delay cap
  • ✅ Comprehensive logging
  • ✅ Type-safe (passes mypy strict checks)
  • ✅ 15 comprehensive test cases, all passing

Why This Matters

When building AI agents that interact with LLM APIs, transient failures are common (rate limiting, network timeouts, temporary service unavailability). This utility provides a production-ready solution that improves reliability and reduces boilerplate error handling code.

Usage Example

from any_agent.utils import retry_with_backoff

@retry_with_backoff(max_attempts=3, initial_delay=1.0)
async def call_llm_api():
    response = await client.chat.completions.create(...)
    return response

Testing
# All tests pass
uv run pytest tests/unit/utils/test_retry.py -v
# Results: 15 passed

# Code quality checks pass
uv run pre-commit run --files src/any_agent/utils/retry.py src/any_agent/utils/__init__.py
Checklist
 Code follows project style guidelines (ruff, mypy pass)
 Tests added with comprehensive coverage
 All tests pass locally
 No breaking changes to existing API
 Follows contribution guidelines from CONTRIBUTING.md
Related Issues
This addresses the need for better error handling in agent operations, particularly when dealing with external API calls that may experience transient failures.

- Add retry_with_backoff decorator for handling transient failures
- Support both sync and async functions automatically
- Configurable exponential backoff with max delay cap
- Selective exception handling for specific error types
- Add RetryError exception for exhausted retry attempts
- Include comprehensive test suite with 15 test cases
- Add proper logging and type safety (mypy strict)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant