Skip to content

[PR 3/7] Add timeout and circuit breaker resilience features#71

Merged
gkamradt merged 4 commits into
checkpoint/pr2-local-checkpointingfrom
checkpoint/pr3-cli-integration
Jan 23, 2026
Merged

[PR 3/7] Add timeout and circuit breaker resilience features#71
gkamradt merged 4 commits into
checkpoint/pr2-local-checkpointingfrom
checkpoint/pr3-cli-integration

Conversation

@ericc59
Copy link
Copy Markdown
Contributor

@ericc59 ericc59 commented Jan 22, 2026

Summary

  • Timeout utilities: TaskTimeoutError, request_timeout context manager, task_timeout function
  • Circuit breaker: CLOSED/OPEN/HALF_OPEN states with configurable failure threshold and recovery timeout
  • CircuitBreakerRegistry: manages per-provider circuit breakers
  • CLI integration: --max-task-timeout and --circuit-breaker-threshold flags in run_all.py
  • Provider config: timeout and circuit breaker settings per provider in provider_config.yml

Dependencies

  • Requires [PR 2/7] local checkpointing

Test plan

  • Run pytest src/arc_agi_benchmarking/tests/test_resilience.py (41 tests)
  • Test with real API calls to verify timeout enforcement
  • Verify circuit breaker opens after consecutive failures

@ericc59 ericc59 force-pushed the checkpoint/pr2-local-checkpointing branch from 4902f2d to 3158bfa Compare January 22, 2026 20:51
- Add resilience module with TaskTimeoutError, request_timeout, task_timeout
- Add CircuitBreaker with CLOSED/OPEN/HALF_OPEN states and configurable thresholds
- Add CircuitBreakerRegistry for managing per-provider circuit breakers
- Integrate timeout and circuit breaker into cli/run_all.py
- Add --max-task-timeout and --circuit-breaker-threshold CLI flags
- Add timeout and circuit breaker config to provider_config.yml
- Add 41 tests for resilience module
Timeouts now only apply when --max-task-timeout is explicitly passed.
This prevents accidentally failing long-running reasoning model tasks.
@ericc59 ericc59 force-pushed the checkpoint/pr3-cli-integration branch from 3b48102 to 94824ca Compare January 22, 2026 20:51
@gkamradt gkamradt merged commit ae9fb93 into checkpoint/pr2-local-checkpointing Jan 23, 2026
3 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants