Skip to content

Reduce max_steps from 15 to 10 to improve runtime #17

@apenab

Description

@apenab

Problem

Current max_steps=15 can overrun straightforward samples and inflate latency.

Proposal

Reduce max_steps from 15 to 10 (or make it configurable with a lower default) to cut average runtime.

Expected Impact

  • Estimated runtime improvement: ~20%
  • Approximate per-sample runtime: 108.5s -> ~95s

Risk

  • Yes, visible quality risk on complex samples that need deeper reasoning.

Implementation Considerations

  • Keep max_steps configurable per experiment/profile.
  • Consider dataset- or difficulty-aware overrides.
  • Pair with evaluation guardrails to detect regressions quickly.

Acceptance Criteria

  • max_steps is configurable and documented.
  • Benchmarks compare 15 vs 10 with:
    • latency
    • token usage
    • correctness/grounding metrics
  • Regression analysis identifies which sample types degrade most.
  • Decision note documents whether 10 becomes default or remains an opt-in profile.

Metadata

Metadata

Assignees

No one assigned

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions