Problem
The step-level prompt in backend/agent.py is repeatedly sending too much page state:
- large system prompt
- browsing history
- error context
- visible text dump
- interactive element dump
This creates heavy repeated token usage.
Proposal
Compress the per-step context:
- replace raw visible text with a page summary
- send only top-k interactive candidates, not the full element list every time
- send delta history instead of repeated history
- send only new errors since last step
- keep invariant prompt blocks much shorter
Add a rules-first layer before LLM action selection for obvious cases:
- form discovery
- heading checks
- missing alt text
- click target sizing
- dead links
- obvious nav actions
Suggested file targets
backend/agent.py
backend/browser_utils.py
- maybe a new summarizer/router helper
Acceptance criteria
- step prompt payload is significantly smaller
- obvious interaction choices are resolved without an LLM call when possible
- logs show reduced context size per step
Problem
The step-level prompt in
backend/agent.pyis repeatedly sending too much page state:This creates heavy repeated token usage.
Proposal
Compress the per-step context:
Add a rules-first layer before LLM action selection for obvious cases:
Suggested file targets
backend/agent.pybackend/browser_utils.pyAcceptance criteria