Skip to content

Compress per-step agent prompts and add rules-first action routing #4

@PraneelBhatia

Description

@PraneelBhatia

Problem

The step-level prompt in backend/agent.py is repeatedly sending too much page state:

  • large system prompt
  • browsing history
  • error context
  • visible text dump
  • interactive element dump

This creates heavy repeated token usage.

Proposal

Compress the per-step context:

  • replace raw visible text with a page summary
  • send only top-k interactive candidates, not the full element list every time
  • send delta history instead of repeated history
  • send only new errors since last step
  • keep invariant prompt blocks much shorter

Add a rules-first layer before LLM action selection for obvious cases:

  • form discovery
  • heading checks
  • missing alt text
  • click target sizing
  • dead links
  • obvious nav actions

Suggested file targets

  • backend/agent.py
  • backend/browser_utils.py
  • maybe a new summarizer/router helper

Acceptance criteria

  • step prompt payload is significantly smaller
  • obvious interaction choices are resolved without an LLM call when possible
  • logs show reduced context size per step

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions