[FEATURE] Add --benchmark flag for performance testing

## Feature Description
Run standardized performance benchmarks on agents to measure speed, cost, and quality.

## Problem/Motivation
Hard to compare different agents or track performance changes over time objectively.

## Proposed Solution
```bash
chat_loop myagent --benchmark

# Runs standard queries and reports:
# - Average latency: 2.3s
# - Tokens/sec: 45
# - Cost per query: $0.015
# - Total time: 23s
# - Total cost: $0.15
```

## Benefits
- Compare agents objectively
- Track performance regressions
- Optimize for speed or cost
- Share benchmarks with team

## Priority
- [ ] Critical
- [ ] High
- [x] Medium
- [ ] Low

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[FEATURE] Add --benchmark flag for performance testing #15

Feature Description

Problem/Motivation

Proposed Solution

Benefits

Priority

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

[FEATURE] Add --benchmark flag for performance testing #15

Description

Feature Description

Problem/Motivation

Proposed Solution

Benefits

Priority

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions