Unified trace bridge: link game-side TraceStore with framework observability (LangSmith, etc.)

## Problem

Agent Arena has two observability layers that don't talk to each other:

1. **Game-side** (TraceStore): Observations, tool results, scores — keyed by `(agent_id, tick)`
2. **LLM-side** (LangSmith/Anthropic console): Prompts, model responses, token usage, latency

When debugging a bad decision at tick 42, a user has to manually correlate between these systems. There's no way to click on a tick and see the full chain: what the agent saw → what prompt was built → what the LLM returned → what tool was called → what happened in the game.

## Proposed Solution

Add a **trace bridge** that links game ticks to framework trace IDs, creating a unified view.

### How it works

1. **SDK passes tick context to the decide callback:**
   The `decide(observation)` function already receives the tick via `observation.tick`. No change needed.

2. **Framework starters attach Arena metadata to LLM calls:**
   ```python
   # In starters/langchain/agent.py
   result = graph.invoke(
       {"observation": obs},
       config={"metadata": {"arena_tick": obs.tick, "arena_agent": obs.agent_id}}
   )
   ```
   LangSmith automatically indexes this metadata, making it searchable.

3. **TraceStore captures the framework trace URL back:**
   ```python
   # After the LLM call, store the link
   trace.add_step("framework_trace", {
       "langsmith_run_id": run_id,
       "langsmith_url": f"https://smith.langchain.com/runs/{run_id}"
   })
   ```

4. **Result: unified per-tick trace**
   ```
   Tick 42:
     observation: {pos: [1,2,3], resources: [{name: "berry", dist: 3.2}]}
     framework_trace: https://smith.langchain.com/runs/abc123  ← click to see LLM details
     decision: {tool: "collect", params: {target: "berry"}}
     tool_result: {success: true, items_collected: 1}
     score: {resources_collected: 5}
   ```

### Framework-agnostic design

The bridge should work with any framework:
- **LangGraph**: LangSmith run metadata + callbacks
- **Claude SDK**: Anthropic console trace IDs
- **OpenAI SDK**: OpenAI dashboard request IDs
- **Custom**: Any string URL/ID the user wants to attach

The SDK provides a simple hook:
```python
def decide(observation: Observation) -> Decision:
    # User's framework code here...
    observation.trace_metadata["framework_url"] = langsmith_url
    return decision
```

## Acceptance Criteria

- [ ] TraceStore supports storing external trace links per `(agent_id, tick)`
- [ ] LangGraph starter attaches `arena_tick` + `arena_agent` as LangSmith run metadata
- [ ] TraceStore captures LangSmith run URL back into the game-side trace
- [ ] A user can go from tick → full prompt/response in LangSmith with one click
- [ ] Design is framework-agnostic (works for Claude SDK, OpenAI, etc.)
- [ ] Documentation shows the debugging workflow end-to-end

## Dependencies

- **Depends on #74** (framework adapter system — need at least one working framework starter)
- Related to #75 (inspector refactor — game-side trace becomes the inspector's data source)

## Estimated Effort

1 day (after #74 is complete)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unified trace bridge: link game-side TraceStore with framework observability (LangSmith, etc.) #83

Problem

Proposed Solution

How it works

Framework-agnostic design

Acceptance Criteria

Dependencies

Estimated Effort

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Unified trace bridge: link game-side TraceStore with framework observability (LangSmith, etc.) #83

Description

Problem

Proposed Solution

How it works

Framework-agnostic design

Acceptance Criteria

Dependencies

Estimated Effort

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions