Define episode lifecycle protocol (start, end, restart, chaining)

## Problem

With the compiled executable (#77) and cross-episode learning (#76), several episode lifecycle questions are unanswered. Currently:

- **SPACE** starts/stops simulation (goes away with auto-start)
- **R** resets the scene
- Episode boundaries are detected ad-hoc by tick resets in `EpisodeMemoryManager`
- No formal protocol for episode end conditions
- No way to chain episodes automatically

For persistent cross-episode memory (#76) to work, and for the compiled executable (#77) to feel polished, the episode lifecycle needs a clear protocol that both Godot and Python agree on.

## Questions to Answer

### 1. How does an episode END?
Define explicit end conditions per scenario:
- **Foraging**: All resources collected? Health reaches 0? Tick limit (e.g., 500 ticks)?
- **Crafting chain**: All recipes crafted? Tick limit?
- **Team capture**: All points held? Time limit?

Currently `foraging.gd` has `MAX_RESOURCES=7` but no end-on-completion logic.

### 2. How does Godot signal episode end to Python?
Options:
- Special field in observation: `{"episode_complete": true, "reason": "all_collected", "score": 85}`
- Dedicated endpoint: `POST /episode_end` from Godot → Python
- Observation stops arriving (implicit — fragile)

### 3. How does a new episode START?
- Auto-restart after N seconds?
- Python requests restart via `POST /reset`?
- User presses a key in the game window?
- Configurable: `--episodes 5 --delay-between 3`

### 4. How are episodes numbered/tracked?
For #76's persistent memory, each episode needs an ID:
- Sequential: episode_1, episode_2, ...
- Timestamped: episode_20260223_143022
- Both Godot and Python must agree on the current episode ID

### 5. Can episodes chain automatically?
For the learning progression demo (#76), users want to run 5+ episodes and watch improvement:
```bash
python run.py --scenario foraging --episodes 5
```
This requires: auto-restart, episode boundary signaling, and Python-side orchestration.

## Proposed Protocol

### Episode State Machine
```
WAITING → RUNNING → ENDED → (auto) WAITING
   |         |         |
   |    tick_advanced   |
   |    observations    |
   |    tool_calls      |
   |         |          |
   |    end_condition   |
   |    met             |
   |         |          |
   |         v          |
   |      ENDED --------+--→ save persistent memory
   |         |               generate episode summary
   |         v
   +---- WAITING (reset scene, increment episode_id)
```

### IPC Messages

**Godot → Python (episode end)**:
Include in final observation or as a separate message:
```json
{
  "episode_ended": true,
  "episode_id": 3,
  "reason": "objective_complete",
  "final_score": 85,
  "ticks_elapsed": 247,
  "metrics": {
    "resources_collected": 7,
    "damage_taken": 15,
    "distance_traveled": 142.5,
    "exploration_pct": 0.78
  }
}
```

**Python → Godot (episode control)**:
New endpoints or CLI args:
```
POST /reset          — restart current scenario
POST /configure      — set episode params (tick_limit, auto_restart)
--episodes N         — run N episodes then quit
--tick-limit 500     — max ticks per episode
```

### Godot-Side Changes
- Add end conditions to `base_scene_controller.gd` (configurable per scenario)
- Add `episode_id` tracking (increment on reset)
- Add auto-restart logic (configurable delay)
- Include episode metadata in observations

### Python-Side Changes
- SDK `AgentArena` class gains episode lifecycle hooks:
  - `on_episode_start(episode_id)`
  - `on_episode_end(episode_id, summary)`
- Adapter base class (#74) exposes these hooks to frameworks
- `--episodes N` flag in run.py for batch execution

## Acceptance Criteria

- [ ] Foraging scene has explicit end conditions (objective met OR tick limit)
- [ ] Godot signals episode end to Python with score and metrics
- [ ] Python can request scene reset
- [ ] Episodes have unique IDs agreed upon by both sides
- [ ] `--episodes 5` runs 5 consecutive episodes with automatic restarts
- [ ] Episode lifecycle hooks available in SDK for persistent memory (#76)
- [ ] Protocol documented for scenario developers

## Estimated Effort
1-2 days

## Dependencies
- Related to #71 (tool callbacks — similar IPC extension pattern)
- Blocks #76 (persistent memory needs episode boundaries)
- Related to #77 (compiled exe needs auto-start/restart)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Define episode lifecycle protocol (start, end, restart, chaining) #81

Problem

Questions to Answer

1. How does an episode END?

2. How does Godot signal episode end to Python?

3. How does a new episode START?

4. How are episodes numbered/tracked?

5. Can episodes chain automatically?

Proposed Protocol

Episode State Machine

IPC Messages

Godot-Side Changes

Python-Side Changes

Acceptance Criteria

Estimated Effort

Dependencies

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Define episode lifecycle protocol (start, end, restart, chaining) #81

Description

Problem

Questions to Answer

1. How does an episode END?

2. How does Godot signal episode end to Python?

3. How does a new episode START?

4. How are episodes numbered/tracked?

5. Can episodes chain automatically?

Proposed Protocol

Episode State Machine

IPC Messages

Godot-Side Changes

Python-Side Changes

Acceptance Criteria

Estimated Effort

Dependencies

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions