Skip to content

Latest commit

 

History

History
137 lines (94 loc) · 4.14 KB

File metadata and controls

137 lines (94 loc) · 4.14 KB

Game Notes

Details on each game's implementation, observation space, action space, and difficulty tuning.


Snake

Core: code/games/snake_core.py
Config: code/conf/game/snake.yaml
Robustness: code/conf/robustness/snake_*.yaml

Mechanics

Tile-based snake on a grid. The snake moves at a configurable tile rate (fps), which increases as it eats apples (fps_per_apple). Episode ends on wall collision, self-collision, or apple-inactivity timeout (~2 minute at base speed).

Observation Space

A local grid window centred on the snake's head, plus global context:

(2 * half_size + 1)² grid cells  +  6 scalars
  └─ default half_size=3 → 7×7 = 49 cells
Scalars: direction (x, y), food direction (x, y), speed, length
Total: 55 floats  (Box, [-10, 10])

Action Space

Discrete(4) UP, RIGHT, DOWN, LEFT (180° reversal is ignored)

Reward

Base reward: +1.0 per apple eaten. Shaped further by the active persona.

Key Config Parameters

Parameter Default Hard
width × height 720 × 480 700 × 500
cell 10 10
fps (base tile rate) 16 20
fps_per_apple 0.5 0.75
max_fps 26 40
penalty_range 1 2
penalty_max 0.01 0.02

Timeout

Snake can stall indefinitely by avoiding the apple. The wrapper applies an apple-inactivity timeout: if no apple is eaten within 960 tile steps (~1 minute at base speed), the episode is truncated. This is tracked via steps_since_apple in the info dict.


Flappy Bird

Core: code/games/flappy_core.py
Config: code/conf/game/flappy.yaml
Robustness: code/conf/robustness/flappy_*.yaml

Mechanics

Side-scrolling bird avoids pipe gaps. Episode ends immediately on collision with a pipe or the ground/ceiling. No time limit needed the game always terminates.

Observation Space

Continuous values describing the bird's state relative to upcoming pipes (position, velocity, gap location, distance).

Action Space

Discrete(2) 0 = do nothing, 1 = flap

Reward

Base reward: +1.0 per pipe cleared. Shaped further by the active persona.


Pong

Core: code/games/pong_core.py
Config: code/conf/game/pong.yaml
Robustness: code/conf/robustness/pong_*.yaml

Mechanics

Classic single-player Pong against a rule-based opponent. Episode ends when the ball goes out of bounds. Ball physics naturally resolve every rally no time limit needed.

Observation Space

Continuous: ball position/velocity, paddle positions.

Action Space

Discrete(3) stay, up, down

Reward

Based on rally outcome (scoring/conceding).


Donkey Kong

Core: code/games/dk_core.py
Config: code/conf/game/dk.yaml
Robustness: code/conf/robustness/dk_*.yaml

Mechanics

Mario-style platformer. Player climbs ladders and avoids rolling barrels to reach the top. Episode ends on contact with a barrel or a fall. Multiple level layouts are supported via level_id.

Observation Space

Controlled by obs_mode:

  • state structured vector (player pos, barrel positions, level info)
  • pixel downscaled pixel grid (obs_scale controls resolution)

Action Space

Discrete move left/right, climb up/down, jump

Reward

Progress-based: reward for climbing higher, penalty for being hit.

Key Config Parameters

Parameter Default
level_id 1
obs_scale 16
obs_mode state

Adding a New Game

  1. Implement code/games/<game>_core.py with:

    • get_action_space() → Gym space
    • get_observation_space() → Gym space
    • reset(seed=None) → obs
    • step(action, dt=None)(obs, base_reward, terminated, info)
    • render(surface, blit_only=False)
    • WIDTH, HEIGHT class attributes
  2. Add code/conf/game/<game>.yaml with _target_: code.games.<game>_core.<GameClass>

  3. Add three robustness configs: code/conf/robustness/<game>_default/easy/hard.yaml

  4. Add a persona: code/conf/reward/<game>_<persona>.yaml

  5. Add a metrics collector: code/metrics/<game>_balance.py

  6. Register in code/conf/grid.yaml under games and personas