Skip to content

Add move-and-shoot training with curriculum learning#4

Merged
cgpadwick merged 3 commits into
mainfrom
feature/move-and-shoot
Feb 4, 2026
Merged

Add move-and-shoot training with curriculum learning#4
cgpadwick merged 3 commits into
mainfrom
feature/move-and-shoot

Conversation

@cgpadwick

Copy link
Copy Markdown
Contributor

Summary

  • Adds move-and-shoot training where the robot follows a path through the alliance zone while shooting at fixed intervals
  • Ball inherits robot velocity at launch via full 3D Euler integration physics
  • Automatic curriculum learning ramps path speed as the agent improves: stationary → slow (0.5–1.0 m/s) → medium (1.0–3.0) → fast (3.0–5.0)
  • Enabled via --move-and-shoot flag (continuous env only), backward compatible with existing training
  • Adds --resume flag for continuing training from checkpoints

New modules

Module Purpose
src/paths/ Path generation with zone boundary and hub avoidance
src/callbacks/ Curriculum callback for automatic difficulty progression

Key changes

File Change
src/physics/projectile.py compute_trajectory_3d_moving() — full 3D integration with robot velocity
src/env/shooter_env_continuous.py 4D observation space [dist, bearing, vx, vy], path-based episodes
scripts/train.py --move-and-shoot, --resume flags, curriculum callback wiring
scripts/evaluate.py --air-resistance, --move-and-shoot passthrough

Training results (74K steps)

  • Stationary shooting mastered by ~25K steps (reward peaked at +83)
  • Curriculum advanced to slow-moving at ~28K steps
  • Agent adapting to velocity compensation (reward recovering from transition dip)

Test plan

  • All 98 tests pass (pytest tests/ -v)
  • Gymnasium check_env passes for both stationary and moving modes
  • Zero robot velocity matches existing compute_trajectory_3d results
  • Stationary training backward compatible
  • Full curriculum progression through all 4 levels (longer run needed)

🤖 Generated with Claude Code

cgpadwick and others added 3 commits February 2, 2026 21:37
Enables training the robot to shoot while moving along a path through
the alliance zone. Ball inherits robot velocity at launch for realistic
physics. Automatic curriculum ramps speed as the agent improves:
stationary → slow (0.5-1.0 m/s) → medium (1.0-3.0) → fast (3.0-5.0).

New modules:
- src/paths/: Path generation with zone boundary and hub avoidance
- src/callbacks/: Curriculum callback for automatic difficulty progression
- src/physics/projectile.py: Full 3D Euler integration with robot velocity
- 4D observation space [distance, bearing, vx, vy] when enabled

Activated via --move-and-shoot flag (continuous env only).
Also adds --resume flag for continuing training from checkpoints.

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…tic divergence

- Use LoggingSAC instead of plain SAC so gradient norms are actually logged
  to TensorBoard (actor_grad_norm, critic_grad_norm, and _max variants)
- Fix AttributeError in LoggingSAC on replay_data.discounts (doesn't exist
  in SB3's ReplayBufferSamples)
- Add gradient clipping (max_norm=1.0) on actor and critic to prevent
  gradient explosion in move-and-shoot training
- Set target_entropy=-6.0 (2x default) since optimal policy is nearly
  deterministic; default target drives entropy back up after convergence
- Reduce batch_size 4096->256 and gradient_steps 4->1 to fix critic
  Q-value divergence from over-updating on small replay buffers
- Bump learning_starts 500->1000 for better initial buffer coverage
- Update curriculum levels: first level is now "crawl" (0.1-0.5 m/s)
  instead of stationary, with adjusted speed ranges
- Clear replay buffer on curriculum advancement to prevent stale
  transitions from poisoning Q-value estimates

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
…otes

- Add move-and-shoot training commands and --resume flag to README
- Document curriculum learning levels and thresholds
- Add move-and-shoot results table (with and without air resistance)
- Document SAC hyperparameters and rationale (batch_size, target_entropy, etc.)
- Update project structure to include sac_logging, paths, and callbacks
- Update CLAUDE.md with LoggingSAC, gradient clipping, and curriculum details

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
@cgpadwick cgpadwick merged commit a48b440 into main Feb 4, 2026
3 checks passed
@cgpadwick cgpadwick deleted the feature/move-and-shoot branch February 4, 2026 03:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant