Add move-and-shoot training with curriculum learning by cgpadwick · Pull Request #4 · Team766/rebuilt_rl

cgpadwick · 2026-02-03T05:41:43Z

Summary

Adds move-and-shoot training where the robot follows a path through the alliance zone while shooting at fixed intervals
Ball inherits robot velocity at launch via full 3D Euler integration physics
Automatic curriculum learning ramps path speed as the agent improves: stationary → slow (0.5–1.0 m/s) → medium (1.0–3.0) → fast (3.0–5.0)
Enabled via --move-and-shoot flag (continuous env only), backward compatible with existing training
Adds --resume flag for continuing training from checkpoints

New modules

Module	Purpose
`src/paths/`	Path generation with zone boundary and hub avoidance
`src/callbacks/`	Curriculum callback for automatic difficulty progression

Key changes

File	Change
`src/physics/projectile.py`	`compute_trajectory_3d_moving()` — full 3D integration with robot velocity
`src/env/shooter_env_continuous.py`	4D observation space `[dist, bearing, vx, vy]`, path-based episodes
`scripts/train.py`	`--move-and-shoot`, `--resume` flags, curriculum callback wiring
`scripts/evaluate.py`	`--air-resistance`, `--move-and-shoot` passthrough

Training results (74K steps)

Stationary shooting mastered by ~25K steps (reward peaked at +83)
Curriculum advanced to slow-moving at ~28K steps
Agent adapting to velocity compensation (reward recovering from transition dip)

Test plan

All 98 tests pass (pytest tests/ -v)
Gymnasium check_env passes for both stationary and moving modes
Zero robot velocity matches existing compute_trajectory_3d results
Stationary training backward compatible
Full curriculum progression through all 4 levels (longer run needed)

🤖 Generated with Claude Code

Enables training the robot to shoot while moving along a path through the alliance zone. Ball inherits robot velocity at launch for realistic physics. Automatic curriculum ramps speed as the agent improves: stationary → slow (0.5-1.0 m/s) → medium (1.0-3.0) → fast (3.0-5.0). New modules: - src/paths/: Path generation with zone boundary and hub avoidance - src/callbacks/: Curriculum callback for automatic difficulty progression - src/physics/projectile.py: Full 3D Euler integration with robot velocity - 4D observation space [distance, bearing, vx, vy] when enabled Activated via --move-and-shoot flag (continuous env only). Also adds --resume flag for continuing training from checkpoints. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…tic divergence - Use LoggingSAC instead of plain SAC so gradient norms are actually logged to TensorBoard (actor_grad_norm, critic_grad_norm, and _max variants) - Fix AttributeError in LoggingSAC on replay_data.discounts (doesn't exist in SB3's ReplayBufferSamples) - Add gradient clipping (max_norm=1.0) on actor and critic to prevent gradient explosion in move-and-shoot training - Set target_entropy=-6.0 (2x default) since optimal policy is nearly deterministic; default target drives entropy back up after convergence - Reduce batch_size 4096->256 and gradient_steps 4->1 to fix critic Q-value divergence from over-updating on small replay buffers - Bump learning_starts 500->1000 for better initial buffer coverage - Update curriculum levels: first level is now "crawl" (0.1-0.5 m/s) instead of stationary, with adjusted speed ranges - Clear replay buffer on curriculum advancement to prevent stale transitions from poisoning Q-value estimates Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

…otes - Add move-and-shoot training commands and --resume flag to README - Document curriculum learning levels and thresholds - Add move-and-shoot results table (with and without air resistance) - Document SAC hyperparameters and rationale (batch_size, target_entropy, etc.) - Update project structure to include sac_logging, paths, and callbacks - Update CLAUDE.md with LoggingSAC, gradient clipping, and curriculum details Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>

cgpadwick and others added 3 commits February 2, 2026 21:37

cgpadwick merged commit a48b440 into main Feb 4, 2026
3 checks passed

cgpadwick deleted the feature/move-and-shoot branch February 4, 2026 03:46

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add move-and-shoot training with curriculum learning#4

Add move-and-shoot training with curriculum learning#4
cgpadwick merged 3 commits into
mainfrom
feature/move-and-shoot

cgpadwick commented Feb 3, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

cgpadwick commented Feb 3, 2026

Summary

New modules

Key changes

Training results (74K steps)

Test plan

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant