Train a Unitree Go2 quadruped robot using Reinforcement Learning (PPO) in the Genesis physics simulator.
# Create and activate conda environment
conda create -n genesis python=3.10 -y
conda activate genesis
# Install Genesis
pip install genesis-world
# Install RL library (IMPORTANT: must be this exact version)
pip install rsl-rl-lib==2.2.4
# Install other dependencies
pip install torch tensorboardcd /Users/aks/Desktop/Robotbuild
# Train walking (stable locomotion at 0.3-0.8 m/s)
python train.py --mode walk
# Train running (fast locomotion at 1.0-2.5 m/s)
python train.py --mode run
# Train jumping
python train.py --mode jump
# Train all modes sequentially
python train.py --mode allpython train.py --mode walk \
--num_envs 4096 \ # Number of parallel environments (default: 4096)
--max_iterations 300 \ # Training iterations (default: 300)
--viewer \ # Show visualization during training
--resume # Resume from latest checkpointFor machines without GPU:
python train.py --mode walk --cpu --num_envs 64tensorboard --logdir=logs/Open http://localhost:6006 in your browser.
# Evaluate a trained model with visualization
python evaluate.py --checkpoint logs/go2_walk_XXXXXX/
# Run more episodes
python evaluate.py --checkpoint logs/go2_run_XXXXXX/ --episodes 10| Mode | Speed Target | Description |
|---|---|---|
| walk | 0.3-0.8 m/s | Stable, energy-efficient walking |
| run | 1.0-2.5 m/s | Fast running with dynamic gait |
| jump | N/A | Jumping with stable landing |
Robotbuild/
├── train.py # Main training script
├── evaluate.py # Model evaluation script
├── go2_env.py # Environment with all reward functions
├── urdf/ # Robot model files
│ ├── go2/ # Go2 robot URDF
│ └── plane/ # Ground plane URDF
├── logs/ # Training logs and checkpoints
└── models/ # Saved models
The environment includes comprehensive reward functions:
Locomotion rewards:
tracking_lin_vel- Track commanded velocitytracking_ang_vel- Track commanded turningforward_vel- Reward forward motion (run mode)
Stability penalties:
lin_vel_z- Penalize vertical bouncingang_vel_xy- Penalize roll/pitch rotationorientation- Penalize tiltingbase_height- Maintain proper height
Smoothness penalties:
action_rate- Penalize jerky movementsdof_acc- Penalize joint accelerationenergy- Penalize energy consumption
Jump-specific:
jump_height- Reward height achievedair_time- Reward time in airland_stable- Reward stable landing
"No module named genesis"
pip install genesis-world"Please install rsl-rl-lib==2.2.4"
pip uninstall rsl-rl rsl_rl # Remove any old versions
pip install rsl-rl-lib==2.2.4Training is very slow
- Use GPU: remove
--cpuflag - Reduce environments:
--num_envs 1024
Robot falls immediately
- This is normal at the start of training
- Wait for 50-100 iterations for learning to begin
- Check TensorBoard for reward curves
- Start with walking - It's the foundation for other behaviors
- Use GPU - Training is 10-100x faster on GPU
- Monitor TensorBoard - Watch for reward convergence
- 300 iterations is usually enough for basic walking
- Resume training if you need to continue:
--resume