PPO#

RoboVerse provides two PPO implementations with different features and use cases:

2. CleanRL PPO#

Based on CleanRL, this implementation provides a more minimal and educational approach with direct algorithm implementation.

Usage#

# CleanRL PPO with RoboVerse environment
python roboverse_learn/rl/clean_rl/ppo.py --task reach_origin --robot franka --sim mjx --num_envs 2048

Configuration#

Check the file header in roboverse_learn/rl/clean_rl/ppo.py for available configuration options including:

  • Task selection (--task)

  • Robot type (--robot)

  • Simulator backend (--sim)

  • Training hyperparameters (--num_envs, --learning_rate, etc.)

Quick Start Examples#

For detailed tutorials and infrastructure setup: