0. PPO Reaching#
RL is a powerful tool for training agents to perform tasks in simulation, expecially when we have large scale parallel simulation environments.
In this example, we will train a PPO agent to reach as far away as possible and also reach a target position in a 3D environment.
One Command to Train PPO, Inference and Save Video#
We provide tutorials for training PPO, inference and saving video. In this example, we will use stable baseline 3 to train PPO.
If you are using MacOS: We only support mujoco with no parallelism now. Please run these code with mjpython instead of python and with additional tag --num_envs 1.
If you are using Windows: We only support mujoco with no parallelism now. Please use additional tag --num_envs 1.
Task: Reach Far Away#
python get_started/rl/0_ppo.py --sim <simulator> --task debug:reach_far_away --num_envs <num_envs> --headless
Task: Reach Target#
python get_started/rl/0_ppo.py --sim <simulator> --task debug:reach_origin --num_envs <num_envs> --headless
Example Commands and Results#
Task: Reach Far Away#
Isaac Gym:
python get_started/rl/0_ppo.py --sim isaacgym --task debug:reach_far_away --num_envs 128 --headless
Isaac Lab:
python get_started/rl/0_ppo.py --sim isaaclab --task debug:reach_far_away --num_envs 128 --headless
Task: Reach Target#
Isaac Gym:
python get_started/rl/0_ppo.py --sim isaacgym --task debug:reach_origin --num_envs 128 --headless
Isaac Lab:
python get_started/rl/0_ppo.py --sim isaaclab --task debug:reach_origin --num_envs 128 --headless
You can get the video like this:#
Reach Far Away:#
Isaac Gym
Isaac Sim
Reach Origin:#
Isaac Gym
Isaac Sim