SkillBlender RL#
We provide implementent SkillBlender into our framework.
RL algorithm: PPO
by rsl_rl v1.0.2
RL learning framework: hierarchical RL
Simulator: IsaacGym
Installation#
pip install -e roboverse_learn/rl/rsl_rl
Training#
IssacGym:
python3 roboverse_learn/skillblender_rl/train.py --task "skillblender:Walking" --sim "isaacgym" --num_envs 1024 --robot "h1_wrist" --use_wandb
after training around a few minuts for task
skillblender:Walking
andskillblender:Stepping
, you can see like this. Note that we should always useh1_wrist
instead of navieh1
keep ths wrist links exist. To speed up training, click the IsaacGym viewer and press V to stop rendering.
Play#
After training a few minutes, you can run the following play script
python3 roboverse_learn/skillblender_rl/play.py --task skillblender:Reaching --sim isaacgym --robot h1_wrist --load_run 2025_0628_232507 --checkpoint 15000
you can see video like this

Skillblender::Reaching

Skillblender::Walking
Checkpoints#
We also provide checkpoints trained with roboverse humanoid infra. To use it with roboverse_learn/skillblender_rl/play.py
,
rename the file to model_xxx.pt
and move it into the appropriate directory, which should have the following layout:
outputs/
└── skillblender/
└── h1_wrist_reaching/ # Task name
└── 2025_0628_232507/ # Timestamped experiment folder
├── reaching_cfg.py # Config snapshot (copied from metasim/cfg/tasks/skillblender)
├── model_0.pt # Checkpoint at iteration 0
├── model_500.pt # Checkpoint at iteration 500
└── ...
when play
Task list#
4 Goal-Conditional Skills
Walking
Squatting
Stepping
Reaching
8 Loco-Manipulation Tasks
FarReach
ButtonPress
CabinetClose
FootballShoot
BoxPush
PackageLift
BoxTransfer
PackageCarry
Robots supports#
h1
g1
h1_2
Todos#
domain randomization
pushing robot
sim2sim
How to add new Task#
Create your wrapper module
Add a new file
abc_wrapper.py
underroboverse_learn/skillblender_rl/env_wrappers
Add a config file
abc_cfg.py
undermetasim/cfg/tasks/skillblender
define your reward functions in reward_fun_cfg.py, check whether the current states or variables are enough for reward computation.
If states not enough, add global variable by overriding
_init_buffer()
def _init_buffers(self): super()._init_buffers() """DEFINED YOUR VARIABLE or BUFFER HERE""" self.xxx = xxx
parse your state for reward computation if necessary:
def _parse_NEW_STATES(self, envstate): """NEWSTATES PARSEING...""" envstate[robot_name].extra{'xxx'} = self.xxx def _parse_state_for_reward(self, envstate): super()._parse_state_for_reward(self, envstate): _parse_NEW_STATES(self, envstate)
Implemented
_compute_observation()
fill
obs
andprivelidged_obs
.modified
_post_physics_step
to reset variables you defined withreset_env_idx
Add Cfg for your task
metasim/cfg/tasks/skillblender
References and Acknowledgements#
We implement SkillBlender based on and inspired by the following projects: