SkillBlender RL#

We provide implementent SkillBlender into our framework.

RL algorithm: PPO by rsl_rl v1.0.2

RL learning framework: hierarchical RL

Simulator: IsaacGym

Installation#

pip install -e roboverse_learn/rl/rsl_rl

Training#

  • IssacGym:

    python3 roboverse_learn/skillblender_rl/train.py --task "skillblender:Walking" --sim "isaacgym" --num_envs 1024 --robot "h1_wrist" --use_wandb
    

    after training around a few minuts for task skillblender:Walking and skillblender:Stepping, you can see like this. Note that we should always use h1_wrist instead of navie h1 keep ths wrist links exist. To speed up training, click the IsaacGym viewer and press V to stop rendering.

Play#

After training a few minutes, you can run the following play script

python3 roboverse_learn/skillblender_rl/play.py --task skillblender:Reaching --sim isaacgym --robot h1_wrist --load_run 2025_0628_232507  --checkpoint 15000

you can see video like this

Skillblender::Reaching

Skillblender::Walking

Checkpoints#

We also provide checkpoints trained with roboverse humanoid infra. To use it with roboverse_learn/skillblender_rl/play.py, rename the file to model_xxx.pt and move it into the appropriate directory, which should have the following layout:

outputs/
└── skillblender/
    └── h1_wrist_reaching/        # Task name
        └── 2025_0628_232507/     # Timestamped experiment folder
            ├── reaching_cfg.py   # Config snapshot (copied from metasim/cfg/tasks/skillblender)
            ├── model_0.pt        # Checkpoint at iteration 0
            ├── model_500.pt      # Checkpoint at iteration 500
            └── ...

when play

Task list#

4 Goal-Conditional Skills

  • Walking

  • Squatting

  • Stepping

  • Reaching

8 Loco-Manipulation Tasks

  • FarReach

  • ButtonPress

  • CabinetClose

  • FootballShoot

  • BoxPush

  • PackageLift

  • BoxTransfer

  • PackageCarry

Robots supports#

  • h1

  • g1

  • h1_2

Todos#

  • domain randomization

  • pushing robot

  • sim2sim

How to add new Task#

  1. Create your wrapper module

    • Add a new file abc_wrapper.py under roboverse_learn/skillblender_rl/env_wrappers

    • Add a config file abc_cfg.py under metasim/cfg/tasks/skillblender

    • define your reward functions in reward_fun_cfg.py, check whether the current states or variables are enough for reward computation.

  2. If states not enough, add global variable by overriding _init_buffer()

    def _init_buffers(self):
        super()._init_buffers()
        """DEFINED YOUR VARIABLE or BUFFER HERE"""
        self.xxx = xxx
    
  3. parse your state for reward computation if necessary:

    def _parse_NEW_STATES(self, envstate):
        """NEWSTATES PARSEING..."""
        envstate[robot_name].extra{'xxx'} = self.xxx
    
    def _parse_state_for_reward(self, envstate):
        super()._parse_state_for_reward(self, envstate):
        _parse_NEW_STATES(self, envstate)
    
  4. Implemented _compute_observation()

    • fill obs and privelidged_obs.

    • modified _post_physics_step to reset variables you defined with reset_env_idx

  5. Add Cfg for your task metasim/cfg/tasks/skillblender

References and Acknowledgements#

We implement SkillBlender based on and inspired by the following projects: