# Unitree RL

Train and deploy locomotion policies for Unitree robots across three stages:
- Training in IsaacGym, IsaacSim
- Sim2Sim evaluation in IsaacGym, IsaacSim, MuJoCo
- Real-world deployment (networked controller)

Well Supported robots: `g1_dof29` (full-body with without hands) and `g1_dof12` (lower-body).


## Environment setup

Install the RL library dependency (rsl_rl v3.1.1) from source:
```
git clone https://github.com/leggedrobotics/rsl_rl
cd rsl_rl
git checkout v3.1.1
pip install -e .
```


## Training (IsaacGym)

General form:
```
python roboverse_learn/rl/unitree_rl/main.py \
  --task <your_task> \
  --sim isaacgym \
  --num_envs 8192 \
  --robot <your_robot>
```

Examples:
- G1 humanoid walking (IsaacSim):
```
python roboverse_learn/rl/unitree_rl/main.py --task walk_g1_dof29 --sim isaacsim --num_envs 8192 --robot g1_dof29
```
- G1Dof12 walking (IsaacGym):
```
python roboverse_learn/rl/unitree_rl/main.py --task walk_g1_dof12 --sim isaacgym --num_envs 8192 --robot g1_dof12
```

Outputs and checkpoints are saved to:
```
outputs/unitree_rl/<robot>_<task>/<datetime>/
```
Each checkpoint is named `model_<iter>.pt`.

## Evaluation / Play

You can evaluate trained policies in both MuJoCo, Isaacsim and IsaacGym. In evaluation, `main.py` also exports the jit version policy to the directory `outputs/unitree_rl/<robot>_<task>/<datetime>/exported/model_exported_jit.pt`, which can be further used for real-world deployment.

IsaacGym evaluation:
```
python roboverse_learn/rl/unitree_rl/main.py \
  --task walk_g1_dof29 \
  --sim isaacgym \
  --num_envs 1 \
  --robot g1_dof29 \
  --resume <datetime_from_outputs> \
  --checkpoint <iter> \
  --eval
```

MuJoCo evaluation (e.g., DOF12 with public policy):
```
python roboverse_learn/rl/unitree_rl/main.py \
  --checkpoint <iter> \
  --task walk_g1_dof12 \
  --sim mujoco \
  --robot g1_dof12 \
  --resume <datetime_from_outputs> \
  --eval
```
the `--resume` and `--checkpoint` option can also be used during training for checkpoint resume.

## Real-World deployment

First please install the `unitree_sdk2_python` package:
```
cd third_party
git clone https://github.com/unitreerobotics/unitree_sdk2_python.git
cd unitree_sdk2_python
pip install -e .
```

Real-world deployment entry point:
```
python roboverse_learn/rl/unitree_rl/deploy/deploy_real.py <network_interface> <robot_yaml>
```
Example:
```
python roboverse_learn/rl/unitree_rl/deploy/deploy_real.py eno1 g1_dof29_dex3.yaml
```
where you should modify the corresponding `yaml` file in `roboverse_learn/rl/unitree_rl/deploy/configs`, setting the `policy_path` to the exported jit policy.
This will initialize the real controller and stream commands to the robot. Ensure your networking and safety interlocks are correctly configured.

## Command-line arguments

The most relevant flags (see `helper/utils.py`):
- `--task` (str): Task name. CamelCase or snake_case accepted. Examples: `walk_g1_dof29`, `walk_g1_dof12`.
- `--robot` (str): Robot identifier. Common: `g1_dof29`, `g1_dof12`.
- `--num_envs` (int): Number of parallel environments.
- `--sim` (str): Simulator. Supported: `isaacgym` (training), `mujoco` (evaluation).
- `--run_name` (str): Required run tag for training logs/checkpoints.
- `--learning_iterations` (int): Number of learning iterations (default 15000).
- `--resume` (flag): Resume training from a checkpoint dir (datetime) in the specified run.
- `--checkpoint` (int): Which checkpoint to load. `-1` loads the latest.
- `--headless` (flag): Headless rendering (IsaacGym).
- `--jit_load` (flag): Load the jit policy.

Notes:
- Checkpoints: `outputs/unitree_rl/<task>/<run_name or datetime>/model_<iter>.pt`
- Exported JIT model (when used): `outputs/unitree_rl/<task>/<run_name or datetime>/exported/model_exported_jit.pt`