Diffusion-Policy#

Installation#

cd roboverse_learn/algorithms/diffusion_policy
pip install -e .
cd ../../../

pip install pandas wandb

Usage of Training#

  1. First run ‘data2zarr_dp.py’ to get zarr data

# --------- instruction format ---------#

python roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py  <task_name> <expert_data_num> <metadata_dir> <split offset>

# The format of 'task_name' should be {task}_{robot}
# e.g.StackCube_franka / CloseBox_franka

# 'expert_data_num' means number of training data
# you can define it by yourself, but it should be less than the number of data in 'metadata_dir'

# 'metadata_dir' means the location of metadata

# --------- instruction example ---------#

python roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py CloseBox_franka 10 ~/Project/RoboVerse/RoboVerse/data_isaaclab/demo/CloseBox/robot-franka 1

The processed zarr data will be saved in ‘data_policy’. If you wanna move the zarr data to headless server, please make sure the new location is the same as ‘data_policy’ in the headless server.

ATTENTION:

In this script, we define ‘joint_qpos’ as the state and ‘joint_qpos_target’ as the action. If you wanna change, just modify the code in line 84 and line 85 in ‘roboverse_learn/algorithms/diffusion_policy/data2zarr_dp.py’.

By the way, if you change the ‘joint_qpos’ and ‘joint_qpos_target’ to other parameters, please check agent_pos:shape and action:shape in roboverse_learn/algorithms/diffusion_policy/diffusion_policy/config/task/default_task.yaml to make sure they are consistent with the shape of new parameters.

  1. Then run ‘train.sh’ to train DP

# --------- instruction format ---------#
# example task: using 100 CloseBox Franka demo to train a diffusion policy
python roboverse_learn/algorithms/diffusion_policy/train.py --config-name=robot_dp.yaml \
                task.name=CloseBox_franka_100 \
                task.dataset.zarr_path="data_policy/CloseBox_franka_5.zarr" \
                training.debug=False \
                training.seed=0 \
                training.device="cuda:0" \
                exp_name=CloseBox_franka_100_dp \
                horizon=4 \
                n_obs_steps=2 \
                n_action_steps=2

you can modify some parameters (including training epoch, batch_size) in ‘roboverse_learn/algorithms/diffusion_policy/diffusion_policy/config/robot_dp.yaml’

Here list some important parameters which can consider to change:

  • horizon (line 12)

  • n_obs_steps (line 13)

  • n_action_steps (line 14)

  • dataloader: batch_size (line 76)

  • val_dataloader: batch_size (line 83)

  • num_epochs (line 104)

  • checkpoint_every (line 113)

  • val_every (line 114)

Usage of Validation#

Use following instruction to run validation (customize some input parameters):

python roboverse_learn/eval.py --task CloseBox --sim isaaclab --checkpoint_path XXX(absolute_path)