Diffusion Policy#

Installation#

cd roboverse_learn/il/utils/diffusion_policy

pip install -e .

cd ../../../../

pip install pandas wandb

Register for a Weights & Biases (wandb) account to obtain an API key.

Workflow#

Step 1: Collect and pre-processing data#

./roboverse_learn/il/collect_demo.sh

collect_demo.sh collects demos, i.e., metadata, using ~/RoboVerse/scripts/advanced/collect_demo.py and converts the metadata into Zarr format for efficient dataloading. This script can handle both joint position and end effector action and observation spaces.

Outputs: Metadata directory is stored in metadata_dir. Converted dataset is stored in ~/RoboVerse/data_policy

Parameters:#

Argument	Description	Example
`task_name_set`	Name of the task	`close_box`
`sim_set`	Name of the selected simulator	`isaacsim`
`max_demo_idx`	Maximum index of demos to collect	`100`
`expert_data_num`	Number of expert demonstrations to process	`100`
`metadata_dir`	Path to the directory containing demonstration metadata saved by collect_demo	`~/RoboVerse/roboverse_demo/demo_isaacsim/close_box-/robot-franka`
`action_space`	Type of action space to use (options: ‘joint_pos’ or ‘ee’)	`joint_pos`
`observation_space`	Type of observation space to use (options: ‘joint_pos’ or ‘ee’)	`joint_pos`
`delta_ee`	(optional) Delta control (0: absolute, 1: delta; default 0)	`0`
`cust_name`	User defined name	`noDR`

Step 2: Training and evaluation#

./roboverse_learn/il/dp/dp_run.sh

dp_run.sh uses roboverse_learn/il/dp/main.py and the generated Zarr data, which gets stored in the data_policy/ directory, to train and evaluate the DP model.

Outputs: Training result is stored in ~/RoboVerse/info/outputs/DP. Evaluation result is stored in ~/RoboVerse/tmp

Parameters:#

Argument	Description	Example
`task_name`	Name of the task	`close_box`
`sim_set`	Name of the selected simulator	`isaacsim`
`gpu_id`	ID of the GPU to use	`0`
`train_enable`	Enable training	`True`
`eval_enable`	Enable evaluation	`True`
`algo_choose`	Choose training or inference algorithm. 0: DDPM, 1: DDIM, 2: Flow matching 3: Score-based	`0`
`eval_path`	Path of trained DP model. Need to be set if `train_enable = False`.	`/path/to/your/checkpoint.ckpt`
`zarr_path`	Path to the zarr dataset.	`data_policy/${task_name}FrankaL${level}_${extra}_${expert_data_num}.zarr`
`seed`	Random seed for reproducibility	`42`
`num_epochs`	Number of training epochs	`200`
`obs_type`	Observation type (joint_pos or ee)	`joint_pos`
`action_type`	Action type (joint_pos or ee)	`joint_pos`
`delta`	Delta control mode (0 for absolute, 1 for delta)	`0`
`device`	GPU device to use	`"cuda:7"`

Important Parameter Overrides:

horizon, n_obs_steps, and n_action_steps are set directly in dp_runner.sh and override the YAML configurations.

Switching between Joint Position and End Effector Control

Joint Position Control: Set both obs_space and act_space to joint_pos.
End Effector Control: Set both obs_space and act_space to ee. You may use delta_ee=1 for delta mode or delta_ee=0 for absolute positioning.

Adjust relevant configuration parameters in:

roboverse_learn/il/dp/config/.yaml

Diffusion Policy#

Installation#

Workflow#

Step 1: Collect and pre-processing data#

Parameters:#

Step 2: Training and evaluation#

Parameters:#

This Page