Diffusion Policy#
Installation#
cd roboverse_learn/il/utils/diffusion_policy
pip install -e .
cd ../../../../
pip install pandas wandb
Register for a Weights & Biases (wandb) account to obtain an API key.
Workflow#
Step 1: Collect and pre-processing data#
./roboverse_learn/il/collect_demo.sh
collect_demo.sh collects demos, i.e., metadata, using ~/RoboVerse/scripts/advanced/collect_demo.py and converts the metadata into Zarr format for efficient dataloading. This script can handle both joint position and end effector action and observation spaces.
Outputs: Metadata directory is stored in metadata_dir. Converted dataset is stored in ~/RoboVerse/data_policy
Parameters:#
Argument |
Description |
Example |
|---|---|---|
|
Name of the task |
|
|
Name of the selected simulator |
|
|
Maximum index of demos to collect |
|
|
Number of expert demonstrations to process |
|
|
Path to the directory containing demonstration metadata saved by collect_demo |
|
|
Type of action space to use (options: ‘joint_pos’ or ‘ee’) |
|
|
Type of observation space to use (options: ‘joint_pos’ or ‘ee’) |
|
|
(optional) Delta control (0: absolute, 1: delta; default 0) |
|
|
User defined name |
|
Step 2: Training and evaluation#
./roboverse_learn/il/dp/dp_run.sh
dp_run.sh uses roboverse_learn/il/dp/main.py and the generated Zarr data, which gets stored in the data_policy/ directory, to train and evaluate the DP model.
Outputs: Training result is stored in ~/RoboVerse/info/outputs/DP. Evaluation result is stored in ~/RoboVerse/tmp
Parameters:#
Argument |
Description |
Example |
|---|---|---|
|
Name of the task |
|
|
Name of the selected simulator |
|
|
ID of the GPU to use |
|
|
Enable training |
|
|
Enable evaluation |
|
|
Choose training or inference algorithm. 0: DDPM, 1: DDIM, 2: Flow matching 3: Score-based |
|
|
Path of trained DP model. Need to be set if |
|
|
Path to the zarr dataset. |
|
|
Random seed for reproducibility |
|
|
Number of training epochs |
|
|
Observation type (joint_pos or ee) |
|
|
Action type (joint_pos or ee) |
|
|
Delta control mode (0 for absolute, 1 for delta) |
|
|
GPU device to use |
|
Important Parameter Overrides:
horizon,n_obs_steps, andn_action_stepsare set directly indp_runner.shand override the YAML configurations.
Switching between Joint Position and End Effector Control
Joint Position Control: Set both
obs_spaceandact_spacetojoint_pos.End Effector Control: Set both
obs_spaceandact_spacetoee. You may usedelta_ee=1for delta mode ordelta_ee=0for absolute positioning.
Adjust relevant configuration parameters in:
roboverse_learn/il/dp/config/.yaml