ACT#
ACT (Action Chunking with Transformers) implements a transformer-based VAE policy, which generates chunks of ~100 actions at each step. These are averaged using temporal ensembling to generate a single action. This algorithm was introduced by the Aloha paper, and uses the same implementation.
Installation#
cd roboverse_learn/algorithms/act/detr
pip install -e .
cd ../../../
pip install pandas wandb
Option 1: Two Step, Pre-processing and Training#
Data Preparation:#
data2zarr_dp.py converts the metadata stored by the collect_demo script into Zarr format for efficient dataloading. This script can handle both joint position and end effector action and observation spaces.
Command:
python roboverse_learn/algorithms/data2zarr_dp.py \
--task_name <task_name> \
--expert_data_num <expert_data_num> \
--metadata_dir <metadata_dir> \
--action_space <action_space> \
--observation_space <observation_space> \
--delta_ee <delta_ee>
Argument |
Description |
Example |
---|---|---|
|
Name of the task |
|
|
Number of expert demonstrations to process |
|
|
Path to the directory containing demonstration metadata saved by collect_demo |
|
|
Type of action space to use (options: ‘joint_pos’ or ‘ee’) |
|
|
Type of observation space to use (options: ‘joint_pos’ or ‘ee’) |
|
|
(optional) Delta control (0: absolute, 1: delta; default 0) |
|
Training:#
roboverse_learn/algorithms/act/train.py uses the generated Zarr data, which gets stored in the data_policy/
directory, to train the ACT model.
Command:
python -m roboverse_learn.algorithms.act.train \
--task_name <task_name> \
--num_episodes <num_episodes> \
--dataset_dir <dataset_dir> \
--policy_class <policy_class> \
--kl_weight <kl_weight> \
--chunk_size <chunk_size> \
--hidden_dim <hidden_dim> \
--batch_size <batch_size> \
--dim_feedforward <dim_feedforward> \
--num_epochs <num_epochs> \
--lr <lr> \
--state_dim <state_dim> \
--seed <seed>
Argument |
Description |
Example |
---|---|---|
|
Name of the task |
|
|
Number of episodes in the dataset |
|
|
Path to the zarr dataset created in Data Preparation step |
|
|
Policy class to use |
|
|
Weight for KL divergence loss |
|
|
Number of actions per chunk |
|
|
Hidden dimension size for the transformer |
|
|
Batch size for training |
|
|
Feedforward dimension for transformer |
|
|
Number of training epochs |
|
|
Learning rate |
|
|
State dimension (action space dimension) |
|
|
Random seed for reproducibility |
|
Option 2: Run with Single Command: train_act.sh#
We further wrap the data preparation and training into a single command: train_act.sh
. This ensures consistency between the parameters of the data preparation and training, especially the action space, observation space, and data directory.
bash roboverse_learn/algorithms/act/train_act.sh <metadata_dir> <task_name> <expert_data_num> <gpu_id> <num_epochs> <obs_space> <act_space> [<delta_ee>]
Argument |
Description |
Example |
---|---|---|
|
Path to the directory containing demonstration metadata saved by collect_demo |
|
|
Name of the task |
|
|
Number of expert demonstrations to use |
|
|
ID of the GPU to use |
|
|
Number of training epochs |
|
|
Observation space ( |
|
|
Action space ( |
|
|
Optional: Delta control ( |
|
Example:
bash roboverse_learn/algorithms/act/train_act.sh roboverse_demo/demo_isaaclab/CloseBox-Level0/robot-franka CloseBoxFrankaL0 100 0 2000 joint_pos joint_pos
Important Parameter Overrides:
Key hyperparameters including
kl_weight
(set to 10),chunk_size
(set to 100),hidden_dim
(set to 512),batch_size
(set to 8),dim_feedforward
(set to 3200), andlr
(set to 1e-5) are set directly intrain_act.sh
.state_dim
is set to 9 by default, which works for both Franka joint space and end effector space.Notably,
chunk_size
is the most important parameter, which is defaulted to 100 actions per step.
Switching between Joint Position and End Effector Control#
Joint Position Control: Set both
obs_space
andact_space
tojoint_pos
.End Effector Control: Set both
obs_space
andact_space
toee
. You may usedelta_ee=1
for delta mode ordelta_ee=0
for absolute positioning.Note the original ACT paper uses an action joint space of 14, but we modify the code to allow a parameterized action dimensionality
state_dim
to be passed into the training python script, which we default to 9 for Franka joint space or end effector space.
Evaluation#
To deploy and evaluate the trained policy:
python roboverse_learn/eval.py --task CloseBox --algo ACT --num_envs <up to ~50 envs works on RTX> --checkpoint_path <save_directory>
Ensure that <save_directory>
points to the directory containing your trained model checkpoint, which should get saved to info/outputs/ACT/...