ACT#
ACT (Action Chunking with Transformers) implements a transformer-based VAE policy, which generates chunks of ~100 actions at each step. These are averaged using temporal ensembling to generate a single action. This algorithm was introduced by the Aloha paper, and uses the same implementation.
Key features:
Action chunking strategy is used to avoid compound error.
Temporal ensemble helps prevent jerky robot motion.
Generative model, i.e., conditional variational autoencoder, is employed to handle the stochastic property of huam data.
Installation#
cd roboverse_learn/il/utils/act/detr
pip install -e .
cd ../../../../../
pip install pandas wandb
Workflow#
Step 1: Collect and pre-processing data#
./roboverse_learn/il/collect_demo.sh
collect_demo.sh collects demos, i.e., metadata, using ~/RoboVerse/scripts/advanced/collect_demo.py and converts the metadata into Zarr format for efficient dataloading. This script can handle both joint position and end effector action and observation spaces.
Outputs: Metadata directory is stored in metadata_dir. Converted dataset is stored in ~/RoboVerse/data_policy
Parameters:#
Argument |
Description |
Example |
|---|---|---|
|
Name of the task |
|
|
Name of the selected simulator |
|
|
Maximum index of demos to collect |
|
|
Number of expert demonstrations to process |
|
|
Path to the directory containing demonstration metadata saved by collect_demo |
|
|
Type of action space to use (options: ‘joint_pos’ or ‘ee’) |
|
|
Type of observation space to use (options: ‘joint_pos’ or ‘ee’) |
|
|
(optional) Delta control (0: absolute, 1: delta; default 0) |
|
|
User defined name |
|
Step 2: Training and evaluation#
./roboverse_learn/il/act/act_run.sh
act_run.sh uses roboverse_learn/il/utils/act/train.py and the generated Zarr data, which gets stored in the data_policy/ directory, to train the ACT model. Subsequently, roboverse_learn/il/act/act_eval_runner.py is utilized to evaluate the trained model.
Outputs: Training result is stored in ~/RoboVerse/info/outputs/ACT. Evaluation result is stored in ~/RoboVerse/tmp/act
Parameters:#
Argument |
Description |
Example |
|---|---|---|
|
Name of the task |
|
|
ID of the GPU to use |
|
|
Enable training |
|
|
Enable evaluation |
|
Training parameters:#
Argument |
Description |
Example |
|---|---|---|
|
Number of episodes in the dataset |
|
|
Path to the zarr dataset created in Data Preparation step |
|
|
Policy class to use |
|
|
Weight for KL divergence loss |
|
|
Number of actions per chunk |
|
|
Hidden dimension size for the transformer |
|
|
Batch size for training |
|
|
Feedforward dimension for transformer |
|
|
Number of training epochs |
|
|
Learning rate |
|
|
State dimension (action space dimension) |
|
|
Random seed for reproducibility |
|
Important Parameter Overrides:
Key hyperparameters including
kl_weight(set to 10),chunk_size(set to 100),hidden_dim(set to 512),batch_size(set to 8),dim_feedforward(set to 3200), andlr(set to 1e-5) are set directly intrain_act.sh.state_dimis set to 9 by default, which works for both Franka joint space and end effector space.Notably,
chunk_sizeis the most important parameter, which is defaulted to 100 actions per step.
Switching between Joint Position and End Effector Control
Joint Position Control: Set both
obs_spaceandact_spacetojoint_pos.End Effector Control: Set both
obs_spaceandact_spacetoee. You may usedelta_ee=1for delta mode ordelta_ee=0for absolute positioning.Note the original ACT paper uses an action joint space of 14, but we modify the code to allow a parameterized action dimensionality
state_dimto be passed into the training python script, which we default to 9 for Franka joint space or end effector space.
Evaluation parameters:#
Argument |
Description |
Example |
|---|---|---|
|
Evaluation algorithm |
|
|
Number of environments |
|
|
Number of evaluated samples |
|
|
The directory containing your trained model checkpoint |
|
Outputs: Training result is stored in ~/RoboVerse/info/outputs/ACT. Evaluation result is stored in ~/RoboVerse/tmp/act
Initial test results#
Task: close_box#
Setup: temporal_agg=True, num_episodes=100, num_epochs=100
Chunking size |
成功率 |
|---|---|
1 |
0.17 |
10 |
0.17 |
20 |
0.76 |
40 |
0.55 |
60 |
0.51 |
80 |
0.56 |
100 |
0.09 |
120 |
0.66 |
140 |
0.53 |
160 |
0.54 |
Task: pick_butter#
Setup: temporal_agg=True, num_episodes=100, num_epochs=100
Chunking size |
成功率 |
|---|---|
1 |
0 |
10 |
0.5 |
20 |
0 |
40 |
1 |
60 |
1 |
80 |
0 |
100 |
0 |
120 |
0 |
140 |
0.5 |
160 |
0 |