Benchmark Results#

The following results are trained on 50 episodes and evaluated on another 50 episodes.

In-distribution Evaluation#

Task name
RGB
RGBD
PointCloud
resnet18 resnet18 ViT MultiViT pointnet spUnet
CloseBoxL0 0.81 0.91 0.89 0.80 0.82 0.92
CloseBoxL1 0.40 0.58 0.40 0.42 0.73 0.88
CloseBoxL2 0.42 0.30 0.30 0.32 0.82 0.62
StackCubeL0 0.91 0.87 0.06 0.06 0.00 0.00
StackCubeL1 0.01 0.00 0.00 0.00 0.00 0.00
StackCubeL2 0.01 0.00 0.00 0.00 0.00 0.00

Out-of-distribution Evaluation(Zero-shot)#

Task name
RGB
RGBD
PointCloud
resnet18 resnet18 ViT MultiViT pointnet spUnet
CloseBoxL0 0.52 0.72 0.68 0.80 0.60 0.94
CloseBoxL1 0.20 0.50 0.36 0.34 0.77 0.88
CloseBoxL2 0.32 0.38 0.40 0.32 0.38 0.42
StackCubeL0 0.29 0.19 0.00 0.02 0.00 0.00
StackCubeL1 0.00 0.00 0.00 0.00 0.00 0.00
StackCubeL2 0.00 0.00 0.00 0.00 0.00 0.00