# Prepare Data If you are not a developer, you don't have to follow this instruction. The required data will be downloaded automatically from HuggingFace when you run the code. For the current stage, the HuggingFace dataset is private. So you need a token to access it. See tips for using HuggingFace [here](../developer_guide/tips/huggingface.md). ## Data on HuggingFace See tips for using HuggingFace [here](../developer_guide/tips/huggingface.md). Download the release-ready data by running: ```bash git clone git@hf.co:datasets/RoboVerseOrg/roboverse_data ``` ```{note} If you haven't used Git LFS on your machine, run the following command to install it: ```bash sudo apt install git-lfs git lfs install ``` ### File structure ```text roboverse_data |-- robots |-- {robot} |-- usd |-- *.usd |-- urdf |-- *.urdf |-- mjcf |-- *.xml |-- trajs |-- {benchmark} |-- {task} |-- v2 |-- {robot}_v2.pkl |-- assets |-- {benchmark} |-- {task} |-- COMMON |-- {object} |-- textures |-- *.png |-- usd |-- *.usd |-- urdf |-- *.urdf |-- mjcf |-- *.xml |-- materials |-- arnold |-- {Category} |-- *.mdl |-- vMaterial_2 |-- {Category} |-- *.mdl |-- ... |-- scenes |-- {source} (e.g. arnold, physcene) |-- {scene} |-- textures |-- *.png |-- usd |-- *.usd |-- urdf |-- *.urdf |-- mjcf |-- *.xml metasim |-- data |-- quickstarts |-- ... |-- robots (deprecated) |-- {robot} |-- usd |-- *.usd |-- urdf |-- *.urdf |-- mjcf |-- *.xml ``` Explanation: - When assets are reused in multiple tasks, they can be stored in a `COMMON` folder to avoid redundancy. Naming convention: - The {benchmark} is in lowercase - The {task} is in snake_case - The {robot} is in snake_case - The {object} is in snake_case ## Data on Google Drive (Deprecated) 1. Please download the data from [here](https://drive.google.com/drive/folders/1ORMP3__KIlXettN8eUCF3YQNybZQxzkw) and put it under `./data`. 2. Download the converted isaaclab data from [here](https://drive.google.com/drive/folders/1nF-5SU4nC6S_vgC_NL7E7Rt63Zl9sYqW) and put it under `./data_isaaclab`. 3. Then link `./data/source_data` to `./data_isaaclab/source_data`. ```bash cd data_isaaclab ln -s ../data/source_data source_data ``` ### Sync data with Google Drive We recommend using rclone to sync the data with google drive. #### Setup rclone Create a rclone config under the [official guide](https://rclone.org/drive/), during which you need to create your own client id and token by following [this](https://rclone.org/drive/#making-your-own-client-id). The final rclone config should look like this: ``` [roboverse] type = drive client_id = client_secret = scope = drive token = team_drive = root_folder_id = 1ORMP3__KIlXettN8eUCF3YQNybZQxzkw [roboverse_isaaclab] type = drive client_id = client_secret = scope = drive token = team_drive = root_folder_id = 1nF-5SU4nC6S_vgC_NL7E7Rt63Zl9sYqW ``` #### Download data ```bash rclone copy -P roboverse: data --exclude='demo/**' rclone copy -P roboverse_isaaclab: data_isaaclab ``` You may want to speed up by setting `--multi-thread-streams=16` (default is 4). #### Upload data For example, if you are a developer who is migrating the rlbench task, you can upload the assets and source demos to google drive by running: ```bash rclone copy -P data/assets/rlbench roboverse_isaaclab:assets/rlbench --exclude='.thumbs/**' rclone copy -P data/source_data/rlbench roboverse:data/source_data --include='trajectory-unified.pkl' ``` You ***should*** replace `-P` with `-n` (means dry run) to check before uploading.