Achitecture Overview#

Metasim Overview#

Metasim is a standalone simulator layer designed to provide a unified interface to different underlying physics backends (e.g. MuJoCo, Isaac). It is simulator-agnostic, and only contains code and configuration necessary for simulating scenes and extracting structured state information.

Its design principle:

  1. Metasim is a standalone simulation interface that supports multiple use cases.

  2. Configurations only describe static, simulator-related properties.

  3. New tasks should be easy to migrate or implement from scratch without modifying simulator logic.


Directory Structure#

Inside the metasim/ folder:

Folder/File

Description

cfg/

Contains static py configs that define simulation-related properties — such as robot models, scenes, objects, and task setups.

sim/

Simulator-specific handlers. Each simulator has a handler that defines how to set, step, and get state.

scripts/

Includes runnable tools that operate within Metasim — e.g. trajectory replay, asset conversion.

test/

Contains consistency tests for handler behavior debug. In particular, it ensures information does not change after using get_state() and set_state()

utils/

Shared utility functions that Metasim uses internally

constants.py / types.py

Global definitions for enums and shared constants used throughout the metasim system.


Core Components#

The two most important folders in Metasim are:

  1. sim folder — Simulator Adapters

    The sim/ module defines simulation-specific handlers that bridge between low-level simulators (like MuJoCo, IsaacGym) and RoboVerse’s unified task interface.

    Each simulator implements a handler class (e.g., MJXHandler, IsaacHandler) by inheriting from BaseSimHandler. These handlers are responsible for loading assets, stepping physics, setting/resetting environment state, and extracting structured state for upper layers.


    Handler Lifecycle

    Every handler follows a common lifecycle:

    1. Initialization (__init__): Receives a ScenarioCfg which includes simulation metadata such as robots, objects, sensors, lights, task checker, etc. It extracts these components and stores useful references like self.robots, self.cameras, and self.object_dict.

    2. Launch (launch): This function builds the simulator model (e.g., loading MJCF/URDF files), compiles it, allocates buffers, and optionally initializes renderers or viewers.

    3. Close (close) Releases all simulator resources, such as viewers, renderers, or GPU memory buffers.


    Key Interface Functions

    1. get_state() → TensorState

    Purpose: Extracts structured simulator state for all robots, objects, and cameras into a unified TensorState data structure.

    This includes:

    • Root position/orientation of each object

    • Joint positions & velocities

    • Actuator states

    • Camera outputs (RGB / depth)

    • Optional per-task ā€œextrasā€

    It supports multi-env batched extraction, and ensures consistent structure across backends.


    1. set_state(ts: TensorState)

    Purpose: Restores or manually sets the simulator state using a full TensorState snapshot.

    This is often used for:

    • Episode resets to a known state

    • State injection during training

    • Replaying trajectories

    Internally this maps the unified TensorState back to simulator-specific structures (qpos, qvel, ctrl, etc.)


    1. simulate()

    Purpose: Executes the physics update (step function) in the simulator.

    This is typically called after applying actions or updating the state. It may involve multiple substeps (based on decimation rate) and handles model-specific quirks.


    1. get_extras(env_ids=None) → dict[str, Tensor]

    Purpose: Returns task-specific, non-standard information not present in the core TensorState.

    Examples include:

    • Site positions

    • Contact forces

    • Body mass

    • IMU sensor data

      ……

    Usage overview

    The full pipeline looks like this:

    Task.extra_spec()       # Declares what is needed
            │
            ā–¼
    SimHandler.get_extras()                 # Called by RL wrapper
            │
            ā–¼
    SimHandler.query_derived_obs(spec)      # Parses query dict
            │
            ā–¼
    Querier.query(query_obj, handler)       # Resolves each field
    

    Task-level declaration

    from metasim.cfg.query_type import SitePos, SensorData, GeomCollision, BodyMass
    
    def extra_spec(self):
        return {
            "head_pos"        : SitePos(["head"]),
            "gyro_torso"      : SensorData("gyro_torso"),
            "torso_mass"      : BodyMass("torso"),
            "left_foot_touch" : GeomCollision("left_foot", "floor"),
        }
    

    Output from get_extras()

    The returned dictionary will look like:

    {
        "head_pos":        Tensor of shape (N_env, 3),
        "gyro_torso":      Tensor of shape (N_env, 3),
        "torso_mass":      Tensor of shape (N_env,),  # scalar per env
        "left_foot_touch": Tensor of shape (N_env,),  # bool mask
    }
    

    Each value is resolved independently via the corresponding query type and handler logic.

  2. cfg folder — Simulator Configuration

    What Belongs in Config

    Each config file under cfg/ specifies only information required to build and launch the simulation. This includes:

    Key Section

    Purpose

    robots

    List of robot instances, including model path (e.g. MJCF or URDF), initial pose, joint limits, etc.

    objects

    Static or dynamic scene objects, such as tables, cubes, buttons. Each has position, type, and optional fixations.

    lights

    Light source settings for visual fidelity or vision-based tasks (e.g. color, direction, intensity).

    cameras

    Camera positions and intrinsics, e.g., for RGB, depth, or offscreen rendering.

    scene

    Ground plane, friction, or other high-level environment descriptors.

    sim_params

    Physics timestep, solver config, gravity toggle, etc.


    What Does Not Belong in Config

    To keep cfg/ clean and portable across tasks and RL settings, the following things are explicitly excluded:

    • Reward functions

    • Observation definitions

    • Success checkers

    • Task-level logic or termination conditions

    • Algorithm-specific parameters (policy type, optimizer, etc.)

    These should all live in upper-level wrappers in Roboverse_learn


    Integration with ScenarioCfg

    Every handler is initialized with a ScenarioCfg object parsed from these configs. The ScenarioCfg aggregates all static config elements (robot, objects, lights, etc.), and passes them to the simulation backend during launch.

    This decoupling ensures that you can:

    • Reuse one config across multiple RL tasks

    • Load the same config for visualization, trajectory replay, or debugging

    • Build new tasks without touching simulator configs

RoboVerse Learn Overeview#

RoboVerse Learn consists of Task Wrappers and Learning Framework.
Its goal is to present one standard interface that:

  • Lets any algorithm (PPO, SAC, BC, etc.) work with any task

  • Hides simulator & task differences, so you can swap tasks, simulators or algorithms with minimal friction


Design Principles#

#

Principle

Key Points

1

Standardised Wrapper API

• TaskWrapper exposes step / reset / _reward / _observation / _success.
• Once an algorithm is connected to a single TaskWrapper, it can seamlessly switch to any other task simply by replacing the wrapper.
• Upper‑level algorithms need not care whether the backend is MuJoCo, Isaac, etc.

2

Minimise Task‑Migration Cost

• Add a task: just subclass / compose a wrapper.
• Switch simulator: wrappers/algorithms stay unchanged.
• Directory layout, Configs management(except the sim-related part), training scripts all stay the same.

3

Reusable Reward & Checker Primitives

• Tasks build complex logic by composing primitives → no copy‑paste across tasks.


1. Module Composition#

Sub‑module

Responsibilities

Task Wrapper

• Combines a Handler & exposes step / reset.
• Assembles Reward / Observation / Success .
• Provides pre_sim_step & post_sim_step callbacks for task‑level DR.

Handler (Metasim)

• set_state / get_state / get_extras unified across engines.
• Physics‑level DR (pre_sim_step).
• Pure simulator adapter—no algorithm logic.

Learning Framework

• Any RL / IL algorithm.
• No simulator knowledge.

Custom Util Wrapper

• Provide lightweight extensions (e.g., NumPy-to-Torch conversion, first-frame caching) to support logging, preprocessing, or offline data collection without modifying core task logic.


2. Interface List#

Method

Purpose

step(action)

Runs one simulation step: calls pre_sim_step, then handler.simulate(), then post_sim_step; returns (obs, reward, done, info)

reset()

Resets the environment and applies reset_callback, returns initial observation

pre_sim_step()

(Optional) Hook for task-level domain randomization before simulation

post_sim_step()

(Optional) Hook for post-processing (e.g., observation noise)

get_state() / set_state()

Unified simulator-agnostic state interface using TensorState

get_extras(spec)

Returns task-specific quantities (e.g., site poses, contact forces) via query descriptors

3. Domain Randomisation Layers#

Layer

Location

Examples

Physics‑level

Handler

Friction, mass, light, material

Task‑level

Wrapper.pre/post_sim_step()

Action noise, observation noise, initial‑pose jitter

Rule: Simulator parameters → Handler; task‑coupled noise → Wrapper.


4.Migrating a New Task into RoboVerse#

We support two ways to bring an external task into the RoboVerse Learn pipeline:

Approach 1: Direct Integration (Quick Migration)#

The fastest way to integrate a new task is to:

  1. Copy the task codebase (from an external repo) into roboversa_learn/

  2. Replace any simulator-specific API calls with Handler equivalents

  3. Convert raw observations into RoboVerse TensorState via get_state()

  4. Move simulator-related config (e.g. robot model path, asset layout, dt, decimation, n_substeps) into ScenarioCfg and Metasim config files

This transforms the original task into a RoboVerse-compatible format while preserving its logic and structure.

Cross-simulator support is now enabled for this task.

Approach 2: Structured Wrapper Integration#

To enable better reuse and cross-task comparison:

  1. Subclass BaseTaskWrapper

  2. Implement standardized interfaces: _reward(), _observation(), _terminated()

  3. Use callbacks (pre_sim_step, post_sim_step, reset_callback) as needed

  4. Leverage existing Handler and ScenarioCfg setup from Approach 1

This approach supports full compatibility with:

  • Multi-task learning benchmarks

  • One-click algorithm switching

  • Clean architectural separation between task, sim, and learning logic


With either approach, you can quickly benchmark new tasks under different simulators or algorithms — with no boilerplate or duplicate integration.