Achitecture Overview#

Metasim Overview#

Metasim is a standalone simulator layer designed to provide a unified interface to different underlying physics backends (e.g. MuJoCo, Isaac). It is simulator-agnostic, and only contains code and configuration necessary for simulating scenes and extracting structured state information.

Its design principle:

Metasim is a standalone simulation interface that supports multiple use cases.

Configurations only describe static, simulator-related properties.

New tasks should be easy to migrate or implement from scratch without modifying simulator logic.

Directory Structure#

Inside the metasim/ folder:

Folder/File	Description
`cfg/`	Contains static py configs that define simulation-related properties — such as robot models, scenes, objects, and task setups.
`sim/`	Simulator-specific handlers. Each simulator has a handler that defines how to set, step, and get state.
`scripts/`	Includes runnable tools that operate within Metasim — e.g. trajectory replay, asset conversion.
`test/`	Contains consistency tests for handler behavior debug. In particular, it ensures information does not change after using `get_state()` and `set_state()`
`utils/`	Shared utility functions that Metasim uses internally
`constants.py / types.py`	Global definitions for enums and shared constants used throughout the metasim system.

Core Components#

The two most important folders in Metasim are:

sim folder — Simulator Adapters

The sim/ module defines simulation-specific handlers that bridge between low-level simulators (like MuJoCo, IsaacGym) and RoboVerse’s unified task interface.

Each simulator implements a handler class (e.g., MJXHandler, IsaacHandler) by inheriting from BaseSimHandler. These handlers are responsible for loading assets, stepping physics, setting/resetting environment state, and extracting structured state for upper layers.

Handler Lifecycle

Every handler follows a common lifecycle:
1. Initialization (__init__): Receives a ScenarioCfg which includes simulation metadata such as robots, objects, sensors, lights, task checker, etc. It extracts these components and stores useful references like self.robots, self.cameras, and self.object_dict.
2. Launch (launch): This function builds the simulator model (e.g., loading MJCF/URDF files), compiles it, allocates buffers, and optionally initializes renderers or viewers.
3. Close (close) Releases all simulator resources, such as viewers, renderers, or GPU memory buffers.
Key Interface Functions
1. get_state() → TensorState
Purpose: Extracts structured simulator state for all robots, objects, and cameras into a unified TensorState data structure.

This includes:
- Root position/orientation of each object
- Joint positions & velocities
- Actuator states
- Camera outputs (RGB / depth)
- Optional per-task “extras”
It supports multi-env batched extraction, and ensures consistent structure across backends.
1. set_state(ts: TensorState)
Purpose: Restores or manually sets the simulator state using a full TensorState snapshot.

This is often used for:
- Episode resets to a known state
- State injection during training
- Replaying trajectories
Internally this maps the unified TensorState back to simulator-specific structures (qpos, qvel, ctrl, etc.)
1. simulate()
Purpose: Executes the physics update (step function) in the simulator.

This is typically called after applying actions or updating the state. It may involve multiple substeps (based on decimation rate) and handles model-specific quirks.
1. get_extras(env_ids=None) → dict[str, Tensor]
Purpose: Returns task-specific, non-standard information not present in the core TensorState.

Examples include:
- Site positions
- Contact forces
- Body mass
- IMU sensor data
  
  ……
Usage overview

The full pipeline looks like this:
```
Task.extra_spec()       # Declares what is needed
        │
        ▼
SimHandler.get_extras()                 # Called by RL wrapper
        │
        ▼
SimHandler.query_derived_obs(spec)      # Parses query dict
        │
        ▼
Querier.query(query_obj, handler)       # Resolves each field
```
Task-level declaration
```
from metasim.cfg.query_type import SitePos, SensorData, GeomCollision, BodyMass

def extra_spec(self):
    return {
        "head_pos"        : SitePos(["head"]),
        "gyro_torso"      : SensorData("gyro_torso"),
        "torso_mass"      : BodyMass("torso"),
        "left_foot_touch" : GeomCollision("left_foot", "floor"),
    }
```
Output from get_extras()

The returned dictionary will look like:
```
{
    "head_pos":        Tensor of shape (N_env, 3),
    "gyro_torso":      Tensor of shape (N_env, 3),
    "torso_mass":      Tensor of shape (N_env,),  # scalar per env
    "left_foot_touch": Tensor of shape (N_env,),  # bool mask
}
```
Each value is resolved independently via the corresponding query type and handler logic.

cfg folder — Simulator Configuration

What Belongs in Config

Each config file under cfg/ specifies only information required to build and launch the simulation. This includes:

Key Section	Purpose
`robots`	List of robot instances, including model path (e.g. MJCF or URDF), initial pose, joint limits, etc.
`objects`	Static or dynamic scene objects, such as tables, cubes, buttons. Each has position, type, and optional fixations.
`lights`	Light source settings for visual fidelity or vision-based tasks (e.g. color, direction, intensity).
`cameras`	Camera positions and intrinsics, e.g., for RGB, depth, or offscreen rendering.
`scene`	Ground plane, friction, or other high-level environment descriptors.
`sim_params`	Physics timestep, solver config, gravity toggle, etc.

What Does Not Belong in Config

To keep cfg/ clean and portable across tasks and RL settings, the following things are explicitly excluded:

Reward functions
Observation definitions
Success checkers
Task-level logic or termination conditions
Algorithm-specific parameters (policy type, optimizer, etc.)

These should all live in upper-level wrappers in Roboverse_learn

Integration with ScenarioCfg

Every handler is initialized with a ScenarioCfg object parsed from these configs. The ScenarioCfg aggregates all static config elements (robot, objects, lights, etc.), and passes them to the simulation backend during launch.

This decoupling ensures that you can:

Reuse one config across multiple RL tasks
Load the same config for visualization, trajectory replay, or debugging
Build new tasks without touching simulator configs

RoboVerse Learn Overeview#

RoboVerse Learn consists of Task Wrappers and Learning Framework.
Its goal is to present one standard interface that:

Lets any algorithm (PPO, SAC, BC, etc.) work with any task
Hides simulator & task differences, so you can swap tasks, simulators or algorithms with minimal friction

Design Principles#

#	Principle	Key Points
1	Standardised Wrapper API	• `TaskWrapper` exposes `step / reset / _reward / _observation / _success`. • Once an algorithm is connected to a single `TaskWrapper`, it can seamlessly switch to any other task simply by replacing the wrapper. • Upper‑level algorithms need not care whether the backend is MuJoCo, Isaac, etc.
2	Minimise Task‑Migration Cost	• Add a task: just subclass / compose a wrapper. • Switch simulator: wrappers/algorithms stay unchanged. • Directory layout, Configs management（except the sim-related part）, training scripts all stay the same.
3	Reusable Reward & Checker Primitives	• Tasks build complex logic by composing primitives → no copy‑paste across tasks.

1. Module Composition#

Sub‑module	Responsibilities
Task Wrapper	• Combines a `Handler` & exposes `step / reset`. • Assembles Reward / Observation / Success . • Provides `pre_sim_step` & `post_sim_step` callbacks for task‑level DR.
Handler (Metasim)	• `set_state / get_state / get_extras` unified across engines. • Physics‑level DR (`pre_sim_step`). • Pure simulator adapter—no algorithm logic.
Learning Framework	• Any RL / IL algorithm. • No simulator knowledge.
Custom Util Wrapper	• Provide lightweight extensions (e.g., NumPy-to-Torch conversion, first-frame caching) to support logging, preprocessing, or offline data collection without modifying core task logic.

2. Interface List#

Method	Purpose
`step(action)`	Runs one simulation step: calls `pre_sim_step`, then `handler.simulate()`, then `post_sim_step`; returns `(obs, reward, done, info)`
`reset()`	Resets the environment and applies `reset_callback`, returns initial observation
`pre_sim_step()`	(Optional) Hook for task-level domain randomization before simulation
`post_sim_step()`	(Optional) Hook for post-processing (e.g., observation noise)
`get_state()` / `set_state()`	Unified simulator-agnostic state interface using `TensorState`
`get_extras(spec)`	Returns task-specific quantities (e.g., site poses, contact forces) via query descriptors

3. Domain Randomisation Layers#

Layer	Location	Examples
Physics‑level	`Handler`	Friction, mass, light, material
Task‑level	`Wrapper.pre/post_sim_step()`	Action noise, observation noise, initial‑pose jitter

Rule: Simulator parameters → Handler; task‑coupled noise → Wrapper.

4.Migrating a New Task into RoboVerse#

We support two ways to bring an external task into the RoboVerse Learn pipeline:

Approach 1: Direct Integration (Quick Migration)#

The fastest way to integrate a new task is to:

Copy the task codebase (from an external repo) into roboversa_learn/
Replace any simulator-specific API calls with Handler equivalents
Convert raw observations into RoboVerse TensorState via get_state()
Move simulator-related config (e.g. robot model path, asset layout, dt, decimation, n_substeps) into ScenarioCfg and Metasim config files

This transforms the original task into a RoboVerse-compatible format while preserving its logic and structure.

Cross-simulator support is now enabled for this task.

Approach 2: Structured Wrapper Integration#

To enable better reuse and cross-task comparison:

Subclass BaseTaskWrapper
Implement standardized interfaces: _reward(), _observation(), _terminated()
Use callbacks (pre_sim_step, post_sim_step, reset_callback) as needed
Leverage existing Handler and ScenarioCfg setup from Approach 1

This approach supports full compatibility with:

Multi-task learning benchmarks
One-click algorithm switching
Clean architectural separation between task, sim, and learning logic

With either approach, you can quickly benchmark new tasks under different simulators or algorithms — with no boilerplate or duplicate integration.