RoboMonster: Compositional Generalization of Heterogeneous Embodied Agents

🧠 Overview

RoboMonster is a compositional embodied manipulation framework and benchmark built on ManiSkill that treats a robot as a team of heterogeneous, specialized end-effectors instead of a single gripper. It pairs a constraint-driven high-level planner with imitation-learned control policies for each tool to select, coordinate, and sequence the right end-effector for each sub-task, showing clear gains over gripper-only baselines.

🚀 Quick Start

First, clone this repository to your local machine, and install vulkan and the following dependencies.

git clone [email protected]:MARS-EAI/RoboMonster.git
conda create -n RoboMonster python=3.9
conda activate RoboMonster
cd RoboMonster
pip install -r requirements.txt
cd RoboMonster
# (optional): conda install -c conda-forge networkx=2.5

Then download the 3D assets in RoboMonster task:

python script/download_assets.py

We use a specific fork version of ManiSkill for RoboMonster: Maniskill_fork_for_RoboMonster, you should replace the mani_skill in your local conda environment!

# NOTE: the mani_skill install path usually is ~/anaconda3/envs/RoboMonster/lib/python3.9/site-packages/mani_skill/ folder
# Example:
mv -f ../Maniskill_fork_for_RoboMonster/agent/robots/panda {your mani_skill install path}/agent/robots/
mv -f ../Maniskill_fork_for_RoboMonster/assets/robots/panda {your mani_skill install path}/assets/robots/

Now, try to run the task with just a line of code:

# You can choose the variant in ["gripper", "ours"], "gripper" means gripper-only and "ours" means heterogeneous multi-end effectors, for example:
python script/run_task.py configs/table/swipe_card.yaml ours
# or
python script/run_task.py configs/table/circle_vase.yaml gripper

For more complex scene like RoboCasa, you can download them using the following commands. Note that if you use these scenes in your work please cite the scene dataset authors.

python -m mani_skill.utils.download_asset RoboCasa

After download the scene dataset, you can try to run it:

python script/run_task.py configs/robocasa/swipe_card.yaml ours

🛠 Installing OpenGL/EGL Dependencies on Headless Debian Servers

If you are running simulation environments on a headless Debian server without a graphical desktop, you will need to install a minimal set of OpenGL and EGL libraries to ensure compatibility.

Run the following commands to install the necessary runtime libraries:

sudo apt update
sudo apt install libgl1 libglvnd0 libegl1-mesa libgles2-mesa libopengl0

📦 Generate Data

You can use the following script to generate data for DP2 or DP3. The generated data is usually placed in the demos/ folder.

# Format: python script/generate_data.py --config {config_path} --num {traj_num} --variant {gripper-only or heterogeneous agent(ours)} [--save-video]
# Generate data for DP2:
python script/generate_data.py --config configs/table/swipe_card.yaml --num 75 --variant ours --save-video
# Generate data for DP3:
python script/generate_data_pointcloud.py --config configs/table/circle_vase.yaml --num 150 --variant gripper --save-video
# For short:
python script/generate_data.py configs/table/swipe_card.yaml 75 ours

🧪 Train & Evaluate Policy

Data Processing

The data generated by the ManiSkill framework is in .h5 format. To accelerate training, we restructure the data format.

# 1. make data folder if it does not exist
mkdir data && mkdir data/h5_data

# 2. move your .h5 and .json file into the data/h5_data folder.
# NOTE: data_type can choose in ["rgb", "pointcloud"], You should follow the naming convention to avoid issues in later scripts.
mv {your_h5_file}.h5 data/h5_data/{task_name}_{data_type}.h5
mv {your_h5_file}.json data/h5_data/{task_name}_{data_type}.json

# 3. run the script to process the data.
# NOTE: This is the script for default config. If you add the additional camera in config yaml, modify the script to adapt the data.
# Example for DP2:
# --load-num is the demonstration number (Identical to {traj_num} in the data generation command.)
python script/convert_data.py data/h5_data/swipe_card_rgb.h5 --agent-num 2 --modality image --load-num 75
# Example for DP3:
python script/convert_data.py data/h5_data/circle_vase_pointcloud.h5 --agent-num 1 --modality pointcloud --load-num 150

# 4. (Optional) you can check our converted .h5 files by the read_h5.py in script/tools/h5py/ folder.
# Example:
cp data/{task_name}_{data_type}.h5 script/tools/h5py/input
python script/tools/h5py/read_h5.py -i data/{task_name}_{data_type}.h5

Train

We currently provide training code for Diffusion Policy (DP) and 3D Diffusion Policy (DP3), and we plan to provide more policies in the future. You can train the DP model through the following code:

# Format: python policy/Diffusion-Policy/diffusion_policy/workspace/{policy_workspace} --config-name={policy} task={task_config}
# Example for DP2 training:
python policy/Diffusion-Policy/diffusion_policy/workspace/workspace_dp2.py --config-name=dp2 task=2a_swipe_card_2d
# Example for DP3 training:
python policy/Diffusion-Policy/diffusion_policy/workspace/workspace_dp3.py --config-name=dp3 task=1a_circle_vase_3d

Evaluation

Use the .ckpt file (usually in the outputs/ folder) to evaluate your model results after the training is completed. When setting DEBUG_MODE to 1, it will output more info.

# Example for DP2 inference:
python policy/Diffusion-Policy/eval_dp2.py --config configs/table/swipe_card.yaml --variant ours --ckpt {your_ckpt_path}
# Example for DP3 inference:
python policy/Diffusion-Policy/eval_dp3.py --config configs/table/circle_vase.yaml --variant gripper --ckpt {your_ckpt_path}

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
Imges		Imges
Maniskill_fork_for_RoboMonster		Maniskill_fork_for_RoboMonster
RoboMonster		RoboMonster
.gitignore		.gitignore
LICENSE		LICENSE
LICENSE-MANISKILL		LICENSE-MANISKILL
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Licenses found

Uh oh!

Repository files navigation

RoboMonster: Compositional Generalization of Heterogeneous Embodied Agents

🧠 Overview

🚀 Quick Start

🛠 Installing OpenGL/EGL Dependencies on Headless Debian Servers

📦 Generate Data

🧪 Train & Evaluate Policy

Data Processing

Train

Evaluation

About

Licenses found

Uh oh!

Releases

Packages

Languages

License

Licenses found

MARS-EAI/RoboMonster

Folders and files

Latest commit

History

Repository files navigation

RoboMonster: Compositional Generalization of Heterogeneous Embodied Agents

🧠 Overview

🚀 Quick Start

🛠 Installing OpenGL/EGL Dependencies on Headless Debian Servers

📦 Generate Data

🧪 Train & Evaluate Policy

Data Processing

Train

Evaluation

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages