|
| 1 | +# LIBERO-plus |
| 2 | + |
| 3 | +LIBERO-plus is a **robustness benchmark** for Vision-Language-Action (VLA) models built on top of [LIBERO](./libero). It systematically stress-tests policies by applying **seven independent perturbation dimensions** to the original LIBERO task set, exposing failure modes that standard benchmarks miss. |
| 4 | + |
| 5 | +- Paper: [In-depth Robustness Analysis of Vision-Language-Action Models](https://arxiv.org/abs/2510.13626) |
| 6 | +- GitHub: [sylvestf/LIBERO-plus](https://github.com/sylvestf/LIBERO-plus) |
| 7 | +- Dataset: [lerobot/libero_plus](https://huggingface.co/datasets/lerobot/libero_plus) |
| 8 | + |
| 9 | + |
| 10 | + |
| 11 | +## Perturbation dimensions |
| 12 | + |
| 13 | +LIBERO-plus creates ~10 000 task variants by perturbing each original LIBERO task along these axes: |
| 14 | + |
| 15 | +| Dimension | What changes | |
| 16 | +| --------------------- | ----------------------------------------------------- | |
| 17 | +| Objects layout | Target position, presence of confounding objects | |
| 18 | +| Camera viewpoints | Camera position, orientation, field-of-view | |
| 19 | +| Robot initial states | Manipulator start pose | |
| 20 | +| Language instructions | LLM-rewritten task description (paraphrase / synonym) | |
| 21 | +| Light conditions | Intensity, direction, color, shadow | |
| 22 | +| Background textures | Scene surface and object appearance | |
| 23 | +| Sensor noise | Photometric distortions and image degradation | |
| 24 | + |
| 25 | +## Available task suites |
| 26 | + |
| 27 | +LIBERO-plus covers the same five suites as LIBERO: |
| 28 | + |
| 29 | +| Suite | CLI name | Tasks | Max steps | Description | |
| 30 | +| -------------- | ---------------- | ----- | --------- | -------------------------------------------------- | |
| 31 | +| LIBERO-Spatial | `libero_spatial` | 10 | 280 | Tasks requiring reasoning about spatial relations | |
| 32 | +| LIBERO-Object | `libero_object` | 10 | 280 | Tasks centered on manipulating different objects | |
| 33 | +| LIBERO-Goal | `libero_goal` | 10 | 300 | Goal-conditioned tasks with changing targets | |
| 34 | +| LIBERO-90 | `libero_90` | 90 | 400 | Short-horizon tasks from the LIBERO-100 collection | |
| 35 | +| LIBERO-Long | `libero_10` | 10 | 520 | Long-horizon tasks from the LIBERO-100 collection | |
| 36 | + |
| 37 | +<Tip warning={true}> |
| 38 | + Installing LIBERO-plus **replaces** vanilla LIBERO — it uninstalls `hf-libero` |
| 39 | + so that `import libero` resolves to the LIBERO-plus fork. You cannot have both |
| 40 | + installed at the same time. To switch back to vanilla LIBERO, uninstall the |
| 41 | + fork and reinstall with `pip install -e ".[libero]"`. |
| 42 | +</Tip> |
| 43 | + |
| 44 | +## Installation |
| 45 | + |
| 46 | +### System dependencies (Linux only) |
| 47 | + |
| 48 | +```bash |
| 49 | +sudo apt install libexpat1 libfontconfig1-dev libmagickwand-dev |
| 50 | +``` |
| 51 | + |
| 52 | +### Python package |
| 53 | + |
| 54 | +```bash |
| 55 | +pip install -e ".[libero]" "robosuite==1.4.1" bddl easydict mujoco wand scikit-image gym |
| 56 | +git clone https://github.com/sylvestf/LIBERO-plus.git |
| 57 | +cd LIBERO-plus && pip install --no-deps -e . |
| 58 | +pip uninstall -y hf-libero # so `import libero` resolves to the fork |
| 59 | +``` |
| 60 | + |
| 61 | +LIBERO-plus is installed from its GitHub fork rather than a pyproject extra — the fork ships as a namespace package that pip can't handle, so it must be cloned and added to `PYTHONPATH`. See `docker/Dockerfile.benchmark.libero_plus` for the canonical install. MuJoCo is required, so only Linux is supported. |
| 62 | + |
| 63 | +<Tip> |
| 64 | +Set the MuJoCo rendering backend before running evaluation: |
| 65 | + |
| 66 | +```bash |
| 67 | +export MUJOCO_GL=egl # headless / HPC / cloud |
| 68 | +``` |
| 69 | + |
| 70 | +</Tip> |
| 71 | + |
| 72 | +### Download LIBERO-plus assets |
| 73 | + |
| 74 | +LIBERO-plus ships its extended asset pack separately. Download `assets.zip` from the [Hugging Face dataset](https://huggingface.co/datasets/Sylvest/LIBERO-plus/tree/main) and extract it into the LIBERO-plus package directory: |
| 75 | + |
| 76 | +```bash |
| 77 | +# After installing the package, find where it was installed: |
| 78 | +python -c "import libero; print(libero.__file__)" |
| 79 | +# Then extract assets.zip into <package_root>/libero/assets/ |
| 80 | +``` |
| 81 | + |
| 82 | +## Evaluation |
| 83 | + |
| 84 | +### Default evaluation (recommended) |
| 85 | + |
| 86 | +Evaluate across the four standard suites (10 episodes per task): |
| 87 | + |
| 88 | +```bash |
| 89 | +lerobot-eval \ |
| 90 | + --policy.path="your-policy-id" \ |
| 91 | + --env.type=libero_plus \ |
| 92 | + --env.task=libero_spatial,libero_object,libero_goal,libero_10 \ |
| 93 | + --eval.batch_size=1 \ |
| 94 | + --eval.n_episodes=10 \ |
| 95 | + --env.max_parallel_tasks=1 |
| 96 | +``` |
| 97 | + |
| 98 | +### Single-suite evaluation |
| 99 | + |
| 100 | +Evaluate on one LIBERO-plus suite: |
| 101 | + |
| 102 | +```bash |
| 103 | +lerobot-eval \ |
| 104 | + --policy.path="your-policy-id" \ |
| 105 | + --env.type=libero_plus \ |
| 106 | + --env.task=libero_spatial \ |
| 107 | + --eval.batch_size=1 \ |
| 108 | + --eval.n_episodes=10 |
| 109 | +``` |
| 110 | + |
| 111 | +- `--env.task` picks the suite (`libero_spatial`, `libero_object`, etc.). |
| 112 | +- `--env.task_ids` restricts to specific task indices (`[0]`, `[1,2,3]`, etc.). Omit to run all tasks in the suite. |
| 113 | +- `--eval.batch_size` controls how many environments run in parallel. |
| 114 | +- `--eval.n_episodes` sets how many episodes to run per task. |
| 115 | + |
| 116 | +### Multi-suite evaluation |
| 117 | + |
| 118 | +Benchmark a policy across multiple suites at once by passing a comma-separated list: |
| 119 | + |
| 120 | +```bash |
| 121 | +lerobot-eval \ |
| 122 | + --policy.path="your-policy-id" \ |
| 123 | + --env.type=libero_plus \ |
| 124 | + --env.task=libero_spatial,libero_object \ |
| 125 | + --eval.batch_size=1 \ |
| 126 | + --eval.n_episodes=10 |
| 127 | +``` |
| 128 | + |
| 129 | +### Control mode |
| 130 | + |
| 131 | +LIBERO-plus supports two control modes — `relative` (default) and `absolute`. Different VLA checkpoints are trained with different action parameterizations, so make sure the mode matches your policy: |
| 132 | + |
| 133 | +```bash |
| 134 | +--env.control_mode=relative # or "absolute" |
| 135 | +``` |
| 136 | + |
| 137 | +### Policy inputs and outputs |
| 138 | + |
| 139 | +**Observations:** |
| 140 | + |
| 141 | +- `observation.state` — 8-dim proprioceptive features (eef position, axis-angle orientation, gripper qpos) |
| 142 | +- `observation.images.image` — main camera view (`agentview_image`), HWC uint8 |
| 143 | +- `observation.images.image2` — wrist camera view (`robot0_eye_in_hand_image`), HWC uint8 |
| 144 | + |
| 145 | +**Actions:** |
| 146 | + |
| 147 | +- Continuous control in `Box(-1, 1, shape=(7,))` — 6D end-effector delta + 1D gripper |
| 148 | + |
| 149 | +### Recommended evaluation episodes |
| 150 | + |
| 151 | +For reproducible benchmarking, use **10 episodes per task** across all four standard suites (Spatial, Object, Goal, Long). This gives 400 total episodes and matches the protocol used for published results. |
| 152 | + |
| 153 | +## Training |
| 154 | + |
| 155 | +### Dataset |
| 156 | + |
| 157 | +A LeRobot-format training dataset for LIBERO-plus is available at: |
| 158 | + |
| 159 | +- [lerobot/libero_plus](https://huggingface.co/datasets/lerobot/libero_plus) |
| 160 | + |
| 161 | +### Example training command |
| 162 | + |
| 163 | +```bash |
| 164 | +lerobot-train \ |
| 165 | + --policy.type=smolvla \ |
| 166 | + --policy.repo_id=${HF_USER}/smolvla_libero_plus \ |
| 167 | + --policy.load_vlm_weights=true \ |
| 168 | + --dataset.repo_id=lerobot/libero_plus \ |
| 169 | + --env.type=libero_plus \ |
| 170 | + --env.task=libero_spatial \ |
| 171 | + --output_dir=./outputs/ \ |
| 172 | + --steps=100000 \ |
| 173 | + --batch_size=4 \ |
| 174 | + --eval.batch_size=1 \ |
| 175 | + --eval.n_episodes=1 \ |
| 176 | + --eval_freq=1000 |
| 177 | +``` |
| 178 | + |
| 179 | +## Relationship to LIBERO |
| 180 | + |
| 181 | +LIBERO-plus is a drop-in extension of LIBERO: |
| 182 | + |
| 183 | +- Same Python gym interface (`LiberoEnv`, `LiberoProcessorStep`) |
| 184 | +- Same camera names and observation/action format |
| 185 | +- Same task suite names |
| 186 | +- Installs under the same `libero` Python package name (different GitHub repo) |
| 187 | + |
| 188 | +To use the original LIBERO benchmark, see [LIBERO](./libero) and use `--env.type=libero`. |
0 commit comments