huggingface · spirosperos · Aug 7, 2025 · Aug 7, 2025 · Oct 3, 2025 · Oct 8, 2025
diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml
@@ -36,26 +36,32 @@ repos:
       - id: check-yaml
       - id: check-toml
       - id: end-of-file-fixer
+        exclude: ^(outputs/|examples/hil_serl_simulation_training/outputs/)
       - id: trailing-whitespace
+        exclude: ^(outputs/|examples/hil_serl_simulation_training/outputs/)
 
   - repo: https://github.com/astral-sh/ruff-pre-commit
     rev: v0.12.4
     hooks:
       - id: ruff-format
+        exclude: ^(outputs/|examples/hil_serl_simulation_training/outputs/)
       - id: ruff
         args: [--fix, --exit-non-zero-on-fix]
+        exclude: ^(outputs/|examples/hil_serl_simulation_training/outputs/)
 
   - repo: https://github.com/adhtruong/mirrors-typos
     rev: v1.34.0
     hooks:
       - id: typos
         args: [--force-exclude]
+        exclude: ^(outputs/|examples/hil_serl_simulation_training/outputs/)
 
   - repo: https://github.com/asottile/pyupgrade
     rev: v3.20.0
     hooks:
     -   id: pyupgrade
         args: [--py310-plus]
+        exclude: ^(outputs/|examples/hil_serl_simulation_training/outputs/)
 
   ##### Markdown Quality #####
   - repo: https://github.com/rbubley/mirrors-prettier

diff --git a/examples/grid_hil_serl/README.md b/examples/grid_hil_serl/README.md
@@ -0,0 +1,152 @@
+# Grid HIL SERL Environment
+
+This example demonstrates a **simplified HIL-SERL setup** for computer vision-based grid position prediction. Instead of complex robotic manipulation, the algorithm learns to predict which of the 64 grid cells contains a red cube based on camera images, with human feedback during training.
+
+## Overview
+
+The environment consists of:
+- An 8x8 grid world with high-definition visual rendering
+- A red cube that randomly spawns at grid cell centers
+- Top-left origin coordinate system (0,0) = top-left corner
+- Automatic high-definition image capture (1920x1080)
+
+## Files
+
+- `grid_scene.xml` - Mujoco scene definition with 8x8 grid
+- `grid_cube_randomizer.py` - Main script for randomizing cube positions
+- `README.md` - This documentation
+
+## Usage
+
+### 1. Test the Environment
+```bash
+cd examples/grid_hil_serl
+python grid_cube_randomizer.py
+```
+
+### 2. Record Demonstrations
+```bash
+# Record training data automatically
+python record_grid_demo.py --episodes 50 --steps 10
+
+# Or use LeRobot's recording script (standard dataset format)
+python -m lerobot.scripts.rl.gym_manipulator --config_path record_grid_position_lerobot.json
+```
+
+### 3. Train HIL-SERL Policy
+```bash
+# Terminal 1: Start learner
+python -m lerobot.scripts.rl.learner --config_path train_grid_position.json
+
+# Terminal 2: Start actor (with human feedback)
+python -m lerobot.scripts.rl.actor --config_path train_grid_position.json
+```
+
+### Command Line Options
+```bash
+# Environment testing
+python grid_cube_randomizer.py --interval 2.0 --no-save
+
+# Recording options
+python record_grid_demo.py --episodes 100 --steps 5 --output ./my_recordings
+```
+
+## Features
+
+### Grid System
+- **8x8 grid**: 64 total cells
+- **Coordinate system**: (0,0) = top-left, (7,7) = bottom-right
+- **Cell centers**: Cube spawns at precise grid cell centers
+- **High-definition**: 32x32 texture with 256x256 resolution
+
+### Cube Positioning
+- **Random placement**: Uniform random distribution across all 64 cells
+- **Precise positioning**: Cube lands exactly at grid cell centers
+- **Physics compliant**: Proper velocity reset for instant teleportation
+- **Visual feedback**: Clear console output of cell coordinates
+
+### Image Capture
+- **HD resolution**: 1920x1080 (Full HD)
+- **Automatic saving**: Images saved after each cube repositioning
+- **Professional quality**: Suitable for datasets and documentation
+- **Top-down view**: Camera positioned for complete grid visibility
+
+## Coordinate System
+
+```
+(0,0) → (-3.5, 3.5)   (7,0) → (3.5, 3.5)
+      ↘                     ↙
+(0,7) → (-3.5, -3.5)  (7,7) → (3.5, -3.5)
+```
+
+## HIL-SERL Workflow
+
+This simplified setup demonstrates the core HIL-SERL concept with minimal complexity:
+
+### Training Phase (Offline)
+1. **Automatic Data Collection**: Environment randomly places cube in different grid positions
+2. **Supervised Learning**: Algorithm learns to predict grid position from images
+3. **Ground Truth Labels**: Exact grid coordinates provided for each image
+
+### Human-in-the-Loop Phase (Online)
+1. **Algorithm Prediction**: Model predicts cube position from camera images
+2. **Human Feedback**: Human indicates if prediction is correct/incorrect
+3. **Iterative Learning**: Model improves based on human guidance
+
+### Key Simplifications
+- **No Robot Control**: Focus purely on computer vision prediction
+- **Discrete Predictions**: 64 possible outputs (one per grid cell)
+- **Perfect Ground Truth**: Exact position labels available
+- **Visual Task Only**: No complex motor control or physics
+
+## Integration with LeRobot
+
+The environment integrates with LeRobot's HIL-SERL framework through:
+
+1. **Custom Gym Environment**: `GridPositionPrediction-v0` registered with gymnasium
+2. **LeRobot-Compatible Interface**: Proper observation/action space formatting
+3. **Config Files**: `record_grid_position.json` and `train_grid_position.json`
+4. **Dataset Collection**: Automated recording of image-position pairs
+
+## Technical Details
+
+- **Physics**: Mujoco physics engine with proper joint control
+- **Rendering**: Offscreen rendering with PIL for image saving
+- **Randomization**: NumPy-based random number generation
+- **Threading**: Proper event handling for viewer controls
+
+## Example Output
+
+```
+Loading scene: grid_scene.xml
+
+==================================================
+8x8 Grid Cube Randomizer
+==================================================
+This scene shows an 8x8 grid with a randomly positioned cube.
+Cube position randomizes every 3.0 seconds.
+
+Controls:
+  R: Manually randomize cube position
+  S: Save current camera view to img.jpg
+  Space: Pause/unpause
+  Esc: Exit
+  Camera: Mouse controls for rotation/zoom
+==================================================
+Spawning cube at grid cell (3, 5) -> position (-0.5, -1.5)
+Camera view saved to: img.jpg
+Spawning cube at grid cell (1, 2) -> position (-2.5, 1.5)
+Camera view saved to: img.jpg
+```
+
+## Dependencies
+
+- mujoco
+- numpy
+- PIL (Pillow)
+- gymnasium (optional, for integration)
+
+## Related Examples
+
+- `hil_serl_simulation_training/` - Full HIL-SERL training examples
+- `lekiwi/` - Real robot integration examples
diff --git a/examples/grid_hil_serl/grid_cube_randomizer.py b/examples/grid_hil_serl/grid_cube_randomizer.py
@@ -0,0 +1,161 @@
+#!/usr/bin/env python
+
+"""
+Random Grid Cube Spawner
+
+This script loads the 8x8 grid scene and randomly positions a cube
+in one of the 64 grid cells. The cube spawns at integer coordinates
+within the grid boundaries.
+"""
+
+import numpy as np
+import mujoco
+import mujoco.viewer
+import argparse
+import time
+from PIL import Image
+
+
+def save_camera_view(model, data, filename="img.jpg"):
+    """
+    Save the current camera view to a JPEG image file.
+
+    Args:
+        model: Mujoco model
+        data: Mujoco data
+        filename: Output filename (default: img.jpg)
+    """
+    try:
+        # Create a high-definition renderer for the current camera
+        renderer = mujoco.Renderer(model, height=1080, width=1920)
+
+        # Update the scene and render
+        renderer.update_scene(data, camera="grid_camera")
+        img = renderer.render()
+
+        if img is not None:
+            # Convert to PIL Image and save
+            image = Image.fromarray(img)
+            image.save(filename)
+            print(f"Camera view saved to: {filename}")
+        else:
+            print("Warning: Could not capture camera view")
+
+        # Clean up renderer (if close method exists)
+        if hasattr(renderer, 'close'):
+            renderer.close()
+
+    except Exception as e:
+        print(f"Error saving image: {e}")
+
+
+def randomize_cube_position(model, data, grid_size=8):
+    """
+    Randomly position the cube in one of the grid cells.
+
+    Args:
+        model: Mujoco model
+        data: Mujoco data
+        grid_size: Size of the grid (8x8)
+    """
+    # For 8x8 grid: generate random cell indices from 0-7 for both x and y
+    # This gives us coordinates for each of the 64 grid cells
+    x_cell = np.random.randint(0, 8)  # 0 to 7 inclusive
+    y_cell = np.random.randint(0, 8)  # 0 to 7 inclusive
+
+    # Convert cell indices to center positions (offset by 0.5 from grid lines)
+    # X: left(0) = -3.5, right(7) = 3.5
+    x_pos = (x_cell - grid_size // 2) + 0.5
+    # Y: top(0) = 3.5, bottom(7) = -3.5 (flipped coordinate system)
+    y_pos = (grid_size // 2 - y_cell) - 0.5
+
+    print(f"Spawning cube at grid cell ({x_cell}, {y_cell}) -> position ({x_pos}, {y_pos})")
+
+    # Set the cube position and velocity (free joint has 6 DOF: 3 pos + 3 vel)
+    cube_joint_id = mujoco.mj_name2id(model, mujoco.mjtObj.mjOBJ_JOINT, "cube_joint")
+
+    # Set position (x, y, z) - keep rotation as identity (0, 0, 0)
+    data.qpos[model.jnt_qposadr[cube_joint_id]:model.jnt_qposadr[cube_joint_id] + 6] = [x_pos, y_pos, 0.5, 0, 0, 0]
+
+    # Reset velocity to zero (linear and angular velocities)
+    data.qvel[model.jnt_dofadr[cube_joint_id]:model.jnt_dofadr[cube_joint_id] + 6] = [0, 0, 0, 0, 0, 0]
+
+    return x_pos, y_pos
+
+
+def run_grid_viewer(xml_path, randomize_interval=2.0, auto_save=True):
+    """
+    Run the grid viewer with random cube positioning.
+
+    Args:
+        xml_path: Path to the XML scene file
+        randomize_interval: How often to randomize cube position (seconds)
+        auto_save: Whether to automatically save camera view after each repositioning
+    """
+    print(f"Loading scene: {xml_path}")
+    model = mujoco.MjModel.from_xml_path(xml_path)
+    data = mujoco.MjData(model)
+
+    print("\n" + "="*50)
+    print("8x8 Grid Cube Randomizer")
+    print("="*50)
+    print("This scene shows an 8x8 grid with a randomly positioned cube.")
+    print(f"Cube position randomizes every {randomize_interval} seconds.")
+    print()
+    print("Controls:")
+    print("  R: Manually randomize cube position")
+    print("  S: Save current camera view to img.jpg")
+    print("  Space: Pause/unpause")
+    print("  Esc: Exit")
+    print("  Camera: Mouse controls for rotation/zoom")
+    print("="*50)
+
+    last_randomize_time = 0
+
+    with mujoco.viewer.launch_passive(model, data) as viewer:
+        # Initial randomization
+        x, y = randomize_cube_position(model, data)
+        mujoco.mj_forward(model, data)
+
+        while viewer.is_running():
+            current_time = time.time()
+
+            # Auto-randomize every few seconds
+            if current_time - last_randomize_time > randomize_interval:
+                x, y = randomize_cube_position(model, data)
+                mujoco.mj_forward(model, data)
+                # Force viewer to update the scene
+                viewer.sync()
+                # Save the current camera view if auto_save is enabled
+                if auto_save:
+                    save_camera_view(model, data, "img.jpg")
+                last_randomize_time = current_time
+
+            # Small delay to prevent excessive CPU usage
+            time.sleep(0.01)
+
+        print("\nViewer closed.")
+
+
+def main():
+    parser = argparse.ArgumentParser(description="8x8 Grid Cube Randomizer")
+    parser.add_argument("--xml", type=str, default="grid_scene.xml",
+                       help="Path to XML scene file")
+    parser.add_argument("--interval", type=float, default=3.0,
+                       help="Randomization interval in seconds")
+    parser.add_argument("--no-save", action="store_true",
+                       help="Disable automatic saving of camera views")
+
+    args = parser.parse_args()
+
+    try:
+        run_grid_viewer(args.xml, args.interval, not args.no_save)
+    except FileNotFoundError:
+        print(f"Error: Could not find XML file '{args.xml}'")
+        print("Make sure the XML file exists in the current directory.")
+    except Exception as e:
+        print(f"Error: {e}")
+
+
+if __name__ == "__main__":
+    main()