Skip to content

Latest commit

 

History

History
976 lines (760 loc) · 41.3 KB

File metadata and controls

976 lines (760 loc) · 41.3 KB

Imitation Learning Data Collection

Purpose: Bookmark operator demonstrations for extracting training data from robot logs.


Design Philosophy

The robot owns all ground-truth data. OpenC2 is a lightweight "session bookmarking" tool:

Concern Owner Notes
Sensor data (camera, LiDAR, IMU) Robot High-frequency, hardware-timestamped
Motor commands / actions Robot Ground-truth executed commands
Observation-action pairs Robot Self-consistent with robot clock
Session time windows OpenC2 Coarse markers for log slicing
Mission context (drawn areas) OpenC2 Spatial context for demonstrations
Operator notes & labels OpenC2 Quality tags for dataset curation

Why this split? UI timestamps don't need sub-second precision. The goal is to identify "look at robot logs from ~12:34 to ~12:39" — then trim junk frames during post-processing.


Architecture

┌─────────────────────────────────────────────────────────────┐
│  HeaderBar                    [⏺ RECORDING 00:03:42]       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    Map + Drawing Tools                      │
│               (vehicle path drawn in real-time)             │
│                                                             │
├─────────────────────────────────────────────────────────────┤
│  CommandPanel   [⏹ Stop] [Cancel]  │  Notes: ____________   │
└─────────────────────────────────────────────────────────────┘

State Management

New state in appStore:

interface RobotConfig {
  expectedTopics: string[];        // Topics that should be in rosbag
  perceptionStack?: {
    detector?: string;             // e.g., "yolov8n"
    detectorVersion?: string;      // e.g., "2026-01-15"
  };
  notes?: string;
}

interface ValidationCheck {
  name: string;
  passed: boolean;
  message?: string;
}

interface ValidationResult {
  status: 'pending' | 'valid' | 'warning' | 'invalid';
  checkedAt: string | null;
  checks: ValidationCheck[];
}

interface RecordingSession {
  id: string;                      // UUID for this session
  vehicleId: string;               // Which vehicle is being recorded
  operatorId?: string;             // Optional: for multi-operator datasets
  startTime: number;               // Unix timestamp (ms) — wall clock
  startTimeLocal: string;          // ISO string for display/logs
  notes: string;                   // Operator notes (editable during/after)
  tags: string[];                  // Categorization tags
  robotConfig?: RobotConfig;       // Provenance: what robot was running
}

recordingSession: RecordingSession | null;

New actions:

startRecording(vehicleId: string): void
stopRecording(outcome: 'success' | 'failure' | 'partial'): void
cancelRecording(): void                    // Discard without saving
updateRecordingNotes(notes: string): void  // Update notes mid-session
updateRecordingTags(tags: string[]): void  // Update tags mid-session

Multi-client: Gateway enforces one recording per vehicle. Second start request is rejected; all UIs see current recording state.


Workflow

Recording Flow

  1. Launch OpenC2, connect vehicle
  2. (Optional) Draw mission area on map
  3. Click Start Recording
  4. Demonstrate behavior with controller
  5. Click Stop Recording
  6. Add notes: "clean run" / "collision at end" / "good obstacle avoidance"
  7. Session saved to ~/OpenC2/recordings/{date}_{time}/

UI States

Idle (ready to record):

┌────────────────────────────────────────────────────────────┐
│  🟢 ROVER-01 Connected   Battery: 87%   Signal: Good      │
├────────────────────────────────────────────────────────────┤
│                                                            │
│   [Draw Area]  [Clear]              Live map view          │
│                                                            │
├────────────────────────────────────────────────────────────┤
│  [ 🔴 START RECORDING ]                                    │
└────────────────────────────────────────────────────────────┘

Recording active:

┌────────────────────────────────────────────────────────────┐
│  ⏺ RECORDING   00:03:42   │  ROVER-01   🎮 Manual        │
├────────────────────────────────────────────────────────────┤
│                                                            │
│   Vehicle path drawing on map in real-time                 │
│                                                            │
├────────────────────────────────────────────────────────────┤
│  [ ⏹ STOP ]  [ ⚠️ CANCEL ]   Notes: [________________]    │
└────────────────────────────────────────────────────────────┘

Post-recording (before dismiss):

┌────────────────────────────────────────────────────────────┐
│  ✅ Recording Complete                                     │
│                                                            │
│  Duration: 3:42         Vehicle: ROVER-01                  │
│  Time window: 14:30:22 → 14:34:04                          │
│                                                            │
│  Outcome:  ● Success  ○ Partial  ○ Failure                 │
│  Notes:    [Good obstacle avoidance demo___________]       │
│  Tags:     [+outdoor] [+obstacles] [+add tag...]           │
│                                                            │
│  Saved to: ~/OpenC2/recordings/2026-03-15_143022/          │
│                                                            │
│  [ DONE ]                                                  │
└────────────────────────────────────────────────────────────┘

Data Output

Directory Structure

~/OpenC2/recordings/
├── 2026-03-15_143022/
│   ├── session.json        # Primary session metadata
│   └── mission.geojson     # Drawn areas (if any)
├── 2026-03-15_151847/
│   └── ...
└── manifest.json           # Index of all sessions (for tooling)

session.json Schema

{
  "version": "1.1",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "vehicleId": "rover-01",
  "vehicleName": "Clearpath Husky A200",
  "operatorId": "ethan",
  
  "timeWindow": {
    "start": "2026-03-15T14:30:22.000Z",
    "end": "2026-03-15T14:34:04.000Z",
    "durationSec": 222
  },
  
  "outcome": "success",
  "notes": "Good obstacle avoidance demo, clean run",
  "tags": ["outdoor", "obstacles", "sunny"],
  
  "missionContext": {
    "hasDrawnAreas": true,
    "featureCount": 2
  },
  
  "robotConfig": {
    "expectedTopics": ["/cmd_vel", "/odom", "/camera/image_raw", "/scan"],
    "perceptionStack": {
      "detector": "yolov8n",
      "detectorVersion": "2026-01-15"
    },
    "notes": "Standard outdoor config"
  },
  
  "validation": {
    "status": "pending",
    "checkedAt": null,
    "checks": []
  },
  
  "annotations": [],
  
  "meta": {
    "createdAt": "2026-03-15T14:34:10.000Z",
    "appVersion": "0.4.0"
  }
}

Field Reference

Field Purpose When Set
robotConfig.expectedTopics Topics that should be in the rosbag At recording start (from vehicle config)
robotConfig.perceptionStack Detector versions for reproducibility At recording start
validation.status pendingvalid / warning / invalid After extraction & QA
validation.checks Array of { name, passed, message? } After running validation
annotations Post-hoc labels (see below) During dataset curation

Annotations Schema

Annotations are added during post-collection curation, not during recording:

{
  "annotations": [
    {
      "id": "ann-001",
      "type": "event",
      "label": "pedestrian_encounter",
      "timestampSec": 45.2,
      "annotator": "ethan",
      "createdAt": "2026-03-16T10:00:00.000Z"
    },
    {
      "id": "ann-002",
      "type": "segment",
      "label": "recovery_maneuver",
      "startSec": 120.0,
      "endSec": 128.5,
      "annotator": "ethan",
      "createdAt": "2026-03-16T10:05:00.000Z"
    }
  ]
}

Annotation types:

  • event: Point-in-time occurrence (person detected, collision, etc.)
  • segment: Time range (good demo segment, recovery, failure mode)
  • quality: Overall session quality note ("noisy IMU", "sun glare")

manifest.json (Root Index)

{
  "version": "1.0",
  "sessions": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "path": "2026-03-15_143022",
      "vehicleId": "rover-01",
      "outcome": "success",
      "durationSec": 222,
      "tags": ["outdoor", "obstacles"]
    }
  ],
  "stats": {
    "totalSessions": 47,
    "totalDurationMin": 312,
    "byOutcome": { "success": 38, "partial": 7, "failure": 2 }
  }
}

Clock Alignment (Practical Reality)

Assume all clocks are different. A typical robot has 3+ independent clocks:

Clock Example Typical Drift
UI/Operator OpenC2 on laptop NTP-synced, ~50ms from UTC
Robot main computer Jetson/Pi running ROS2 May have NTP, may not
Autopilot PX4, ArduPilot, motor controller Often no NTP, drifts seconds/hour
Sensors LiDAR, camera Hardware timestamps, own epoch

The Solution: Robot-Internal Consistency

All clocks don't need to agree. What's required is:

  1. Robot logs are internally consistent — ROS2 timestamps everything with its clock
  2. Autopilot commands are logged by ROS2 — bridged via MAVROS/px4_ros_com with ROS2 timestamps
  3. UI provides approximate wall-clock window — for humans to find the right log segment
┌─────────────────────────────────────────────────────────────────────────┐
│  UI says: "Demo ran 14:30 → 14:35 (laptop clock)"                       │
│                                                                         │
│  Robot bag (ROS2 clock):                                                │
│  ├── /cmd_vel           ← operator joystick commands (ROS2 time)        │
│  ├── /mavros/setpoint   ← what we sent to autopilot (ROS2 time)         │
│  ├── /mavros/state      ← autopilot feedback, bridged (ROS2 time)       │
│  ├── /odom              ← state estimation (ROS2 time)                  │
│  └── /camera/image_raw  ← images with header.stamp (ROS2 time)          │
│                                                                         │
│  All aligned to ROS2 clock → self-consistent for training               │
└─────────────────────────────────────────────────────────────────────────┘

What Gets Logged Where

Data Logged By Clock Notes
Joystick → cmd_vel ROS2 joy node ROS2 Ground truth operator intent
cmd_vel → autopilot MAVROS bridge ROS2 Stamped when sent
Autopilot state/feedback MAVROS bridge ROS2 Re-stamped on receipt
Odometry Robot state estimator ROS2 Fused from sensors
Camera images Camera driver ROS2 header.stamp Use hardware sync if available

Key insight: As long as ROS2 is the logger, everything gets a consistent timestamp. The autopilot's internal clock doesn't matter — we log what we sent to it and what it reported back, both stamped by ROS2.

Finding the Right Log Segment

Since UI clock ≠ robot clock, we use a two-step approach:

Step 1: Marker-based extraction

OpenC2 sends "recording started" / "recording stopped" messages over the gateway. Robot logs these as a ROS2 topic with robot-clock timestamps:

/openc2/session_marker
  - timestamp: (ROS2 clock)
  - event: "start" | "stop"
  - session_id: "550e8400-..."

Extraction:

# Find markers in bag
ros2 bag filter robot_log.mcap --topic /openc2/session_marker | grep "550e8400"
# → start: 1710513022.000, stop: 1710513264.000

# Extract segment
ros2 bag filter robot_log.mcap -o demo.mcap \
  --start-time 1710513022 --end-time 1710513264

Step 2: Visual verification

Always sanity-check the extracted segment:

# Play back and confirm it's the right demo
ros2 bag play demo.mcap

# Or quick stats check
ros2 bag info demo.mcap
# Verify: duration matches expected, topics present, no gaps

Why both? Markers can fail (gateway hiccup, missed message, duplicate session IDs from testing). A 30-second visual spot-check catches these before wasting hours training on garbage.

Fallback when markers are missing:

If markers didn't make it into the bag (gateway down, topic not subscribed, etc.), use the UI timestamps as a rough guide:

# UI says demo was 14:30-14:35 (laptop time)
# Robot clock might be off by minutes — extract a wide window

ros2 bag info robot_log.mcap
# → Start: 2026-03-15T14:28:00  End: 2026-03-15T16:45:00

# Extract generous window
ros2 bag filter robot_log.mcap -o candidate.mcap \
  --start 2026-03-15T14:25:00 --end 2026-03-15T14:40:00

# Visually find the actual demo, note the real timestamps, re-extract tight
ros2 bag play candidate.mcap

Handling Startup/Shutdown Lag

Reality: Operator clicks "Start" → takes 2-5 seconds to grab controller and begin demonstrating. The robot may already be in flight or moving autonomously.

Timeline (robot already moving):
  |---autonomous mode---|---operator demonstration---|---back to auto---|
  ^                      ^                           ^                  ^
                         UI start                    UI stop
                         marker                      marker
                         (operator takes over)       (operator releases)

The markers ARE the ground truth. Unlike stationary-start scenarios, trimming based on velocity isn't possible. The demonstration is everything between start/stop markers.

What the markers actually mean:

  • start: Operator has taken manual control (even if still orienting)
  • stop: Operator releases control back to autonomy (or lands/stops)

Include the "fumble time" — the first few seconds where operator is getting oriented is still valid training data. It shows recovery, stabilization, and intent formation. Don't over-trim.

/openc2/session_marker
  - timestamp: (ROS2 clock)
  - event: "start" | "stop"  
  - session_id: "550e8400-..."
  - control_mode: "manual"   # What mode the robot should be in

Post-processing considerations:

# For aerial/continuous motion: use markers directly, minimal trimming
def extract_demonstration(bag, session_id):
    markers = get_markers(bag, session_id)
    start_time = markers['start']
    end_time = markers['stop']
    
    # Optional: add small buffer for context
    # But don't trim based on velocity — operator was demonstrating the whole time
    return extract_segment(bag, start_time - 0.5, end_time + 0.5)

For ground vehicles starting stationary: Idle frames at the very start can optionally be trimmed, but be conservative. A 2-second pause while the operator grabs the controller is fine to include.

For aerial vehicles / continuous operation: Don't trim at all. The entire marker-bounded segment is the demonstration, including any stabilization or reorientation at the start.

session.json with Clock Info

{
  "timeWindow": {
    "start": "2026-03-15T14:30:22.000Z",
    "end": "2026-03-15T14:34:04.000Z",
    "durationSec": 222,
    "clockSource": "operator_laptop",
    "note": "Robot clock may differ — use session markers or visual verification"
  }
}

Robot-Side Requirements

OpenC2 provides time windows. The robot must log everything needed for training.

Required Robot Logging

Data Frequency Format Notes
Odometry 50-100 Hz ROS2 nav_msgs/Odometry Position, velocity, orientation
Commands 50-100 Hz ROS2 geometry_msgs/Twist What the operator commanded
Camera 10-30 Hz JPEG/H.264 or raw RGB, stereo, or RGBD
LiDAR 10-20 Hz ROS2 sensor_msgs/PointCloud2 If available
IMU 100-200 Hz ROS2 sensor_msgs/Imu For state estimation
TF tree 50 Hz ROS2 tf2_msgs/TFMessage All transforms

Recommended: ROS2 Bag Recording

# On robot, always-on logging to circular buffer:
ros2 bag record -a --max-cache-size 500000000 --storage mcap

# Or selective topics:
ros2 bag record /cmd_vel /odom /camera/image_raw /scan /tf /tf_static

Log Extraction Workflow

Ideal case: Session JSON is on the same machine as the robot bag.

# 1. Read session time window
cat ~/OpenC2/recordings/2026-03-15_143022/session.json | jq '.timeWindow'
# → { "start": "2026-03-15T14:30:22.000Z", "end": "2026-03-15T14:34:04.000Z" }

# 2. Extract relevant portion from robot bag (add ±30s buffer)
ros2 bag filter input.mcap -o demo_001.mcap \
  --start 2026-03-15T14:30:00 \
  --end 2026-03-15T14:34:30

# 3. Convert to training format (custom pipeline)
python extract_trajectories.py demo_001.mcap --output demo_001/

Reality: Most teams DON'T have centralized log storage. Session metadata is on one laptop, robot logs are on an SD card, and eventually things get uploaded to a shared drive with folder names like "march_outdoor_tests". See Decentralized Collection Workflow for practical patterns.


Dataset Organization

Recommended Structure for ML Training

datasets/
├── husky_outdoor_v1/
│   ├── dataset.json          # Dataset manifest
│   ├── train/
│   │   ├── traj_001/
│   │   │   ├── observations/  # Images, point clouds
│   │   │   ├── actions.npy    # Command sequence
│   │   │   └── states.npy     # Odometry sequence
│   │   └── traj_002/
│   ├── val/
│   └── test/
└── husky_outdoor_v2/

Splitting Guidelines

Rule Why
Split by session, not frame Prevents temporal leakage
Keep operator-specific sessions together Prevents style leakage
Stratify by tags Ensures environment diversity in all splits
70/15/15 or 80/10/10 Standard train/val/test ratios

Dataset Versioning

Keep datasets immutable and versioned. Simple convention:

datasets/
├── husky_outdoor_v1/           # Never modify after "release"
│   ├── dataset.json
│   ├── CHANGELOG.md            # What's in this version
│   └── train/val/test/
├── husky_outdoor_v2/           # New version = new folder
│   ├── dataset.json
│   ├── CHANGELOG.md
│   └── train/val/test/
└── RELEASES.md                 # Index of all versions

Versioning rules:

  • Bump version when adding/removing sessions or changing splits
  • Never modify a released dataset — create a new version
  • Include sourceSessionIds in dataset.json for traceability

dataset.json example:

{
  "name": "husky_outdoor",
  "version": "2",
  "createdAt": "2026-03-16T12:00:00.000Z",
  "description": "Added 15 new obstacle demos, fixed IMU alignment",
  "sourceSessionIds": [
    "550e8400-e29b-41d4-a716-446655440000",
    "661f9500-f39c-52e5-b827-557766551111"
  ],
  "splits": {
    "train": 180,
    "val": 38,
    "test": 29
  },
  "parentVersion": "1"
}

Quality Assurance

During Collection

  • Use outcome field honestly — mark failures/partials
  • Add descriptive notes — this helps during later review
  • Use consistent tags — create a tag vocabulary

Post-Collection Checklist

These checks populate the validation.checks array in session.json:

Check Tool Action if Failed
Session has matching robot logs ros2 bag info Re-sync clocks, retry
Expected topics present Compare to robotConfig.expectedTopics Check recording config
Trajectory is non-trivial (moved > 1m) Odom delta Discard or re-record
No long pauses (> 5s stationary) Velocity analysis Trim or split session
Commands match motion Odom vs cmd_vel Check hardware, re-record
Images are not corrupt Frame decode test Check camera driver

Validation result example:

{
  "validation": {
    "status": "warning",
    "checkedAt": "2026-03-16T15:00:00.000Z",
    "checks": [
      { "name": "has_matching_logs", "passed": true },
      { "name": "expected_topics", "passed": true },
      { "name": "non_trivial_motion", "passed": true },
      { "name": "no_long_pauses", "passed": false, "message": "6.2s pause at t=45s" },
      { "name": "commands_match_motion", "passed": true },
      { "name": "images_valid", "passed": true }
    ]
  }
}

Dataset Statistics to Track

{
  "totalTrajectories": 247,
  "totalFrames": 148203,
  "totalDistanceKm": 12.4,
  "avgTrajectoryLengthSec": 142,
  "environmentBreakdown": {
    "outdoor": 180,
    "indoor": 67
  },
  "outcomeBreakdown": {
    "success": 220,
    "partial": 22,
    "failure": 5
  }
}

Log Organization & Storage

See COMMS_PIPELINE.md for the full logging architecture.

Summary:

  • Network logs (always on): OpenC2 logs all gateway traffic to ~/OpenC2/logs/
  • Robot logs (ground truth): Rosbags on the robot, pulled via SSH by OpenC2
  • Sessions: Organized in ~/OpenC2/sessions/{vehicle}/{session_id}/

Imitation Learning Extensions

For ML training, sessions need a few extra fields in session.json:

Field Purpose
robotConfig.expectedTopics Verify rosbag has required topics
robotConfig.perceptionStack Track model versions for reproducibility
validation.status pendingvalid / warning / invalid after QA
{
  "robotConfig": {
    "expectedTopics": ["/cmd_vel", "/odom", "/camera/image_raw"],
    "perceptionStack": {
      "detector": "yolov8n",
      "detectorVersion": "2026-01-15"
    }
  },
  "validation": {
    "status": "pending",
    "checks": []
  }
}

Post-Collection Validation

Run these checks before using sessions for training:

Check How
Robot log exists Pairing succeeded
Expected topics present Compare to robotConfig.expectedTopics
Non-trivial motion Robot moved >1m
Commands match motion cmd_vel correlates with odom
# Validate sessions in bulk
find sessions/ -name session.json | while read f; do
  openc2-validate "$f" || echo "FAILED: $f"
done

Decentralized Collection Workflow

Reality: Controlling how distributed teams collect data isn't realistic.

This system is designed around a fundamental constraint: operators in the field have autonomy. Session metadata ends up on one laptop, robot logs on an SD card, and everything eventually lands in a shared folder named something like march_outdoor_tests_v2_FINAL. Mandating infrastructure isn't practical. Enforcing process isn't practical. Getting people to write notes is difficult enough.

Design principles:

  1. Self-contained bundles over databases. Sessions export as ZIP files with everything needed to understand them. No dependency on a running server, shared database, or network connectivity. An operator can collect data in a desert and email the bundle later.

  2. UUIDs for identity, not filenames. People will rename files, duplicate folders, and create chaos. Session IDs are UUIDs baked into the metadata. Deduplication works even when demo_final_v3_USE_THIS.zip appears in five places.

  3. Approximate timestamps are fine. The UI clock doesn't need to match the robot clock. Session markers provide robot-clock alignment when available; manual verification handles the rest. Don't let clock sync become a prerequisite for collection.

  4. Aggregation happens later. There's no central coordinator during collection. Operators dump bundles to whatever shared storage exists. CLI tools (openc2-aggregate) build the unified manifest at dataset-creation time, not during capture.

  5. Fail open, not closed. Missing fields, absent markers, orphaned logs — the system tolerates all of it. Validation flags problems for humans to triage; it doesn't reject data that might be needed later.

The workflow:

                              ┌─────────────────────────────────┐
                              │     Field collection            │
                              │     (operator autonomy)         │
                              └───────────────┬─────────────────┘
                                              │
                    ┌─────────────────────────┴─────────────────────────┐
                    ▼                                                   ▼
        ┌───────────────────────┐                         ┌───────────────────────┐
        │  Pull logs via SSH    │                         │  Manual SD card pull  │
        │  (while connected)    │                         │  (offline / later)    │
        └───────────┬───────────┘                         └───────────┬───────────┘
                    │                                                 │
                    ▼                                                 ▼
        ┌───────────────────────┐                         ┌───────────────────────┐
        │  Bundle with rosbag   │                         │  Session bundle only  │
        │  (complete)           │                         │  (needs pairing)      │
        └───────────┬───────────┘                         └───────────┬───────────┘
                    │                                                 │
                    └─────────────────────────┬───────────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  Upload to shared drive         │
                              │  (whatever exists)              │
                              └───────────────┬─────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  openc2-aggregate               │
                              │  (builds manifest, dedupes)     │
                              └───────────────┬─────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  openc2-pair (if needed)        │
                              │  (matches orphan sessions/bags) │
                              └───────────────┬─────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  Curate                         │
                              │  (human judgment)               │
                              └─────────────────────────────────┘

Preferred path: Pull logs via SSH while still connected to the robot. The bundle includes robot_log.mcap and pairing is automatic. If logs weren't pulled at that time, the SD card was grabbed later, or the system was offline — openc2-pair matches orphaned sessions with rosbags using session markers or timestamp approximation.

This isn't the "right" way to manage ML datasets. It's the realistic way when data collection is distributed across people who have real jobs and limited patience for tooling.


Implementation Checklist

Phase 1: MVP (~4-6 hours)

  • startRecording / stopRecording actions in appStore
  • Recording indicator in HeaderBar (red dot + timer)
  • Start/Stop buttons in CommandPanel
  • Save session.json to ~/OpenC2/recordings/{timestamp}/
  • Include mission.geojson if areas drawn

Phase 2: Quality Features (~6-8 hours)

  • Outcome selector (success/partial/failure) on stop
  • Notes field (editable during and after recording)
  • Tags input with autocomplete from previous tags
  • manifest.json index file with stats
  • Session browser panel (list past recordings)
  • robotConfig populated from vehicle settings (expectedTopics, perceptionStack)
  • validation.status field (default: pending)

Phase 3: Session Bundles (~6-8 hours)

  • Export bundle button in session browser
  • Bundle ZIP creation (session.json + mission.geojson + README)
  • Optional robot log attachment (file picker or path input)
  • Thumbnail generation (map screenshot)
  • Attachments support (arbitrary files)
  • collection metadata (operatorId, machine, timezone)
  • bundleIntegrity hash computation

Phase 4: CLI Tools (~8-10 hours)

  • openc2-export — Create bundle from session folder
  • openc2-pair — Match sessions with rosbags via markers
  • openc2-aggregate — Build manifest.json from bundle folder
  • openc2-extract — Extract training data from paired bundles
  • openc2-validate — Check bundle integrity and completeness

Phase 5: Polish (~4-6 hours)

  • Operator ID field (persisted in settings)
  • "Vehicle disconnected" handling (pause + prompt)
  • Keyboard shortcuts (R to start, Esc to cancel)
  • Help tooltips for new users

Phase 6: Curation Support (~4-6 hours)

  • Annotation editor (add event/segment labels to past sessions)
  • Validation runner (check extraction against expectedTopics)
  • Dataset export wizard (select sessions → create versioned dataset folder)
  • Session diff view (compare two recordings)

FAQ

Q: Do I need precise clock sync between UI and robot?
A: No. The robot logs everything with its own clock (ROS2). Session markers get robot-clock timestamps when received. UI timestamps are just a fallback.

Q: My autopilot clock is different from my ROS2 clock. Problem?
A: Not if logging via MAVROS/px4_ros_com. The bridge re-timestamps everything with ROS2 time. The model trains on "what ROS2 saw", which is self-consistent.

Q: How do I find the exact log segment?
A: Markers first, visual verification second. Extract using the /openc2/session_marker timestamps, then play back the segment to confirm it's correct. If markers are missing, use UI timestamps as a rough guide and find the demo visually.

Q: Can I record multiple vehicles at once?
A: Current design is single-vehicle. Multi-vehicle would need separate session files per vehicle.

Q: What if two operators try to record the same vehicle?
A: Gateway enforces one recording per vehicle. Second start request is rejected.

Q: How do I organize recordings for training?
A: Extract robot logs using the time windows, then use the appropriate ML pipeline format. See "Dataset Organization" above.

Q: We don't have centralized log infrastructure. How do we manage data?
A: Use the Decentralized Collection Workflow. Export self-contained session bundles from OpenC2, upload to a shared drive (Drive, NAS), then use CLI tools to aggregate and query. No central database required.

Q: Sessions and robot logs are on different machines. How do I pair them?
A: Use openc2-pair sessions/*.zip rosbags/. The tool looks for session markers in the rosbag to match IDs. If markers are missing, it falls back to timestamp approximation (manual verification required).

Q: What's the minimum workflow for a solo operator?
A: 1) Record session in OpenC2, 2) Copy rosbag from robot SD, 3) Run openc2-pair session.zip rosbag.mcap, 4) Use ros2 bag filter with the time window. No shared drive needed.

Q: How do we avoid duplicate sessions when uploading to shared drives?
A: Session IDs are UUIDs. openc2-aggregate deduplicates by ID, not filename. Name bundles consistently (e.g., {vehicle}_{timestamp}.zip) but the system tolerates duplicates.


Trainable Behaviors

What Gets Captured

┌─────────────────────────────────────────────────────────────────────────────┐
│                         During Demonstration                                │
│                                                                             │
│   Robot (ROS2 running continuously):                                        │
│   ├── Camera → YOLO/SegFormer → /detections         ← logged               │
│   ├── LiDAR → obstacle map → /costmap               ← logged               │
│   ├── State estimator → /odom                       ← logged               │
│   └── All perception outputs are live               ← logged               │
│                                                                             │
│   Operator (sees robot's perception):                                       │
│   ├── Watches video feed with detection overlays                            │
│   ├── Sees what the robot "sees"                                            │
│   └── Acts accordingly → /cmd_vel                   ← logged               │
│                                                                             │
│   Training pairs: (observations, actions) at each timestep                  │
└─────────────────────────────────────────────────────────────────────────────┘

ROS2 runs continuously. The robot's full perception stack is live. The rosbag captures everything — detections, costmaps, images, AND the operator's commands. Demonstrations are time windows where the operator is intentionally showing good behavior.


Tier 1: Direct Visuomotor (Easiest)

Behavior Observation Action Notes
Obstacle avoidance RGB + depth/LiDAR Twist (v, ω) Classic, works well
Trail following RGB camera Twist Needs consistent trails
Waypoint navigation Odom + goal Twist Policy replaces path planner
Speed adaptation RGB + terrain class Velocity Slow on rough, fast on smooth

Example: Operator drives Husky through forest. Camera sees trees, policy learns "steer around dark vertical things."


Tier 2: Detection-Conditioned (Reactive)

Behavior Observation Action Notes
Stop for pedestrians Detection boxes (person) Twist Operator stops when YOLO fires
Follow target Detection (tracked object) Twist Operator keeps target centered
Avoid animals Detection (dog, deer) Twist + slow Different reaction than static obstacles
Inspect anomalies Anomaly detector output Approach + pause Operator moves closer, circles

Key: Log detection outputs as part of the observation. The policy learns "when person_detected=Truevelocity=0."


Tier 3: Task-Specific Policies (Complex)

Behavior Observation Action Notes
Perimeter patrol Position + heading + time Twist Operator does laps, policy learns pattern
Search pattern Coverage map + camera Twist Lawnmower, expanding spiral
Recovery from stuck IMU (no motion) + cmd history Recovery maneuver Operator backs out, policy learns
Docking/approach ArUco marker pose Precise Twist Final approach to station

The Operator Display Rule

What the operator sees matters enormously. If the operator can see information X when deciding what to do, X must be in the observation space.

Display Mode What Operator Sees Training Implication
Raw camera RGB image only End-to-end pixels → actions
Camera + detections RGB with bounding boxes Policy conditioned on (image, detections)
Full HUD Camera + costmap + goal + battery Must include ALL HUD info in observation

Recommendation: Camera + detection overlays. Operator reacts to what robot "understands," and same detector runs at deployment.


Concrete Example: Emergency Stop for People

Setup:

# Robot runs continuously:
- /camera/image_raw → YOLOv8 → /detections (Person, confidence, bbox)
- /cmd_vel → motor controller
- rosbag records all topics

# Operator display:
- Video stream with detection boxes overlaid
- "PERSON DETECTED" warning when confidence > 0.7

What's logged:

t=10.0: image=<frame>, detections=[], cmd_vel=(0.5, 0.0)        # driving
t=10.5: image=<frame>, detections=[{person, 0.85}], cmd_vel=(0.5, 0.0)  # detection fires
t=10.6: image=<frame>, detections=[{person, 0.87}], cmd_vel=(0.0, 0.0)  # operator stopped
t=12.0: image=<frame>, detections=[], cmd_vel=(0.5, 0.0)        # person gone, resume

Policy learns: if person_detected → stop


What's Difficult to Train

Behavior Why It's Hard
Long-horizon planning Demo is reactive; can't capture "I'm going there because of X 5 minutes ago"
Multi-agent coordination Single-vehicle design; no swarm demonstrations
Rare edge cases Would need to manufacture situations; hard to get enough data
Precision manipulation Twist commands are coarse; need different action space
Anything requiring speech/gesture Not capturing audio or operator body language

Practical Recommendations

  1. Start with obstacle avoidance — highest ROI, works with ~50 demos
  2. Add detection conditioning early — log detector outputs from day 1, even if not using yet
  3. Keep operator display consistent — whatever they see during training, show during deployment test
  4. Log everything, filter later — disk is cheap, missing topics is painful

Related Documentation