Imitation Learning Data Collection

Purpose: Bookmark operator demonstrations for extracting training data from robot logs.

Design Philosophy

The robot owns all ground-truth data. OpenC2 is a lightweight "session bookmarking" tool:

Concern	Owner	Notes
Sensor data (camera, LiDAR, IMU)	Robot	High-frequency, hardware-timestamped
Motor commands / actions	Robot	Ground-truth executed commands
Observation-action pairs	Robot	Self-consistent with robot clock
Session time windows	OpenC2	Coarse markers for log slicing
Mission context (drawn areas)	OpenC2	Spatial context for demonstrations
Operator notes & labels	OpenC2	Quality tags for dataset curation

Why this split? UI timestamps don't need sub-second precision. The goal is to identify "look at robot logs from ~12:34 to ~12:39" — then trim junk frames during post-processing.

Architecture

┌─────────────────────────────────────────────────────────────┐
│  HeaderBar                    [⏺ RECORDING 00:03:42]       │
├─────────────────────────────────────────────────────────────┤
│                                                             │
│                    Map + Drawing Tools                      │
│               (vehicle path drawn in real-time)             │
│                                                             │
├─────────────────────────────────────────────────────────────┤
│  CommandPanel   [⏹ Stop] [Cancel]  │  Notes: ____________   │
└─────────────────────────────────────────────────────────────┘

State Management

New state in appStore:

interface RobotConfig {
  expectedTopics: string[];        // Topics that should be in rosbag
  perceptionStack?: {
    detector?: string;             // e.g., "yolov8n"
    detectorVersion?: string;      // e.g., "2026-01-15"
  };
  notes?: string;
}

interface ValidationCheck {
  name: string;
  passed: boolean;
  message?: string;
}

interface ValidationResult {
  status: 'pending' | 'valid' | 'warning' | 'invalid';
  checkedAt: string | null;
  checks: ValidationCheck[];
}

interface RecordingSession {
  id: string;                      // UUID for this session
  vehicleId: string;               // Which vehicle is being recorded
  operatorId?: string;             // Optional: for multi-operator datasets
  startTime: number;               // Unix timestamp (ms) — wall clock
  startTimeLocal: string;          // ISO string for display/logs
  notes: string;                   // Operator notes (editable during/after)
  tags: string[];                  // Categorization tags
  robotConfig?: RobotConfig;       // Provenance: what robot was running
}

recordingSession: RecordingSession | null;

New actions:

startRecording(vehicleId: string): void
stopRecording(outcome: 'success' | 'failure' | 'partial'): void
cancelRecording(): void                    // Discard without saving
updateRecordingNotes(notes: string): void  // Update notes mid-session
updateRecordingTags(tags: string[]): void  // Update tags mid-session

Multi-client: Gateway enforces one recording per vehicle. Second start request is rejected; all UIs see current recording state.

Workflow

Recording Flow

Launch OpenC2, connect vehicle
(Optional) Draw mission area on map
Click Start Recording
Demonstrate behavior with controller
Click Stop Recording
Add notes: "clean run" / "collision at end" / "good obstacle avoidance"
Session saved to ~/OpenC2/recordings/{date}_{time}/

UI States

Idle (ready to record):

┌────────────────────────────────────────────────────────────┐
│  🟢 ROVER-01 Connected   Battery: 87%   Signal: Good      │
├────────────────────────────────────────────────────────────┤
│                                                            │
│   [Draw Area]  [Clear]              Live map view          │
│                                                            │
├────────────────────────────────────────────────────────────┤
│  [ 🔴 START RECORDING ]                                    │
└────────────────────────────────────────────────────────────┘

Recording active:

┌────────────────────────────────────────────────────────────┐
│  ⏺ RECORDING   00:03:42   │  ROVER-01   🎮 Manual        │
├────────────────────────────────────────────────────────────┤
│                                                            │
│   Vehicle path drawing on map in real-time                 │
│                                                            │
├────────────────────────────────────────────────────────────┤
│  [ ⏹ STOP ]  [ ⚠️ CANCEL ]   Notes: [________________]    │
└────────────────────────────────────────────────────────────┘

Post-recording (before dismiss):

┌────────────────────────────────────────────────────────────┐
│  ✅ Recording Complete                                     │
│                                                            │
│  Duration: 3:42         Vehicle: ROVER-01                  │
│  Time window: 14:30:22 → 14:34:04                          │
│                                                            │
│  Outcome:  ● Success  ○ Partial  ○ Failure                 │
│  Notes:    [Good obstacle avoidance demo___________]       │
│  Tags:     [+outdoor] [+obstacles] [+add tag...]           │
│                                                            │
│  Saved to: ~/OpenC2/recordings/2026-03-15_143022/          │
│                                                            │
│  [ DONE ]                                                  │
└────────────────────────────────────────────────────────────┘

Data Output

Directory Structure

~/OpenC2/recordings/
├── 2026-03-15_143022/
│   ├── session.json        # Primary session metadata
│   └── mission.geojson     # Drawn areas (if any)
├── 2026-03-15_151847/
│   └── ...
└── manifest.json           # Index of all sessions (for tooling)

session.json Schema

{
  "version": "1.1",
  "id": "550e8400-e29b-41d4-a716-446655440000",
  "vehicleId": "rover-01",
  "vehicleName": "Clearpath Husky A200",
  "operatorId": "ethan",
  
  "timeWindow": {
    "start": "2026-03-15T14:30:22.000Z",
    "end": "2026-03-15T14:34:04.000Z",
    "durationSec": 222
  },
  
  "outcome": "success",
  "notes": "Good obstacle avoidance demo, clean run",
  "tags": ["outdoor", "obstacles", "sunny"],
  
  "missionContext": {
    "hasDrawnAreas": true,
    "featureCount": 2
  },
  
  "robotConfig": {
    "expectedTopics": ["/cmd_vel", "/odom", "/camera/image_raw", "/scan"],
    "perceptionStack": {
      "detector": "yolov8n",
      "detectorVersion": "2026-01-15"
    },
    "notes": "Standard outdoor config"
  },
  
  "validation": {
    "status": "pending",
    "checkedAt": null,
    "checks": []
  },
  
  "annotations": [],
  
  "meta": {
    "createdAt": "2026-03-15T14:34:10.000Z",
    "appVersion": "0.4.0"
  }
}

Field Reference

Field	Purpose	When Set
`robotConfig.expectedTopics`	Topics that should be in the rosbag	At recording start (from vehicle config)
`robotConfig.perceptionStack`	Detector versions for reproducibility	At recording start
`validation.status`	`pending` → `valid` / `warning` / `invalid`	After extraction & QA
`validation.checks`	Array of `{ name, passed, message? }`	After running validation
`annotations`	Post-hoc labels (see below)	During dataset curation

Annotations Schema

Annotations are added during post-collection curation, not during recording:

{
  "annotations": [
    {
      "id": "ann-001",
      "type": "event",
      "label": "pedestrian_encounter",
      "timestampSec": 45.2,
      "annotator": "ethan",
      "createdAt": "2026-03-16T10:00:00.000Z"
    },
    {
      "id": "ann-002",
      "type": "segment",
      "label": "recovery_maneuver",
      "startSec": 120.0,
      "endSec": 128.5,
      "annotator": "ethan",
      "createdAt": "2026-03-16T10:05:00.000Z"
    }
  ]
}

Annotation types:

event: Point-in-time occurrence (person detected, collision, etc.)
segment: Time range (good demo segment, recovery, failure mode)
quality: Overall session quality note ("noisy IMU", "sun glare")

manifest.json (Root Index)

{
  "version": "1.0",
  "sessions": [
    {
      "id": "550e8400-e29b-41d4-a716-446655440000",
      "path": "2026-03-15_143022",
      "vehicleId": "rover-01",
      "outcome": "success",
      "durationSec": 222,
      "tags": ["outdoor", "obstacles"]
    }
  ],
  "stats": {
    "totalSessions": 47,
    "totalDurationMin": 312,
    "byOutcome": { "success": 38, "partial": 7, "failure": 2 }
  }
}

Clock Alignment (Practical Reality)

Assume all clocks are different. A typical robot has 3+ independent clocks:

Clock	Example	Typical Drift
UI/Operator	OpenC2 on laptop	NTP-synced, ~50ms from UTC
Robot main computer	Jetson/Pi running ROS2	May have NTP, may not
Autopilot	PX4, ArduPilot, motor controller	Often no NTP, drifts seconds/hour
Sensors	LiDAR, camera	Hardware timestamps, own epoch

The Solution: Robot-Internal Consistency

All clocks don't need to agree. What's required is:

Robot logs are internally consistent — ROS2 timestamps everything with its clock
Autopilot commands are logged by ROS2 — bridged via MAVROS/px4_ros_com with ROS2 timestamps
UI provides approximate wall-clock window — for humans to find the right log segment

┌─────────────────────────────────────────────────────────────────────────┐
│  UI says: "Demo ran 14:30 → 14:35 (laptop clock)"                       │
│                                                                         │
│  Robot bag (ROS2 clock):                                                │
│  ├── /cmd_vel           ← operator joystick commands (ROS2 time)        │
│  ├── /mavros/setpoint   ← what we sent to autopilot (ROS2 time)         │
│  ├── /mavros/state      ← autopilot feedback, bridged (ROS2 time)       │
│  ├── /odom              ← state estimation (ROS2 time)                  │
│  └── /camera/image_raw  ← images with header.stamp (ROS2 time)          │
│                                                                         │
│  All aligned to ROS2 clock → self-consistent for training               │
└─────────────────────────────────────────────────────────────────────────┘

What Gets Logged Where

Data	Logged By	Clock	Notes
Joystick → cmd_vel	ROS2 joy node	ROS2	Ground truth operator intent
cmd_vel → autopilot	MAVROS bridge	ROS2	Stamped when sent
Autopilot state/feedback	MAVROS bridge	ROS2	Re-stamped on receipt
Odometry	Robot state estimator	ROS2	Fused from sensors
Camera images	Camera driver	ROS2 header.stamp	Use hardware sync if available

Key insight: As long as ROS2 is the logger, everything gets a consistent timestamp. The autopilot's internal clock doesn't matter — we log what we sent to it and what it reported back, both stamped by ROS2.

Finding the Right Log Segment

Since UI clock ≠ robot clock, we use a two-step approach:

Step 1: Marker-based extraction

OpenC2 sends "recording started" / "recording stopped" messages over the gateway. Robot logs these as a ROS2 topic with robot-clock timestamps:

/openc2/session_marker
  - timestamp: (ROS2 clock)
  - event: "start" | "stop"
  - session_id: "550e8400-..."

Extraction:

# Find markers in bag
ros2 bag filter robot_log.mcap --topic /openc2/session_marker | grep "550e8400"
# → start: 1710513022.000, stop: 1710513264.000

# Extract segment
ros2 bag filter robot_log.mcap -o demo.mcap \
  --start-time 1710513022 --end-time 1710513264

Step 2: Visual verification

Always sanity-check the extracted segment:

# Play back and confirm it's the right demo
ros2 bag play demo.mcap

# Or quick stats check
ros2 bag info demo.mcap
# Verify: duration matches expected, topics present, no gaps

Why both? Markers can fail (gateway hiccup, missed message, duplicate session IDs from testing). A 30-second visual spot-check catches these before wasting hours training on garbage.

Fallback when markers are missing:

If markers didn't make it into the bag (gateway down, topic not subscribed, etc.), use the UI timestamps as a rough guide:

# UI says demo was 14:30-14:35 (laptop time)
# Robot clock might be off by minutes — extract a wide window

ros2 bag info robot_log.mcap
# → Start: 2026-03-15T14:28:00  End: 2026-03-15T16:45:00

# Extract generous window
ros2 bag filter robot_log.mcap -o candidate.mcap \
  --start 2026-03-15T14:25:00 --end 2026-03-15T14:40:00

# Visually find the actual demo, note the real timestamps, re-extract tight
ros2 bag play candidate.mcap

Handling Startup/Shutdown Lag

Reality: Operator clicks "Start" → takes 2-5 seconds to grab controller and begin demonstrating. The robot may already be in flight or moving autonomously.

Timeline (robot already moving):
  |---autonomous mode---|---operator demonstration---|---back to auto---|
  ^                      ^                           ^                  ^
                         UI start                    UI stop
                         marker                      marker
                         (operator takes over)       (operator releases)

The markers ARE the ground truth. Unlike stationary-start scenarios, trimming based on velocity isn't possible. The demonstration is everything between start/stop markers.

What the markers actually mean:

start: Operator has taken manual control (even if still orienting)
stop: Operator releases control back to autonomy (or lands/stops)

Include the "fumble time" — the first few seconds where operator is getting oriented is still valid training data. It shows recovery, stabilization, and intent formation. Don't over-trim.

/openc2/session_marker
  - timestamp: (ROS2 clock)
  - event: "start" | "stop"  
  - session_id: "550e8400-..."
  - control_mode: "manual"   # What mode the robot should be in

Post-processing considerations:

# For aerial/continuous motion: use markers directly, minimal trimming
def extract_demonstration(bag, session_id):
    markers = get_markers(bag, session_id)
    start_time = markers['start']
    end_time = markers['stop']
    
    # Optional: add small buffer for context
    # But don't trim based on velocity — operator was demonstrating the whole time
    return extract_segment(bag, start_time - 0.5, end_time + 0.5)

For ground vehicles starting stationary: Idle frames at the very start can optionally be trimmed, but be conservative. A 2-second pause while the operator grabs the controller is fine to include.

For aerial vehicles / continuous operation: Don't trim at all. The entire marker-bounded segment is the demonstration, including any stabilization or reorientation at the start.

session.json with Clock Info

{
  "timeWindow": {
    "start": "2026-03-15T14:30:22.000Z",
    "end": "2026-03-15T14:34:04.000Z",
    "durationSec": 222,
    "clockSource": "operator_laptop",
    "note": "Robot clock may differ — use session markers or visual verification"
  }
}

Robot-Side Requirements

OpenC2 provides time windows. The robot must log everything needed for training.

Required Robot Logging

Data	Frequency	Format	Notes
Odometry	50-100 Hz	ROS2 `nav_msgs/Odometry`	Position, velocity, orientation
Commands	50-100 Hz	ROS2 `geometry_msgs/Twist`	What the operator commanded
Camera	10-30 Hz	JPEG/H.264 or raw	RGB, stereo, or RGBD
LiDAR	10-20 Hz	ROS2 `sensor_msgs/PointCloud2`	If available
IMU	100-200 Hz	ROS2 `sensor_msgs/Imu`	For state estimation
TF tree	50 Hz	ROS2 `tf2_msgs/TFMessage`	All transforms

Recommended: ROS2 Bag Recording

# On robot, always-on logging to circular buffer:
ros2 bag record -a --max-cache-size 500000000 --storage mcap

# Or selective topics:
ros2 bag record /cmd_vel /odom /camera/image_raw /scan /tf /tf_static

Log Extraction Workflow

Ideal case: Session JSON is on the same machine as the robot bag.

# 1. Read session time window
cat ~/OpenC2/recordings/2026-03-15_143022/session.json | jq '.timeWindow'
# → { "start": "2026-03-15T14:30:22.000Z", "end": "2026-03-15T14:34:04.000Z" }

# 2. Extract relevant portion from robot bag (add ±30s buffer)
ros2 bag filter input.mcap -o demo_001.mcap \
  --start 2026-03-15T14:30:00 \
  --end 2026-03-15T14:34:30

# 3. Convert to training format (custom pipeline)
python extract_trajectories.py demo_001.mcap --output demo_001/

Reality: Most teams DON'T have centralized log storage. Session metadata is on one laptop, robot logs are on an SD card, and eventually things get uploaded to a shared drive with folder names like "march_outdoor_tests". See Decentralized Collection Workflow for practical patterns.

Dataset Organization

Recommended Structure for ML Training

datasets/
├── husky_outdoor_v1/
│   ├── dataset.json          # Dataset manifest
│   ├── train/
│   │   ├── traj_001/
│   │   │   ├── observations/  # Images, point clouds
│   │   │   ├── actions.npy    # Command sequence
│   │   │   └── states.npy     # Odometry sequence
│   │   └── traj_002/
│   ├── val/
│   └── test/
└── husky_outdoor_v2/

Splitting Guidelines

Rule	Why
Split by session, not frame	Prevents temporal leakage
Keep operator-specific sessions together	Prevents style leakage
Stratify by tags	Ensures environment diversity in all splits
70/15/15 or 80/10/10	Standard train/val/test ratios

Dataset Versioning

Keep datasets immutable and versioned. Simple convention:

datasets/
├── husky_outdoor_v1/           # Never modify after "release"
│   ├── dataset.json
│   ├── CHANGELOG.md            # What's in this version
│   └── train/val/test/
├── husky_outdoor_v2/           # New version = new folder
│   ├── dataset.json
│   ├── CHANGELOG.md
│   └── train/val/test/
└── RELEASES.md                 # Index of all versions

Versioning rules:

Bump version when adding/removing sessions or changing splits
Never modify a released dataset — create a new version
Include sourceSessionIds in dataset.json for traceability

dataset.json example:

{
  "name": "husky_outdoor",
  "version": "2",
  "createdAt": "2026-03-16T12:00:00.000Z",
  "description": "Added 15 new obstacle demos, fixed IMU alignment",
  "sourceSessionIds": [
    "550e8400-e29b-41d4-a716-446655440000",
    "661f9500-f39c-52e5-b827-557766551111"
  ],
  "splits": {
    "train": 180,
    "val": 38,
    "test": 29
  },
  "parentVersion": "1"
}

Quality Assurance

During Collection

Use outcome field honestly — mark failures/partials
Add descriptive notes — this helps during later review
Use consistent tags — create a tag vocabulary

Post-Collection Checklist

These checks populate the validation.checks array in session.json:

Check	Tool	Action if Failed
Session has matching robot logs	`ros2 bag info`	Re-sync clocks, retry
Expected topics present	Compare to `robotConfig.expectedTopics`	Check recording config
Trajectory is non-trivial (moved > 1m)	Odom delta	Discard or re-record
No long pauses (> 5s stationary)	Velocity analysis	Trim or split session
Commands match motion	Odom vs cmd_vel	Check hardware, re-record
Images are not corrupt	Frame decode test	Check camera driver

Validation result example:

{
  "validation": {
    "status": "warning",
    "checkedAt": "2026-03-16T15:00:00.000Z",
    "checks": [
      { "name": "has_matching_logs", "passed": true },
      { "name": "expected_topics", "passed": true },
      { "name": "non_trivial_motion", "passed": true },
      { "name": "no_long_pauses", "passed": false, "message": "6.2s pause at t=45s" },
      { "name": "commands_match_motion", "passed": true },
      { "name": "images_valid", "passed": true }
    ]
  }
}

Dataset Statistics to Track

{
  "totalTrajectories": 247,
  "totalFrames": 148203,
  "totalDistanceKm": 12.4,
  "avgTrajectoryLengthSec": 142,
  "environmentBreakdown": {
    "outdoor": 180,
    "indoor": 67
  },
  "outcomeBreakdown": {
    "success": 220,
    "partial": 22,
    "failure": 5
  }
}

Log Organization & Storage

See COMMS_PIPELINE.md for the full logging architecture.

Summary:

Network logs (always on): OpenC2 logs all gateway traffic to ~/OpenC2/logs/
Robot logs (ground truth): Rosbags on the robot, pulled via SSH by OpenC2
Sessions: Organized in ~/OpenC2/sessions/{vehicle}/{session_id}/

Imitation Learning Extensions

For ML training, sessions need a few extra fields in session.json:

Field	Purpose
`robotConfig.expectedTopics`	Verify rosbag has required topics
`robotConfig.perceptionStack`	Track model versions for reproducibility
`validation.status`	`pending` → `valid` / `warning` / `invalid` after QA

{
  "robotConfig": {
    "expectedTopics": ["/cmd_vel", "/odom", "/camera/image_raw"],
    "perceptionStack": {
      "detector": "yolov8n",
      "detectorVersion": "2026-01-15"
    }
  },
  "validation": {
    "status": "pending",
    "checks": []
  }
}

Post-Collection Validation

Run these checks before using sessions for training:

Check	How
Robot log exists	Pairing succeeded
Expected topics present	Compare to `robotConfig.expectedTopics`
Non-trivial motion	Robot moved >1m
Commands match motion	cmd_vel correlates with odom

# Validate sessions in bulk
find sessions/ -name session.json | while read f; do
  openc2-validate "$f" || echo "FAILED: $f"
done

Decentralized Collection Workflow

Reality: Controlling how distributed teams collect data isn't realistic.

This system is designed around a fundamental constraint: operators in the field have autonomy. Session metadata ends up on one laptop, robot logs on an SD card, and everything eventually lands in a shared folder named something like march_outdoor_tests_v2_FINAL. Mandating infrastructure isn't practical. Enforcing process isn't practical. Getting people to write notes is difficult enough.

Design principles:

Self-contained bundles over databases. Sessions export as ZIP files with everything needed to understand them. No dependency on a running server, shared database, or network connectivity. An operator can collect data in a desert and email the bundle later.
UUIDs for identity, not filenames. People will rename files, duplicate folders, and create chaos. Session IDs are UUIDs baked into the metadata. Deduplication works even when demo_final_v3_USE_THIS.zip appears in five places.
Approximate timestamps are fine. The UI clock doesn't need to match the robot clock. Session markers provide robot-clock alignment when available; manual verification handles the rest. Don't let clock sync become a prerequisite for collection.
Aggregation happens later. There's no central coordinator during collection. Operators dump bundles to whatever shared storage exists. CLI tools (openc2-aggregate) build the unified manifest at dataset-creation time, not during capture.
Fail open, not closed. Missing fields, absent markers, orphaned logs — the system tolerates all of it. Validation flags problems for humans to triage; it doesn't reject data that might be needed later.

The workflow:

                              ┌─────────────────────────────────┐
                              │     Field collection            │
                              │     (operator autonomy)         │
                              └───────────────┬─────────────────┘
                                              │
                    ┌─────────────────────────┴─────────────────────────┐
                    ▼                                                   ▼
        ┌───────────────────────┐                         ┌───────────────────────┐
        │  Pull logs via SSH    │                         │  Manual SD card pull  │
        │  (while connected)    │                         │  (offline / later)    │
        └───────────┬───────────┘                         └───────────┬───────────┘
                    │                                                 │
                    ▼                                                 ▼
        ┌───────────────────────┐                         ┌───────────────────────┐
        │  Bundle with rosbag   │                         │  Session bundle only  │
        │  (complete)           │                         │  (needs pairing)      │
        └───────────┬───────────┘                         └───────────┬───────────┘
                    │                                                 │
                    └─────────────────────────┬───────────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  Upload to shared drive         │
                              │  (whatever exists)              │
                              └───────────────┬─────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  openc2-aggregate               │
                              │  (builds manifest, dedupes)     │
                              └───────────────┬─────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  openc2-pair (if needed)        │
                              │  (matches orphan sessions/bags) │
                              └───────────────┬─────────────────┘
                                              ▼
                              ┌─────────────────────────────────┐
                              │  Curate                         │
                              │  (human judgment)               │
                              └─────────────────────────────────┘

Preferred path: Pull logs via SSH while still connected to the robot. The bundle includes robot_log.mcap and pairing is automatic. If logs weren't pulled at that time, the SD card was grabbed later, or the system was offline — openc2-pair matches orphaned sessions with rosbags using session markers or timestamp approximation.

This isn't the "right" way to manage ML datasets. It's the realistic way when data collection is distributed across people who have real jobs and limited patience for tooling.

Implementation Checklist

Phase 1: MVP (~4-6 hours)

startRecording / stopRecording actions in appStore
Recording indicator in HeaderBar (red dot + timer)
Start/Stop buttons in CommandPanel
Save session.json to ~/OpenC2/recordings/{timestamp}/
Include mission.geojson if areas drawn

Phase 2: Quality Features (~6-8 hours)

Outcome selector (success/partial/failure) on stop
Notes field (editable during and after recording)
Tags input with autocomplete from previous tags
manifest.json index file with stats
Session browser panel (list past recordings)
robotConfig populated from vehicle settings (expectedTopics, perceptionStack)
validation.status field (default: pending)

Phase 3: Session Bundles (~6-8 hours)

Export bundle button in session browser
Bundle ZIP creation (session.json + mission.geojson + README)
Optional robot log attachment (file picker or path input)
Thumbnail generation (map screenshot)
Attachments support (arbitrary files)
collection metadata (operatorId, machine, timezone)
bundleIntegrity hash computation

Phase 4: CLI Tools (~8-10 hours)

openc2-export — Create bundle from session folder
openc2-pair — Match sessions with rosbags via markers
openc2-aggregate — Build manifest.json from bundle folder
openc2-extract — Extract training data from paired bundles
openc2-validate — Check bundle integrity and completeness

Phase 5: Polish (~4-6 hours)

Operator ID field (persisted in settings)
"Vehicle disconnected" handling (pause + prompt)
Keyboard shortcuts (R to start, Esc to cancel)
Help tooltips for new users

Phase 6: Curation Support (~4-6 hours)

Annotation editor (add event/segment labels to past sessions)
Validation runner (check extraction against expectedTopics)
Dataset export wizard (select sessions → create versioned dataset folder)
Session diff view (compare two recordings)

FAQ

Q: Do I need precise clock sync between UI and robot?
A: No. The robot logs everything with its own clock (ROS2). Session markers get robot-clock timestamps when received. UI timestamps are just a fallback.

Q: My autopilot clock is different from my ROS2 clock. Problem?
A: Not if logging via MAVROS/px4_ros_com. The bridge re-timestamps everything with ROS2 time. The model trains on "what ROS2 saw", which is self-consistent.

Q: How do I find the exact log segment?
A: Markers first, visual verification second. Extract using the /openc2/session_marker timestamps, then play back the segment to confirm it's correct. If markers are missing, use UI timestamps as a rough guide and find the demo visually.

Q: Can I record multiple vehicles at once?
A: Current design is single-vehicle. Multi-vehicle would need separate session files per vehicle.

Q: What if two operators try to record the same vehicle?
A: Gateway enforces one recording per vehicle. Second start request is rejected.

Q: How do I organize recordings for training?
A: Extract robot logs using the time windows, then use the appropriate ML pipeline format. See "Dataset Organization" above.

Q: We don't have centralized log infrastructure. How do we manage data?
A: Use the Decentralized Collection Workflow. Export self-contained session bundles from OpenC2, upload to a shared drive (Drive, NAS), then use CLI tools to aggregate and query. No central database required.

Q: Sessions and robot logs are on different machines. How do I pair them?
A: Use openc2-pair sessions/*.zip rosbags/. The tool looks for session markers in the rosbag to match IDs. If markers are missing, it falls back to timestamp approximation (manual verification required).

Q: What's the minimum workflow for a solo operator?
A: 1) Record session in OpenC2, 2) Copy rosbag from robot SD, 3) Run openc2-pair session.zip rosbag.mcap, 4) Use ros2 bag filter with the time window. No shared drive needed.

Q: How do we avoid duplicate sessions when uploading to shared drives?
A: Session IDs are UUIDs. openc2-aggregate deduplicates by ID, not filename. Name bundles consistently (e.g., {vehicle}_{timestamp}.zip) but the system tolerates duplicates.

Trainable Behaviors

What Gets Captured

┌─────────────────────────────────────────────────────────────────────────────┐
│                         During Demonstration                                │
│                                                                             │
│   Robot (ROS2 running continuously):                                        │
│   ├── Camera → YOLO/SegFormer → /detections         ← logged               │
│   ├── LiDAR → obstacle map → /costmap               ← logged               │
│   ├── State estimator → /odom                       ← logged               │
│   └── All perception outputs are live               ← logged               │
│                                                                             │
│   Operator (sees robot's perception):                                       │
│   ├── Watches video feed with detection overlays                            │
│   ├── Sees what the robot "sees"                                            │
│   └── Acts accordingly → /cmd_vel                   ← logged               │
│                                                                             │
│   Training pairs: (observations, actions) at each timestep                  │
└─────────────────────────────────────────────────────────────────────────────┘

ROS2 runs continuously. The robot's full perception stack is live. The rosbag captures everything — detections, costmaps, images, AND the operator's commands. Demonstrations are time windows where the operator is intentionally showing good behavior.

Tier 1: Direct Visuomotor (Easiest)

Behavior	Observation	Action	Notes
Obstacle avoidance	RGB + depth/LiDAR	Twist (v, ω)	Classic, works well
Trail following	RGB camera	Twist	Needs consistent trails
Waypoint navigation	Odom + goal	Twist	Policy replaces path planner
Speed adaptation	RGB + terrain class	Velocity	Slow on rough, fast on smooth

Example: Operator drives Husky through forest. Camera sees trees, policy learns "steer around dark vertical things."

Tier 2: Detection-Conditioned (Reactive)

Behavior	Observation	Action	Notes
Stop for pedestrians	Detection boxes (person)	Twist	Operator stops when YOLO fires
Follow target	Detection (tracked object)	Twist	Operator keeps target centered
Avoid animals	Detection (dog, deer)	Twist + slow	Different reaction than static obstacles
Inspect anomalies	Anomaly detector output	Approach + pause	Operator moves closer, circles

Key: Log detection outputs as part of the observation. The policy learns "when person_detected=True → velocity=0."

Tier 3: Task-Specific Policies (Complex)

Behavior	Observation	Action	Notes
Perimeter patrol	Position + heading + time	Twist	Operator does laps, policy learns pattern
Search pattern	Coverage map + camera	Twist	Lawnmower, expanding spiral
Recovery from stuck	IMU (no motion) + cmd history	Recovery maneuver	Operator backs out, policy learns
Docking/approach	ArUco marker pose	Precise Twist	Final approach to station

The Operator Display Rule

What the operator sees matters enormously. If the operator can see information X when deciding what to do, X must be in the observation space.

Display Mode	What Operator Sees	Training Implication
Raw camera	RGB image only	End-to-end pixels → actions
Camera + detections	RGB with bounding boxes	Policy conditioned on (image, detections)
Full HUD	Camera + costmap + goal + battery	Must include ALL HUD info in observation

Recommendation: Camera + detection overlays. Operator reacts to what robot "understands," and same detector runs at deployment.

Concrete Example: Emergency Stop for People

Setup:

# Robot runs continuously:
- /camera/image_raw → YOLOv8 → /detections (Person, confidence, bbox)
- /cmd_vel → motor controller
- rosbag records all topics

# Operator display:
- Video stream with detection boxes overlaid
- "PERSON DETECTED" warning when confidence > 0.7

What's logged:

t=10.0: image=<frame>, detections=[], cmd_vel=(0.5, 0.0)        # driving
t=10.5: image=<frame>, detections=[{person, 0.85}], cmd_vel=(0.5, 0.0)  # detection fires
t=10.6: image=<frame>, detections=[{person, 0.87}], cmd_vel=(0.0, 0.0)  # operator stopped
t=12.0: image=<frame>, detections=[], cmd_vel=(0.5, 0.0)        # person gone, resume

Policy learns: if person_detected → stop

What's Difficult to Train

Behavior	Why It's Hard
Long-horizon planning	Demo is reactive; can't capture "I'm going there because of X 5 minutes ago"
Multi-agent coordination	Single-vehicle design; no swarm demonstrations
Rare edge cases	Would need to manufacture situations; hard to get enough data
Precision manipulation	Twist commands are coarse; need different action space
Anything requiring speech/gesture	Not capturing audio or operator body language

Practical Recommendations

Start with obstacle avoidance — highest ROI, works with ~50 demos
Add detection conditioning early — log detector outputs from day 1, even if not using yet
Keep operator display consistent — whatever they see during training, show during deployment test
Log everything, filter later — disk is cheap, missing topics is painful

FilesExpand file tree

IMITATION_LEARNING_DATA.md

Latest commit

History