SpatialMP4

A C++/Python toolkit for processing SpatialMP4 format, supporting reading and processing spatial video files containing RGB, depth, pose, and audio data.

🚀 Features

Multi-modal Data Support: Simultaneously process RGB images, depth maps, pose data, and audio
Stereo Vision: Support for left and right eye RGB image data
High Performance: Efficient video decoding based on FFmpeg
Flexible Reading Modes: Support for RGB-only, Depth-only, and Depth-first reading modes
Random Access: Support for random access to video frames and keyframe indexing
3D Reconstruction: Built-in point cloud generation and RGBD data processing
Camera Calibration: Support for reading and applying intrinsic and extrinsic parameters
Visualization Tools: Rich data visualization and debugging capabilities
Cross-Platform: Full support for Linux and macOS

📋 System Requirements

Operating System:
- Linux (Ubuntu 18.04+ recommended)
- macOS (10.15+ Catalina, Xcode required)
Compiler:
- GCC 7.0+ or Clang 6.0+ (C++17 support required)
- Apple Clang from Xcode 11.0+ on macOS (how-to-install)
CMake: 3.24.1+

🔧 Dependencies

The project depends on the following third-party libraries:

FFmpeg: Video encoding/decoding (libavformat, libavcodec, libswscale)
OpenCV: Image processing and computer vision
Eigen3: Linear algebra operations
Sophus: Lie group operations for SE(3) group
spdlog: High-performance logging library
fmt: Modern C++ formatting library
Google Test: Unit testing framework (optional)

🛠️ Build and Installation

1. Clone Repository

git clone https://github.com/Pico-Developer/SpatialMP4
cd SpatialMP4

2. Build FFmpeg

Build ffmpeg first:

bash scripts/build_ffmpeg.sh

3. Install Dependencies

bash scripts/install_deps.sh

4. Configure and Build

mkdir build && cd build

# Configure project
cmake .. -DCMAKE_BUILD_TYPE=Release

# Build
make -j$(nproc)  # On Linux
make -j$(sysctl -n hw.ncpu)  # On macOS

5. Run Tests (Optional)

# Build with tests
cmake .. -DBUILD_TESTING=ON
make -j$(nproc)  # On Linux
make -j$(sysctl -n hw.ncpu)  # On macOS

# Run tests
cd ..
./build/test_reader

📖 Usage Guide

Basic Usage

#include "spatialmp4/reader.h"
#include "spatialmp4/data_types.h"

// Create reader
SpatialML::Reader reader("path/to/your/spatial.mp4");

// Check data types
if (reader.HasRGB()) {
    std::cout << "Contains RGB data" << std::endl;
}
if (reader.HasDepth()) {
    std::cout << "Contains depth data" << std::endl;
}
if (reader.HasPose()) {
    std::cout << "Contains pose data" << std::endl;
}

// Get camera parameters
auto rgb_intrinsics = reader.GetRgbIntrinsicsLeft();
auto depth_intrinsics = reader.GetDepthIntrinsics();

Reading RGB and Depth Data

// Set reading mode
reader.SetReadMode(SpatialML::Reader::ReadMode::DEPTH_FIRST);
reader.Reset();

while (reader.HasNext()) {
    SpatialML::rgb_frame rgb_frame;
    SpatialML::depth_frame depth_frame;
    
    // Read RGB and depth frames simultaneously
    reader.Load(rgb_frame, depth_frame);
    
    // Process data
    cv::Mat left_rgb = rgb_frame.left_rgb;
    cv::Mat right_rgb = rgb_frame.right_rgb;
    cv::Mat depth = depth_frame.depth;
    
    std::cout << "RGB timestamp: " << rgb_frame.timestamp << std::endl;
    std::cout << "Depth timestamp: " << depth_frame.timestamp << std::endl;
}

RGB-Only Reading

reader.SetReadMode(SpatialML::Reader::ReadMode::RGB_ONLY);
reader.Reset();

while (reader.HasNext()) {
    SpatialML::rgb_frame rgb_frame;
    reader.Load(rgb_frame);
    
    // Process RGB data
    cv::imshow("Left RGB", rgb_frame.left_rgb);
    cv::imshow("Right RGB", rgb_frame.right_rgb);
    cv::waitKey(1);
}

Pose Data Processing

// Get all pose data
auto pose_frames = reader.GetPoseFrames();

for (const auto& pose : pose_frames) {
    // Convert to SE(3) representation
    Sophus::SE3d se3_pose = pose.as_se3();
    
    // Get rotation and translation
    Eigen::Matrix3d rotation = se3_pose.rotationMatrix();
    Eigen::Vector3d translation = se3_pose.translation();
    
    std::cout << "Pose timestamp: " << pose.timestamp << std::endl;
    std::cout << "Position: " << translation.transpose() << std::endl;
}

Depth Projection to RGB

#include "utilities/RgbdUtils.h"

// Get camera parameters
auto rgb_intrinsics = reader.GetRgbIntrinsicsLeft().as_cvmat();
auto depth_intrinsics = reader.GetDepthIntrinsics().as_cvmat();

// Calculate transformation matrix
auto T_I_Srgb = reader.GetRgbExtrinsicsLeft().as_se3();
auto T_I_Stof = reader.GetDepthExtrinsics().as_se3();
auto T_Srgb_Stof = T_I_Srgb.inverse() * T_I_Stof;

// Project depth to RGB
cv::Mat projected_depth;
Utilities::ProjectDepthToRgb(depth_frame.depth, rgb_frame.left_rgb, 
                           rgb_intrinsics, depth_intrinsics, 
                           T_Srgb_Stof, projected_depth);

Point Cloud Generation

#include "utilities/PointcloudUtils.h"

// Generate point cloud from RGBD data
Utilities::Pointcloud pcd;
Utilities::RgbdToPointcloud(rgb_frame.left_rgb, projected_depth, 
                          rgb_intrinsics, pcd, 10.0f);

// Save point cloud
Utilities::SavePointcloudToFile("output.obj", pcd);

📚 API Reference

SpatialML::Reader

Main SpatialMP4 file reader class.

Constructor

Reader(const std::string& filename)

Data Check Methods

bool HasRGB() const;        // Whether contains RGB data
bool HasDepth() const;      // Whether contains depth data  
bool HasPose() const;       // Whether contains pose data
bool HasAudio() const;      // Whether contains audio data
bool HasDisparity() const;  // Whether contains disparity data

Parameter Getter Methods

camera_intrinsics GetRgbIntrinsicsLeft() const;   // Left RGB camera intrinsics
camera_intrinsics GetRgbIntrinsicsRight() const;  // Right RGB camera intrinsics
camera_extrinsics GetRgbExtrinsicsLeft() const;   // Left RGB camera extrinsics
camera_extrinsics GetRgbExtrinsicsRight() const;  // Right RGB camera extrinsics
camera_intrinsics GetDepthIntrinsics() const;     // Depth camera intrinsics
camera_extrinsics GetDepthExtrinsics() const;     // Depth camera extrinsics

Reading Control Methods

void SetReadMode(ReadMode mode);  // Set reading mode
bool HasNext() const;             // Whether has next frame
void Reset();                     // Reset to beginning
int GetIndex() const;             // Get current index

Data Loading Methods

void Load(rgb_frame& rgb_frame);                              // Load RGB frame
void Load(depth_frame& depth_frame);                          // Load depth frame  
void Load(rgb_frame& rgb_frame, depth_frame& depth_frame);    // Load RGB and depth frames simultaneously

Data Structures

rgb_frame

struct rgb_frame {
    double timestamp;     // Timestamp
    cv::Mat left_rgb;     // Left eye RGB image
    cv::Mat right_rgb;    // Right eye RGB image
    pose_frame pose;      // Corresponding pose data
};

depth_frame

struct depth_frame {
    double timestamp;     // Timestamp
    cv::Mat depth;        // Depth image
    pose_frame pose;      // Corresponding pose data
};

pose_frame

struct pose_frame {
    double timestamp;     // Timestamp
    double x, y, z;       // Position
    double qw, qx, qy, qz; // Quaternion rotation
    
    Sophus::SE3d as_se3() const;  // Convert to SE(3) representation
};

🔍 Utility Functions

Image Processing (OpencvUtils)

VisualizeMat(): Visualize matrix data
DumpMat() / LoadMat(): Save/load matrices
ConcatenateMat(): Concatenate multiple images

RGBD Processing (RgbdUtils)

ProjectDepthToRgb(): Project depth to RGB
RgbdToPointcloud(): Convert RGBD to point cloud

Point Cloud Processing (PointcloudUtils)

SavePointcloudToFile(): Save point cloud files
Support for OBJ format output

🐛 Debugging and Logging

The project uses spdlog for logging:

#include <spdlog/spdlog.h>

// Set log level
spdlog::set_level(spdlog::level::debug);

// Debug information will be automatically output in the code

📄 License

This project is licensed under the MIT. See the LICENSE file for details.

🤝 Contributing

Issues and Pull Requests are welcome to improve this project!

📞 Contact

For questions or suggestions, please contact us through GitHub Issues.

Name		Name	Last commit message	Last commit date
Latest commit History 26 Commits
bindings		bindings
cmake		cmake
docs		docs
python		python
scripts		scripts
src/spatialmp4		src/spatialmp4
video		video
.clang-format		.clang-format
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
CMakeLists.txt		CMakeLists.txt
LICENSE		LICENSE
README.md		README.md
README_zh.md		README_zh.md

License

Pico-Developer/SpatialMP4

Folders and files

Latest commit

History

Repository files navigation