Add normal map op, optimal flow op, and universal segmentation op for videos.#970
Add normal map op, optimal flow op, and universal segmentation op for videos.#970Qirui-jiao wants to merge 15 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces three new video processing mappers—VideoNormalMapMapper, VideoOpticalFlowMapper, and VideoUniversalSegmentationMapper—along with configuration updates, model utilities, and unit tests. The review feedback highlights critical serialization issues caused by storing raw numpy arrays and torch tensors in metadata, which would break JSON exports. Other feedback addresses a color space bug in the optical flow mapper, fragile path parsing for video names, and the use of an incorrect package version and hardcoded mirror in the segmentation mapper.
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces three new video processing mappers: video_normal_map_mapper (using Metric3D), video_optical_flow_mapper (using RAFT), and video_universal_segmentation_mapper (using OneFormer). It also standardizes configuration values in config_all.yaml and updates documentation and unit tests. Feedback focuses on improving robustness by adding checks for missing or corrupted image files, handling empty frame lists, and removing unused logit calculations. Additionally, a critical issue was identified regarding a non-existent package version and hardcoded PyPI mirror in the segmentation mapper, along with a recommendation to replace a brittle polling loop in the model preparation logic with standard repository patterns.
Add ops:
video_normal_map_mapper: Generate normal maps for videos (with the Metric3D model).video_optical_flow_mapper: Generate optical flow information for videos.video_universal_segmentation_mapper: Generate semantic segmentation, instance segmentation, and panoptic segmentation information for videos (with the OneFormer model).