Skip to content

Add normal map op, optimal flow op, and universal segmentation op for videos.#970

Open
Qirui-jiao wants to merge 15 commits into
mainfrom
dev/normal_and_optimal_flow_and_segmentation_ops
Open

Add normal map op, optimal flow op, and universal segmentation op for videos.#970
Qirui-jiao wants to merge 15 commits into
mainfrom
dev/normal_and_optimal_flow_and_segmentation_ops

Conversation

@Qirui-jiao
Copy link
Copy Markdown
Collaborator

Add ops:

  • video_normal_map_mapper: Generate normal maps for videos (with the Metric3D model).
  • video_optical_flow_mapper: Generate optical flow information for videos.
  • video_universal_segmentation_mapper: Generate semantic segmentation, instance segmentation, and panoptic segmentation information for videos (with the OneFormer model).

@Qirui-jiao Qirui-jiao requested review from HYLcool and yxdyc April 27, 2026 15:45
@Qirui-jiao Qirui-jiao added enhancement New feature or request dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs labels Apr 27, 2026
Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces three new video processing mappers—VideoNormalMapMapper, VideoOpticalFlowMapper, and VideoUniversalSegmentationMapper—along with configuration updates, model utilities, and unit tests. The review feedback highlights critical serialization issues caused by storing raw numpy arrays and torch tensors in metadata, which would break JSON exports. Other feedback addresses a color space bug in the optical flow mapper, fragile path parsing for video names, and the use of an incorrect package version and hardcoded mirror in the segmentation mapper.

Comment thread data_juicer/ops/mapper/video_normal_map_mapper.py Outdated
Comment thread data_juicer/ops/mapper/video_optical_flow_mapper.py Outdated
Comment thread data_juicer/ops/mapper/video_optical_flow_mapper.py Outdated
Comment thread data_juicer/ops/mapper/video_universal_segmentation_mapper.py
Comment thread data_juicer/ops/mapper/video_normal_map_mapper.py Outdated
Comment thread data_juicer/ops/mapper/video_optical_flow_mapper.py Outdated
Comment thread data_juicer/ops/mapper/video_universal_segmentation_mapper.py
Comment thread data_juicer/ops/mapper/video_universal_segmentation_mapper.py Outdated
Comment thread data_juicer/utils/model_utils.py Outdated
@Qirui-jiao
Copy link
Copy Markdown
Collaborator Author

/gemini review

Copy link
Copy Markdown
Contributor

@gemini-code-assist gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces three new video processing mappers: video_normal_map_mapper (using Metric3D), video_optical_flow_mapper (using RAFT), and video_universal_segmentation_mapper (using OneFormer). It also standardizes configuration values in config_all.yaml and updates documentation and unit tests. Feedback focuses on improving robustness by adding checks for missing or corrupted image files, handling empty frame lists, and removing unused logit calculations. Additionally, a critical issue was identified regarding a non-existent package version and hardcoded PyPI mirror in the segmentation mapper, along with a recommendation to replace a brittle polling loop in the model preparation logic with standard repository patterns.

Comment thread data_juicer/ops/mapper/video_universal_segmentation_mapper.py
Comment thread data_juicer/ops/mapper/video_normal_map_mapper.py
Comment thread data_juicer/ops/mapper/video_optical_flow_mapper.py
Comment thread data_juicer/ops/mapper/video_normal_map_mapper.py
Comment thread data_juicer/ops/mapper/video_universal_segmentation_mapper.py Outdated
Comment thread data_juicer/ops/mapper/video_universal_segmentation_mapper.py
Comment thread data_juicer/utils/model_utils.py
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

dj:multimodal issues/PRs about multimodal data processing dj:op issues/PRs about some specific OPs enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant