This dataset is designed for the Temporally-Constrained Video Reasoning Segmentation (TCVideoRS) task in Operating Room setting. Note that this dataset is constructed based on the existing video object identification datasets MVOR (released under CC BY-NC-SA 4.0 License). By downloading, using, or sharing this dataset, you will agree to comply with the license.
Please find the manuscript: https://arxiv.org/pdf/2507.16718.
You can first clone the repository:
git clone https://github.com/arcadelab/TCVideoRSBenchmark.git
Then unzip mask.zip
to the same path with the data file data.json
.
Original video is available here.
The data file data.json
contains 52 data, each containing a query and a path of ground truth masks for 4 videos. For more information on the videos, please step to MVOR.
The ground truth file mask.zip
includes the ground truth masks for the data queries. The masks that are not in the time slots constrained by the query will be empty (all pixels are black) for every data.
The data of TCVideoRSBenchmark is released for non-commercial research purpose only.
If you use this code or dataset in your research, please kindly cite our paper:
@article{shen2025temporally,
title={Temporally-Constrained Video Reasoning Segmentation and Automated Benchmark Construction},
author={Shen, Yiqing and Li, Chenjia and Fan, Chenxiao and Unberath, Mathias},
journal={arXiv preprint arXiv:2507.16718},
year={2025}
}