[AAAI-24] VVS : Video-to-Video Retrieval
With Irrelevant Frame Suppression [Project Page]
Official Pytorch Implementation of VVS: Video-to-Video Retrieval With Irrelevant Frame Suppression
Paper: Video-to-Video Retrieval With Irrelevant Frame Suppression
- For a fast verification, a simple evaluation protocol is guided as follows.
-
The process of fast evaluation for VVS on FIVR5K can be summarized into 3 steps:
-
Download the data from an Google Drive link.
-
Please locate the data as below
- Place the
pca.pklinside aVVS/data/vcdbfolder - Place the
fivr5k_resnet50_l4imacinside aVVS/featuresfolder - Place the
table_benchmark_dim_3840inside aVVS/jobsfolder
- Place the
-
Run the command to evaluate the VVS on FIVR5K
bash experiments/review/fast_evaluation_fivr5k.sh
-
-
Download the raw video dataset you want. The supported options are:
-
You should contact the author about the missing video that occurs during the download process.
-
The raw video data should be located like the structure below.
-
But preparing raw video is not essential. We provide the features, we used.
├── videos
├── fivr
└── videos
├── video_1
├── video_2
└── ...
├── cc_web
└── videos
├── video_1
├── video_2
└── ...
├── evve
└── videos
├── video_1
├── video_2
└── ...
-
For convenience, we provide the features we used. You can find them here.
-
Before running, Place the features inside a
VVS/featuresfolder.
├── features
└── vcdb_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
└── fivr_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
└── cc_web_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
└── evve_resnet50_l4imac
├── features
├── feat_1
├── feat_2
└── ...
- OS : Ubuntu 18.04
- CUDA : 10.2
- Python 3.7
- Pytorch 1.8.1 Torchvision 0.9.1
- GPU : NVIDA-Tesla V100(32G)
Required packages are listed in environment.yaml. You can install by running:
conda env create -f environment.yaml
conda activate VVS
If your GPU only support above CUDA 11.0, you can install by running:
conda env create -f environment_cuda11.yaml
conda activate VVS
- Before running, Place the pca.pkl inside a
VVS/data/vcdbfolder or you can calculate PCA weight directlypython cal_pca.py. - You can easily evaluate the model by running the provided script.
Please follow the instructions in README.md for training and evaluation
We provide checkpoints, to succesfully reproduce our benchmark experiments.
- You can run the script according to the feature dimension.
| Dataset | script |
|---|---|
| FIVR5K | $ bash experiments/main_script/train/table_benchmark/eval_benchmark_fivr5k_dim_{dim}.sh |
| FIVR200K | $ bash experiments/main_script/train/table_benchmark/eval_benchmark_fivr200k_dim_{dim}.sh |
| CC_WEB_VIDEO | $ bash experiments/main_script/train/table_benchmark/eval_benchmark_cc_web_dim_{dim}.sh |
| Usage | Method | train dataset | DSVR | CSVR | ISVR |
|---|---|---|---|---|---|
| frame | TN | VCDB | 0.724 | 0.699 | 0.589 |
| DP | VCDB | 0.775 | 0.740 | 0.632 | |
| TCAsym | VCDB | 0.728 | 0.698 | 0.592 | |
| TCAf | VCDB | 0.877 | 0.830 | 0.703 | |
| SCFV+NIP256 | VCDB | 0.819 | 0.764 | 0.622 | |
| SCFV+TNIP256 | VCDB | 0.896 | 0.833 | 0.674 | |
| ViSiLsym | VCDB | 0.833 | 0.792 | 0.654 | |
| ViSiLf | VCDB | 0.843 | 0.797 | 0.660 | |
| ViSiLv | VCDB | 0.892 | 0.841 | 0.702 | |
| DnS(SfA) | DnS-100K | 0.921 | 0.875 | 0.741 | |
| video | HC | VCDB | 0.265 | 0.247 | 0.193 |
| DML | VCDB | 0.398 | 0.378 | 0.309 | |
| TMK | VCDB | 0.417 | 0.394 | 0.319 | |
| LAMV | VCDB | 0.489 | 0.459 | 0.364 | |
| VRAG | VCDB | 0.484 | 0.470 | 0.399 | |
| TCAc | VCDB | 0.570 | 0.553 | 0.473 | |
| DnS(Sc) | DnS-100K | 0.574 | 0.558 | 0.476 | |
| VVS500(Ours) | VCDB | 0.606 | 0.588 | 0.502 | |
| VVS512(Ours) | VCDB | 0.608 | 0.590 | 0.505 | |
| VVS1024(Ours) | VCDB | 0.645 | 0.627 | 0.536 | |
| VVS3840(Ours) | VCDB | 0.711 | 0.689 | 0.590 |
| Usage | Method | train dataset | cc_web | cc_web* | cc_webc | cc_webc* |
|---|---|---|---|---|---|---|
| frame | TN | VCDB | 0.978 | 0.965 | 0.991 | 0.987 |
| DP | VCDB | 0.975 | 0.958 | 0.990 | 0.982 | |
| CTE | VCDB | 0.996 | - | - | - | |
| TCAsym | VCDB | 0.982 | 0.962 | 0.992 | 0.981 | |
| TCAf | VCDB | 0.983 | 0.969 | 0.994 | 0.990 | |
| SCFV+NIP256 | VCDB | 0.973 | 0.953 | 0.976 | 0.959 | |
| SCFV+TNIP256 | VCDB | 0.978 | 0.969 | 0.983 | 0.975 | |
| ViSiLsym | VCDB | 0.982 | 0.969 | 0.991 | 0.988 | |
| ViSiLf | VCDB | 0.984 | 0.969 | 0.993 | 0.987 | |
| ViSiLv | VCDB | 0.985 | 0.971 | 0.996 | 0.993 | |
| DnS(SfA) | DnS-100K | 0.984 | 0.973 | 0.995 | 0.992 | |
| video | HC | VCDB | 0.958 | - | - | - |
| DML | VCDB | 0.971 | 0.941 | 0.979 | 0.959 | |
| VRAG | VCDB | 0.971 | 0.952 | 0.980 | 0.967 | |
| TCAc | VCDB | 0.973 | 0.947 | 0.983 | 0.965 | |
| DnS(Sc) | DnS-100K | 0.972 | 0.952 | 0.980 | 0.967 | |
| VVS500(Ours) | VCDB | 0.973 | 0.952 | 0.981 | 0.966 | |
| VVS512(Ours) | VCDB | 0.973 | 0.952 | 0.981 | 0.967 | |
| VVS1024(Ours) | VCDB | 0.973 | 0.952 | 0.982 | 0.969 | |
| VVS3840(Ours) | VCDB | 0.975 | 0.955 | 0.984 | 0.973 |
We referenced the repos below for the code.
If you have any question or comment, please contact using the issue.

