This is the repository of DIVE [EMNLP 2023]
Most of the codes in this repo are copied/modified from KM-BART
- Clone the repository recursively
git clone --recursive https://github.com/Park-ing-lot/DIVE.git
- Install requirements
You can change your cuda or torch version.
pip install -r requirments.txt
-
Download the text data and visual features from VisualCOMET and decompose them into 'data'.
mkdir data cd data unzip visualcomet.zip unzip features.zip
-
Download the images from VCR and decompose images into 'data'.
cd data unzip vcr1images.zip
-
unzip the filtered annotations and unique/novel reference file (for the validation) and put them into 'data/visualcomet/'.
unzip filtered_vcg.zip mv filtered_vcg/. data/visualcomet/
-
Prepare training data.
mkdir data/dive python -m scripts.prepare_vcg --output_dir data/dive --annot_dir data/visualcomet/ --data_dir data/vcr/vcr1images/
- Train DIVE.
mkdir output python vcg_train_crl.py --data_dir data/dive/ --checkpoint facebook/bart-base --validate_score --validate_loss --dropout 0.3 --batch_size 256 --lr 5e-5 --do_sample
- Generating inferences with nucleus sampling.
mkdir generated python vcg_generate_gt.py --do_sample --top_p 0.9 --num_beams 1 --data_dir data/dive/ --output_file generated/dive --checkpoint {model path}
- Evaluating the generated inferences.
python vcg_eval.py --generation generated/dive --reference data/dive/val_ref.json --annotation data/visualcomet/train_annots.json
- Evaluating with a unique validation subset.
python vcg_eval.py --generation generated/dive --reference data/visualcomet/val_unique_ref.json --annotation data/visualcomet/train_annots.json
- Evaluating with a novel validation subset.
python vcg_eval.py --generation generated/dive --reference data/visualcomet/val_novel_ref.json --annotation data/visualcomet/train_annots.json