diff --git a/README.md b/README.md index 929ebced..51c9caf2 100644 --- a/README.md +++ b/README.md @@ -59,7 +59,7 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 3072 VideoMix2M checkpoint - configs + configs ViT-H @@ -69,7 +69,7 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 3072 VideoMix2M checkpoint - configs + configs ViT-H @@ -79,7 +79,7 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 2400 VideoMix2M checkpoint - configs + configs @@ -97,21 +97,21 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 224x224 80.8 attentive probe checkpoint - configs + configs ViT-H/16 224x224 82.0 attentive probe checkpoint - configs + configs ViT-H/16 384x384 81.9 attentive probe checkpoint - configs + configs @@ -129,21 +129,21 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 224x224 69.5 attentive probe checkpoint - configs + configs ViT-H/16 224x224 71.4 attentive probe checkpoint - configs + configs ViT-H/16 384x384 72.2 attentive probe checkpoint - configs + configs @@ -161,21 +161,21 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 224x224 74.8 attentive probe checkpoint - configs + configs ViT-H/16 224x224 75.9 attentive probe checkpoint - configs + configs ViT-H/16 384x384 77.4 attentive probe checkpoint - configs + configs @@ -193,21 +193,21 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 224x224 60.3 attentive probe checkpoint - configs + configs ViT-H/16 224x224 61.7 attentive probe checkpoint - configs + configs ViT-H/16 384x384 62.8 attentive probe checkpoint - configs + configs @@ -225,21 +225,21 @@ The V-JEPA feature predictions are indeed grounded, and exhibit spatio-temporal 224x224 67.8 attentive probe checkpoint - configs + configs ViT-H/16 224x224 67.9 attentive probe checkpoint - configs + configs ViT-H/16 384x384 72.6 attentive probe checkpoint - configs + configs @@ -330,7 +330,7 @@ For example, suppose we have a directory called ``my_image_datasets``. We would ### Local training If you wish to debug your code or setup before launching a distributed training run, we provide the functionality to do so by running the pretraining script locally on a multi-GPU (or single-GPU) machine, however, reproducing our results requires launching distributed training. -The single-machine implementation starts from the [app/main.py](appmain.py), which parses the experiment config file and runs the pretraining locally on a multi-GPU (or single-GPU) machine. +The single-machine implementation starts from the [app/main.py](app/main.py), which parses the experiment config file and runs the pretraining locally on a multi-GPU (or single-GPU) machine. For example, to run V-JEPA pretraining on GPUs "0", "1", and "2" on a local machine using the config [configs/pretrain/vitl16.yaml](configs/pretrain/vitl16.yaml), type the command: ```bash python -m app.main \ @@ -353,31 +353,31 @@ python -m app.main_distributed \ ### Local training If you wish to debug your eval code or setup before launching a distributed training run, we provide the functionality to do so by running the pretraining script locally on a multi-GPU (or single-GPU) machine, however, reproducing the full eval would require launching distributed training. -The single-machine implementation starts from the [eval/main.py](eval/main.py), which parses the experiment config file and runs the eval locally on a multi-GPU (or single-GPU) machine. +The single-machine implementation starts from the [evals/main.py](evals/main.py), which parses the experiment config file and runs the eval locally on a multi-GPU (or single-GPU) machine. -For example, to run ImageNet image classification on GPUs "0", "1", and "2" on a local machine using the config [configs/eval/vitl16_in1k.yaml](configs/eval/vitl16_in1k.yaml), type the command: +For example, to run ImageNet image classification on GPUs "0", "1", and "2" on a local machine using the config [configs/evals/vitl16_in1k.yaml](configs/evals/vitl16_in1k.yaml), type the command: ```bash python -m evals.main \ - --fname configs/eval/vitl16_in1k.yaml \ + --fname configs/evals/vitl16_in1k.yaml \ --devices cuda:0 cuda:1 cuda:2 ``` ### Distributed training -To launch a distributed evaluation run, the implementation starts from [eval/main_distributed.py](eval/main_distributed.py), which, in addition to parsing the config file, also allows for specifying details about distributed training. For distributed training, we use the popular open-source [submitit](https://github.com/facebookincubator/submitit) tool and provide examples for a SLURM cluster. +To launch a distributed evaluation run, the implementation starts from [evals/main_distributed.py](evals/main_distributed.py), which, in addition to parsing the config file, also allows for specifying details about distributed training. For distributed training, we use the popular open-source [submitit](https://github.com/facebookincubator/submitit) tool and provide examples for a SLURM cluster. -For example, to launch a distributed ImageNet image classification experiment using the config [configs/eval/vitl16_in1k.yaml](configs/eval/vitl16_in1k.yaml), type the command: +For example, to launch a distributed ImageNet image classification experiment using the config [configs/evals/vitl16_in1k.yaml](configs/evals/vitl16_in1k.yaml), type the command: ```bash python -m evals.main_distributed \ - --fname configs/eval/vitl16_in1k.yaml \ + --fname configs/evals/vitl16_in1k.yaml \ --folder $path_to_save_stderr_and_stdout \ --partition $slurm_partition ``` -Similarly, to launch a distributed K400 video classification experiment using the config [configs/eval/vitl16_k400.yaml](configs/eval/vitl16_k400.yaml), type the command: +Similarly, to launch a distributed K400 video classification experiment using the config [configs/evals/vitl16_k400.yaml](configs/evals/vitl16_k400_16x8x3.yaml), type the command: ```bash python -m evals.main_distributed \ - --fname configs/eval/vitl16_k400.yaml \ + --fname configs/eval/vitl16_k400_16x8x3.yaml \ --folder $path_to_save_stderr_and_stdout \ --partition $slurm_partition ``` diff --git a/src/datasets/utils/video/randaugment.py b/src/datasets/utils/video/randaugment.py index 4c80a990..8d1d6789 100644 --- a/src/datasets/utils/video/randaugment.py +++ b/src/datasets/utils/video/randaugment.py @@ -7,8 +7,8 @@ """ This implementation is based on -https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/auto_augment.py -pulished under an Apache License 2.0. +https://github.com/huggingface/pytorch-image-models/blob/main/timm/data/auto_augment.py +published under an Apache License 2.0. """ import math diff --git a/src/datasets/utils/video/randerase.py b/src/datasets/utils/video/randerase.py index d1f185c8..b073588c 100644 --- a/src/datasets/utils/video/randerase.py +++ b/src/datasets/utils/video/randerase.py @@ -7,8 +7,8 @@ """ This implementation is based on -https://github.com/rwightman/pytorch-image-models/blob/master/timm/data/random_erasing.py -pulished under an Apache License 2.0. +https://github.com/huggingface/pytorch-image-models/blob/main/timm/data/auto_augment.py +published under an Apache License 2.0. """ import math import random