Skip to content

Conversation

ankitade
Copy link
Contributor

@ankitade ankitade commented Jun 26, 2022

  • Accuracy metric for finetuning
  • Add checkpoint saving and best ckpt loading based on val accuracy
  • Load pretrained ckpt by default in classification model
  • make num gpus 1 in qnli.yaml

Test plan
python -m flava.finetune config=flava/configs/finetuning/qnli.yaml
(val acc : 0.8651)

Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt
/data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using DistributedSampler with the dataloaders. During trainer.validate(), it is recommended to use Trainer(devices=1) to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use DistributedSampler that replicates some samples to make sure all devices have same batch size in case of uneven inputs.
rank_zero_warn(
Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00, 3.15it/s]
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Validate metric ┃ DataLoader 0 ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ validation/accuracy/classification │ 0.8651315569877625 │
│ validation/losses/classification │ 0.4168359339237213 │

Stack from ghstack (oldest at bottom):

Differential Revision: D37444938

@facebook-github-bot facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jun 26, 2022
ankitade added a commit that referenced this pull request Jun 26, 2022
@codecov-commenter
Copy link

codecov-commenter commented Jun 26, 2022

Codecov Report

❗ No coverage uploaded for pull request base (gh/ankitade/7/base@1ae2073). Click here to learn what that means.
The diff coverage is n/a.

@@                  Coverage Diff                  @@
##             gh/ankitade/7/base     #119   +/-   ##
=====================================================
  Coverage                      ?   88.59%           
=====================================================
  Files                         ?       40           
  Lines                         ?     2087           
  Branches                      ?        0           
=====================================================
  Hits                          ?     1849           
  Misses                        ?      238           
  Partials                      ?        0           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ae2073...5fb84e8. Read the comment docs.

@ankitade ankitade marked this pull request as ready for review June 26, 2022 17:42
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

- Accuracy metric for finetuning
- Add checkpoint saving and best ckpt loading based on val accuracy
- Load pretrained ckpt by default in classification model
- make num gpus 1 in qnli.yaml

Test plan
python -m flava.finetune config=flava/configs/finetuning/qnli.yaml
(val acc : 0.8651)

Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt
/data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using `DistributedSampler` with the dataloaders. During `trainer.validate()`, it is recommended to use `Trainer(devices=1)` to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use `DistributedSampler` that replicates some samples to make sure all devices have same batch size in case of uneven inputs.
  rank_zero_warn(
Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00,  3.15it/s]
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃          Validate metric           ┃            DataLoader 0            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ validation/accuracy/classification │         0.8651315569877625         │
│  validation/losses/classification  │         0.4168359339237213         │




Differential Revision: [D37444938](https://our.internmc.facebook.com/intern/diff/D37444938)

[ghstack-poisoned]
ankitade added a commit that referenced this pull request Jun 26, 2022
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

1 similar comment
@ankitade
Copy link
Contributor Author

@ankitade has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

- Accuracy metric for finetuning
- Add checkpoint saving and best ckpt loading based on val accuracy
- Load pretrained ckpt by default in classification model
- make num gpus 1 in qnli.yaml

Test plan
python -m flava.finetune config=flava/configs/finetuning/qnli.yaml
(val acc : 0.8651)

Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt
/data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using `DistributedSampler` with the dataloaders. During `trainer.validate()`, it is recommended to use `Trainer(devices=1)` to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use `DistributedSampler` that replicates some samples to make sure all devices have same batch size in case of uneven inputs.
  rank_zero_warn(
Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00,  3.15it/s]
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃          Validate metric           ┃            DataLoader 0            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ validation/accuracy/classification │         0.8651315569877625         │
│  validation/losses/classification  │         0.4168359339237213         │




Differential Revision: [D37444938](https://our.internmc.facebook.com/intern/diff/D37444938)

[ghstack-poisoned]
ankitade added a commit that referenced this pull request Jun 30, 2022
- Accuracy metric for finetuning
- Add checkpoint saving and best ckpt loading based on val accuracy
- Load pretrained ckpt by default in classification model
- make num gpus 1 in qnli.yaml

Test plan
python -m flava.finetune config=flava/configs/finetuning/qnli.yaml
(val acc : 0.8651)

Loaded model weights from checkpoint at /data/home/deankita/torchmultimodal/examples/flava-epoch=03-step=10000.ckpt
/data/home/deankita/miniconda/envs/flava/lib/python3.8/site-packages/pytorch_lightning/trainer/connectors/data_connector.py:330: PossibleUserWarning: Using `DistributedSampler` with the dataloaders. During `trainer.validate()`, it is recommended to use `Trainer(devices=1)` to ensure each sample/batch gets evaluated exactly once. Otherwise, multi-device settings use `DistributedSampler` that replicates some samples to make sure all devices have same batch size in case of uneven inputs.
  rank_zero_warn(
Validation DataLoader 0: 100%|████████████████████████████████████████████████████████████████████████████████████████████████| 171/171 [00:54<00:00,  3.15it/s]
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃          Validate metric           ┃            DataLoader 0            ┃
┡━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┩
│ validation/accuracy/classification │         0.8651315569877625         │
│  validation/losses/classification  │         0.4168359339237213         │




Differential Revision: [D37444938](https://our.internmc.facebook.com/intern/diff/D37444938)

[ghstack-poisoned]
classifier_activation: Callable[..., nn.Module] = nn.ReLU,
classifier_normalization: Optional[Callable[..., nn.Module]] = None,
loss_fn: Optional[Callable[..., Tensor]] = None,
pretrained_model_key: Optional[str] = "flava_full",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just wondering, why do we want to add this as a param to flava_model_for_classification? Feels to me like this class should not care about the pretrained weights. I like TorchVision's approach for handling this, maybe we can do something similar?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am just doing this for "short term uniformity" with the flava_for_pretraining. But yeah i agree, tv's approach is nicer and something we can potentially follow

@ankitade
Copy link
Contributor Author

ankitade commented Jul 5, 2022

@apsdehal I am going to merge this today. please feel free to leave comments either ways and ping me. I can address them in a separate PR if I end up merging before u review

@facebook-github-bot facebook-github-bot deleted the gh/ankitade/7/head branch July 9, 2022 14:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants