Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary
This PR introduces code to reproduce main results described in the papers "Scaling Channel-Adaptive Self-Supervised Learning", published in TMLR a few months ago https://openreview.net/forum?id=pT8sgtRVAf , and to the paper "Cell-DINO: Self-Supervised Image-based Embeddings for Cell Fluorescent Microscopy" , in revision.
The code contains a main README_CELL-DINO_AND_CHANNEL-DINO.md, that points to two separate readme corresponding to the two papers.
We list the changes in 3 categories:
1- New non .py files or documentation: License, readmes, scripts to launch code to reproduce some results, and configuration files.
LICENSE_CELLDINO
LICENSE_CELLDINO_WEIGHTS
README.md
README_CELL-DINO_AND_CHANNEL-DINO.md
docs/Cell-DINO.png
docs/README_CELLDINO.md
docs/README_CHANNEL_ADAPTIVE_DINO.md
docs/launcher_CHAMMI_eval.sh
docs/launcher_knn_eval_on_chammi.sh
docs/test_inference_celldino.py
dinov2/configs/eval/celldino_hpaone.yaml
dinov2/configs/eval/channeldino_ext_chammi.yaml
dinov2/configs/train/hpafov_vitl16.yaml
dinov2/configs/train/hpafov_vitl16_boc.yaml
dinov2/configs/train/hpaone_vitl16.yaml
2- New .py files: data and evaluation files
These new files contain data or checkpointing helpers (dinov2/data/[accumulators.py), dinov2/utils/checkpoint.py : necessary function to the linear_celldino eval), specific transforms used for pretraining (cell_augmentations.py), for evaluations (dinov2/data/transforms_cells.py ), new data loaders (dinov2/data/datasets/chammi_cp.py dinov2/data/datasets/chammi_hpa.py
dinov2/data/datasets/chammi_wtc.py dinov2/data/datasets/hpafov.py dinov2/data/datasets/hpaone.py)
Specific knn and linear evaluation files have been created with new options:
dinov2/eval/utils_celldino.py
dinov2/eval/knn_celldino.py
dinov2/eval/linear_celldino.py
dinov2/run/eval/knn_celldino.py
dinov2/run/eval/linear_celldino.py
3- Modified files
dinov2/configs/ssl_default_config.yaml
dinov2/data/init.py : added necessary imports
dinov2/data/adapters.py : implemented padding in DatasetWithEnumeratedTargets
dinov2/data/datasets/init.py: added necessary imports
dinov2/data/datasets/decoders.py : introduced new decoders necessary to load multi-channel images
dinov2/data/datasets/extended.py : introduced options to use different decoders
dinov2/data/loaders.py : added necessary imports to use the new datasets
dinov2/eval/metrics.py : added multi-label and multi-class F1 score metrics
dinov2/hub/backbones.py: introduced Celldino models
dinov2/models/init.py: propagate the in_chans (number of channels) and channel_dataptive parameters in build_model
dinov2/models/vision_transformer.py: implemented the channel adaptive option
dinov2/train/train.py: picking the correct pretraining augmentation from the config option train.cell_augmentation
dinov2/utils/cluster.py : useful modification to test this PR.