Add high-resolution fine-tuning stage (448px tiles, magnification-aware)#68
Open
dat-rohit wants to merge 2 commits intoMedARC-AI:mainfrom
Open
Add high-resolution fine-tuning stage (448px tiles, magnification-aware)#68dat-rohit wants to merge 2 commits intoMedARC-AI:mainfrom
dat-rohit wants to merge 2 commits intoMedARC-AI:mainfrom
Conversation
Implements Phase 2 post-training: fine-tune from a Phase 1 checkpoint on 448px tiles (2x of 224) at halved magnifications [1, 0.5, 0.25, 0.125] µm/px, preserving physical tissue regions while doubling pixel resolution. The ViT processes 392px global crops (784 tokens vs 256 at 224px). Key changes: - Magnification-aware sample list generator targeting specific µm/px values with multiprocessing support - SlideDataset: parse optional read_size from sample list, resize to patch_size_pixels - train.py: _load_from_teacher_checkpoint() with pos_embed interpolation, gradient accumulation, eval transform uses cfg.crops.global_crops_size - ssl_meta_arch.py: loss_scale parameter for gradient accumulation - config.py: LR scaling accounts for gradient accumulation steps - New config vitg14_reg4_highres.yaml: 448px tiles, 392/168 crops, batch=6, accum=4, lr=1e-4, 120k iterations Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Suppresses unnecessary all-reduce on non-final micro-steps. With accum_steps=4, this eliminates 3 redundant gradient synchronizations per optimizer step across GPUs. Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Implements Phase 2 high-resolution post-training from the Midnight paper (Section 2). Fine-tunes a Phase 1 checkpoint on 448px tiles at halved magnifications, preserving physical tissue regions while doubling pixel resolution. The ViT processes 392px global crops (28x28 = 784 tokens vs 16x16 = 256).
Key design decisions (validated against paper, codebase, and actual TCGA SVS data):
224 * mpp = 448 * (mpp/2)). Midnight uses 256→512 (same 2x ratio).[1.0, 0.5, 0.25, 0.125]by computingread_sizeat the best SVS level and resizing. SVS levels are 4x apart, so intermediate magnifications are synthesized.Changes
prepatching_scripts/create_sample_dataset_txt_highres.py— Magnification-aware sample list generator with multiprocessing. Outputspath x y level read_sizeformat.dinov2/data/datasets/slide_dataset.py— Parse optionalread_sizefield, resize topatch_size_pixelsdinov2/train/train.py—_load_from_teacher_checkpoint()with pos_embed interpolation, gradient accumulation loop, eval atcfg.crops.global_crops_sizedinov2/train/ssl_meta_arch.py—loss_scaleparameter for gradient accumulationdinov2/utils/config.py— LR scaling includes accumulation stepsdinov2/configs/train/vitg14_reg4_highres.yaml— Phase 2 configrun_highres_finetune.sh— Launch scriptHow to use
Test plan
[448, 224, 112, 56]µm matches Phase 1's[448, 224, 112, 56]µm[1, 257, 1536]→[1, 785, 1536]🤖 Generated with Claude Code