Skip to content

Add high-resolution fine-tuning stage (448px tiles, magnification-aware)#68

Open
dat-rohit wants to merge 2 commits intoMedARC-AI:mainfrom
dat-rohit:feat/highres-finetuning-v2
Open

Add high-resolution fine-tuning stage (448px tiles, magnification-aware)#68
dat-rohit wants to merge 2 commits intoMedARC-AI:mainfrom
dat-rohit:feat/highres-finetuning-v2

Conversation

@dat-rohit
Copy link
Copy Markdown

Summary

Implements Phase 2 high-resolution post-training from the Midnight paper (Section 2). Fine-tunes a Phase 1 checkpoint on 448px tiles at halved magnifications, preserving physical tissue regions while doubling pixel resolution. The ViT processes 392px global crops (28x28 = 784 tokens vs 16x16 = 256).

Key design decisions (validated against paper, codebase, and actual TCGA SVS data):

  • 448px tiles (not 512): OpenMidnight Phase 1 uses 224px tiles, so 2x = 448. This preserves the physical-area invariant (224 * mpp = 448 * (mpp/2)). Midnight uses 256→512 (same 2x ratio).
  • Magnification-aware sampling: Targets specific µm/px values [1.0, 0.5, 0.25, 0.125] by computing read_size at the best SVS level and resizing. SVS levels are 4x apart, so intermediate magnifications are synthesized.
  • Loads backbone + heads: Loads DINO and iBOT heads from checkpoint (not just backbone), preserving Phase 1 training. Interpolates pos_embed from 16x16 to 28x28.
  • LR scaling fix: Accounts for gradient accumulation in effective batch size calculation.

Changes

  • prepatching_scripts/create_sample_dataset_txt_highres.py — Magnification-aware sample list generator with multiprocessing. Outputs path x y level read_size format.
  • dinov2/data/datasets/slide_dataset.py — Parse optional read_size field, resize to patch_size_pixels
  • dinov2/train/train.py_load_from_teacher_checkpoint() with pos_embed interpolation, gradient accumulation loop, eval at cfg.crops.global_crops_size
  • dinov2/train/ssl_meta_arch.pyloss_scale parameter for gradient accumulation
  • dinov2/utils/config.py — LR scaling includes accumulation steps
  • dinov2/configs/train/vitg14_reg4_highres.yaml — Phase 2 config
  • run_highres_finetune.sh — Launch script

How to use

# 1. Generate sample list (magnification-aware, 448px tiles)
uv run python3 prepatching_scripts/create_sample_dataset_txt_highres.py \
  --target_patches 25000000 --workers 16

# 2. Update config with checkpoint path and sample list path

# 3. Launch
bash run_highres_finetune.sh

Test plan

  • Sample list generation verified — read_sizes cluster correctly around expected values for each target µm/px
  • Physical area invariant confirmed: [448, 224, 112, 56] µm matches Phase 1's [448, 224, 112, 56] µm
  • pos_embed interpolation verified: [1, 257, 1536][1, 785, 1536]
  • Dry-run training on 1 GPU (in progress)
  • Full 120k iteration run
  • Eva benchmark at 392px

🤖 Generated with Claude Code

dat-rohit and others added 2 commits March 29, 2026 13:20
Implements Phase 2 post-training: fine-tune from a Phase 1 checkpoint on
448px tiles (2x of 224) at halved magnifications [1, 0.5, 0.25, 0.125]
µm/px, preserving physical tissue regions while doubling pixel resolution.
The ViT processes 392px global crops (784 tokens vs 256 at 224px).

Key changes:
- Magnification-aware sample list generator targeting specific µm/px values
  with multiprocessing support
- SlideDataset: parse optional read_size from sample list, resize to
  patch_size_pixels
- train.py: _load_from_teacher_checkpoint() with pos_embed interpolation,
  gradient accumulation, eval transform uses cfg.crops.global_crops_size
- ssl_meta_arch.py: loss_scale parameter for gradient accumulation
- config.py: LR scaling accounts for gradient accumulation steps
- New config vitg14_reg4_highres.yaml: 448px tiles, 392/168 crops,
  batch=6, accum=4, lr=1e-4, 120k iterations

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Suppresses unnecessary all-reduce on non-final micro-steps. With
accum_steps=4, this eliminates 3 redundant gradient synchronizations
per optimizer step across GPUs.

Co-Authored-By: Claude Opus 4.6 (1M context) <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant