Complete guide for training Brownian Bridge Diffusion Model on SynthRAD2023 dataset.
# Activate conda environment
conda activate ct2mri
# Verify installation
python -c "import torch; print(f'PyTorch: {torch.__version__}')"
python -c "import torch; print(f'CUDA: {torch.cuda.is_available()}')"
python -c "import neptune; print('Neptune OK')"Ensure these are installed in your ct2mri conda environment:
- PyTorch >= 2.0 (with CUDA support)
- neptune >= 1.0
- albumentations
- h5py
- nibabel
- tqdm
- pyyaml
- wandb (optional, legacy)
Your SynthRAD2023 dataset should be organized as follows:
/pscratch/sd/s/seojw/CT_to_MRI/Task1-2/brain/
βββ 1BA001/
β βββ ct_brain_crop_pad_(0,1).pt # CT volume [D, H, W] in [0, 1]
β βββ mri_brain_crop_pad_(0,1).pt # MRI volume [D, H, W] in [0, 1]
βββ 1BA002/
β βββ ct_brain_crop_pad_(0,1).pt
β βββ mri_brain_crop_pad_(0,1).pt
βββ ...
Important Notes:
- Files must be PyTorch tensors saved with
torch.save() - Tensors shape:
[Depth, Height, Width] - Values normalized to
[0, 1]range - Subject IDs must contain "1B" for auto-detection
# Submit training job
cd /pscratch/sd/s/seojw/CT_to_MRI/CT2MRI
sbatch scripts/train_synthrad.shpython train_synthrad.py \
--train \
--config configs/BBDM_synthrad.yaml \
--base_dir /pscratch/sd/s/seojw/CT_to_MRI/Task1-2/brain \
--exp_name synthrad_bbdm_baseline \
--batch 8 \
--max_epoch 5000 \
--gpu_ids 0Training automatically resumes from last checkpoint if available:
# Manual resume
python train_synthrad.py \
--train \
--config configs/BBDM_synthrad.yaml \
--resume_model results/SynthRAD2023_176/synthrad_bbdm_baseline/checkpoint/last_model.pth \
--resume_optim results/SynthRAD2023_176/synthrad_bbdm_baseline/checkpoint/last_optim_sche.pth \
--exp_name synthrad_bbdm_baseline \
--gpu_ids 0# Submit test job
sbatch scripts/test_synthrad.sh
# Or directly
python train_synthrad.py \
--sample_to_eval \
--config configs/BBDM_synthrad.yaml \
--resume_model results/SynthRAD2023_176/synthrad_bbdm_baseline/checkpoint/last_model.pth \
--base_dir /pscratch/sd/s/seojw/CT_to_MRI/Task1-2/brain \
--gpu_ids 0Model Architecture:
num_timesteps: 1000- Diffusion timestepssample_step: 50- Sampling steps (inference)mt_type: 'linear'- Marginal transformation (linear/sin)objective: 'grad'- Training objective (grad/noise/ysubx)
Training:
n_epochs: 5000- Maximum epochsbatch_size: 8- Batch size per GPUlr: 1e-4- Learning rateuse_amp: True- Enable bfloat16 AMPsample_interval: 200- Image logging frequencycheckpoint_interval: 5- Save checkpoint every N epochs
Data:
image_size: 176- Input image resolutionchannels: 3- Multi-channel slices (center Β± 1)augment: True- Random flip augmentation
python train_synthrad.py \
--train \
--config configs/BBDM_synthrad.yaml \
--HW 256 \ # Change image size
--batch 4 \ # Change batch size
--lr 5e-5 \ # Change learning rate
--max_epoch 10000 \ # Change max epochs
--sample_step 25 \ # Faster sampling
--exp_name custom_experiment # Custom nameTraining:
train/loss- Training loss (latent L1)train/lr- Current learning ratetrain/images/source_ct- Input CT slicestrain/images/target_mri- Ground truth MRItrain/images/generated_mri- Generated MRI
Validation:
val/loss- Validation loss
Model:
model/total_params- Total parametersmodel/trainable_params- Trainable parameters
Checkpoints:
checkpoints/epoch- Epoch checkpointscheckpoints/last- Latest checkpoint
- Go to https://app.neptune.ai
- Navigate to project:
ejswjawnj/CT-to-MRI - Find your experiment by name
- View metrics, images, and download checkpoints
results/SynthRAD2023_176/
βββ synthrad_bbdm_baseline/
βββ checkpoint/
β βββ epoch_0005.pth
β βββ epoch_0010.pth
β βββ last_model.pth # Latest model
β βββ config.yaml # Saved config
βββ log/ # TensorBoard logs
βββ image/ # Training visualizations
βββ sample/ # Generated samples (training)
βββ sample_to_eval/ # Test-time generation
βββ 1BA001_generated_mri.npy
βββ 1BA002_generated_mri.npy
βββ ...
# 1. Check dataset
ls /pscratch/sd/s/seojw/CT_to_MRI/Task1-2/brain/
# 2. Submit job
sbatch scripts/train_synthrad.sh
# 3. Monitor
squeue -u $USER
tail -f logs/train_*.out
# 4. Check Neptune dashboard# Test different learning rates
for lr in 5e-5 1e-4 2e-4; do
python train_synthrad.py \
--train \
--config configs/BBDM_synthrad.yaml \
--lr $lr \
--exp_name "synthrad_lr${lr}" \
--max_epoch 100 \
--gpu_ids 0
done# Training automatically resumes from last checkpoint
sbatch scripts/train_synthrad.sh# 1. Ensure model checkpoint exists
ls results/SynthRAD2023_176/synthrad_bbdm_baseline/checkpoint/last_model.pth
# 2. Run inference
sbatch scripts/test_synthrad.sh
# 3. Check results
ls results/SynthRAD2023_176/synthrad_bbdm_baseline/sample_to_eval/# Verify base_dir in config
grep base_dir configs/BBDM_synthrad.yaml
# Check subjects
ls /pscratch/sd/s/seojw/CT_to_MRI/Task1-2/brain/ | grep 1B | head -5# Test Neptune connection
python -c "import neptune; run = neptune.init_run(project='ejswjawnj/CT-to-MRI', api_token='YOUR_TOKEN'); run.stop()"# Reduce batch size
python train_synthrad.py --train --batch 4 ...
# Or reduce image size
python train_synthrad.py --train --HW 128 ...# Check checkpoint content
python -c "import torch; ckpt = torch.load('path/to/checkpoint.pth'); print(ckpt.keys())"Default split: 90% train, 10% validation
- Controlled in
datasets/synthrad_dataset.py - Subjects sorted alphabetically, then split
- Consistent across runs (deterministic)
To customize:
# In synthrad_dataset.py, line ~40
n_train = int(n_total * 0.9) # Change 0.9 to desired ratio- Use bfloat16 AMP: Already enabled by default
- Optimize batch size: Balance between GPU memory and speed
- Reduce sample_interval: Less frequent image logging speeds up training
- Use pin_memory: Already enabled in DataLoader
- Multi-GPU: Set
--gpu_ids 0,1,2,3for DDP training
Each checkpoint contains:
epoch: Current epoch numbermodel_state_dict: Model weightsoptimizer_state_dict: Optimizer statescheduler_state_dict: LR scheduler stateema_state_dict: EMA weights (if enabled)val_loss: Validation loss at save timeconfig: Full configuration
If you use this code, please cite:
@inproceedings{choo2024ct2mri,
title={Slice-Consistent 3D Volumetric Brain CT-to-MRI Translation with 2D Brownian Bridge Diffusion Model},
author={Choo et al.},
booktitle={MICCAI},
year={2024}
}- Original BBDM paper: arXiv:2407.05059
- Original repository: micv-yonsei/ct2mri2024
- Neptune AI docs: docs.neptune.ai
Last Updated: 2025-10-24 Version: 1.0