Skip to content

Conversation

juliensimon
Copy link

Hi @cg123

I built this while experimenting with Arcee Fusion. If you like it, great. If not, no worries 😃

Add mergekit-diff tool for model weight analysis

Summary

Adds mergekit-diff CLI tool for comparing weights between models with the same architecture.

Changes

  • Created mergekit/scripts/diff.py with class-based architecture
  • Added mergekit-diff entry point to pyproject.toml
  • Updated README.md with tool description and TOC entry
  • Supports weight difference analysis and KL divergence computation
  • Multiple loading strategies: direct state dict, lazy loading, full loading
  • Memory efficient for large models with proper GPU support

Technical Details

  • Weight comparison: percentage of differing weights per layer
  • KL divergence: histogram-based distributional comparison
  • Three loading modes: LazyTensorLoader (most efficient), layer-by-layer, full model
  • Type annotations, error handling, progress tracking
  • CLI: mergekit-diff BASE_MODEL MODEL [--device cpu|cuda|mps] [--verbose] [--num-bins 100] [--no-lazy]

Use Cases

Model comparison after fine-tuning, merge analysis, quality assurance, research on weight space geometry.

Copy link

github-actions bot commented Jul 4, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@juliensimon
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant