Skip to content

Conversation

zongzhenyang
Copy link

Implements CABS (Conflict-Aware and Balanced Sparsification) model merging technique from "CABS: Conflict-Aware and Balanced Sparsification for Enhancing Model Merging".

CABS aims to improve merged model quality by mitigating parameter interference through sequential conflict-aware pruning and applying n:m structural pruning to task vector components.

Key Features:

  • Sequential Conflict-Aware Pruning: Processes task vectors in a user-defined pruning_order, masking out parameters already claimed by prior models in the sequence before subsequent pruning. This minimizes destructive overlap.
  • N:M Structural Pruning:
    • Applies n:m pruning (retaining n largest magnitude weights out of every m consecutive weights) to the conflict-masked task vector components.
    • n and m values are configurable globally (default_n_val, default_m_val) and per-model (n_val, m_val).
  • Weighted Aggregation: Pruned task vectors are scaled by a weight (lambda) and added to the base model.
  • Added cabs.py implementing CABSMerge and CABSTask.
  • Added CABS example configuration in examples/cabs.yml.

Copy link

github-actions bot commented May 9, 2025

All contributors have signed the CLA ✍️ ✅
Posted by the CLA Assistant Lite bot.

@zongzhenyang
Copy link
Author

I have read the CLA Document and I hereby sign the CLA

@cg123
Copy link
Collaborator

cg123 commented May 10, 2025

Thanks for the PR! I'd love to have your method in mergekit.

Two things:

  • Could you please run the pre-commit hook to format the code and push the changes?
  • Would you like to add your method to the table in the README?

@CasualAutopsy
Copy link

This is absolutely an amazing merging method, however it seems to need more support with gradients. While technically compatible if you supply as many values to the array as there are blocks, it would be nice to have it round to the closest whole number so it doesn't throw an error.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants