Skip to content

Fix LoRA weight exports to save merged full checkpoints#2182

Open
taivu1998 wants to merge 2 commits intoPrimeIntellect-ai:mainfrom
taivu1998:fix/1707-lora-weight-export
Open

Fix LoRA weight exports to save merged full checkpoints#2182
taivu1998 wants to merge 2 commits intoPrimeIntellect-ai:mainfrom
taivu1998:fix/1707-lora-weight-export

Conversation

@taivu1998
Copy link
Copy Markdown

@taivu1998 taivu1998 commented Apr 2, 2026

Summary

  • merge LoRA deltas into full trainer weight exports before HF serialization so weights/step_<N> stays a standalone checkpoint
  • keep save_adapter_separately=true additive by still writing lora_adapters/ alongside the merged full checkpoint
  • add focused merge-math coverage plus full-checkpoint assertions for the existing SFT LoRA integration path
  • clarify the checkpoint/config contract for LoRA exports in the trainer config and checkpointing docs

Closes #1707.

Testing

  • python3 -m py_compile src/prime_rl/trainer/lora.py src/prime_rl/trainer/weights.py src/prime_rl/trainer/ckpt.py src/prime_rl/configs/trainer.py tests/unit/train/test_lora.py tests/integration/test_sft_lora.py
  • git diff --check -- src/prime_rl/trainer/lora.py src/prime_rl/trainer/weights.py src/prime_rl/trainer/ckpt.py src/prime_rl/configs/trainer.py docs/checkpointing.md tests/unit/train/test_lora.py tests/integration/test_sft_lora.py
  • attempted: UV_CACHE_DIR=/tmp/uv-cache uv run pytest tests/unit/train/test_lora.py -q
    • this could not be completed locally because the repo lockfile only supports Linux environments while this workspace is macOS

Note

Medium Risk
Changes the semantics of LoRA weight exports by merging adapter deltas into the main weights/step_<N> checkpoint, which could affect downstream consumers if any relied on the previous unmerged format; adds test coverage to reduce regression risk.

Overview
Ensures LoRA weight exports always produce a standalone Hugging Face-compatible merged checkpoint at weights/step_<N> by merging LoRA deltas into the gathered full-model state dict before HF serialization.

Makes ckpt.weights.save_adapter_separately additive (writes weights/step_<N>/lora_adapters as a sidecar without altering the main checkpoint), removes LoRA-cleanup from gather_weights_on_master, and adds unit + expanded integration tests (including a new merged-only CI config) to assert both adapter exports and merged full checkpoints are HF-compatible.

Reviewed by Cursor Bugbot for commit 6a6ae08. Bugbot is set up for automated code reviews on this repo. Configure here.

Copy link
Copy Markdown

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

LoRA adapters not saved to weights

1 participant