Skip to content

Conversation

@sugunav14
Copy link
Contributor

What does this PR do?

Type of change: Bug fix

Overview: Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True

Usage

# Add a code snippet demonstrating how to use this

Testing

accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_code

python /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_code

Before your PR is "Ready for review"

  • Make sure you read and follow Contributor guidelines and your commits are signed.
  • Is this change backward compatible?: Yes
  • Did you write any new necessary tests?: N/A
  • Did you add or update any necessary documentation?: N/A
  • Did you update Changelog?: ?

Additional Information

NVBug [5711927]

@sugunav14 sugunav14 requested a review from a team as a code owner December 5, 2025 08:51
@sugunav14 sugunav14 self-assigned this Dec 5, 2025
@codecov
Copy link

codecov bot commented Dec 5, 2025

Codecov Report

❌ Patch coverage is 0% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.46%. Comparing base (c1c5ca0) to head (11d4cf4).
⚠️ Report is 5 commits behind head on main.

Files with missing lines Patch % Lines
modelopt/torch/quantization/utils.py 0.00% 22 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
- Coverage   74.66%   74.46%   -0.20%     
==========================================
  Files         183      183              
  Lines       18550    18409     -141     
==========================================
- Hits        13851    13709     -142     
- Misses       4699     4700       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@sugunav14 sugunav14 requested review from a team as code owners December 5, 2025 20:02
@sugunav14 sugunav14 requested a review from realAsma December 5, 2025 23:28
@cjluo-nv
Copy link
Collaborator

cjluo-nv commented Dec 6, 2025

@sugunav14 could you share how bias is handled in the PR?

@sugunav14
Copy link
Contributor Author

@sugunav14 could you share how bias is handled in the PR?

Previously, I would iterate through modules to be updated and just register a new parameter for weight. Now, I iterate through the parameters of the modules to be updated (which could be just weight, or weight and bias)

Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
@sugunav14 sugunav14 force-pushed the svelury/nvbug-5711927 branch from fc0d6e8 to aa5f7df Compare December 7, 2025 21:41
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Copy link
Contributor

@kinjalpatel27 kinjalpatel27 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@sugunav14 sugunav14 merged commit 07c3881 into main Dec 10, 2025
35 checks passed
@sugunav14 sugunav14 deleted the svelury/nvbug-5711927 branch December 10, 2025 04:06
kevalmorabia97 pushed a commit that referenced this pull request Dec 11, 2025
## What does this PR do?

**Type of change:** Bug fix <!-- Use one of the following: Bug fix, new
feature, new example, new tests, documentation. -->

**Overview:** Current context manager for FSDP2 aware weight update only
works for modules with bias=False. Updated the code to handle modules
with bias=True

## Usage
<!-- You can potentially add a usage example below. -->

```python
# Add a code snippet demonstrating how to use this
```

## Testing
<!-- Mention how have you tested your change if applicable. -->
`accelerate launch --config_file ./fsdp2.yaml --machine_rank=0
--num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122
--main_process_port=6000
--fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer
./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8
--kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path
B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_code`

`python /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py
--model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp
--tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6
--disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024
--trust_remote_code`

## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->

- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes <!--- If No, explain why.
-->
- **Did you write any new necessary tests?**: N/A
- **Did you add or update any necessary documentation?**: N/A
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
? <!--- Only for new features, API changes, critical bug fixes or bw
breaking changes. -->

## Additional Information
<!-- E.g. related issue. -->
NVBug [5711927]

---------

Signed-off-by: Suguna Velury <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants