[Bug Fix]: NVBug 5711927 #651

sugunav14 · 2025-12-05T08:51:38Z

What does this PR do?

Type of change: Bug fix

Overview: Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True

Usage

# Add a code snippet demonstrating how to use this

Testing

accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_code

python /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_code

Before your PR is "Ready for review"

Make sure you read and follow Contributor guidelines and your commits are signed.
Is this change backward compatible?: Yes
Did you write any new necessary tests?: N/A
Did you add or update any necessary documentation?: N/A
Did you update Changelog?: ?

Additional Information

NVBug [5711927]

codecov · 2025-12-05T09:02:08Z

Codecov Report

❌ Patch coverage is 0% with 22 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.46%. Comparing base (c1c5ca0) to head (11d4cf4).
⚠️ Report is 5 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/torch/quantization/utils.py	0.00%	22 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #651      +/-   ##
==========================================
- Coverage   74.66%   74.46%   -0.20%     
==========================================
  Files         183      183              
  Lines       18550    18409     -141     
==========================================
- Hits        13851    13709     -142     
- Misses       4699     4700       +1

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

cjluo-nv · 2025-12-06T07:10:13Z

@sugunav14 could you share how bias is handled in the PR?

sugunav14 · 2025-12-07T20:16:51Z

@sugunav14 could you share how bias is handled in the PR?

Previously, I would iterate through modules to be updated and just register a new parameter for weight. Now, I iterate through the parameters of the modules to be updated (which could be just weight, or weight and bias)

Signed-off-by: Suguna Velury <[email protected]>

modelopt/torch/quantization/utils.py

Signed-off-by: Suguna Velury <[email protected]>

kinjalpatel27

LGTM

## What does this PR do? **Type of change:** Bug fix  **Overview:** Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True ## Usage  ```python # Add a code snippet demonstrating how to use this ``` ## Testing  `accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_code` `python /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_code` ## Before your PR is "*Ready for review*"  - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes  - **Did you write any new necessary tests?**: N/A - **Did you add or update any necessary documentation?**: N/A - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: ?  ## Additional Information  NVBug [5711927] --------- Signed-off-by: Suguna Velury <[email protected]>

sugunav14 requested a review from a team as a code owner December 5, 2025 08:51

sugunav14 requested a review from kinjalpatel27 December 5, 2025 08:51

sugunav14 self-assigned this Dec 5, 2025

sugunav14 requested review from cjluo-nv, kevalmorabia97 and meenchen December 5, 2025 18:02

sugunav14 requested review from a team as code owners December 5, 2025 20:02

sugunav14 requested a review from realAsma December 5, 2025 23:28

sugunav14 added 4 commits December 7, 2025 21:41

updated context manager to handle weight and bias

b445a39

Signed-off-by: Suguna Velury <[email protected]>

added fixes for nvbug 5710649

bc2584c

Signed-off-by: Suguna Velury <[email protected]>

update

658afc2

Signed-off-by: Suguna Velury <[email protected]>

updated unit tests

aa5f7df

Signed-off-by: Suguna Velury <[email protected]>

sugunav14 force-pushed the svelury/nvbug-5711927 branch from fc0d6e8 to aa5f7df Compare December 7, 2025 21:41

update

fc573e6

Signed-off-by: Suguna Velury <[email protected]>

kinjalpatel27 reviewed Dec 8, 2025

View reviewed changes

modelopt/torch/quantization/utils.py Outdated Show resolved Hide resolved

updated variable names

442d8ce

Signed-off-by: Suguna Velury <[email protected]>

kinjalpatel27 approved these changes Dec 8, 2025

View reviewed changes

cjluo-nv approved these changes Dec 9, 2025

View reviewed changes

sugunav14 merged commit 07c3881 into main Dec 10, 2025
35 checks passed

sugunav14 deleted the svelury/nvbug-5711927 branch December 10, 2025 04:06

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Bug Fix]: NVBug 5711927 #651

[Bug Fix]: NVBug 5711927 #651

Uh oh!

sugunav14 commented Dec 5, 2025

Uh oh!

codecov bot commented Dec 5, 2025 •

edited

Loading

Uh oh!

cjluo-nv commented Dec 6, 2025

Uh oh!

sugunav14 commented Dec 7, 2025

Uh oh!

Uh oh!

kinjalpatel27 left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

[Bug Fix]: NVBug 5711927 #651

[Bug Fix]: NVBug 5711927 #651

Uh oh!

Conversation

sugunav14 commented Dec 5, 2025

What does this PR do?

Usage

Testing

Before your PR is "Ready for review"

Additional Information

Uh oh!

codecov bot commented Dec 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

cjluo-nv commented Dec 6, 2025

Uh oh!

sugunav14 commented Dec 7, 2025

Uh oh!

Uh oh!

kinjalpatel27 left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

codecov bot commented Dec 5, 2025 •

edited

Loading