-
Notifications
You must be signed in to change notification settings - Fork 213
[Bug Fix]: NVBug 5711927 #651
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #651 +/- ##
==========================================
- Coverage 74.66% 74.46% -0.20%
==========================================
Files 183 183
Lines 18550 18409 -141
==========================================
- Hits 13851 13709 -142
- Misses 4699 4700 +1 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
|
@sugunav14 could you share how bias is handled in the PR? |
Previously, I would iterate through modules to be updated and just register a new parameter for weight. Now, I iterate through the parameters of the modules to be updated (which could be just weight, or weight and bias) |
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
fc0d6e8 to
aa5f7df
Compare
Signed-off-by: Suguna Velury <[email protected]>
Signed-off-by: Suguna Velury <[email protected]>
kinjalpatel27
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
## What does this PR do? **Type of change:** Bug fix <!-- Use one of the following: Bug fix, new feature, new example, new tests, documentation. --> **Overview:** Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True ## Usage <!-- You can potentially add a usage example below. --> ```python # Add a code snippet demonstrating how to use this ``` ## Testing <!-- Mention how have you tested your change if applicable. --> `accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_code` `python /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_code` ## Before your PR is "*Ready for review*" <!-- If you haven't finished some of the above items you can still open `Draft` PR. --> - **Make sure you read and follow [Contributor guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)** and your commits are signed. - **Is this change backward compatible?**: Yes <!--- If No, explain why. --> - **Did you write any new necessary tests?**: N/A - **Did you add or update any necessary documentation?**: N/A - **Did you update [Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**: ? <!--- Only for new features, API changes, critical bug fixes or bw breaking changes. --> ## Additional Information <!-- E.g. related issue. --> NVBug [5711927] --------- Signed-off-by: Suguna Velury <[email protected]>
What does this PR do?
Type of change: Bug fix
Overview: Current context manager for FSDP2 aware weight update only works for modules with bias=False. Updated the code to handle modules with bias=True
Usage
# Add a code snippet demonstrating how to use thisTesting
accelerate launch --config_file ./fsdp2.yaml --machine_rank=0 --num_machines=1 --num_processes=4 --main_process_ip=10.126.7.122 --main_process_port=6000 --fsdp_transformer_layer_cls_to_wrap=Qwen2DecoderLayer ./multinode_ptq.py --pyt_ckpt_path Qwen/Qwen2-7B-Instruct --qformat fp8 --kv_cache_qformat fp8 --batch_size 24 --calib_size 64 --export_path B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --trust_remote_codepython /app/tensorrt_llm/examples/llm-api/quickstart_advanced.py --model_dir B200-Qwen2-7B-Instruct-fp8-kvcache-fp8 --enable_attention_dp --tp_size 1 --moe_ep_size 1 --kv_cache_fraction 0.6 --disable_kv_cache_reuse --max_batch_size 8 --max_num_tokens 1024 --trust_remote_codeBefore your PR is "Ready for review"
Additional Information
NVBug [5711927]