config file based modelopt config 1/N #657

shengliangxu · 2025-12-06T00:30:21Z

What does this PR do?

Type of change: ?

new feature

Overview: ?

start a new config system using yaml/yml files. The config system adopts Hydra style override, using the defaults tag, but we implement it by ourselves by simply merging the configs using OmegaConf
implement the quantization configs using the new config system, but not actually in use.
make sure the configs from the new config system match the exisiting configs
pipe through config_file based hf_ptq script
testted hf_ptq using both builtin and extenal config file

Testing

python examples/llm_ptq/hf_ptq.py \
    --pyt_ckpt_path=Qwen/Qwen3-8B \
    --export_path=qwen3-8B_fp8 \
    --qformat=fp8 \
    --kv_cache_qformat=fp8 \
    --calib_size=16 \
    --batch_size=0 \
    --trust_remote_code \
    --export_fmt=hf

python examples/llm_ptq/hf_ptq.py \
    --pyt_ckpt_path=Qwen/Qwen3-8B \
    --export_path=nvfp4_awq \
    --qformat=nvfp4_awq_full \
    --kv_cache_qformat=fp8 \
    --calib_size=16 \
    --batch_size=0 \
    --trust_remote_code \
    --export_fmt=hf

python examples/llm_ptq/hf_ptq.py \
    --qformat=nvfp4,fp8 \
    --auto_quantize_score_size 128 \
    --auto_quantize_bits 5.0 \
    --auto_quantize_checkpoint Qwen3-8B-auto-quantize-checkpoint \
    --pyt_ckpt_path=Qwen/Qwen3-8B \
    --export_path=qwen3-8B_auto_quantize \
    --kv_cache_qformat=fp8 \
    --calib_size=16 \
    --batch_size=0 \
    --trust_remote_code \
    --export_fmt=trtllm

copy-pr-bot · 2025-12-06T00:30:25Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

codecov · 2025-12-06T00:41:17Z

Codecov Report

❌ Patch coverage is 76.47059% with 16 lines in your changes missing coverage. Please review.
✅ Project coverage is 74.52%. Comparing base (9409412) to head (c5f657e).
⚠️ Report is 6 commits behind head on main.

Files with missing lines	Patch %	Lines
modelopt/torch/opt/config.py	31.57%	13 Missing ⚠️
modelopt/torch/quantization/config.py	93.87%	3 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##             main     #657      +/-   ##
==========================================
- Coverage   74.66%   74.52%   -0.15%     
==========================================
  Files         183      183              
  Lines       18550    18467      -83     
==========================================
- Hits        13851    13763      -88     
- Misses       4699     4704       +5

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

This script has several separate logic and the code of them are entangled, making it really hard to add new features Refactor them so that we separate these logics: 1. sparsity, all logic go to sparsity_main. TODO: we may actually move this logic out to a separate script 2. quantize, all logic go to quantize_main. 2.1 plain quantization with a single quantization format 2.2 auto quantization In the quantization pipeline, separate the pipeline to: 1. model loading 2. calibrate dataset loading 3. pre-quantize processing 4. actual quantize 5. post-quantize processing 6. quantized model export Signed-off-by: Shengliang Xu <[email protected]>

1. start a new config system using yaml/yml files. The config system adopts Hydra style override, using the defaults tag, but we implement it by ourselves by simply merging the configs using OmegaConf 2. implement the quantization configs using the new config system, but not actually in use. 3. make sure the configs from the new config system match the exisiting configs 4. pipe through config_file based hf_ptq script 5. testted hf_ptq using both builtin and extenal config file Signed-off-by: Shengliang Xu <[email protected]>

and they block custom quantization config Signed-off-by: Shengliang Xu <[email protected]>

Signed-off-by: Shengliang Xu <[email protected]>

shengliangxu force-pushed the shengliangx/config_yaml branch from c5f657e to b4c0a27 Compare December 8, 2025 22:34

shengliangxu force-pushed the shengliangx/config_yaml branch from b4c0a27 to db07d84 Compare December 9, 2025 02:04

shengliangxu added 2 commits December 9, 2025 17:58

remove asserts that really have no effect

62f6b03

and they block custom quantization config Signed-off-by: Shengliang Xu <[email protected]>

fix parsing from list to tuple

9c326f8

Signed-off-by: Shengliang Xu <[email protected]>

shengliangxu force-pushed the shengliangx/config_yaml branch from d5cba37 to 9c326f8 Compare December 9, 2025 17:59

shengliangxu self-assigned this Dec 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

config file based modelopt config 1/N #657

config file based modelopt config 1/N #657

Uh oh!

shengliangxu commented Dec 6, 2025 •

edited

Loading

Uh oh!

copy-pr-bot bot commented Dec 6, 2025

Uh oh!

codecov bot commented Dec 6, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

config file based modelopt config 1/N #657

Are you sure you want to change the base?

config file based modelopt config 1/N #657

Uh oh!

Conversation

shengliangxu commented Dec 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

What does this PR do?

Testing

Uh oh!

copy-pr-bot bot commented Dec 6, 2025

Uh oh!

codecov bot commented Dec 6, 2025

Codecov Report

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

shengliangxu commented Dec 6, 2025 •

edited

Loading