Skip to content

Conversation

@binghanc
Copy link

What does this PR do?

Type of change: new feature

Overview: support for newer checkpoints

Usage

torchrun --nproc-per-node=8 ptq.py --mla_quant nvfp4_wq_a_wkv_a_wq_b_wo_fp8_wkv_b --batch_size 4 --model_path $DS_CKPT --config DeepSeek-V3/inference/configs/config_671B.json --quant_cfg NVFP4_DEFAULT_CFG --output_path $AMAX_PATH

@copy-pr-bot
Copy link

copy-pr-bot bot commented Nov 20, 2025

This pull request requires additional validation before any workflows can run on NVIDIA's runners.

Pull request vetters can view their responsibilities here.

Contributors can view more details about this message here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant