symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) #4868

ColinPeppler · 2025-09-12T20:17:59Z

Summary:

What

restore changes from D79899622
fixes numerical issue with llama

Llama numerical issue fix

In the original diff, I was not checking the value of std::optional<bool> symmetric only that a value existed...

https://www.internalfb.com/code/fbsource/[e97b4a76367b61af327d1aa2683d691730220493]/fbcode/deeplearning/fbgemm/fbgemm_gpu/experimental/gen_ai/src/kv_cache/kv_cache.cu?lines=2903%2C2911%2C2981-2982

This is now fixed with

bool use_symmetric_quantization = symmetric.value_or(false);
if (use_symmetric_quantization) {
  CALL_DEQUANTIZE_FP8_CACHE(false, true);
}

Differential Revision: D82320500

netlify · 2025-09-12T20:18:05Z

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Name	Link
🔨 Latest commit	`323989a`
🔍 Latest deploy log	https://app.netlify.com/projects/pytorch-fbgemm-docs/deploys/68cc3aa6db37b700082f67f1
😎 Deploy Preview	https://deploy-preview-4868--pytorch-fbgemm-docs.netlify.app
📱 Preview on mobile	Toggle QR Code... Use your smartphone camera to open QR code link.

To edit notification comments on pull requests, go to your Netlify project configuration.

facebook-github-bot · 2025-09-12T20:18:22Z

@ColinPeppler has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320500.

…rch#4868) Summary: X-link: facebookresearch/FBGEMM#1890 Differential Revision: D82320500

facebook-github-bot · 2025-09-18T17:00:28Z

@ColinPeppler has exported this pull request. If you are a Meta employee, you can view the originating diff in D82320500.

…rch#4868) Summary: X-link: facebookresearch/FBGEMM#1890 Differential Revision: D82320500

meta-cla bot added the cla signed label Sep 12, 2025

facebook-github-bot added fb-exported meta-exported labels Sep 12, 2025

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) (pyto…

323989a

…rch#4868) Summary: X-link: facebookresearch/FBGEMM#1890 Differential Revision: D82320500

ColinPeppler force-pushed the export-D82320500 branch from 5eb9f4c to 323989a Compare September 18, 2025 17:00

ColinPeppler added a commit to ColinPeppler/FBGEMM that referenced this pull request Sep 18, 2025

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) (pyto…

376c3e2

…rch#4868) Summary: X-link: facebookresearch/FBGEMM#1890 Differential Revision: D82320500

ColinPeppler added a commit to ColinPeppler/FBGEMM that referenced this pull request Sep 18, 2025

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) (pyto…

86359f1

…rch#4868) Summary: X-link: facebookresearch/FBGEMM#1890 Differential Revision: D82320500

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) #4868

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) #4868

Uh oh!

ColinPeppler commented Sep 12, 2025

Uh oh!

netlify bot commented Sep 12, 2025 •

edited

Loading

Uh oh!

facebook-github-bot commented Sep 12, 2025

Uh oh!

facebook-github-bot commented Sep 18, 2025

Uh oh!

Uh oh!

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) #4868

Are you sure you want to change the base?

symmetric quantization to FBGEMM prefill token-wise FP8 (fixed) #4868

Uh oh!

Conversation

ColinPeppler commented Sep 12, 2025

What

Llama numerical issue fix

Uh oh!

netlify bot commented Sep 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ Deploy Preview for pytorch-fbgemm-docs ready!

Uh oh!

facebook-github-bot commented Sep 12, 2025

Uh oh!

facebook-github-bot commented Sep 18, 2025

Uh oh!

Uh oh!

netlify bot commented Sep 12, 2025 •

edited

Loading