-
Notifications
You must be signed in to change notification settings - Fork 195
Pull requests: NVIDIA/TensorRT-Model-Optimizer
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Convert compressed-tensor int4 format to GPTQ int4 format
#590
opened Nov 20, 2025 by
Edwardf0t1
Loading…
[3/N] Support for save/restoring AutoQuantize sensitivity scores
#588
opened Nov 20, 2025 by
realAsma
Loading…
[1/N] Refactored AutoQuantizeSearcher to _AutoQuantizeBaseSearcher & AutoQuantizeGradientSearcher; seperated quant modules and score modules
#586
opened Nov 20, 2025 by
realAsma
Loading…
Add sewing kit and utilities used for pruning scoring - pruning scoring is self-contained now
#584
opened Nov 20, 2025 by
danielkorzekwa
Loading…
Product Rename: TensorRT Model Optimizer to Model Optimizer
#583
opened Nov 20, 2025 by
kevalmorabia97
Loading…
1 of 2 tasks
Added support to export for BF16 weight and amax for vLLM fakequant QAT
#579
opened Nov 19, 2025 by
kinjalpatel27
Loading…
Bump TRT-LLM docker to 1.2.0rc2 (CUDA 13)
#578
opened Nov 19, 2025 by
kevalmorabia97
Loading…
1 task
[OMNIML-2244] Implement the ONNX quantization exporter for INT4
#575
opened Nov 18, 2025 by
ajrasane
Loading…
Specdec Bench: vLLM reqid, SGL path, conc > 1 metric fix
#541
opened Nov 12, 2025 by
IzzyPutterman
Loading…
[OMNIML-3015]Add per tensor/per channel MSE calibrator
#540
opened Nov 12, 2025 by
Fridah-nv
Loading…
2 tasks
[OMNIML-2852] [2/n] Add Core Sparse Attention Infrastructure
#527
opened Nov 7, 2025 by
kaix-nv
Loading…
Previous Next
ProTip!
Type g i on any issue or pull request to go back to the issue listing page.