Commit 6467ec2
[2/N] Added KDLoss based AutoQuantize (#592)
## What does this PR do?
**Type of change:** ? New Feature
**Overview:**
This PR extends AutoQuantize with KL Divergence Loss-based sensitivity
measurement as an alternative to the existing gradient-based approach.
KD Loss uses a binary searcher similar to the binary searcher in
FastNAS.
AutoQuantize gradient is faster than KL Divergence based AutoQuantize.
However KL Divergence does not need the model implementation to support
gradient backward. In addition, AutoQuantize collected KL Divergence is
useful for sensitivity analysis of the model. KL Divergence is a more
direct measure of sensitivity than gradient scores.
## Usage
see `tests/unit/torch/quantization/test_autoquant.py`
## Testing
Testes with unit tests.
Result for Qwen3 8B
<img width="1979" height="980" alt="image"
src="https://github.com/user-attachments/assets/6cc36425-ea60-4a76-a3c6-25293667c742"
/>
## Before your PR is "*Ready for review*"
<!-- If you haven't finished some of the above items you can still open
`Draft` PR. -->
- **Make sure you read and follow [Contributor
guidelines](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CONTRIBUTING.md)**
and your commits are signed.
- **Is this change backward compatible?**: Yes/No <!--- If No, explain
why. -->
- **Did you write any new necessary tests?**: Yes/No
- **Did you add or update any necessary documentation?**: Yes/No
- **Did you update
[Changelog](https://github.com/NVIDIA/TensorRT-Model-Optimizer/blob/main/CHANGELOG.rst)?**:
Yes/No <!--- Only for new features, API changes, critical bug fixes or
bw breaking changes. -->
## Additional Information
<!-- E.g. related issue. -->
---------
Signed-off-by: realAsma <[email protected]>
Co-authored-by: Asma Kuriparambil Thekkumpate <[email protected]>1 parent 4b72089 commit 6467ec2
File tree
12 files changed
+614
-66
lines changed- examples
- llm_eval
- llm_ptq
- scripts
- modelopt/torch/quantization
- tests/unit/torch/quantization
- plugins
12 files changed
+614
-66
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14 | 14 | | |
15 | 15 | | |
16 | 16 | | |
| 17 | + | |
| 18 | + | |
17 | 19 | | |
18 | 20 | | |
19 | 21 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
201 | 201 | | |
202 | 202 | | |
203 | 203 | | |
204 | | - | |
205 | 204 | | |
| 205 | + | |
| 206 | + | |
| 207 | + | |
| 208 | + | |
206 | 209 | | |
207 | 210 | | |
208 | 211 | | |
| |||
450 | 453 | | |
451 | 454 | | |
452 | 455 | | |
| 456 | + | |
| 457 | + | |
| 458 | + | |
| 459 | + | |
| 460 | + | |
| 461 | + | |
| 462 | + | |
| 463 | + | |
| 464 | + | |
| 465 | + | |
| 466 | + | |
| 467 | + | |
| 468 | + | |
| 469 | + | |
| 470 | + | |
| 471 | + | |
| 472 | + | |
| 473 | + | |
| 474 | + | |
| 475 | + | |
| 476 | + | |
| 477 | + | |
| 478 | + | |
| 479 | + | |
| 480 | + | |
| 481 | + | |
| 482 | + | |
| 483 | + | |
| 484 | + | |
| 485 | + | |
453 | 486 | | |
454 | 487 | | |
455 | 488 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
56 | 59 | | |
57 | 60 | | |
58 | 61 | | |
| |||
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
| 87 | + | |
| 88 | + | |
84 | 89 | | |
85 | 90 | | |
| 91 | + | |
86 | 92 | | |
87 | 93 | | |
88 | 94 | | |
| |||
101 | 107 | | |
102 | 108 | | |
103 | 109 | | |
| 110 | + | |
| 111 | + | |
| 112 | + | |
| 113 | + | |
| 114 | + | |
| 115 | + | |
104 | 116 | | |
105 | 117 | | |
106 | 118 | | |
| |||
110 | 122 | | |
111 | 123 | | |
112 | 124 | | |
113 | | - | |
| 125 | + | |
| 126 | + | |
| 127 | + | |
| 128 | + | |
| 129 | + | |
| 130 | + | |
| 131 | + | |
| 132 | + | |
| 133 | + | |
114 | 134 | | |
115 | 135 | | |
116 | | - | |
| 136 | + | |
| 137 | + | |
| 138 | + | |
| 139 | + | |
| 140 | + | |
| 141 | + | |
| 142 | + | |
| 143 | + | |
| 144 | + | |
| 145 | + | |
| 146 | + | |
| 147 | + | |
| 148 | + | |
117 | 149 | | |
118 | 150 | | |
119 | 151 | | |
| |||
139 | 171 | | |
140 | 172 | | |
141 | 173 | | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
142 | 177 | | |
143 | 178 | | |
144 | 179 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
227 | 227 | | |
228 | 228 | | |
229 | 229 | | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
230 | 233 | | |
231 | 234 | | |
232 | 235 | | |
| |||
281 | 284 | | |
282 | 285 | | |
283 | 286 | | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
284 | 290 | | |
285 | 291 | | |
286 | 292 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
66 | 66 | | |
67 | 67 | | |
68 | 68 | | |
| 69 | + | |
| 70 | + | |
69 | 71 | | |
70 | 72 | | |
| 73 | + | |
71 | 74 | | |
72 | 75 | | |
73 | 76 | | |
| |||
81 | 84 | | |
82 | 85 | | |
83 | 86 | | |
84 | | - | |
85 | | - | |
86 | | - | |
87 | | - | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
| 99 | + | |
| 100 | + | |
| 101 | + | |
| 102 | + | |
| 103 | + | |
| 104 | + | |
| 105 | + | |
| 106 | + | |
| 107 | + | |
88 | 108 | | |
89 | 109 | | |
90 | 110 | | |
91 | 111 | | |
92 | 112 | | |
93 | 113 | | |
94 | | - | |
| 114 | + | |
95 | 115 | | |
96 | 116 | | |
97 | | - | |
98 | | - | |
99 | | - | |
| 117 | + | |
| 118 | + | |
100 | 119 | | |
| 120 | + | |
| 121 | + | |
| 122 | + | |
101 | 123 | | |
102 | 124 | | |
103 | 125 | | |
| |||
141 | 163 | | |
142 | 164 | | |
143 | 165 | | |
144 | | - | |
145 | 166 | | |
146 | 167 | | |
147 | 168 | | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
148 | 173 | | |
149 | 174 | | |
150 | 175 | | |
| |||
155 | 180 | | |
156 | 181 | | |
157 | 182 | | |
158 | | - | |
159 | 183 | | |
160 | 184 | | |
161 | 185 | | |
| 186 | + | |
| 187 | + | |
| 188 | + | |
| 189 | + | |
| 190 | + | |
162 | 191 | | |
163 | 192 | | |
164 | 193 | | |
| |||
170 | 199 | | |
171 | 200 | | |
172 | 201 | | |
| 202 | + | |
| 203 | + | |
173 | 204 | | |
174 | | - | |
| 205 | + | |
175 | 206 | | |
176 | 207 | | |
177 | 208 | | |
| |||
186 | 217 | | |
187 | 218 | | |
188 | 219 | | |
189 | | - | |
| 220 | + | |
190 | 221 | | |
191 | 222 | | |
192 | 223 | | |
193 | 224 | | |
194 | 225 | | |
195 | 226 | | |
196 | 227 | | |
197 | | - | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
198 | 237 | | |
199 | 238 | | |
200 | 239 | | |
| |||
0 commit comments