You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
AutoRound automatically selects the best available backend based on the installed libraries and prompts the user to
336
336
install additional libraries when a better backend is found. On CUDA, the default priority is Marlin > ExLLaMAV2 >
337
-
Triton, but the final choice depends on factors such as bits group_size packing format compatibility, etc. Please refer
338
-
to the following table for the details.
337
+
Triton, but the final choice depends on factors such as bits, group_size, packing format compatibility, etc. And the backend may not always be the most suitable for certain devices. Please refer
338
+
to the following table for the details and specify the backend you want.
339
339
340
340
| Name | Devices | Bits | Dtypes | Priority | Packing format | Requirements |
0 commit comments