-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Visual blocks are not quantized in code #2
Comments
Thank you for your interest in the MBQ work ! |
@Albert-huyc If so, does VLM's visual block retain FP16 without quantization? |
@junghye0klee You're right. In all our experiments, except those detailed in Sec. 5.3.3, the visual block of the VLM remained unquantized. |
@Albert-huyc Thank you for your kind response. If I ask one more question, does additional quantization of visual blocks result in a significant decrease in accuracy? |
@junghye0klee That's a good question! We've actually conducted several experiments on this, and you can find the detailed results in Table 4 (Section 5.3.3) of our paper. The experimental results show that quantizing the visual blocks to W4A8 maintains the model's performance without any noticeable degradation. |
The function "get_blocks" only return the llm blocks of VLM model. Will the code for quantizing visual blocks be released?
The text was updated successfully, but these errors were encountered: