Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Visual blocks are not quantized in code #2

Open
Yiman-GO opened this issue Jan 8, 2025 · 5 comments
Open

Visual blocks are not quantized in code #2

Yiman-GO opened this issue Jan 8, 2025 · 5 comments

Comments

@Yiman-GO
Copy link

Yiman-GO commented Jan 8, 2025

The function "get_blocks" only return the llm blocks of VLM model. Will the code for quantizing visual blocks be released?

@Albert-huyc
Copy link

Thank you for your interest in the MBQ work !
The MBQ algorithm is focused on quantizing LLM blocks in VLMs. In our experiments, we directly quantized the ViT encoder using SmoothQuant and implemented it in a rough way. We are currently still refining this part of the code.

@junghye0klee
Copy link

@Albert-huyc If so, does VLM's visual block retain FP16 without quantization?

@Albert-huyc
Copy link

@junghye0klee You're right. In all our experiments, except those detailed in Sec. 5.3.3, the visual block of the VLM remained unquantized.

@junghye0klee
Copy link

@Albert-huyc Thank you for your kind response. If I ask one more question, does additional quantization of visual blocks result in a significant decrease in accuracy?

@Albert-huyc
Copy link

@junghye0klee That's a good question! We've actually conducted several experiments on this, and you can find the detailed results in Table 4 (Section 5.3.3) of our paper. The experimental results show that quantizing the visual blocks to W4A8 maintains the model's performance without any noticeable degradation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants