-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
[MM] Add text-only mode for Qwen3-VL #26000
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Code Review
This pull request aims to add text-only support for Qwen3-VL by conditionally initializing the visual components. The changes correctly identify the parts of the code that need to be conditional (Qwen3_VisionTransformer
initialization, deepstack_input_embeds
initialization, and weight loading). However, there is a critical logic error in the condition for initializing the visual model, which inverts the intended behavior. This would cause the model to initialize visual components in text-only mode and skip them in multimodal mode. I've provided a suggestion to fix this. The other related changes are correct, assuming this primary logic is fixed.
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com> Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: Roger Wang <[email protected]>
Signed-off-by: simon-mo <[email protected]>
Purpose
Since this model is performing pretty well on text only tasks we might want to allow people to serve it as a text-only model.
Test Plan
Test Result
Running
vllm serve Qwen/Qwen3-VL-235B-A22B-Instruct --limit-mm-per-prompt.image 0 --limit-mm-per-prompt.video 0 --load-format dummy -tp 8
shows the following in the logsConfirm vision model weights are not loaded:
Without setting limit:
Without setting limit + DP ViT:
Setting all limits to 0:
Essential Elements of an Effective PR Description Checklist
supported_models.md
andexamples
for a new model.