[Usage]: vision encoder speedup?

### Your current environment

pytorch 2.7.1, vLLM 0.10.1.1, H20 * 8


### How would you like to use vllm

I'm currently using Qwen2.5-VL 32B to do some multi-modal inferencing. From the result of torch profiler, i see the vision encoder takes a "long" time in the prefill stage. I'd like to know if there is any way to increase the performance of the vision encoder (like compilation etc.). Thanks!

### Before submitting a new issue...

- [x] Make sure you already searched for relevant issues, and asked the chatbot living at the bottom right corner of the [documentation page](https://docs.vllm.ai/en/latest/), which can answer lots of frequently asked questions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

[Usage]: vision encoder speedup? #25746

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Uh oh!

[Usage]: vision encoder speedup? #25746

Description

Your current environment

How would you like to use vllm

Before submitting a new issue...

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions