-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Pull requests: vllm-project/vllm
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
[MM] Add text-only mode for Qwen3-VL
qwen
Related to Qwen models
#26000
opened Oct 1, 2025 by
ywang96
Loading…
5 tasks
[Deepseek v3.2] Support indexer prefill chunking
deepseek
Related to DeepSeek models
v1
#25999
opened Oct 1, 2025 by
heheda12345
Loading…
5 tasks
[NVIDIA] flashinfer TRTLLM attention prefill token limit
ready
ONLY add when PR is ready to merge/full CI is needed
#25998
opened Oct 1, 2025 by
jasonlizhengjian
Loading…
5 tasks
[GPTOSS][DP/EP][Marlin] Enable GPTOSS Batched DP/EP using Marlin kernels
gpt-oss
Related to GPT-OSS models
#25997
opened Sep 30, 2025 by
varun-sundar-rabindranath
•
Draft
[Bugfix] Fix Related to DeepSeek models
ready
ONLY add when PR is ready to merge/full CI is needed
rocm
Related to AMD ROCm
__syncwarp
on ROCM
deepseek
Fix test_mamba_ssm_ssd.py due to missing _query_start_loc_to_chunk_indices_offsets
#25995
opened Sep 30, 2025 by
hl475
Loading…
1 of 5 tasks
[CI/Build] do not enforce precompilation on tpu ci tests
structured-output
v1
#25992
opened Sep 30, 2025 by
sixiang-google
Loading…
5 tasks done
[Doc] updating torch.compile doc link
documentation
Improvements or additions to documentation
#25989
opened Sep 30, 2025 by
nadathurv
Loading…
[Bugfix] Allow skipping MoE in NVFP4 (fix for MTP)
deepseek
Related to DeepSeek models
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
#25987
opened Sep 30, 2025 by
benchislett
Loading…
[P/D] KVConnector for decode benchmarking
kv-connector
v1
#25986
opened Sep 30, 2025 by
tlrmchlsmth
Loading…
[Spec Decode] Enable efficient speculative decoding with FlashInfer-MLA
v1
#25984
opened Sep 30, 2025 by
benchislett
Loading…
Quick fix for IMA with the Prefix Prefill kernel during graph capture
v1
#25983
opened Sep 30, 2025 by
SageMoore
Loading…
[Model] MTP fallback to eager for DeepSeek v32
deepseek
Related to DeepSeek models
ready
ONLY add when PR is ready to merge/full CI is needed
speculative-decoding
v1
[P/D] Support async transfer for P2P NCCL connector
documentation
Improvements or additions to documentation
kv-connector
#25976
opened Sep 30, 2025 by
ruisearch42
•
Draft
5 tasks
Add more tests for batch invariant kernel-override logic [3/n]
v1
#25975
opened Sep 30, 2025 by
bwasti
Loading…
3 of 5 tasks
[Misc] Add penalties sampling parameters to serve tool
performance
Performance-related issues
ready
ONLY add when PR is ready to merge/full CI is needed
#25974
opened Sep 30, 2025 by
southfreebird
Loading…
Optimized topk in vllm for tree attention spec decoding
ci/build
#25973
opened Sep 30, 2025 by
yongqigood
Loading…
[WIP][Core/DBO][4/N] Support alternate low-latency schedule
deepseek
Related to DeepSeek models
documentation
Improvements or additions to documentation
v1
#25972
opened Sep 30, 2025 by
LucasWilkinson
Loading…
[Quantization/NVFP4] Speed up TRTLLM NVFP4 MOE weight loading
#25968
opened Sep 30, 2025 by
pavanimajety
•
Draft
5 tasks
[CI/Build] Update the Dockerfile to use ONLY add when PR is ready to merge/full CI is needed
vllm serve
command
ci/build
ready
#25967
opened Sep 30, 2025 by
DarkLight1337
Loading…
5 tasks
[Bugfix] Relax tokenizer regex for mixtral to include 'tokenizer.model'
#25964
opened Sep 30, 2025 by
BowenBao
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-08-30.