Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Assigned to nobody Loading
Sort

Pull requests list

Add tree attention backend for v1 (part 1) llama Related to Llama models speculative-decoding v1
#20401 opened Jul 2, 2025 by TheEpicDolphin Loading…
[Misc] Fix Unable to detect current VLLM config. Defaulting to NHD kv cache layout warning ready ONLY add when PR is ready to merge/full CI is needed v1
#20400 opened Jul 2, 2025 by NickLucche Loading…
[BugFix] Fix DP headless mode arg validation bug Something isn't working frontend ready ONLY add when PR is ready to merge/full CI is needed
#20398 opened Jul 2, 2025 by njhill Loading…
[Misc] Small: Remove global media connector. Each test should have its own test connector object. documentation Improvements or additions to documentation multi-modality Related to multi-modality (#4194) tpu Related to Google TPUs v1
#20395 opened Jul 2, 2025 by huachenheli Loading…
Resolve the torch nightly sync issue ci/build documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#20393 opened Jul 2, 2025 by yangw-dev Loading…
[wip] nvshmem all reduce
#20391 opened Jul 2, 2025 by Amir-19 Draft
4 tasks
[Misc] Small: Fix video loader return type annotations. multi-modality Related to multi-modality (#4194) ready ONLY add when PR is ready to merge/full CI is needed
#20389 opened Jul 2, 2025 by huachenheli Loading…
[Misc] Rename DecodingConfig to StructuredOutputConfig documentation Improvements or additions to documentation structured-output v1
#20386 opened Jul 2, 2025 by njhill Loading…
[TPU] Add a case to cover RedHatAI/Meta-Llama-3.1-8B-Instruct-quantized.w8a8 ci/build llama Related to Llama models ready ONLY add when PR is ready to merge/full CI is needed tpu Related to Google TPUs
#20385 opened Jul 2, 2025 by QiliangCui Loading…
3 tasks done
[Bugfix][CI/CD][CPU] Fix CPU CI tests
#20383 opened Jul 2, 2025 by bigPYJ1151 Loading…
1 of 4 tasks
[Bugfix] Fix import of CutlassExpertsFp8 in compressed_tensors_moe.py ready ONLY add when PR is ready to merge/full CI is needed
#20381 opened Jul 2, 2025 by bnellnm Loading…
Update serial_utils.py v1
#20379 opened Jul 2, 2025 by jue-cmd Loading…
4 tasks
[CI/Build] Fix torch nightly CI dependencies part 3 ci/build ready ONLY add when PR is ready to merge/full CI is needed
#20378 opened Jul 2, 2025 by zou3519 Loading…
3 of 4 tasks
[Docs] Update EAGLE example documentation Improvements or additions to documentation ready ONLY add when PR is ready to merge/full CI is needed
#20375 opened Jul 2, 2025 by NickLucche Loading…
[V1] feat:add engine v1 tracing v1
#20372 opened Jul 2, 2025 by RichardoMrMu Loading…
[Structured Outputs][V1] Skipping with models doesn't contain tokenizers ready ONLY add when PR is ready to merge/full CI is needed structured-output v1
#20365 opened Jul 2, 2025 by aarnphm Loading…
[Bugfix] Fix flaky test_streaming_response test ready ONLY add when PR is ready to merge/full CI is needed
#20363 opened Jul 2, 2025 by NickLucche Loading…
[PP][V1]: Integrate Token Throttling into vLLM v1
#20359 opened Jul 2, 2025 by gty111 Loading…
4 tasks done
[WIP][RC] Update PyTorch to 2.8.0 ci/build rocm Related to AMD ROCm
#20358 opened Jul 2, 2025 by huydhn Draft
ProTip! Find all pull requests that aren't related to any open issues with -linked:issue.