Skip to content

Pull requests: vllm-project/vllm

Author
Filter by author
Loading
Label
Filter by label
Loading
Use alt + click/return to exclude labels
or + click/return for logical OR
Projects
Filter by project
Loading
Milestones
Filter by milestone
Loading
Reviews
Assignee
Filter by who’s assigned
Sort

Pull requests list

[v1] fix compilation cache
#11598 opened Dec 29, 2024 by youkaichao Loading…
[v1][bugfix] fix cudagraph with inplace buffer assignment ready ONLY add when PR is ready to merge/full CI is needed
#11596 opened Dec 29, 2024 by youkaichao Loading…
[Docker] bump up neuron sdk v2.21 ci/build
#11593 opened Dec 29, 2024 by liangfu Loading…
[Bugfix] Reduce prefix prefill block size for Pascal
#11584 opened Dec 28, 2024 by sasha0552 Loading…
[Frontend] Improve Error Handling documentation Improvements or additions to documentation frontend needs-rebase
#11570 opened Dec 27, 2024 by robertgshaw2-neuralmagic Loading…
[Misc]Minor Changes about Worker
#11555 opened Dec 27, 2024 by noemotiovon Loading…
[Benchmark] Add benchmark script for CPU offloading ready ONLY add when PR is ready to merge/full CI is needed
#11533 opened Dec 26, 2024 by ApostaC Loading…
[Core] Block Allocator to support KV cache CPU offloading frontend ready ONLY add when PR is ready to merge/full CI is needed
#11532 opened Dec 26, 2024 by ApostaC Loading…
[Core] Performance optimization for swap_blocks by cuda kernels ready ONLY add when PR is ready to merge/full CI is needed
#11531 opened Dec 26, 2024 by ApostaC Loading…
Fixed docker build for ppc64le ci/build
#11518 opened Dec 26, 2024 by npanpaliya Loading…
ProTip! Follow long discussions with comments:>50.