fix(deps): update dependency vllm to ^0.10.0 #257
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR contains the following updates:
^0.5.0
->^0.10.0
Release Notes
vllm-project/vllm (vllm)
v0.10.2
Compare Source
Highlights
This release contains 740 commits from 266 contributors (97 new)!
Breaking Changes: This release includes PyTorch 2.8.0 upgrade, V0 deprecations, and API changes - please review the changelog carefully.
aarch64 support: This release features native support for aarch64 allowing usage of vLLM on GB200 platform. The docker image
vllm/vllm-openai
should already be multiplatform. To install the wheels, you can download the wheels from this release artifact or install viaModel Support
Engine Core
--model-impl terratorch
support.--safetensors-load-strategy
for NFS based file loading acceleration (#24469), critical CUDA graph capture throughput fix (#24128), scheduler optimization for single completions (#21917), multi-threaded model weight loading (#23928), and tensor core usage enforcement for FlashInfer decode (#23214).Hardware & Performance
Quantization
API & Frontend
Dependencies
V0 Deprecation
Breaking Changes
What's Changed
openai<1.100
to unblock CI by @mgoin in #23118propose_draft_token_ids
non-blocking for lower TTFT by @WoosukKwon in #23041/collective_rpc
API endpoint by @22quinn in #23075--mm-encoder-tp-mode
by @DarkLight1337 in #23190_convert_tokens_to_string_with_added_encoders
by 13.7x by @misrasaurabh1 in #20413prompt_token_ids
arg fallback inLLM.generate
andLLM.embed
by @DarkLight1337 in #18800MinPLogitsProcessor.update_states()
by @njhill in #23401Configuration
📅 Schedule: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).
🚦 Automerge: Enabled.
♻ Rebasing: Whenever PR becomes conflicted, or you tick the rebase/retry checkbox.
🔕 Ignore: Close this PR and you won't be reminded about this update again.
This PR has been generated by Renovate Bot.