-
Notifications
You must be signed in to change notification settings - Fork 91
Pull requests: vllm-project/vllm-gaudi
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Enable HPU Fused SDPA for Qwen3-VL vision attention using attention masks
#787
opened Jan 7, 2026 by
slokesha
Loading…
Draft: Add FlashAttention online merge in Unified Attention
#785
opened Jan 7, 2026 by
kzawora-intel
•
Draft
[WIP] Add Chunked Shared Attention with Dense Biases
#784
opened Jan 7, 2026 by
kzawora-intel
•
Draft
Revert "[GAUDISW-244336] Add missing long ctx prompt buckets (#739)"
#783
opened Jan 7, 2026 by
wpyszka
Loading…
[FIX_FOR_VLLM_LATEST] Fix embedding models, after bug found in #27614
#774
opened Jan 2, 2026 by
iboiko-habana
Loading…
Documentation: Fix the mobile navigation issue in v.0.13.0
documentation
Improvements or additions to documentation
skip-gaudi-tests
#772
opened Jan 2, 2026 by
mhelf-intel
Loading…
Introduce absolute and relative padding limits to the linear bucketing
#762
opened Dec 26, 2025 by
yangulei
Loading…
[GAUDISW-243560] Monkey-patching _get_attn_scale for the Llama4Attention layer
#760
opened Dec 24, 2025 by
rsmyrek
Loading…
Prefill batching logic to handle chunked prefill/prefix caching for HPU
#753
opened Dec 23, 2025 by
hlin99
Loading…
Release Notes for v0.13.0
documentation
Improvements or additions to documentation
skip-gaudi-tests
#750
opened Dec 22, 2025 by
mhelf-intel
Loading…
[GAUDISW-244752] add dynamic scale for V-Cache on Hiddden dim
#749
opened Dec 21, 2025 by
dudilester
Loading…
Dryrun implementation for generating command line file
#723
opened Dec 16, 2025 by
rajanintel24
Loading…
Previous Next
ProTip!
Follow long discussions with comments:>50.