Skip to content

Commit 3676c94

Browse files
committed
default using piecewise when mtp enabled for indexer
Signed-off-by: Lu Fang <[email protected]>
1 parent cc43fce commit 3676c94

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

vllm/v1/attention/backends/mla/indexer.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -171,7 +171,7 @@ def get_max_prefill_buffer_size(vllm_config: VllmConfig):
171171

172172
class DeepseekV32IndexerMetadataBuilder(AttentionMetadataBuilder):
173173
cudagraph_support: ClassVar[AttentionCGSupport] = \
174-
AttentionCGSupport.UNIFORM_BATCH
174+
AttentionCGSupport.UNIFORM_SINGLE_TOKEN_DECODE
175175

176176
reorder_batch_threshold: int = 1
177177

0 commit comments

Comments
 (0)