Skip to content

how does vllm handle wrong tokens in speculative decoding? #4284

Answered by cadedaniel
Tomorrowdawn asked this question in Q&A
Discussion options

You must be logged in to vote

Before the engine appends token ids to sequences, it removes -1 tokens. The logic is here:

# -1 means the output token is not valid (eg. due to spec decode

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
3 replies
@cadedaniel
Comment options

@Tomorrowdawn
Comment options

@cadedaniel
Comment options

Answer selected by Tomorrowdawn
Comment options

You must be logged in to vote
1 reply
@cadedaniel
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
2 participants