Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support Speculative Decoding #1474

Open
JOHW85 opened this issue Sep 12, 2023 · 5 comments
Open

Support Speculative Decoding #1474

JOHW85 opened this issue Sep 12, 2023 · 5 comments
Labels
enhancement New feature or request

Comments

@JOHW85
Copy link

JOHW85 commented Sep 12, 2023

This could be used for LLMs and hopefully for encoder-decoder models like using the smaller NLLB coupled with the bigger NLLB models

@guillaumekln guillaumekln added the enhancement New feature or request label Sep 12, 2023
@wsxiaoys
Copy link
Contributor

This looks be a duplicate of #1234

@guillaumekln
Copy link
Collaborator

It's the same idea but I'm not sure it refers to the same implementation? There is also "Speculative sampling" which seem to refer to yet another implementation/algorithm of this concept.

@epinnock
Copy link

How hard would it be to implement a really naive version of this with ctranslate2? I would like to pick this up if possible

@guillaumekln
Copy link
Collaborator

guillaumekln commented Sep 15, 2023

Implementing this feature in the most basic form may be already possible with the existing Generator API. You could use generate_batch with a small model, and then use forward_batch with a big model to validate the output. The limitation of this approach is that when the big model does not agree, you have to start the generation from scratch and not at the first mismatched position.

@GaetanBaert
Copy link

Hello,

I'm looking to use speculative decoding for whisper : https://huggingface.co/blog/whisper-speculative-decoding

Is it possible with CTranslate2 ?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants
@wsxiaoys @JOHW85 @guillaumekln @epinnock @GaetanBaert and others