-
Notifications
You must be signed in to change notification settings - Fork 325
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Speculative Decoding #1474
Comments
This looks be a duplicate of #1234 |
It's the same idea but I'm not sure it refers to the same implementation? There is also "Speculative sampling" which seem to refer to yet another implementation/algorithm of this concept. |
How hard would it be to implement a really naive version of this with ctranslate2? I would like to pick this up if possible |
Implementing this feature in the most basic form may be already possible with the existing Generator API. You could use |
Hello, I'm looking to use speculative decoding for whisper : https://huggingface.co/blog/whisper-speculative-decoding Is it possible with CTranslate2 ? |
This could be used for LLMs and hopefully for encoder-decoder models like using the smaller NLLB coupled with the bigger NLLB models
The text was updated successfully, but these errors were encountered: