-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turbo-V3 #1025
Comments
Could you convert Whisper Turbo with the multilingual tokenizer? |
Thanks for the quick conversion! I'm getting a tokenizer error:
Any support would be appreciated :) |
It's done in: https://huggingface.co/deepdml/faster-whisper-large-v3-turbo-ct2 |
Tested https://huggingface.co/deepdml/faster-whisper-large-v3-turbo-ct2 |
Could you show how can we test it? On Google Colab notebook I got error as no model named as "faster-whisper-large-v3-turbo-ct2" when |
You've to download the model in your local: from huggingface_hub import snapshot_download
repo_id = "deepdml/faster-whisper-large-v3-turbo-ct2"
local_dir = "faster-whisper-large-v3-turbo-ct2"
snapshot_download(repo_id=repo_id, local_dir=local_dir, repo_type="model") |
If you guys want to test the model as a Real Time Transcription tool - I have a simple demo with Gradio for this. Just updated to code to use "deepdml/faster-whisper-large-v3-turbo-ct2" |
Thanks! |
any idea, how can I run it faster using apple silicon, as i have an M2 pro machine. |
Have you tried faster-whisper? It'seems that it's faster than any other framework. You could try: https://github.com/mustafaaljadery/lightning-whisper-mlx |
lmao the cantonese model is not word to word in the large-v3-turbo one... so sad... :( still will use the large-v3 💖 |
I don't know that language, could you give more details on your observation? What's wrong and how the result will differ with large-v3? |
on so cantonese has two variants, one is the written cantonese, where all subtitles are mostly based on it, one is the spoken cantonese, which is literally the spoken characters written on it
是 => written cantonese variant (read: si) In general, if you want to learn spoken cantonese, you'll stick to the spoken version... bruh, just realized it doesn't even transcribe the most basic terms properly: the famous "DLLM" |
We now support the new whisper-large-v3-turbo on Sieve! Use it via Just set |
@trungkienbkhn Will |
I think that would be necessary for lots of downstream projects like faster-whisper-server |
You may find this discussion helpful: |
What's your experience with using v3-turbo for short audio clips? Mine has been that unfortunately it performs worse than other models, see #1030 (comment) |
I only planned to evaluate turbo, turbo-CTranslate2 and turbo-HQQ. I can only tell that it works x2-x3 times faster based on my logs with the Gradio demo. How bad are the evaluation results? |
How do you use HQQ in faster-whisper? Could you share a sample code? |
Right, HQQ works with Transformers. But faster-whisper is just whisper accelerated with CTranslate2 and there are models of turbo accelerated with CT2 available on HuggingFace: deepdml/faster-whisper-large-v3-turbo-ct2 Also, HQQ is integrated in Transformers, so quantization should be as easy as passing an argument model_id = "deepdml/faster-whisper-large-v3-turbo-ct2"
quant_config = HqqConfig(nbits=4, group_size=64)
model = AutoModelForSpeechSeq2Seq.from_pretrained(
model_id,
torch_dtype=torch_dtype,
low_cpu_mem_usage=True,
use_safetensors=True,
quantization_config=quant_config
) https://huggingface.co/docs/transformers/main/quantization/hqq I didn't try it yet, so don't know if that is going to work. |
No, There is no HQQ support for ctranslate2 yet. Faster-whisper has whisper models in I have created a feature request in the past to support HQQ (with static cache and torch compilation): OpenNMT/CTranslate2#1717 The PR is still in progress and it has some performance issues that needs to be fixed. |
No standardized evaluation or anything, I'm just running it in my streaming application and seeing way worse results than medium (especially with it randomly just not transcribing part of the text). This is a ctranslate2 implementation, deepdml/faster-whisper-large-v3-turbo-ct2 to be exact. See my linked comment for the code |
were you able to convert turbo to faster-whisper format? |
Mobiuslabs fork now supports |
Just to mention that I added support in https://github.com/Softcatala/whisper-ctranslate2 for anybody that wants to test the turbo-model with the current with the current faster-whisper version. |
There seems to be a lot of confusion in this thread -- if you want to use turbo with the current faster whisper, all you have to do is
Closing this thread since there is no issue. |
I converted the new openai model weights to be used with faster-whisper. Still playing around with it, but in terms of speed its about the same as distil whisper.
https://huggingface.co/freddierice/openwhisper-turbo-large-v3-ct2/blob/main/README.md
The text was updated successfully, but these errors were encountered: