Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding multilingual support via fine-tuning code #23

Open
loretoparisi opened this issue Oct 22, 2024 · 7 comments
Open

Adding multilingual support via fine-tuning code #23

loretoparisi opened this issue Oct 22, 2024 · 7 comments
Labels
enhancement New feature or request

Comments

@loretoparisi
Copy link

While both tiny and base model achieve impressive performances 💯 in terms of WER on academics datasets when compared to the "standard-de-facto" (e.g. Whisper), in a real world scenery a multi-lingual model would address more use cases.
Fine-tune train code could eventually enable support by the community.

@evmaki
Copy link
Contributor

evmaki commented Oct 22, 2024

Yes, I think releasing some fine-tuning code would be beneficial. I can add it to our TODO list.

@bil-ash
Copy link

bil-ash commented Oct 23, 2024

@evmaki Please provide the code for finetuning, would be beneficial for extending to other languages.

@evmaki evmaki changed the title Multilingual support Multilingual support via fine-tuning code Oct 23, 2024
@evmaki evmaki added the enhancement New feature or request label Oct 23, 2024
@evmaki evmaki changed the title Multilingual support via fine-tuning code Adding multilingual support via fine-tuning code Oct 23, 2024
@evmaki evmaki mentioned this issue Oct 25, 2024
@zalastone
Copy link

Korean and Mandarin will boost their popularity in ASR use cases . it's great work guys

@pprobst
Copy link

pprobst commented Oct 28, 2024

+1

@wredan
Copy link

wredan commented Oct 30, 2024

Results seem promising. Hope to have the multilingual support or some fine-tune code soon, thank you for your work :)

@bil-ash
Copy link

bil-ash commented Dec 4, 2024

Any updates on this?

@loretoparisi
Copy link
Author

The reason why this (multilingual support) is extremely important can been see from this simple chart related to whisper fine-tune for downstream tasks of transcription (WER) and translation (BLEU) and the correlation to the amount (in hours) of audio transcribed or translated respectively. By example while Spanish (ES) exhibits the Best WER (2.5) for speech recognition it has the Lowest BLEU score (24) for translation, while German (DE) has balanced performance (WER: 4, BLEU: 35), or - viceversa Portuguese (PT) shows the High BLEU score (39) for translation but a relatively low WER (4) for speech recognition, etc.
Screenshot 2024-12-04 alle 23 45 12
Approximative data derived from Robust Speech Recognition via Large-Scale Weak Supervision

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

6 participants