Whisper support - best approach to add?

Hello,
I would like to add whisper.cpp support to the build script (new feature in FFmpeg 8: whisper.cpp is a open-source AI transcription library, using the whisper models provided by OpenAI). While this may sound simple, just build and add the whisper.cpp library, there is one major catch: You have to compile whisper.cpp for a specific backend, so you cannot have CUDA and Vulkan support at the same time for example (most software handles that by loading the correct shared library at runtime).

So my idea is to not include Whisper by default, because it may lead to negative user experience (either bad performance or won't work at all because correct GPU is not available), but make an option:
`--enable-whisper=` where the user has to set the specific backend like `cuda`, `vulkan` or `coreml`.

Vulkan and CUDA on Linux should be very simple to build, same for CoreML on Apple Silicon (but requires different model format for the Apple NPU), because they don't need any additional library. OpenBLAS as generic CPU-optimization should also be quite simple to build. There are some more niche back-ends that seem to be complicated to build. OpenVINO for Intel also seems not very straightforward, as it has a lot of dependencies the script would also have to build. So I would stick to that 4 possible back-ends.

Do you think that is a good approach, or do you have a better idea?

Best regards
Stefan

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Whisper support - best approach to add? #251

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Whisper support - best approach to add? #251

Description

Metadata

Metadata

Assignees

Labels

Projects

Milestone

Relationships

Development

Issue actions