Skip to content

Whisper support - best approach to add? #251

@Stefan-Olt

Description

@Stefan-Olt

Hello,
I would like to add whisper.cpp support to the build script (new feature in FFmpeg 8: whisper.cpp is a open-source AI transcription library, using the whisper models provided by OpenAI). While this may sound simple, just build and add the whisper.cpp library, there is one major catch: You have to compile whisper.cpp for a specific backend, so you cannot have CUDA and Vulkan support at the same time for example (most software handles that by loading the correct shared library at runtime).

So my idea is to not include Whisper by default, because it may lead to negative user experience (either bad performance or won't work at all because correct GPU is not available), but make an option:
--enable-whisper= where the user has to set the specific backend like cuda, vulkan or coreml.

Vulkan and CUDA on Linux should be very simple to build, same for CoreML on Apple Silicon (but requires different model format for the Apple NPU), because they don't need any additional library. OpenBLAS as generic CPU-optimization should also be quite simple to build. There are some more niche back-ends that seem to be complicated to build. OpenVINO for Intel also seems not very straightforward, as it has a lot of dependencies the script would also have to build. So I would stick to that 4 possible back-ends.

Do you think that is a good approach, or do you have a better idea?

Best regards
Stefan

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions