feat: Implement OpenAI style local API server for audio transcription#509
feat: Implement OpenAI style local API server for audio transcription#509Yorick-Ryu wants to merge 2 commits intocjpais:mainfrom
Conversation
|
@cjpais Hello, is there any issue with this PR? |
|
@Yorick-Ryu please be patient, I have not had time to review it yet Im currently traveling and haven't been able to look at my laptop much. Handy was featured in wired which has brought in a lot of new issues and discussions I respond to every day |
|
I saw this: https://www.wired.com/story/handy-free-speech-to-text-app/ |
|
@cjpais any updates? |
|
Patience is the key :) |
|
+1 |
1 similar comment
|
+1 |
…nd port settings.
|
@cjpais Conflicts are fixed. Is it a good time to merge? |
|
@Yorick-Ryu Please patience I will merge when I have time. |
|
@cjpais Hey, just reminding you, when are you planning to merge it? |
|
@samiulazm patience. I keep up with all of the issues and PR's. I don't need reminders. I need people to be patient with me. If I'm taking too long you can always build the PR yourself! |
Before Submitting This PR
Please confirm you have done the following:
If this is a feature or change that was previously closed/rejected:
Human Written Description
I implemented a local STT API that follows the OpenAI Whisper format. Currently, the Whisper model is only accessible within Handy; however, many users want to leverage this functionality for external tasks like subtitle transcription without loading multiple model instances. This change exposes the speech-to-text capability as a standardized service, allowing users to do more with limited system memory.
Related Issues/Discussions
Fixes # None
Discussion: #241
Community Feedback
#241
Testing
Environment:
Test Cases:
curland Demo: convert MP3 to SRT/v1/audio/transcriptionsendpoint correctly triggers the model loading process in the background.LOCAL_API.mdto guide future contributors.Screenshots/Videos (if applicable)