Add TranscriptControl with out-of-process Whisper transcription service#51
Draft
Copilot wants to merge 5 commits into
Draft
Add TranscriptControl with out-of-process Whisper transcription service#51Copilot wants to merge 5 commits into
Copilot wants to merge 5 commits into
Conversation
- Created Bookmarkly.Transcription.Abstractions with WinRT interface definitions - Created Bookmarkly.Transcription with ONNX/Whisper implementation - Created Bookmarkly.Transcription.Server for out-of-process COM server - Created TranscriptControl UserControl in Bookmarkly.Views - Updated Package.appxmanifest to register OOP server - Updated Directory.Packages.Props with required NuGet packages - Updated solution file to include new projects - Added EnableWindowsTargeting property for Linux builds Co-authored-by: Kumara-Krishnan <[email protected]>
Co-authored-by: Kumara-Krishnan <[email protected]>
- Optimize AudioProcessor to pre-allocate list capacity - Add comments for tensor copy efficiency - Reuse TranscriptionService instance in TranscriptControl - Fix GetMany implementation with correct parameter handling - Improve COM server lifetime management with ref counting Co-authored-by: Kumara-Krishnan <[email protected]>
Co-authored-by: Kumara-Krishnan <[email protected]>
Copilot
AI
changed the title
[WIP] Add TranscriptControl UserControl for audio transcription
Add TranscriptControl with out-of-process Whisper transcription service
Dec 21, 2025
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Implements audio transcription via Whisper ONNX models using an out-of-process COM server architecture for cross-app reusability.
Architecture
3 new projects:
Bookmarkly.Transcription.Abstractions- WinRT contract library (.idl→.winmd)Bookmarkly.Transcription- ONNX inference, audio processing, tokenizationBookmarkly.Transcription.Server- OOP COM server (single instance, ref counting)WinRT Interface:
Audio Pipeline:
TranscriptControl UI
Location:
Bookmarkly.Views/Controls/TranscriptControl.xamlFeatures:
Dependency Properties:
AudioFile(StorageFile) - inputTranscript(string, read-only) - outputIsTranscribing(bool) - loading stateSelectedLanguage(string) - language codeUsage:
Configuration
Package.appxmanifest:
NuGet packages:
Implementation Notes
Files: 19 changed (12 new), 1660+ lines added
Original prompt
Overview
Build a
TranscriptControlUserControl that accepts an audio file and uses the Whisper base model (from https://huggingface.co/onnx-community/whisper-base) to render the transcript. The transcription service should run in a separate out-of-process COM server so it can be reused by other apps from the same publisher.Architecture Requirements
Project Structure
Create the following new projects in the solution:
Bookmarkly.Transcription.Abstractions (WinRT Contract Library)
.idlfiles compiled to.winmd)ITranscriptionServiceinterface with methods:TranscribeAsync(StorageFile audioFile)- returns transcript textTranscribeWithLanguageAsync(StorageFile audioFile, string languageCode)- transcribe with specific languageGetSupportedLanguagesAsync()- returns list of supported languages.winmdthat both server and client referenceBookmarkly.Transcription (Class Library)
Microsoft.ML.OnnxRuntime.DirectML(for GPU acceleration)Microsoft.ML.OnnxRuntime(fallback CPU)WhisperTranscriber- loads ONNX models and performs inferenceAudioProcessor- converts audio to mel spectrogram (80 mel bins, 16kHz sample rate)WhisperTokenizer- decodes token IDs to textencoder_model.onnx(from https://huggingface.co/onnx-community/whisper-base/resolve/main/onnx/encoder_model.onnx)decoder_model_merged.onnx(from https://huggingface.co/onnx-community/whisper-base/resolve/main/onnx/decoder_model_merged.onnx)Bookmarkly.Transcription.Server (Out-of-Process WinRT COM Server EXE)
TranscriptionServicethat implementsITranscriptionServiceProgram.cs) that:Package Manifest Updates
Update
Bookmarkly.App/Package.appxmanifestto register the out-of-process server:Add required namespace:
xmlns:uap5="http://schemas.microsoft.com/appx/manifest/uap/windows10/5"TranscriptControl UI Requirements
Location
Create in
Bookmarkly.Viewsproject:Controls/TranscriptControl.xamlControls/TranscriptControl.xaml.csUI Layout
Features
File Input
Shimmer Loading Effect
Language Picker (Top Right)
GetSupportedLanguagesAsync()Copy Button (Top Right)
This pull request was created from Copilot chat.
✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.