Skip to content

Local Meeting Transcript Mode #48

@asmit203

Description

@asmit203

So the feature in my mind is mostly is for transcribing meetings/ audio through the model

Add Local Meeting Transcript Mode

Summary

  • Add a separate Meeting Capture mode that records system audio locally, transcribes it with the existing Whisper pipeline, and
    writes timestamped transcript logs into a user-selected folder.
  • Keep current microphone dictation unchanged.
  • In meeting mode, do not paste text into the focused app; only save transcript output.
  • Store one transcript file per meeting session, growing incrementally while capture is running.

Key Changes

  • Config and menu
    • Add meetingTranscriptDirectory: String? to Config.
    • Add tray actions: Start Meeting Capture, Stop Meeting Capture, Choose Transcript Folder…, Open Transcript Folder, Open
      Current Transcript.
    • Persist the chosen folder path in ~/.config/open-wispr/config.json.
    • Disable Start Meeting Capture until a writable transcript folder is configured.
  • Permissions and packaging
    • Add ScreenCaptureKit and CoreMedia to Package.swift.
    • Add NSAudioCaptureUsageDescription to the generated app Info.plist.
    • Add screen-recording permission checks to Permissions using CGPreflightScreenCaptureAccess / CGRequestScreenCaptureAccess.
    • If permission is newly granted, require an app restart before first system-audio capture. This is the default behavior
      assumed by Apple’s sample.
  • Capture pipeline
    • Introduce a capture abstraction so the current mic path and the new system-audio path share the same transcription pipeline.
    • Keep AudioRecorder for mic dictation.
    • Add a new SystemAudioCaptureSession that uses ScreenCaptureKit, captures audio only, excludes OpenWispr’s own audio, and
      ignores microphone input.
    • Capture audio continuously and cut it into fixed 30-second non-overlapping chunks.
    • Convert chunk audio to mono 16 kHz WAV before passing it to Transcriber.
    • Run chunk transcription on a single serial queue so appended transcript text stays in order.
    • Delete temporary chunk audio files after each transcription completes.
    • While meeting capture is active, ignore the dictation hotkey to avoid overlapping recorder/transcriber state.
  • Transcript logging
    • Add a TranscriptLogStore responsible for folder validation, session file creation, and append operations.
    • Create one file per session named meeting-YYYY-MM-DD-HHmmss.md.
    • Write a small header at session start: start time, source System Audio, model, and language.
    • Append each successful chunk as one line: [HH:mm:ss] transcribed text.
    • Skip empty / blank-audio chunks after existing Whisper marker cleanup.
    • On stop, flush remaining queued transcriptions and append session end time.
  • App orchestration
    • Extend AppDelegate with a separate meeting-capture state machine and start/stop handlers.
    • Extend StatusBarController.State with meeting-specific states so the menu and icon reflect “capturing meeting audio” vs
      normal dictation/transcribing.
    • Reuse existing Transcriber and text post-processing; bypass TextInserter for meeting mode.

Test Plan

  • Add unit tests for Config decoding/saving of meetingTranscriptDirectory.
  • Add unit tests for TranscriptLogStore:
    • creates session filenames correctly
    • rejects missing/unwritable directories
    • appends timestamped chunks in order
    • writes session header and end marker
  • Add unit tests for chunk pipeline helpers:
    • chunk rollover at 30 seconds
    • blank chunk suppression
    • temp file cleanup after transcription
  • Add manual validation for the system-audio path:
    • first-time permission flow for Screen Recording / system audio
    • folder chooser persists across restart
    • transcript file grows during an active Zoom/Meet playback
    • no text is pasted during meeting mode
    • normal mic dictation still works after stopping meeting mode

Assumptions And Defaults

  • V1 is a separate mode, not a source toggle inside the existing dictation flow.
  • V1 is save-only; meeting text is never auto-pasted.
  • V1 stores one transcript file per session.
  • V1 does not retain raw meeting audio after chunk transcription; only transcript logs persist.
  • The chosen folder is stored as a plain filesystem path, not a security-scoped bookmark, because this app is currently
    distributed as an unsandboxed macOS app.
  • V1 captures system output broadly and does not add per-app or per-window capture selection yet.

References

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions