Skip to content

added more 3 vars for LLM post-processing transcript#704

Open
khanhicetea wants to merge 2 commits intocjpais:mainfrom
khanhicetea:post-processing-context
Open

added more 3 vars for LLM post-processing transcript#704
khanhicetea wants to merge 2 commits intocjpais:mainfrom
khanhicetea:post-processing-context

Conversation

@khanhicetea
Copy link

@khanhicetea khanhicetea commented Feb 3, 2026

Before Submitting This PR

Please confirm you have done the following:

If this is a feature or change that was previously closed/rejected:

Human Written Description

We have more variables in prompt when post-processing:

$time_local : current time local string
$current_app : use for identify the purpose of work, can teach the LLM about in prompt template
$short_prev_transcript : a trim about 200 words last transcript in same app (or expired in 5 mins), so LLM has more context about what next.
$language : Selected transcription language for Whisper models (e.g., "en", "zh-Hans"), or "auto" for other models (Parakeet, Moonshine)

Related Issues/Discussions

Fixes #168
Discussion:

Community Feedback

Testing

Using this prompt:

Current App : ${current_app}
Current Local Time : ${time_local}

Clean this transcript:
1. Fix spelling, capitalization, and punctuation errors
2. Convert number words to digits (twenty-five → 25, ten percent → 10%, five dollars → $5)
3. Replace spoken punctuation with symbols (period → ., comma → ,, question mark → ?)
4. Remove filler words (um, uh, like as filler)
5. Keep the language in the original version of ’Transcript' (if it was french, keep it in french for example)

Preserve exact meaning and word order. Do not paraphrase or reorder content.

Return only the cleaned transcript.

Last transcript in same app (for better understanding context) : ${short_prev_transcript}

Transcript:
${output}

Screenshots/Videos (if applicable)

AI Assistance

  • No AI was used in this PR
  • AI was used (please describe below)

If AI was used:

  • Tools used: Claude Opus 4.5 Thinking
  • How extensively:

@khanhicetea
Copy link
Author

btw, I plan to have 1 more var named '${language}' which is picked by user in settings for multi-lang model. Should I push it ?

@cjpais
Copy link
Owner

cjpais commented Feb 4, 2026

Sure. If you can, can you also add documentation of this feature in the readme or somewhere

@khanhicetea
Copy link
Author

  • Added $language (for Whisper models), fallback into 'auto' for everything else
  • Added docs into readme (Experimental > Post Processing section)

I tested on my mac works (not sure about Win and Linux)

This template can provide more ways to do with transcribe (fix typo, grammar, even translation or dual translation output) 😄

/// that has keyboard focus when the user starts transcribing.

#[cfg(target_os = "macos")]
pub fn get_frontmost_app_name() -> Option<String> {
Copy link
Contributor

@VirenMohindra VirenMohindra Feb 8, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hmm so macOS impl returns app name via localizedName (ie "Safari"), but Windows uses GetWindowTextW which returns the window title (ie "Stack Overflow - How to do X - Google Chrome"). this means ${current_app} has different semantics per platform, and ${short_prev_transcript} context keying by app name would create separate entries per browser tab on Windows. maybe consider using GetModuleFileNameExW or querying the process name instead?

);

// Replace ${output} variable in the prompt with the actual text
let processed_prompt = prompt.replace("${output}", transcription);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the change logs the entire prompt including transcription text AND app name. this is a privacy concern since the log file could contain sensitive speech content

recommend reverting


[target.'cfg(target_os = "macos")'.dependencies]
tauri-nspanel = { git = "https://github.com/ahkohd/tauri-nspanel", branch = "v2.1" }
objc = "0.2"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

crate is deprecated in favor of objc2 which is already a transitive dep in the project. maybe we look into objc2-app-kit or a simpler approach for the macOS active app detection

use tauri::Manager;

/// Tracks the frontmost application captured at recording start, keyed by binding_id
static RECORDING_APP_CONTEXT: Lazy<Mutex<HashMap<String, String>>> =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: entries are inserted in start() and removed in stop(), but CancelAction doesn't clean up its entry. over many cancelled recordings this could accumulate stale entries. minor since the values are small strings, but worth adding cleanup in the cancel path too

@VirenMohindra
Copy link
Contributor

VirenMohindra commented Feb 8, 2026

solid feature idea, the context variables will make post-processing much more useful. some notes below~

merge conflicts + coordination with #706

this PR is currently conflicting against main and also has a semantic conflict with #706 (structured outputs). that PR splits prompts into system prompt + user content, so the variable substitutions here (${current_app}, ${time_local}, etc) will need to work in both the legacy ${output} path and the structured output system prompt path. recommend merging #706 first and rebasing this on top

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants