Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Reintroduce explicit command mode? #6

Open
edgar-vincent opened this issue Sep 19, 2024 · 2 comments
Open

Reintroduce explicit command mode? #6

edgar-vincent opened this issue Sep 19, 2024 · 2 comments

Comments

@edgar-vincent
Copy link

Hi!

Thanks for this lovely package.

As you have mentioned on your blog post, you removed the explicit command mode in d80338a, which allowed one to separate dictation and edits.

Edit requests don't seem to work at all here, not matter how much I refine esi-dictate-llm-prompt and esi-dictate-fix-examples. They are simply added to the transcription. Presumably, it is because I cannot use gpt4o-mini for very long (or at all?), since I use only use OpenAI's free tier.

What do you think about the possibility of reintroducing the explicit command mode? It could be optional, and thus help users that need it, without interfering with other users' workflow.

Thanks again,

EV

@lepisma
Copy link
Owner

lepisma commented Sep 28, 2024

Hlo,

  1. I think there might be a bug with the hook not executing which might lead to the edit issue that you are facing. I have also seen this happening and will try to get back on this. One hypothesis is that the speech_final flag from ASR is not coming (probably due to noise?).
  2. You can run the command esi-dictate-fix-context manually at any time to do LLM calls on the current context (highlighted by underline). But I believe you are talking about helping the LLM by providing the command separately like in the older explicit command mode. I will see if I can get that back up whenever I get time.

@lepisma
Copy link
Owner

lepisma commented Oct 14, 2024

For my first point in previous comment, I think I have found the issue. Deepgram is giving utterance end event but not setting the speech_final flag recently. This is causing the llm command to not get triggered. Will get to this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants