Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Added continuous listening functionality which is controlled by prop #5397

Open
wants to merge 9 commits into
base: main
Choose a base branch
from

Conversation

RushikeshGavali
Copy link

@RushikeshGavali RushikeshGavali commented Dec 26, 2024

Changelog Entry

Added

Description

  • Currently, Webchat supports push-to-talk, meaning that when a user clicks the Microphone button to talk with the bot, the voice mode turns off after the message is sent. The user then needs to click the Microphone button again to continue talking. These changes will introduce continuous listening capability, allowing the user to interact with the bot without repeatedly clicking the Microphone button. Once the user clicks the Microphone button, it will continue to listen until the user stops the listening mode by clicking the Microphone button again.

Design

  • Introduced the enableContinuousListening prop to manage the continuous listening mode. This prop essentially manages the dictate state to keep the listening mode active after sending a message.

Specific Changes

  • The Composer component will accept a new prop called enableContinuousListening.
  • The continuousListening state has been created in the Redux store, and the Composer component will initialize its value.
  • The useContinuousListening hook has been created to fetch the value from the store.
Continuous_Listening.mp4
  • I have added tests and executed them locally
  • I have updated CHANGELOG.md
  • I have updated documentation

Review Checklist

This section is for contributors to review your work.

  • Accessibility reviewed (tab order, content readability, alt text, color contrast)
  • Browser and platform compatibilities reviewed
  • CSS styles reviewed (minimal rules, no z-index)
  • Documents reviewed (docs, samples, live demo)
  • Internationalization reviewed (strings, unit formatting)
  • package.json and package-lock.json reviewed
  • Security reviewed (no data URIs, check for nonce leak)
  • Tests reviewed (coverage, legitimacy)

@RushikeshGavali
Copy link
Author

@RushikeshGavali please read the following Contributor License Agreement(CLA). If you agree with the CLA, please reply with the following information.

@microsoft-github-policy-service agree [company="{your company}"]

Options:

  • (default - no company specified) I have sole ownership of intellectual property rights to my Submissions and I am not making Submissions in the course of work for my employer.
@microsoft-github-policy-service agree
  • (when company given) I am making Submissions in the course of work for my employer (or my employer has intellectual property rights in my Submissions by contract or applicable law). I have permission from my employer to make Submissions and enter into this Agreement on behalf of my employer. By signing below, the defined term “You” includes me and my employer.
@microsoft-github-policy-service agree company="Microsoft"

Contributor License Agreement

@microsoft-github-policy-service agree company="Microsoft"

Copy link
Collaborator

@OEvgeny OEvgeny left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The approach needs validation and input from @compulim

The PR description is missing PR number and contributor links from Changelog section.

The PR is missing the CHANGELOG.md file update.

@RushikeshGavali
Copy link
Author

The approach needs validation and input from @compulim

The PR description is missing PR number and contributor links from Changelog section.

The PR is missing the CHANGELOG.md file update.

Updated the description. CHANGELOG.md file is already updated.

@OEvgeny OEvgeny self-requested a review January 3, 2025 20:31
OEvgeny
OEvgeny previously approved these changes Jan 3, 2025
OEvgeny

This comment was marked as duplicate.

@OEvgeny OEvgeny self-requested a review January 3, 2025 20:36
@OEvgeny OEvgeny dismissed their stale review January 3, 2025 20:37

The approach needs validation and input from @compulim

@compulim
Copy link
Contributor

compulim commented Jan 14, 2025

It need to set speechRecognition.continuous to true. Otherwise, the speech recognition engine will stop after first recognition.

AFAIR, either Android or Safari doesn't works with continuous, so there will also some code to turn it back on after it stopped.

Also, need to handle multiple results with final = true. Something outlined in the following screenshot. Every result will have a "final" property indicating it is completed. Try it with "continuous + interim" and you will see some result don't have "final".

In each continuous session, all result will be represented on every single result event. That means, speech that was recognized a minute ago, will still be presented. There will be some extractions needed.

Make sure manual testing is done on browser-provided speech and Azure AI Speech, also done on all browser combination: Edge, Chrome, Firefox, Android Chrome, macOS Safari, iOS Safari, iPadOS Safari. I.e. 14 manual tests.

image

I guess this doesn't need a new hook or saga, because it is native in W3C Web Speech API.

For Android/Safari quirks, should ponyfill a fix for their bug in a separate layer. I.e. don't fix browser issue directly inside Web Chat data/Redux layer. But fix it in the speech factory layer.

@RushikeshGavali
Copy link
Author

RushikeshGavali commented Jan 16, 2025

It need to set speechRecognition.continuous to true. Otherwise, the speech recognition engine will stop after first recognition.

AFAIR, either Android or Safari doesn't works with continuous, so there will also some code to turn it back on after it stopped.

Also, need to handle multiple results with final = true. Something outlined in the following screenshot. Every result will have a "final" property indicating it is completed. Try it with "continuous + interim" and you will see some result don't have "final".

In each continuous session, all result will be represented on every single result event. That means, speech that was recognized a minute ago, will still be presented. There will be some extractions needed.

Make sure manual testing is done on browser-provided speech and Azure AI Speech, also done on all browser combination: Edge, Chrome, Firefox, Android Chrome, macOS Safari, iOS Safari, iPadOS Safari. I.e. 14 manual tests.

image

I guess this doesn't need a new hook or saga, because it is native in W3C Web Speech API.

For Android/Safari quirks, should ponyfill a fix for their bug in a separate layer. I.e. don't fix browser issue directly inside Web Chat data/Redux layer. But fix it in the speech factory layer.

@compulim
I have tried passing the continuous property to the DictateComposer component. To enable continuous listening, we need to make some changes in the react-dictate-button library. Currently, once we receive a result with isFinal as true, we set the recognition reference to undefined and change the ready state. This needs to be done conditionally.

Also, we need all the current PR changes where we are conditionally setting the dictateState in onDictate and inside the stopDictateOnCardActionSaga. Since we control the started props via the dictateState property, we should not set dictateState to IDLE in onDictate and the stopDictateOnCardActionSaga when enabling continuous listening.

Using continuous property will still require all the changes done in the current PR. Could you please share your thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants