Skip to content

Conversation

@alfonsomorab
Copy link
Contributor

@alfonsomorab alfonsomorab commented Sep 6, 2025

Summary

Fixes the infinite echo loop issue when using SDK .speak() calls on Android by implementing STT (Speech-to-Text) processing pause during TTS playbook with comprehensive reliability safeguards.

Note: This PR addresses the Android implementation only. iOS still needs this solution implemented.

Problem

When using SDK .speak() calls with phone speaker + phone microphone:

  1. User says "hello"
  2. App calls session.audio.speak("hello")
  3. TTS plays through phone speaker
  4. Phone microphone captures TTS audio
  5. Speech recognition processes captured audio as new speech
  6. App responds with another .speak() call → Infinite loop

Solution

Instead of trying to disable microphone hardware (which had timing and acoustic coupling issues), this solution pauses STT processing during SDK .speak() calls:

  • Mobile side: AudioManagerModule sends STT pause/resume commands via broadcast
  • Android Core: AugmentosService receives commands and posts PauseAsrEvent
  • Leverages existing infrastructure: Uses the same STT pause system that native TTS already uses successfully
  • Zero latency: Mobile-only solution with no network calls
  • Multi-layered reliability: Comprehensive error handling and safety mechanisms

Architecture

SDK .speak() → Mobile AudioManagerModule → Broadcast Intent →
AugmentosService STTControlReceiver → PauseAsrEvent →
SpeechRecSwitchSystem → SpeechRecFramework.pauseAsr()

Files Changed

  • mobile/android/app/src/main/java/com/mentra/mentra/AudioManagerModule.java: Added STT pause/resume commands with reliability mechanisms
  • android_core/app/src/main/java/com/augmentos/augmentos_core/AugmentosService.java: Added STT control BroadcastReceiver with Android 14+ compatibility

Edge Cases Handled

The solution includes comprehensive safeguards for reliability:

Failure Scenarios

  • Failed audio requests: STT resumes immediately if playAudio() throws exception
  • Network/URL errors: Timeout safety resumes STT after 30 seconds if audio fails to load
  • Rapid .speak() calls: State tracking prevents redundant commands; timer cancellation prevents race conditions
  • Service crashes: STT pause state is managed independently and will resume via timeout
  • Audio loading failures: If audio fails during playback, timeout mechanism ensures STT eventually resumes

Reliability Mechanisms

  • Error handling: Try/catch in playAudio() ensures STT resumes on immediate failures
  • Success-based resume: STT only resumes when audio completes successfully
  • State tracking: Prevents redundant pause/resume commands from conflicting
  • Safety timeout: 30-second auto-resume mechanism for any stuck states
  • Timer management: Proper cancellation prevents old timers from interfering with new audio
  • Android 14+ compatibility: Uses RECEIVER_NOT_EXPORTED for proper BroadcastReceiver registration

Test Plan

  • Tested with phone speaker + microphone configuration on Android
  • Verified SDK .speak() calls no longer create infinite loops
  • Confirmed STT resumes correctly after TTS completes successfully
  • Validated STT resumes after failed audio requests
  • Tested rapid consecutive .speak() calls don't cause issues
  • Verified timeout safety mechanism works for edge cases
  • Confirmed existing native TTS functionality remains unaffected

Testing App

This solution can be tested using the MentraOS Echo Test App:
https://github.com/alfonsomorab/MentraOS-Echo-App

The test app demonstrates the echo loop scenario and validates the fix across various edge cases.

Platform Support

  • Android: Fixed with this PR
  • ⚠️ iOS: Still needs implementation (similar approach should work)

Related Issues

Resolves #1036 - SDK .speak() infinite echo loop (Android only)

- Mobile AudioManagerModule sends STT pause/resume commands via broadcast
- Android Core AugmentosService receives commands and posts PauseAsrEvent
- Leverages existing STT pause infrastructure used by native TTS
- Prevents speech recognition from processing echoed TTS audio
- Mobile-only solution with zero network latency

Resolves GitHub issue Mentra-Community#1036
@alfonsomorab alfonsomorab requested a review from a team as a code owner September 6, 2025 16:50
- Add RECEIVER_NOT_EXPORTED flag for API level 33+ (Android 14)
- Maintains backward compatibility with older Android versions
- Fixes SecurityException during AugmentosService creation
@alfonsomorab alfonsomorab force-pushed the 1036-fix-echo-cancellation-android branch from 997c765 to cf52d9a Compare September 6, 2025 17:39
- Add error handling to prevent STT staying paused on failed audio requests
- Always resume STT regardless of audio success/failure
- Add STT state tracking to prevent redundant pause/resume commands
- Add 30-second timeout safety with proper timer cancellation
- Fix race condition where old timers could resume STT during new audio

Addresses edge cases: failed audio, rapid .speak() calls, network errors,
service crashes, and overlapping timeout scenarios.
- Resume STT only when audio completes successfully
- Failed audio during playback relies on 30s timeout safety net
- Failed audio at start still handled by playAudio() error catch
- More precise control with multi-layered safety mechanisms
This commit resolves the issue where SDK .speak() calls were creating infinite echo loops
on Android 15 due to broadcast receiver restrictions. The STT pause mechanism now works
correctly across all Android versions (11-15).

Key Changes:
- Fixed RECEIVER_NOT_EXPORTED to RECEIVER_EXPORTED in AugmentosService for Android 14+ compatibility
- Added comprehensive debug logging throughout STT pause/resume pipeline
- Enhanced EventBus message handling with proper error checking
- Added debug logging to track pause flag changes and audio blocking
- Updated version to 2.2.7 for both Android and iOS

Technical Details:
- BroadcastReceiver now properly receives cross-app communication on Android 15
- STT audio processing is successfully blocked during TTS playback
- EventBus delivery confirmed working with detailed logging
- Audio ingestion properly filtered when pauseAsrFlag is true

Testing:
- Verified STT pause/resume mechanism working on Android 35 (API level 35)
- Confirmed audio blocking logs show hundreds of blocked audio chunks during TTS
- EventBus message delivery working correctly across app boundaries

Co-authored-by: Development Team
@nic-olo nic-olo requested review from fossephate and isaiahb October 18, 2025 08:33
@aisraelov aisraelov closed this Nov 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants