Skip to content

Conversation

@eroub
Copy link

@eroub eroub commented Nov 25, 2025

Problem

The Daily transport's _join() and _leave() methods can hang indefinitely when the Daily completion callbacks fail to fire. This occurs because both methods use await future without timeout protection, waiting for the completion_callback to set the future's result.

When this happens:

  • The leave() operation never completes
  • The on_left event never fires
  • Pipeline cleanup handlers are blocked

This issue was discovered in production where a call termination hung indefinitely at transport.leave(), preventing cleanup.

Solution

Add timeout protection to both _join() and _leave() operations:

  1. Added module-level timeout constants (30 seconds each)

    • DAILY_JOIN_TIMEOUT_SECS = 30.0
    • DAILY_LEAVE_TIMEOUT_SECS = 30.0
  2. Wrapped async operations with asyncio.wait_for()

    • Both methods now timeout after 30 seconds if the completion callback doesn't fire
    • Timeout errors are logged with room URL context for debugging
    • Error strings are returned following Daily transport's error conventions
  3. Symmetric error handling

    • _join() returns (None, error_msg) tuple on timeout (matching its error signature)
    • _leave() returns error_msg string on timeout (matching its error signature)

- Add 30s timeout constants for join/leave ops
- Wrap join/leave calls with asyncio.wait_for
- Log timeout errors with room URL context
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant