Skip to content

Conversation

@zgreathouse
Copy link
Member

@zgreathouse zgreathouse commented Apr 11, 2025

Summary of Changes Made

EmpathicVoice Client

Adds built-in support for resuming a Chats when reconnecting

  • Adds optional shouldAttemptReconnect callback argument in ReconnectingWebSocket for implementing custom logic for when to try to reconnect.
  • Add means for setting and deleting query params that are included in the url when the ReconnectingWebSocket connects for the purpose of adding the resumed_chat_group_id query param to the handshake url to support resume on reconnect.)
  • Update ChatSocket to extract chatGroupId from chat_metadata message and update query param overrides on ReconnectingWebSocket by default to support resume on reconnect.
  • Add optional shouldResumeChat option for ChatClient
  • Update Chat client to accept optional shouldResumeChat argument (escape hatch for default resume on reconnect behavior)
  • Add static method on Chat client for reconnect logic, only attempting a reconnect when WebSocket was closed with one of the following WebSocket close codes:
    • 1006 (Abnormal Closure)
    • 1011 (Internal Error)
    • 1012 (Service Restart)
    • 1013 (Try Again Later)
    • 1014 (Bad Gateway)

@zgreathouse zgreathouse requested a review from twitchard April 11, 2025 03:45
@zgreathouse zgreathouse marked this pull request as draft April 11, 2025 03:47
Copy link
Contributor

@twitchard twitchard left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. We should add a test or manually test both paths before release though

}
}

const shouldResumeChat = args.shouldResumeChat ?? true;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO defaulting to true is "breaking" enough that we should highlight it by releasing minor instead of patch wdyt?

Copy link
Member Author

@zgreathouse zgreathouse Apr 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@twitchard I agree! Although, I am now leaning toward defaulting to false.

It is likely true, in nearly all cases, that users would want to resume a Chat when reconnecting after an unexpected disconnect. However, this current implementation actually breaks when attempting to resume on reconnect if zero data retention is enabled.

There is currently no field in the API which we can leverage to determine whether or not zero data retention has been enabled on the account. Since we are always adding the resumed_chat_group_id query param to ReconnectingWebSocket._queryParamOverrides upon receiving the chat_metadata message, when a new chat unexpectedly disconnects and we try to reconnect it will fail with a E0720 error instead of reconnecting. When receiving this error we need to remove the query param from the ReconnectingWebSocket._queryParamOverrides so when it tries to reconnect it won't fail.

Apart from that fix, defaulting to false would make the changes here purely additive, while giving users the option to enable resuming on reconnect.

@volskaya
Copy link

I feel like this might suffer from a server restriction, where it doesn't let you resume the group ID if it didn't detect a close, because a socket closing client side doesn't guarantee that it closed server side too.

Like if there's a network drop or switch, or on mobile if the app running this code was suspended or some other factor broke it, leaving the server hanging waiting for response, until its own networking library times it out or the EVI's inactivity timeout gets triggered.

🙇In my opinion the server should allow the client to resume when ever it wants, the behavior should be that if it sees an attempt to establish a new websocket with an already used group id, the old connection should be disposed and it should proceed with the new one.

It should let a test case like this pass to fully support resumes:

it("Chat group ID can take over an older connection", async () => {
  const hume = new HumeClient({
    accessToken: "<>",
  });

  let resolveNewerSocket: (socket: ChatSocket) => void | undefined;
  let resolveSuccess: () => void | undefined;
  let rejectSuccess: () => void | undefined;

  const promiseNewerSocket = new Promise<ChatSocket>((res) => (resolveNewerSocket = res));
  const promiseSuccess = new Promise<void>((res, rej) => ((resolveSuccess = res), (rejectSuccess = rej)));

  const olderSocket = hume.empathicVoice.chat.connect({
    debug: false,
    reconnectAttempts: 30,
  });

  await olderSocket.tillSocketOpen();
  olderSocket.on("message", (message) => {
    if (message.type === "chat_metadata") {
      resolveNewerSocket(
        hume.empathicVoice.chat.connect({
          debug: false,
          reconnectAttempts: 30,
          resumedChatGroupId: message.chatGroupId,
        }),
      );
    }
  });

  const newerSocket = await promiseNewerSocket;

  newerSocket.on("close", () => rejectSuccess());
  newerSocket.on("message", (message) => {
    if (message.type === "chat_metadata") resolveSuccess();
  });

  await newerSocket.tillSocketOpen();
  await promiseSuccess;

  olderSocket.close();
  newerSocket.close();
}, 100000);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants