Skip to content

VIDCS-3699: Implement server-side controls of captions #161

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 8 commits into from
May 12, 2025

Conversation

behei-vonage
Copy link
Contributor

What is this PR doing?

This PR implements server-side controls of captions.

How should this be manually tested?

This needs to be tested with both vonage and opentok as VIDEO_SERVICE_PROVIDERs in your .env file in the backend. For that, you'll need to have both Vonage applicationId and OpenTok API key.

To test this, you will also need to have Postman installed.
Once that is done, run the app by doing yarn dev.
In Postman, make a GET request to localhost:3345/session/<room_name_of_your_choice> (notice no trailing slash) - this is going to create a room for you - captions cannot be enabled unless a room has already been created.
After that, make a POST request to localhost:3345/session/<room_name_of_your_choice>/enableCaptions.
Notice that you are getting a return that looks something like this:

{
    "captions": {
        "captionsId": "3f9ed25c-a12c-4ff4-abe9-428f4371dae2"
    },
    "status": 200
}

Copy-paste the captionsId you got back and make a POST request to localhost:3345/session/<room_name_of_your_choice>/<captionsId_here>/disableCaptions.
If using the example above, <captionsId_here> would be 3f9ed25c-a12c-4ff4-abe9-428f4371dae2.

What are the relevant tickets?

A maintainer will add this ticket number.

Resolves VIDCS-3699

Checklist

[ ] Branch is based on develop (not main).
[ ] Resolves a Known Issue.
[ ] If yes, did you remove the item from the docs/KNOWN_ISSUES.md?
[ ] Resolves an item reported in Issues.
If yes, which issue? Issue Number?

@behei-vonage behei-vonage self-assigned this May 7, 2025
@@ -0,0 +1,4 @@
declare module 'opentok-jwt' {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: this was needed since I'm using the opentok-jwt package to generate a project token; however, the package is not TypeScript friendly so this was needed. I also cannot it export with a default; therefore, disabling lint since it needs to be imported like this:

import { projectToken } from 'opentok-jwt`

therefore not supporting the default export approach

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's some nice magic here 🪄 💪 !

@behei-vonage behei-vonage requested a review from Copilot May 7, 2025 20:56
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements server-side controls for captions in video sessions by adding enable/disable caption methods to both the Vonage and OpenTok video service implementations.

  • Added new methods (enableCaptions and disableCaptions) in VonageVideoService and OpenTokVideoService.
  • Updated the VideoService interface to include caption control methods.
  • Extended tests and routes to cover the new captions functionality.

Reviewed Changes

Copilot reviewed 7 out of 8 changed files in this pull request and generated no comments.

Show a summary per file
File Description
backend/videoService/vonageVideoService.ts Added enable/disable captions methods using the Vonage SDK.
backend/videoService/videoServiceInterface.ts Updated the interface to include caption control methods.
backend/videoService/opentokVideoService.ts Added caption control methods with HTTP requests via axios.
backend/types/opentok-jwt-d.ts Added new type declaration for projectToken from opentok-jwt.
backend/tests/session.test.ts Added tests to cover enabling and disabling captions endpoints.
backend/routes/session.ts Created routes for enabling and disabling captions in a session.
Files not reviewed (1)
  • backend/package.json: Language not supported
Comments suppressed due to low confidence (2)

backend/videoService/videoServiceInterface.ts:10

  • [nitpick] Consider refining the return type of enableCaptions to consistently match its implementations instead of a union with string and undefined. For instance, if both services always return an EnableCaptionResponse on success, update the return type accordingly.
enableCaptions(sessionId: string): Promise<EnableCaptionResponse | string | undefined>;

backend/routes/session.ts:116

  • [nitpick] The response object for captions uses the key 'captionId' for the disableCaptions endpoint, while the enableCaptions endpoint returns a 'captions' object. Consider using consistent naming for response fields across caption-related endpoints to improve clarity.
captionId: responseCaptionId,

@behei-vonage behei-vonage requested a review from Copilot May 9, 2025 20:20
Copy link

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements server-side controls for captions in video sessions by adding methods to enable and disable captions in both the Vonage and OpenTok video service implementations. Key changes include updating the video service interface, extending the Vonage and OpenTok video service implementations with new captions methods, and adding corresponding API routes and tests.

Reviewed Changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated no comments.

Show a summary per file
File Description
backend/videoService/vonageVideoService.ts Added enable/disable captions methods using the Vonage SDK.
backend/videoService/videoServiceInterface.ts Updated the interface to include captions control methods.
backend/videoService/tests/vonageVideoService.test.ts Added tests for captions enabling/disabling scenarios.
backend/videoService/opentokVideoService.ts Implemented captions control via direct HTTP requests to OpenTok API.
backend/routes/session.ts Added new session routes to enable and disable captions.
backend/tests/session.test.ts Extended integration tests to cover captions endpoints.
Files not reviewed (1)
  • backend/package.json: Language not supported
Comments suppressed due to low confidence (1)

backend/routes/session.ts:95

  • [nitpick] For consistency, consider using a uniform key for captions responses. Currently, the enableCaptions endpoint returns the key 'captions' while the disableCaptions endpoint returns 'captionId'. Aligning these property names can help reduce confusion.
res.json({ captions, status: 200 });

@@ -0,0 +1,118 @@
import { describe, expect, it, jest } from '@jest/globals';
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: if you notice some differences between this test file and vonageVideoService.test.ts, no you don't 😆
just kidding - there are some slight difference since the opentok package uses callback and exports itself without a default but with module.exports = OpenTok.

also you'll notice that in this file I'm also mocking axios - that is because we have to use axios in opentokVideoService but not vonageVideoService since the opentok library does not support enableCaptions and disableCaptions methods so we have to do direct REST API requests. I've left some comments about it in the codebase, too.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, to be fair, for tests we prefer WET over DRY ;-)

v-kpheng
v-kpheng previously approved these changes May 9, 2025
Copy link
Collaborator

@v-kpheng v-kpheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 💪 🚀

Some nice mocking here :-)

status: 200,
});
} else {
res.status(404).json({ message: 'Room not found' });
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This error message might be misleading if the room exists but captions couldn't be enabled for some reason.

(Unless enableCaptions() can never fail....)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perhaps but I'd like to keep it as-is to be consistent with the other routes we have.
though, let me know if I'm wrong -
if this line throws:

const captions = await videoService.enableCaptions(sessionId);

then it will be caught by:

catch (error: unknown) {
      const errorMessage = error instanceof Error ? error.message : 'Unknown error occurred';
      res.status(500).json({ message: errorMessage });
    }

however, if it so happens that sessionId is undefined, instead of making the enableCaptions request, it will go straight to:

 res.status(404).json({ message: 'Room not found' });

which seems to be covering all of our scenarios, right?

@@ -0,0 +1,4 @@
declare module 'opentok-jwt' {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's some nice magic here 🪄 💪 !

@@ -0,0 +1,118 @@
import { describe, expect, it, jest } from '@jest/globals';
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, to be fair, for tests we prefer WET over DRY ;-)

Copy link
Contributor

@cpettet cpettet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good so far! Just have a few comments/questions. Let me know what you think!

const { token } = requestToken;

try {
const captionOptions: CaptionOptions = {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: if you notice that languageCode is missing here compared to opentokVideoService, no you didn't
when I was adding the caption options, I found an issue with vonage-node-sdk and I filed a PR to fix it here: Vonage/vonage-node-sdk#993
the big issue is that it accepts languageCode as en-us and not en-US; therefore, making it invalid and I'm not able to successfully enable captions.
for the time being, I decided to skip adding the languageCode as en-US is the default option anyway. once the DevRel team merges in / deploys the new version that has the change I added, I'll add it back here for consistency with opentokVideoService

@behei-vonage behei-vonage requested a review from cpettet May 12, 2025 19:06
Copy link
Contributor

@cpettet cpettet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM Great job! :shipit:

Copy link
Collaborator

@v-kpheng v-kpheng left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 💪 🚀

token,
const { token } = this.generateToken(sessionId);
const captionOptions = {
// The following language codes are supported: en-US, en-AU, en-GB, fr-FR, fr-CA, de-DE, hi-IN, it-IT, pt-BR, ja-JP, ko-KR, zh-CN, zh-TW
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: maybe link to where this is documented instead?

languageCode: 'en-US',
// The maximum duration of the captions in seconds. The default is 14,400 seconds (4 hours).
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same nit above applies here too.

maxDuration: 1800,
// Enabling partial captions allows for more frequent updates to the captions.
// This is useful for real-time applications where the captions need to be updated frequently.
// However, it may also increase the number of inaccuracies in the captions.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent comment! 💪

@DeliaTok
Copy link
Collaborator

looks good!

@behei-vonage behei-vonage merged commit f11eec0 into develop May 12, 2025
7 checks passed
@behei-vonage behei-vonage deleted the behei-vonage/vidcs-3699-captions-backend branch May 12, 2025 21:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants