Is your feature request related to a problem? Please describe.
Currently, the Virtual Courtroom module supports native WebRTC-based video conferencing for remote hearings. However, it lacks real-time transcription and closed-captioning. For litigants from diverse linguistic backgrounds or those with hearing impairments, participating in virtual hearings without live, translated text (Hinglish/regional languages) creates a severe barrier to justice.
Describe the solution you'd like
I want to implement a live-captioning system that extracts audio from the WebRTC stream, processes it via the backend, and displays synchronized text on the UI.
Objective & Tasks:
- Extract audio streams from the existing WebRTC client in the React frontend.
- Transmit chunked audio data to the backend via WebSockets for low-latency processing.
- Leverage the existing Bhashini / AI NLP integrations in the orchestrator to generate real-time transcripts.
- Stream the translated text back to the frontend and overlay it as synchronized closed-captions on the video UI.
- Update the
nyaysetu-frontend Virtual Court component to capture and chunk microphone audio streams via the MediaRecorder API.
- Build a scalable UI overlay in the React video conferencing view to display live captions with language toggle options.
Describe alternatives you've considered
I considered using third-party browser extensions or external APIs directly from the client side, but routing the audio through the existing FastAPI (nlp-orchestrator) / Spring Boot backend via WebSockets ensures better security, lower latency, and leverages the project's existing Bhashini/AI NLP integrations.
Additional context
Tech Stack Required:
- React & WebRTC (Frontend)
- WebSockets (STOMP or standard WS)
- FastAPI / Python (NLP Orchestrator)
- Bhashini API / AI Transcription Models
Notes:
- To minimize latency, audio chunking should be optimized (e.g., 1-3 second intervals).
- We can fall back to basic local transcription models during local development to avoid API rate limits.
- I would love to work on this issue for GSSoC'26. Please assign it to me with the
gssoc and advanced / level-3 labels!
Is your feature request related to a problem? Please describe.
Currently, the Virtual Courtroom module supports native WebRTC-based video conferencing for remote hearings. However, it lacks real-time transcription and closed-captioning. For litigants from diverse linguistic backgrounds or those with hearing impairments, participating in virtual hearings without live, translated text (Hinglish/regional languages) creates a severe barrier to justice.
Describe the solution you'd like
I want to implement a live-captioning system that extracts audio from the WebRTC stream, processes it via the backend, and displays synchronized text on the UI.
Objective & Tasks:
nyaysetu-frontendVirtual Court component to capture and chunk microphone audio streams via theMediaRecorderAPI.Describe alternatives you've considered
I considered using third-party browser extensions or external APIs directly from the client side, but routing the audio through the existing FastAPI (
nlp-orchestrator) / Spring Boot backend via WebSockets ensures better security, lower latency, and leverages the project's existing Bhashini/AI NLP integrations.Additional context
Tech Stack Required:
Notes:
gssocandadvanced/level-3labels!