Problem Statement ID: 1716
Problem Statement Title: Indian Sign Language to Text/Speech Translation
Team Name: qwerty
This project aims to develop a real-time translation system for Indian Sign Language (ISL) to facilitate communication between ISL users and non-signers. The solution translates ISL gestures into text and synthesized speech, thereby promoting inclusivity and accessibility for the deaf and hard-of-hearing community.
Click on the image below to watch the demo video on YouTube:
The system leverages deep learning and computer vision techniques to detect and interpret ISL gestures in real-time:
- Hand Detection and Tracking: Utilizes YOLO and MediaPipe integrated with OpenCV to accurately detect and track hand movements.
- Gesture Classification: Implements a CNN-based classifier using MobileNetV2 (TensorFlow and Keras) to recognize ISL gestures.
- Mapping to Text: Translates recognized gestures into corresponding text output.
- Text-to-Speech Conversion: Converts text into speech using gTTS, rendering audio via Pygame.
- YOLO (You Only Look Once): For fast and efficient object detection.
- MediaPipe: For hand tracking and landmark detection.
- OpenCV: For real-time computer vision tasks.
- TensorFlow & Keras: Deep learning frameworks used to train the gesture classifier with MobileNetV2.
- gTTS (Google Text-to-Speech): Converts text into speech.
- Pygame: Outputs audio for the synthesized speech.
- Feasibility: The use of YOLO and CNNs for gesture recognition is achievable, though accuracy depends on training with a diverse dataset.
- Challenges:
- Accuracy: Gesture recognition accuracy varies with background changes.
- Performance: Real-time processing may experience delays.
- Integration: Complex coordination among components.
- Solutions:
- Train with varied backgrounds for better accuracy.
- Use hardware acceleration and optimization to reduce latency.
- Employ a modular design for smoother integration and maintenance.
This project holds the potential to significantly enhance communication for the deaf and hard-of-hearing community by translating ISL gestures into text and speech in real time.
- Inclusivity: Enables more inclusive interactions and social participation.
- Access to Opportunities: Broadens job prospects and educational access for ISL users.
- Cost-Effective: Reduces dependency on physical translation services.
- Multi-Language Support: Extend translation to additional spoken languages.
- Improved Accuracy: Implement advanced models and larger datasets to increase recognition accuracy.
- Mobile App Version: Develop a mobile application to enhance accessibility and portability.
Thank You !