Visual Aid for Visually Impaired is an innovative mobile application designed to assist visually impaired individuals by converting images or videos into descriptive text. The app leverages Visual Language Models (VLM) to analyze the surroundings and provide detailed descriptions. This tool not only helps users understand their environment but also keeps a record of activities for caregivers to review if necessary.
Vision Crafters
- Scene-to-Text Conversion: The app captures pictures or videos and processes them using VLM to describe the surroundings.
- User Assistance: After processing, the app answers user queries related to their environment.
- Hazard Detection: The app raises alarms in case of hazardous surroundings.
- Conversation Log: The app maintains a log of interactions with the user for future reference.
-
Capture Image/Video:
- The user opens the app and uses the mobile device's camera to capture a picture or video of their surroundings.
-
Process Input:
- The captured media is processed using VLM to generate a descriptive text of the scene.
-
Describe Surroundings:
- The app provides an auditory(scene to text) description of the surroundings, detailing colors, objects, and other relevant features.
-
Query Assistance:
- The user can ask questions about their environment, and the app will respond with relevant information based on the processed data.
-
Hazard Detection:
- If the app detects any hazardous elements in the surroundings, it will immediately raise an alarm to alert the user.
-
Log Conversations:
- All interactions and descriptions are logged for future reference, which can be reviewed by caregivers if needed.
-
Describing Colors/Objects:
- The app can describe the colors and objects in a scene to a person who is partially blind, helping them to better understand their surroundings.
-
Raising Alarms:
- In case of hazardous surroundings, the app will raise an alarm to ensure the user is aware of potential dangers.
-
Conversational Bot:
- The app can act as a conversational bot, describing the surroundings and answering questions to assist a visually impaired person in navigating their environment safely.
- A mobile device with camera capabilities.
- Pre-trained Visual Language Models (VLM) for scene-to-text conversion.
- Internet connectivity for processing and updates.
- Secure storage for maintaining conversation logs.