DeepVision

A image captioning and OCR app with audio feedback and voice control. With image captioning, this application can help visually impaired people, or help to learn a new language ...

Demo

Video screenshot of the app running, there are gifs but on the original video we can hear the audio feedback.

How to launch the application

This project contains two parts, the application (android) and a server (Flask)

Flask server

This server is written in Python in a colab notebook (more convenient for the tensorflow environment)

To start the server, go to this address and download the pre-trained model checkpoint, then put the 3 downloaded files (checkpoint, ckptxxx.data, ckptxxx.index) in a folder named checkpoint. Go to the CaptionModel.py file and set the parameter to the path of the created checkpoints folder, then execute the serverEndpoint.ipynb file For the moment it works with ngrok, so a web address will be generated, and this address will allow you to configure the android application.

This server can also work without the application by calling the function(at /hello in the flask app) and passing the image in the request form as it is currently done.

Android app

To launch the application, open android studio and run the application, then open the application and go to options, enter the web address of the server in the "Deep Vision API URL" field, click on save options, go back to the main page and click on camera and voilà. You have two buttons: describe for image captioning and read for OCR.

In options you have other options like "Glasses camera address" to enter the IP address of a camera and use it instead of the smartphone camera. you can also activate the voice command which currently supports French and English, so instead of clicking on describe or read you can say describe or read.

Name		Name	Last commit message	Last commit date
Latest commit History 41 Commits
android_app/deep_vision		android_app/deep_vision
flask_server		flask_server
.gitattributes		.gitattributes
README.md		README.md
Screenshot_20211205-172805.png		Screenshot_20211205-172805.png
deepvision_baniere_playstore.png		deepvision_baniere_playstore.png
demo_people_riding_bikes.gif		demo_people_riding_bikes.gif
demo_pizza.gif		demo_pizza.gif
demo_sheep.gif		demo_sheep.gif
demo_supermarket.gif		demo_supermarket.gif

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DeepVision

Demo

Flask server

Android app

About

Releases

Packages

Languages

Kossi-Francois/DeepVision

Folders and files

Latest commit

History

Repository files navigation

DeepVision

Demo

Flask server

Android app

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages