GitHub

Say what you see!

!! This won't run on insecure origins (http) because of the security problems with JavaScript's WebAudio API. To test the app, please run the following: python server.py

This is a webapp that uses wavenets - a text2speech model - that generates audio signals based on some text input from an image or video frame!

Run the web app by python run.py
Take a snap (this gets fed into the network)
The Flask backend will do the work: convert pixels into text, text into generating audio signals
Send audio signals back to front-end
Enjoy!

DONE

OCR (Optical Character Recognition) set up to translate any image to text
Webcam enabled by getUserMedia to take input from your cam
Flask server set up -> Ajax can now communicate between front and back-end
Front-end done -> you can take a snap now and see what image will get fed into the network

####TODOs

Save images to feed into model? Not sure what's up here just yet.
Set up wavenets for generating audio signals
Feed the text input into wavenets, and see what it generates (should be a human-like sound!)
fix the ugly CSS

Ignore this line: python -mSimpleHTTPServer 8000 .

Name		Name	Last commit message	Last commit date
Latest commit History 33 Commits
static		static
templates		templates
tensorflow-wavenet		tensorflow-wavenet
.gitignore		.gitignore
Procfile		Procfile
README.md		README.md
im2text.ipynb		im2text.ipynb
requirements.txt		requirements.txt
server.py		server.py
test.png		test.png
test2.jpg		test2.jpg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Say what you see!

DONE

About

Releases

Packages

Languages

dorajam/im2speech

Folders and files

Latest commit

History

Repository files navigation

Say what you see!

DONE

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages