Skip to content

Latest commit

 

History

History
27 lines (19 loc) · 1.27 KB

README.md

File metadata and controls

27 lines (19 loc) · 1.27 KB

Say what you see!

!! This won't run on insecure origins (http) because of the security problems with JavaScript's WebAudio API. To test the app, please run the following: python server.py

This is a webapp that uses wavenets - a text2speech model - that generates audio signals based on some text input from an image or video frame!

  1. Run the web app by python run.py
  2. Take a snap (this gets fed into the network)
  3. The Flask backend will do the work: convert pixels into text, text into generating audio signals
  4. Send audio signals back to front-end
  5. Enjoy!

DONE

  • OCR (Optical Character Recognition) set up to translate any image to text
  • Webcam enabled by getUserMedia to take input from your cam
  • Flask server set up -> Ajax can now communicate between front and back-end
  • Front-end done -> you can take a snap now and see what image will get fed into the network

####TODOs

  • Save images to feed into model? Not sure what's up here just yet.
  • Set up wavenets for generating audio signals
  • Feed the text input into wavenets, and see what it generates (should be a human-like sound!)
  • fix the ugly CSS

Ignore this line: python -mSimpleHTTPServer 8000 .