Skip to content

A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microphone inputs.

License

Notifications You must be signed in to change notification settings

ghunkins/Binaural-Source-Localization-CNN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

author
GREGORY DAVID HUNKINS
Dec 16, 2017
4368280 · Dec 16, 2017

History

78 Commits
Dec 15, 2017
Dec 16, 2017
Dec 15, 2017
Dec 3, 2017
Dec 3, 2017
Dec 5, 2017

Repository files navigation

Binaural-Source-Localization-CNN

Basic Information


Author: Gregory Hunkins

Organization: University of Rochester

License: MIT

Abstract: A Convolutional Neural Network (CNN) classification system was designed for the task of source localization of human voices in 3-D space. A new dataset, VoiceBin100K, is introduced to accomplish this task and for future work in the field. The CNN inputs variable-length binaurual short- time Fourier Transform (STFT) magnitude and phase features and predicts location of the speaker’s voice according to 168 location classes.

Running The Code


Reference: https://cs.rochester.edu/~cxu22/t/577F17/bluehive_tutorial.html

Data


Please contact ghunkins@u.rochester.edu for access to the data. A public link will available shortly.

About

A Deep Convolutional Neural Network (DCNN) designed for the task of localizing human speech to 168 location classes using binaural microphone inputs.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published