Skip to content
This repository has been archived by the owner on Jan 23, 2023. It is now read-only.

Commit

Permalink
added readme
Browse files Browse the repository at this point in the history
  • Loading branch information
jaberkow committed Dec 16, 2020
1 parent b26b0cf commit 1021dca
Show file tree
Hide file tree
Showing 2 changed files with 37 additions and 0 deletions.
37 changes: 37 additions & 0 deletions README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
# Detecting Distribution Shift in Speech Audio

This repo contains tools for doing covariate shift detection on speech audio, and an example application on the [VOiCES dataset](https://iqtlabs.github.io/voices/). Our approach uses pretrained speech featurizers and aggregation methods to embed waveforms into fixed length vectors.

![Embedding pipeline](pipeline.png)

We then use the approach of [Failing Loudly](https://arxiv.org/abs/1810.11953) to detect distribution shift, by first using untrained autoencoders for dimensionality reduction on the embeddings, followed up by two-sample non-parametric hypothesis testing on samples of source and target data.

## Contents

* `embeddors.py`: Classes wrapping models that convert waveforms to sequences of feature vectors.
* `aggregators.py`: Classes for aggregrating sequences of feature vectors into single, fixed-length vector embeddings.
* `data_utils.py`: Tools for combining featurizors and aggregators into the embedding pipeline and applying it to lists of waveforms or .wav files.
* `detection_utils.py`: Functional wrappers around [the alibi_detect implementation](https://docs.seldon.io/projects/alibi-detect/en/stable/methods/mmddrift.html) of [MMD two sample testing](https://en.wikipedia.org/wiki/Kernel_embedding_of_distributions#Kernel_two-sample_test) for single and repeated tests.
* `hypothesis_test.ipynb`: This notebook walks through loading the VOiCES dataset, preprocessing the waveforms into embeddings, and evaluating the performance of distribution shift detection.

## Dependencies and prerequisites

To follow the experiments in `hypothesis_test.ipynb` you will need to download the [VOiCES dataset](https://iqtlabs.github.io/voices/). Instructions for that can be found [here](https://iqtlabs.github.io/voices/downloads/). We recommend using the devkit subset for expediency.

The following packages are required

```
alibi-detect
torch
fairseq
tensorflow
tensorflow_hub
librosa
pandas
seaborn
wget
```
All can be installed using pip
```
pip install -r requirements.txt
```
Binary file added pipeline.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.

0 comments on commit 1021dca

Please sign in to comment.