This is the repository for the DTU 02456 Deep Learning project. The aim is to separate speakers in mixed speech signals using the Demucs model, which was originally intended for stem separation of music. The project is supervised by Prof. Bjørn Sand Jensen (Department of Applied Mathematics and Computer Science).
Links:
- See the synopsis for the outline of the project
- See the poster for a summary of the results
- See the paper for the full results
- See the developer's guidelines if you want to contribute
- Ensure Python 3.9 is installed
- Create a virtual environment with Python 3.9 interpreter
- Install requirements from
requirements.txt - Generate LibriMix dataset (for training)
- Add the DTU HPC transfer server
transfer.gbar.dtu.dkas SSH host nameddtu-hpc-transfer(to use copy scripts)- See this guide for more information on SSH access
- Training: run
bin/librimix/train.shin the project directory to train locally, or submit an LSF job on the DTU HPC cluster usingbsub < bin/librimix/train_job.sh. The trained model will be stored indata/models/librimix/version_1. Adjust the LibriMix folder if needed.- To copy your code to the HPC cluster, use
bin/copy_code_to_hpc.sh - To copy trained models from the HPC cluster, use
bin/copy_model_from_hpc.sh librimix XwhereXis the version - It may be necessary to restart the training from a checkpoint if the 24 h walltime limit for jobs is exceeded. In that case, add
--checkpoint-path data/models/....ckptto the training script.
- To copy your code to the HPC cluster, use
- Prediction: run
bin/librimix/predict.shin the project directory to predict a single example (set by--item) in the LibriMix dataset. Set the checkpoint/trained model to use in the script.- To easily find the checkpoint path, use
find data/models/librimix/ -name *.ckpt
- To easily find the checkpoint path, use
- Training plot generation: run
bin/evaluate_training.sh A B CwhereA B Care the versions you want to compare- If a training consists of more than one version (i.e. it has been restarted from a checkpoint), use
A+BwhereAandBbelong to the same training
- If a training consists of more than one version (i.e. it has been restarted from a checkpoint), use
- Format code using
black .in project directory before committing - Run all code in the project root directory so that paths relative to the current working directory work as intended