PerformanceNet is a deep convolutional model that learns in an end-to-end manner the score-to-audio mapping between musical scores and the correspondent real audio performance. Our work represents a humble yet valuable step towards the dream of The AI Musician. Find more details in our AAAI 2019 paper!
Below we assume the working directory is the repository root.
# Install the dependencies
pip install -r requirements.txt
PerformanceNet utilizes the MusicNet dataset , which provides musical scores and the correspondant performance audio data.
# Download the training data
./scripts/download_data.sh
You can also download the training data manually (musicnet.npz).
Pre-process the dataset into pianorolls and spectrogram used for training PerformanceNet.
# Pre-process the dataset
./scripts/process_data.sh
Below we assume the working directory is the repository root.
We provide the scripts for easy managing the experiments.
- Run the following command to set up a new experiment. (~= 300 epoch to obtain good results)
The arguments are (in order) 1. instrument 2.training epoch 3. testing frequency 4. experiment name.
# Set up a new experiment
./scripts/train_model.sh cello 300 10 cello_exp_1
We use the Griffin-Lim algorithm to convert the output spectrogram into audio waveform. (Note: it can take very long time to synthesize a longer audio)
- Synthesizing with test data split from the Musicnet dataset (New folder that contains generated audio would be created automatically in your exp directory.)
The arguments are (in order) 1. experiment directory 2. data resource (TEST_DATA means using the test data split from training dataset.)
# Generating 5 * 5 seconds audio clip by default
./scripts/synthesize_audio.sh cello_exp_1 TEST_DATA
- Synthesizing audio from your own midi file:
Please manually create a directory called "midi" in you experiment directory, then put the midi files into it before executing this script
# Generating one audio clip, length depends on your midi score.
./scripts/synthesize_audio.sh cello_exp_1 YOUR_MIDI_FILE.midi
Our model can perform any solo music given the score. Therefore we provide a convenient script to convert any .midi file to the input for our model. The quality could vary in different keys, as some notes may never appear in training data. Common keys (C, D, G) should work well though. Also it's important to make sure the note range is within the instrument's range.
- Violin: https://www.youtube.com/watch?v=kAEbbNUEEgI
- Flute: https://www.youtube.com/watch?v=Y38Z2De1NFo
- Cello: https://www.youtube.com/watch?v=3LzN3GvMNeU
- 吳萼洋 蜂蜜檸檬 cover: https://youtu.be/k0-cT6GxS3g
If you use this code in your research, please cite the following papers:
- PerformanceNet: Score-to-Audio Music Generation with Multi-Band Convolutional Residual Network. Bryan Wang, Yi-Hsuan Yang. in Proceedings of the 33rd AAAI Conference on Artificial Intelligence (AAAI), 2019. [paper]
- Demonstration of PerformanceNet: A Convolutional Neural Network Model for Score-to-Audio Music Generation. Yu-Hua Chen, Bryan Wang, Yi-Hsuan Yang. in Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence (IJCAI), Demos, 2019. [paper]