In this project we aimed to create various representations of a sound track in the form of a video. Now if you remember windows XP used to have a visualizer that was based on pitch, tempo, bass, and meta-data from the file itself. We wanted to make something similar that was actually based on the music itself like vibe, instruments used, speed, how dancable it is, and other factors that are beyond just hard-coding some predefined effects.
Sparks by Coldplay
Somethings never change by Bathe Alone
In this project, we're using Google's YAMnet to extract features from the soundtrack. YAMnet itself is trained on millions of soundtracks and it has learned 1024 unique features that it can extract given a 0.96 second time frame.
For our purposes, 1024 unique values is a little too much to process and visualize effectively, so we're currently using kmean clustering to reduce 1024 values to only 4 values which represent x position, y position, color pallete, size in the final output.
In this project we're using Arcade to visualize the clustered embeddings. We essentially made a game that takes in the points and creates more points in between to make it look smoother and more connected. We then visualize in real-time and capture the rendered frame in memory, once all frames are captured the script creates a video and attaches the music to it.
git clone https://github.com/pooriaahmadi/music-visualizer.git
cd music-visualizer
pip install -r requirements.txt
pip3 install -r requirements.txt
We recommend finding your desired soundtrack on soundcloud and then passing it through this soundcloud downloader. Afterwards turn the downloaded mp3 to .wav using this website.
Make sure to put the .wav file in the root of the project (in the same folder as main.py for ease-of-use)
python main.py --music sparks.wav --visuals snake --embed_model yamnet
python3 main.py --music sparks.wav --visuals snake --embed_model yamnet
This will take a while and you have to let the program visualize the whole thing first, DO NOT CLOSE the visualizer window. It will close itself and it will take a while to generate the final output video. BE PATIENT.
If completed, you should be left with final_video.mp4 and you can use that.
- @pooriaahmadi Pooria Ahmadi
- @dolev497 Dolev Klein