Creativity might be the ultimate measure of Artificial Intelligence and one such creative field is Song Writing. In this project, our objective is to explore the capability of Language models to generate lyrics for songs.
The aim of this project is to:
- Scrape and collect the lyrics of Billboard Hot-100 songs from 1960−2021 and select the songs belonging to Pop, Rock and Rap. Similarly, scrape and collect lyrics of songs of 5 artists from those 3 selected genres.
- Perform Exploratory Data Analysis by plotting word clouds, exploring most frequent words, average line length, vocabulary richness, etc for each of the 3 selected genres.
- Create different language models using Vanilla LSTMs and RoBERTa and GPT-2 (from 🤗 HuggingFace library) and train the language models on the lyrics for the 3 selected genres separately.
- Select the best performing model for each of the genres based on the performance metrics and fine-tune the models on the lyrics of the selected artists. We finally have 15 models (3 genres x 5 artists) using which we can generate lyrics like each of those artists.
This repository contains,
- A data folder which contains a main
Billboard Hot-100 dataset
using which mini datasets for thePop, Rock and Rap datasets
are created. This data folder contains subfolders for artists of each genre. Each of these subfolders contain lyrics of selected 5 artists from the 3 genres. Lyrics Generation using Language Models
jupyter notebook.
References: