Fine-tuning DistilBERT on senator tweets

Built in 🐍

using 🤗 Transformers and

deployed on Streamlit 🎈 (coming soon!).

Read the Medium article here.

Code

Part 1: Creating the dataset - get_tweets.ipynb

All 2021 tweets (~100,000) posted by 100 United States Senators and scraped by me.

The model was evaluated on a test dataset (20%):

{'accuracy': 0.908, 
'f1': 0.912}