Application developed during the Digital House Data Science 2021 course
This app extracts tweets and post from Reddit using their respective APIs , runs an analysis using a LGBM model trained with this dataset:
https://www.kaggle.com/jessicali9530/kuc-hackathon-winter-2018
and gives statistics about the Medical drug in that social media site.
Database contains 161297 reviews.
During the COVID-19 pandemic a really popular medication is the Dexamethasone , specially for patients in critical condition.
In this example we run the app for this drug and searching in Reddit´s r/all ( a subreddit with the most popular posts from multiple subreddits)
After running the model this are our final results:
As we can see there is a spike of interest during the firsts months of 2020 , when the pandemic started.
We also generate a Wordcloud of all the collected posts.
Adderall is a drug that is commonly used to treat Attention-Deficit/Hyperactivity Disorder (ADHD). There is a subreddit (r/adhd) were many people with this medical condition talk about this and many others medications used for the treatment.
A wise man once said : "A machine learning model is as good as its data"
We wanted to test this by comparing our model with TextBlob , an open-source NLP pre-trained model and module. It is a really popular model among the Data Science community , BUT there is a really big flaw in it... TextBlob is a model trained with a Movie review dataset.
This is really important to keep in mind , because TextBlob the model most likely wont undestand all of the variables corresponding to a medical review.
We can see this in our deploy in the section ¨Sentiment Analysis¨
Here we can see that for a given sentence both models identify it as a positive review
But here we can see that our model performs correctly ( clasifying the text as a negative review) but TextBlob finds it as a positive one ( which is wrong)
Link to the app hosted in Heroku: https://tinyurl.com/DrugReviewAnalyzer
Grupo 6 - TP 4 - Digital House - Data Science
Juan Boirazian : [email protected]
Jorge Corro : [email protected]
Federico Vessuri : [email protected]
Mariana peinado: [email protected]
Franco Visintini: [email protected]