Medical Drug Review Machine Learning Analyzer app

Application developed during the Digital House Data Science 2021 course

This app extracts tweets and post from Reddit using their respective APIs , runs an analysis using a LGBM model trained with this dataset:

https://www.kaggle.com/jessicali9530/kuc-hackathon-winter-2018

and gives statistics about the Medical drug in that social media site.

Database contains 161297 reviews.

Example 1:

During the COVID-19 pandemic a really popular medication is the Dexamethasone , specially for patients in critical condition.

In this example we run the app for this drug and searching in Reddit´s r/all ( a subreddit with the most popular posts from multiple subreddits)

After running the model this are our final results:

As we can see there is a spike of interest during the firsts months of 2020 , when the pandemic started.

We also generate a Wordcloud of all the collected posts.

Example 2:

Adderall is a drug that is commonly used to treat Attention-Deficit/Hyperactivity Disorder (ADHD). There is a subreddit (r/adhd) were many people with this medical condition talk about this and many others medications used for the treatment.

Our model vs TextBlob, understanding why it is important to train the model with the proper data:

A wise man once said : "A machine learning model is as good as its data"

We wanted to test this by comparing our model with TextBlob , an open-source NLP pre-trained model and module. It is a really popular model among the Data Science community , BUT there is a really big flaw in it... TextBlob is a model trained with a Movie review dataset.

This is really important to keep in mind , because TextBlob the model most likely wont undestand all of the variables corresponding to a medical review.

We can see this in our deploy in the section ¨Sentiment Analysis¨

Here we can see that for a given sentence both models identify it as a positive review

But here we can see that our model performs correctly ( clasifying the text as a negative review) but TextBlob finds it as a positive one ( which is wrong)

Link to the app hosted in Heroku: https://tinyurl.com/DrugReviewAnalyzer

Grupo 6 - TP 4 - Digital House - Data Science

Members:

Juan Boirazian : [email protected]

Jorge Corro : [email protected]

Federico Vessuri : [email protected]

Mariana peinado: [email protected]

Franco Visintini: [email protected]

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
images		images
.gitattributes		.gitattributes
Procfile		Procfile
README.md		README.md
app.py		app.py
modelo_cvect_08.pbz2		modelo_cvect_08.pbz2
modelo_lgbm_08.pbz2		modelo_lgbm_08.pbz2
modelo_svd_08.pbz2		modelo_svd_08.pbz2
requirements.txt		requirements.txt
setup.sh		setup.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Medical Drug Review Machine Learning Analyzer app

Example 1:

Example 2:

Our model vs TextBlob, understanding why it is important to train the model with the proper data:

Link to the app hosted in Heroku: https://tinyurl.com/DrugReviewAnalyzer

Members:

Visitors:

About

Languages

jboirazian/Medical-Drug-Review-Machine-Learning-Analyzer

Folders and files

Latest commit

History

Repository files navigation

Medical Drug Review Machine Learning Analyzer app

Example 1:

Example 2:

Our model vs TextBlob, understanding why it is important to train the model with the proper data:

Link to the app hosted in Heroku: https://tinyurl.com/DrugReviewAnalyzer

Members:

Visitors:

About

Resources

Stars

Watchers

Forks

Languages