Binary Sentiment Classification

In this project three machine learning algorithms (Naive Bayes, ID3 & Ada Boost) have been implemented for the classification of movie reviews as either positive or negative.

For the training and testing of the algorithms the "Large Movie Review Dataset", also known as "IMDB dataset", has been used which can be found here. In order for the code to work simply download it from the link provided and import it in the project file.

In this implementation every review text is represented by a binary vector, which shows which words from a vocabulary are contained in each review. For the definition of the vocabulary, an information gain algorithm is used that selects the most appropriate words.

The learning curves of the algorithms are shown below.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
figures		figures
learning_curves		learning_curves
ID3.py		ID3.py
README.md		README.md
adaBoost.py		adaBoost.py
functions.py		functions.py
main.py		main.py
main_ID3.py		main_ID3.py
main_adaBoost.py		main_adaBoost.py
main_naiveBayes.py		main_naiveBayes.py
metrics.py		metrics.py
naiveBayes.py		naiveBayes.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Binary Sentiment Classification

Naive Bayes Learning Curves

ID3 Learning Curves

AdaBoost Learning Curves

About

Uh oh!

Releases

Packages

Languages

jimis-anastas/Binary-Sentiment-Classification

Folders and files

Latest commit

History

Repository files navigation

Binary Sentiment Classification

Naive Bayes Learning Curves

ID3 Learning Curves

AdaBoost Learning Curves

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages