Skip to content

vedang-04/RECOMMENDER_SYSTEM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 

Repository files navigation

RECOMMENDER_SYSTEM

Introduction

Building a content-based recommendation system by calculating the item-item interaction from the given data. Inputting the valid NETFLIX name of the Movie or the TV Show will lead to the model making the description of the content-based recommendations.

Description

The model used for recommendation is a combination of the Tfidf Vectorizer for text-to-vector conversion and then the unsupervised algorithm of Nearest Neighbors was used to find the cosine distance between the vectors corresponding to the movies/TV shows.
The recommendations will be based on the minimum cosine distance i.e., maximum cosine similarity between the two movies, a movie, and a TV show or TV show and a TV show.

Dataset

We used 3 different datasets and then merged them to form the final dataset. All of them are available on Kaggle.
Netflix Movies and Shows: https://www.kaggle.com/shivamb/netflix-shows
IMDB Movies and Shows: https://www.kaggle.com/stefanoleone992/imdb-extensive-dataset
IMDB Ratings: https://www.kaggle.com/stefanoleone992/imdb-extensive-dataset

They are given in datasets folder. Use accordingly in the notebook.

Advised to fetch it and not download it separately. More on it ahead. Kaggle Account required for fetching the dataset.

Requirements

General:

Python>=3.6

For Model Package:

numpy>=1.19.0,<1.21.0
pandas>=1.2.0,<1.3.0
pydantic>=1.8.1,<1.9.0
scikit-learn>=0.24.0,<0.25.0
strictyaml>=1.3.2,<1.4.0
ruamel.yaml==0.16.12
feature-engine>=1.0.2,<1.1.0
joblib>=1.0.1,<1.1.0
kaggle==1.5.2
bash

For API:

uvicorn>=0.11.3,<0.12.0
fastapi>=0.64.0,<1.0.0
python-multipart>=0.0.5,<0.1.0
typing_extensions>=3.7.4,<3.8.0
loguru>=0.5.3,<0.6.0
python-json-logger>=0.1.11,<0.2.0
jinja2==3.0.2
python-multipart==0.0.5

For Testing and Tooling of the Project:

pytest>=6.2.3,<6.3.0
requests>=2.23.0,<2.24.0
black==20.8b1
flake8>=3.9.0,<3.10.0
mypy==0.812
isort==5.8.0

Setup (Go to the suitable command prompt)

1] Clone the repository on the local system. Here the commands to be adjusted according to the path of the folder. Here the project is on my Desktop (Microsoft C-Drive).

2] For training the model and generating the cosine distances and similarities between content.

cd C:\Users\kshir\OneDrive\Desktop\RECOMMENDER_SYSTEM\ds_project
tox -e fetch_data (fetching the data)
tox -e train_test_package

3] For installing the package locally.

tox -e train_test_package
(use this to ensure that you have a trained model)
cd C:\Users\kshir\OneDrive\Desktop\RECOMMENDER_SYSTEM\ds_project\package
python setup.py sdist bdist_wheel
pip install -e .

If memory error problem is faced directly use the ind_csv.csv present in the datasets of recommender model package.

4] For running the Api.

cd C:\Users\kshir\OneDrive\Desktop\RECOMMENDER_SYSTEM\ds_project
tox -e test_api
tox -e run

5] Once the application starts running go to http://localhost:8001/recommendersystem

More on this repository in this file: https://drive.google.com/file/d/1nOQCYNwy4yLqpzeLjWL2xubG8qeLKW5z/view?usp=sharing

Indepth Analysis and explanation of this repository in this file: https://drive.google.com/file/d/16b4EvfuRbhXnFMoTNd6cmHsJsOpd2kSa/view?usp=sharing

The files in the above mentioned two links are available in documents of this repository

Contributors

VEDANG KSHIRSAGAR (vedang-04)

About

Building content based recommendation system

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published