Classifying Product from their Ingredients Based on various Machine Learning Algorithms.

Team InsightScan

CSE523 - Machine Learning, Ahmedabad University

Introduction

Our goal is to use machine learning techniques to accurately classify products as either food or beauty products based on their ingredient lists. To achieve this, we gathered data by manually collecting ingredient information from 1100 food and beauty products using the ChatGPT API, resulting in a total of 1672 unique ingredients. We conducted pre-processing on the data and tested various classification algorithms. Based on the results, we developed ensemble learning models that were fine-tuned for optimal test performance. We utilized Support Vector Machine (SVM) as the most suitable model for our dataset, and implemented additional techniques such as PCA, GridSearchCV, and Cosine similarity to improve results. We also created a recommendation model that would give a list of similar products based on the given input product from our dataset.

Results

For comparison of all the results generated by different algorithms, check the table in the report.

Classification

Cumulative Explained Varience vs No. of Components

ROC Curve

Precision Recall Curve

Data Visualization after PCA with 2D

Heat Map After PCA

Poster

References

Swain, P. H., & Hauska, H. (1977). The decision tree classifier: Design and potential. IEEE Transactions on Geoscience Electronics, 15(3), 142-147.
Hosmer Jr, D. W., Lemeshow, S., & Sturdivant, R. X. (2013). Applied logistic regression (Vol. 398). John Wiley & Sons.
Dasu, T., & Johnson, T. (2003). Exploratory data mining and data cleaning. John Wiley & Sons.
Pal, M. (2005). Random forest classifier for remote sensing classification. International journal of remote sensing, 26(1), 217-222.
F. Pedregosa, G. Varoquaux, A. Gramfort, V. Michel, B. Thirion, O. Grisel, M. Blondel, P. Prettenhofer, R. Weiss, V. Dubourg, J. Vanderplas, A. Passos, D. Cournapeau, M. Brucher, M. Perrot, and E. Duchesnay, “Scikit-learn: Machine learning in Python,” Journal of Machine Learning Research, vol. 12, pp. 2825–2830, 2011.

Name		Name	Last commit message	Last commit date
Latest commit History 73 Commits
Code		Code
Dataset		Dataset
Report		Report
Results		Results
Weekly Reports		Weekly Reports
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Classifying Product from their Ingredients Based on various Machine Learning Algorithms.

Team InsightScan

CSE523 - Machine Learning, Ahmedabad University

Introduction

Results

Classification

Cumulative Explained Varience vs No. of Components

ROC Curve

Precision Recall Curve

Data Visualization after PCA with 2D

Heat Map After PCA

Poster

References

About

Uh oh!

Releases

Packages

Languages

AsthaBhalodiya/InsightScan-ML

Folders and files

Latest commit

History

Repository files navigation

Classifying Product from their Ingredients Based on various Machine Learning Algorithms.

Team InsightScan

CSE523 - Machine Learning, Ahmedabad University

Introduction

Results

Classification

Cumulative Explained Varience vs No. of Components

ROC Curve

Precision Recall Curve

Data Visualization after PCA with 2D

Heat Map After PCA

Poster

References

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages