This project aims to estimate obesity levels based on eating habits and physical conditions of individuals from Mexico, Peru, and Colombia, aged between 14 and 61. The dataset used is sourced from the UCI Machine Learning Repository.
The main objective of the project is to explore how various features, such as eating habits and physical conditions, can be used to predict obesity levels in individuals. The project employs several classification algorithms to build the predictive model, including:
- K-Nearest Neighbors (KNN)
- Random Forest
- LightGBM (LGBM)
-
Data Analysis & Preprocessing:
- The data consists of multiple features related to the lifestyle, eating habits, and physical conditions of individuals.
- The dataset undergoes necessary cleaning, feature engineering, and preprocessing to ensure compatibility with the classification models.
-
Model Building:
- The project implements three different classification algorithms: KNN, Random Forest, and LightGBM.
- Hyperparameters of the models are tuned to achieve optimal performance.
-
Performance Evaluation:
- Various performance metrics such as accuracy, precision, recall, and F1-score are used to evaluate the effectiveness of the models.
- Visualizations are generated to compare the performance of the models and to identify the most significant features influencing obesity levels.
-
Feature Importance:
- The project also explores the feature importance in predicting obesity levels, helping to understand which factors play the most critical role in the classification.
-
Algorithms Used:
- KNN (K-Nearest Neighbors)
- Random Forest
- LightGBM (LGBM)
-
Performance Metrics:
- Accuracy
- Precision
- Recall
- F1-Score
-
Visualization:
- Performance comparisons of the models.
- Feature importance analysis.