This project was developed for the Kaggle competition within the Data Science for Decision Making Master's program at the Barcelona School of Economics.
Leveraging the extensive player data from FIFA 2022, this project endeavors to establish a robust classification model that can accurately recommend the optimal playing position for each player. By delving into the players' physical attributes and their correlation with on-field performance, the project aims to provide actionable insights to the coaching staff, fostering effective player positioning and skill development.
-
Data Preprocessing and Quality Assurance: Thoroughly clean the FIFA 2022 player dataset, addressing missing values, outliers, and inconsistencies to ensure the accuracy and reliability of subsequent analyses.
-
Feature Creation and Engineering: Utilize domain knowledge to craft meaningful new features from the existing dataset, amplifying the predictive power of the model and capturing nuanced player attributes.
-
Exploratory Data Analysis (EDA): Conduct an in-depth analysis of the most influential features, unveiling correlations between physical attributes and playing positions, guiding feature selection and model development.
-
Cross-Validation Strategy: Implement robust cross-validation techniques to mitigate overfitting and optimize model generalization performance.
-
Model Selection and Justification: Deploy a minimum of two machine learning models, such as Random Forest and Gradient Boosting, seen in class. Elaborate on the rationale behind each model choice, considering their ability to handle non-linearity, feature interactions, and overall performance.
-
Performance Evaluation: Utilize appropriate evaluation metrics such as accuracy, precision, recall, and F1-score to comprehensively assess the models' classification performance.
pip install requirements.txt
- Viktoriia Yuzkiv
- Maëlys Boudier
- Miguel Handt Fueyo