- Oliver Gatland
- Arianna Michelangelo
- Pere Pericot
- Edward Monbiot
As a newly appointed data scientist team at a top tier football club our role is to leverage data analytics to enhance the team's talent development strategy. Leveraging data from FIFA 2022, this project aims to establish a robust classification model that can accurately recommend the optimal playing position for each youth player. By analyzing players' physical attributes and their correlation with on-field performance, this project delivers actionable insights to coaching staff. This facilitates optimal player positioning and skill development, harnessing data-driven strategies to enhance team performance.
- Data Preprocessing and Quality Assurance: Implementation of rigorous data cleansing and preprocessing.
- Exploratory Data Analysis (EDA): Employing statistical and visualization techniques for insights.
- Feature Engineering: Innovating in feature creation and selection (KNN)
- Model Development and Evaluation: Utilizing and evaluating models: logistic regression, RandomForest, LightGBM, and ensembles.
- Cross-Validation Techniques: Applying StratifiedKFold and LeaveOneOut to check for robustness of models.
- Problem-Solving: Addressed a real-world sports analytics challenge with actionable data insights.
- Languages: Python
- Libraries: Pandas, Seaborn, Matplotlib, Scikit-Learn, LightGBM
- Data Analysis: Statistical analysis and data visualization
- Machine Learning: Implementation of diverse machine learning algorithms
This project is part of our journey in mastering data science and machine learning within a collaborative team setting. It underscores our expertise in data processing, analysis and modelling in solving complex problems.
Detailed instructions for setting up and running the project are provided, ensuring straightforward replication and comprehension.