About The Project

Data Ingestion Service: Scrapes NBA player box scores from basketball-reference.com
Data Processing Service: Processes raw data for model training
Feature Generation Service: Prepares features for the ML model
This pipeline is made to scrape box score data from basketball-reference for any NBA season with the month ranges October - July. Unusual seasons like 2019-2020 may call for adjustments to the scraping_config.yml config at data_pipeline_services/config/data_ingestion/scraping_config.yml

Uses XGBoost for predicting fantasy basketball points
Jupyter notebooks for model development and analysis stored in notebooks folder
model is stored in models folder
Model metrics and configuration are stored in config/model_metadata.yaml:

performance_metrics:
  test_mae: 6.163228800313608
  test_mse: 66.63370955961138
  test_rmse: 8.162947357395574
  test_r2: 0.7177536081562295

Provide feedback