Skip to content

The goal of this project was to predict the closing stock price of the fictional company Waystar Royco using Regression and Time Series Models.

Notifications You must be signed in to change notification settings

smruthig/Stock-Price-Forecasting-and-Model-Comparison

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

9 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Stock Price Forecasting and Model Comparison

Trading today with tomorrow's signals

last-commit repo-top-language repo-language-count

Developed with the software and tools below.

scikitlearn Jupyter SciPy pandas NumPy


Quick Links


Overview

The goal of this project was to predict the closing stock price of the fictional company Waystar Royco (WAYA US) from July 30, 2021, to September 10, 2021. Historical stock price data from August 14, 2015, to July 29, 2021, was provided, including opening price, high price, low price, closing price, and trading volume for each day.

The task was to use regression and time series modeling techniques to make predictions, compare the models, and determine which is best suited for this type of stock price forecasting. Disclaimer that Waystar Royco is a fictional company, so external factors beyond the provided data should not be considered.

Original Data for the project: https://www.kaggle.com/competitions/ue19cs312-assignment/data


Results

  • Data exploration showed no null values or significant outliers. Features were highly correlated.
  • Tried PCA for dimensionality reduction but 2 components explained all variability so it was not needed.
  • Scaled and normalized data before modeling.
  • Regression models tried: Linear Regression, Ridge, Lasso, Kernel Ridge, KNN. Linear Regression performed best.
  • Time series models tried: ARIMA, SARIMAX, Holt-Winters. SARIMAX gave the lowest RMSE.
  • Best SARIMAX parameters were (2,1,1)(2,1,1) with m=52 based on seasonal period.
  • Regression beat time series overall in terms of performance metrics.

Analysis

The linear regression model was simpler, avoided overfitting the seasonal patterns, and handled the fluctuations better than time series models. It had an RMSE of around 1.45 on the test set while SARIMAX achieved a much higher RMSE.

Even after tuning SARIMAX, the regression model was more robust. This indicates that classical regression is well-suited for this type of stock price forecasting problem, where a linear combination of the open, high, low prices and volume provides a good fit.

Time series should not be ruled out though, as they can capture seasonality and cyclic trends. With more complex data or longer time spans, SARIMAX may start to outperform regression. Overall the analysis shows regression as the better approach for now, but both should be considered depending on the structure of stock data.


Repository Structure

└── Stock-Price-Forecasting-and-Model-Comparison/
    ├── README.md
    ├── data
    │   ├── test.csv
    │   └── train.csv
    ├── gridsearch_results.txt
    ├── notebooks
    │   ├── gridsearch_cv_script.ipynb
    │   └── submission.ipynb
    ├── requirements.txt
    └── submission.csv

About

The goal of this project was to predict the closing stock price of the fictional company Waystar Royco using Regression and Time Series Models.

Topics

Resources

Stars

Watchers

Forks