diff --git a/_freeze/assignments/A3/execute-results/html.json b/_freeze/assignments/A3/execute-results/html.json index ce8e71f..699e437 100644 --- a/_freeze/assignments/A3/execute-results/html.json +++ b/_freeze/assignments/A3/execute-results/html.json @@ -1,9 +1,11 @@ { - "hash": "384b347a77fbf3979a342be45278343a", + "hash": "83e21a1c36ba4c572e69f73b51b6309b", "result": { "engine": "knitr", - "markdown": "---\ntitle: Assignment 3\n---\n\nThis assignment will use the same data that you will use in the [retail project](Project.qmd) later in the semester. Each student will use a different time series, selected using their student ID number as follows.\n\n```r\nlibrary(fpp3)\nget_my_data <- function(student_id) {\n set.seed(student_id)\n all_data <- readr::read_rds(\"https://bit.ly/monashretaildata\")\n while(TRUE) {\n retail <- filter(all_data, `Series ID` == sample(`Series ID`, 1))\n if(!any(is.na(fill_gaps(retail)$Turnover))) return(retail)\n }\n}\n# Replace the argument with your student ID\nretail <- get_my_data(12345678)\n```\n\nUse a training set up to and including 2018.\n\n* What transformations (Box-Cox and/or differencing) would be required to make the data stationary? You should use a unit-root test as part of the discussion.\n* Use a plot of the ACF and PACF of the (possibly differenced) data to determine two plausible models for this data set.\n* Fit both models, along with an automatically chosen model, and produce forecasts for 2019--2022.\n* Which model is best based on AIC? Which model is best based on the test set RMSE? Which do you think is best to use for future forecasts? Why?\n* Check the residuals from your preferred model, using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise?\n\nSubmit a Quarto (`qmd`) file which carries out the above analysis. You need to submit one file which implements all steps above. You may use this file as a starting point.\n\n


Due: 16 May 2025
  Submit  
\n", - "supporting": [], + "markdown": "---\ntitle: Assignment 3\n---\n\nThis assignment will use the same data that you will use in the [retail project](Project.qmd) later in the semester. Each student will use a different time series, selected using their student ID number as follows.\n\n```r\nlibrary(fpp3)\nget_my_data <- function(student_id) {\n set.seed(student_id)\n all_data <- readr::read_rds(\"https://bit.ly/monashretaildata\")\n while(TRUE) {\n retail <- filter(all_data, `Series ID` == sample(`Series ID`, 1))\n if(!any(is.na(fill_gaps(retail)$Turnover))) return(retail)\n }\n}\n# Replace the argument with your student ID\nretail <- get_my_data(12345678)\n```\n\nUse a training set up to and including 2018.\n\n* What transformations (Box-Cox and/or differencing) would be required to make the data stationary? You should use a unit-root test as part of the discussion.\n* Use a plot of the ACF and PACF of the (possibly differenced) data to determine two plausible ARIMA models for this data set.\n* Fit both models, along with an automatically chosen model, and produce forecasts for 2019--2022.\n* Which model is best based on AIC? Which model is best based on the test set RMSE? Which do you think is best to use for future forecasts? Why?\n* Check the residuals from your preferred model, using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise?\n\nSubmit a Quarto (`qmd`) file which carries out the above analysis. You need to submit one file which implements all steps above. You may use this file as a starting point.\n\n


Due: 16 May 2025
  Submit  
\n", + "supporting": [ + "A3_files" + ], "filters": [ "rmarkdown/pagebreak.lua" ], diff --git a/_freeze/assignments/Project/execute-results/html.json b/_freeze/assignments/Project/execute-results/html.json index cfc4699..9e5fcc1 100644 --- a/_freeze/assignments/Project/execute-results/html.json +++ b/_freeze/assignments/Project/execute-results/html.json @@ -1,9 +1,11 @@ { - "hash": "d12847e1959878cc9a8c39b1eefa8900", + "hash": "c254cf2950b657b5686338f94ca6d3fe", "result": { "engine": "knitr", - "markdown": "---\ntitle: Retail Project\n---\n\n**Objective:** To forecast a real time series using ETS and ARIMA models.\n\n**Data:** The data are monthly measures of retail trade volume in Australia, [obtained from the ABS](https://www.abs.gov.au/methodologies/retail-trade-australia-methodology/jan-2024). Each student will be use a different time series, selected using their student ID number as follows. This is the same series that you used in previous assignments.\n\n```r\nlibrary(fpp3)\nget_my_data <- function(student_id) {\n set.seed(student_id)\n all_data <- readr::read_rds(\"https://bit.ly/monashretaildata\")\n while(TRUE) {\n retail <- filter(all_data, `Series ID` == sample(`Series ID`, 1))\n if(!any(is.na(fill_gaps(retail)$Turnover))) return(retail)\n }\n}\n# Replace the argument with your student ID\nretail <- get_my_data(12345678)\n```\n\n**Assignment value:** This assignment is worth 10% of the overall unit assessment. You may copy and paste material from previous assignments, but you must take into account any feedback that you have received on these assignments.\n\n**Report:**\n\nYou should produce forecasts of the series using ETS and ARIMA models. Write a report in Quarto format of your analysis explaining carefully what you have done and why you have done it. Your report should include the following elements.\n\n* Produce some plots of your series, and describe what you learn from each plot. [2 marks]\n* Discuss the statistical features of the data, including the effect of COVID-19 on your series. [2 marks]\n* Find an appropriate Box-Cox transformation for your data and explain why you have chosen the particular transformation parameter $\\lambda$.\n* Produce a plot of an STL decomposition of the transformed data. What do you learn from the plot?\n* What differencing would be required to make the data stationary? You should use a unit-root test as part of the discussion. [4 marks]\n* A description of the methodology used to create a short-list of appropriate ARIMA models and ETS models. Include discussion of AIC values as well as results from applying the models to a test-set consisting of the last 24 months of the data provided. [6 marks]\n* Choose one ARIMA model and one ETS model based on this analysis and show parameter estimates, residual diagnostics, forecasts and prediction intervals for both models. Diagnostic checking for both models should include ACF graphs and the Ljung-Box test. [8 marks]\n* Comparison of the results from each of your preferred models. Which method do you think gives the better forecasts? Explain with reference to the test-set. [2 marks]\n* Apply your two chosen models to the full data set, re-estimating the parameters but not changing the model structure. Produce out-of-sample point forecasts and 80% prediction intervals for each model for two years past the end of the data provided. [4 marks]\n* Obtain up-to-date data from the [ABS website](https://www.abs.gov.au/statistics/industry/retail-and-wholesale-trade/retail-trade-australia) (Table 11). You may need to use the previous release of data, rather than the latest release. Compare your forecasts with the actual numbers. How well did you do? [5 marks]\n* A discussion of benefits and limitations of the models for your data. [3 marks]\n* Graphs should be properly labelled, including appropriate units of measurement. [3 marks]\n\n**Notes**\n\n* Your submission must include the Quarto file (`.qmd`), and should run without error.\n* There will be a 5 marks penalty if file does not run without error.\n* When using the updated ABS data set, do not edit the downloaded file in any way.\n* There is no need to provide the updated ABS data with your submission.\n\n


Due: 30 May 2025
  Submit  
\n", - "supporting": [], + "markdown": "---\ntitle: Retail Project\n---\n\n**Objective:** To forecast a real time series using ETS and ARIMA models.\n\n**Data:** The data are monthly measures of retail trade volume in Australia, [obtained from the ABS](https://www.abs.gov.au/methodologies/retail-trade-australia-methodology/jan-2024). Each student will be use a different time series, selected using their student ID number as follows. This is the same series that you used in previous assignments.\n\n```r\nlibrary(fpp3)\nget_my_data <- function(student_id) {\n set.seed(student_id)\n all_data <- readr::read_rds(\"https://bit.ly/monashretaildata\")\n while(TRUE) {\n retail <- filter(all_data, `Series ID` == sample(`Series ID`, 1))\n if(!any(is.na(fill_gaps(retail)$Turnover))) return(retail)\n }\n}\n# Replace the argument with your student ID\nretail <- get_my_data(12345678)\n```\n\n**Assignment value:** This assignment is worth 12% of the overall unit assessment. You may copy and paste material from previous assignments, but you must take into account any feedback that you have received on these assignments.\n\n**Report:**\n\nYou should produce forecasts of the series using ETS and ARIMA models. Write a report in Quarto format of your analysis explaining carefully what you have done and why you have done it. Your report should include the following elements.\n\n* Produce some plots of your series, and describe what you learn from each plot. [2 marks]\n* Discuss the statistical features of the data, including the effect of COVID-19 on your series. [2 marks]\n* Find an appropriate Box-Cox transformation for your data and explain why you have chosen the particular transformation parameter $\\lambda$.\n* Produce a plot of an STL decomposition of the transformed data. What do you learn from the plot?\n* What differencing would be required to make the data stationary? You should use a unit-root test as part of the discussion. [4 marks]\n* A description of the methodology used to create a short-list of appropriate ARIMA models and ETS models. Include discussion of AIC values as well as results from applying the models to a test-set consisting of the last 24 months of the data provided. [6 marks]\n* Choose one ARIMA model and one ETS model based on this analysis and show parameter estimates, residual diagnostics, forecasts and prediction intervals for both models. Diagnostic checking for both models should include ACF graphs and the Ljung-Box test. [8 marks]\n* Comparison of the results from each of your preferred models. Which method do you think gives the better forecasts? Explain with reference to the test-set. [2 marks]\n* Apply your two chosen models to the full data set, re-estimating the parameters but not changing the model structure. Produce out-of-sample point forecasts and 80% prediction intervals for each model for two years past the end of the data provided. [4 marks]\n* Obtain up-to-date data from the [ABS website](https://www.abs.gov.au/statistics/industry/retail-and-wholesale-trade/retail-trade-australia) (Table 11). You may need to use the previous release of data, rather than the latest release. Compare your forecasts with the actual numbers. How well did you do? [5 marks]\n* A discussion of benefits and limitations of the models for your data. [3 marks]\n* Graphs should be properly labelled, including appropriate units of measurement. [3 marks]\n\n**Notes**\n\n* Your submission must include the Quarto file (`.qmd`), and should run without error.\n* There will be a 5 marks penalty if file does not run without error.\n* When using the updated ABS data set, do not edit the downloaded file in any way.\n* There is no need to provide the updated ABS data with your submission.\n\n


Due: 30 May 2025
  Submit  
\n", + "supporting": [ + "Project_files" + ], "filters": [ "rmarkdown/pagebreak.lua" ], diff --git a/assignments/A3.qmd b/assignments/A3.qmd index 5cc9559..c85a5d3 100644 --- a/assignments/A3.qmd +++ b/assignments/A3.qmd @@ -21,7 +21,7 @@ retail <- get_my_data(12345678) Use a training set up to and including 2018. * What transformations (Box-Cox and/or differencing) would be required to make the data stationary? You should use a unit-root test as part of the discussion. -* Use a plot of the ACF and PACF of the (possibly differenced) data to determine two plausible models for this data set. +* Use a plot of the ACF and PACF of the (possibly differenced) data to determine two plausible ARIMA models for this data set. * Fit both models, along with an automatically chosen model, and produce forecasts for 2019--2022. * Which model is best based on AIC? Which model is best based on the test set RMSE? Which do you think is best to use for future forecasts? Why? * Check the residuals from your preferred model, using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise? diff --git a/assignments/Project.qmd b/assignments/Project.qmd index 050e64b..f50bcf4 100644 --- a/assignments/Project.qmd +++ b/assignments/Project.qmd @@ -20,7 +20,7 @@ get_my_data <- function(student_id) { retail <- get_my_data(12345678) ``` -**Assignment value:** This assignment is worth 10% of the overall unit assessment. You may copy and paste material from previous assignments, but you must take into account any feedback that you have received on these assignments. +**Assignment value:** This assignment is worth 12% of the overall unit assessment. You may copy and paste material from previous assignments, but you must take into account any feedback that you have received on these assignments. **Report:**