Recompiled

numbats · Mar 12, 2024 · 6465726 · 6465726
1 parent 1c3eed2
commit 6465726
Show file tree

Hide file tree

Showing 47 changed files with 36 additions and 32 deletions.
diff --git a/_freeze/assignments/A1/execute-results/html.json b/_freeze/assignments/A1/execute-results/html.json
@@ -2,7 +2,7 @@
   "hash": "52ed2c72c91ae4f8daa8336e7db58c5e",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: Assignment 1\n---\n\n\n**You must provide forecasts for the following items:**\n\n  1. Google closing stock price on 20 March 2024 [[Data](https://finance.yahoo.com/quote/GOOG/)].\n  2. Maximum temperature at Melbourne airport on 10 April 2024 [[Data](http://www.bom.gov.au/climate/dwo/IDCJDW3049.latest.shtml)].\n  3. The difference in points (Collingwood minus Essendon) scored in the AFL match between Collingwood and Essendon for the Anzac Day clash. 25 April 2024 [[Data](https://en.wikipedia.org/wiki/Anzac_Day_match)].\n  4. The seasonally adjusted estimate of total employment for April 2024. ABS CAT 6202, to be released around mid May 2024 [[Data](https://www.abs.gov.au/statistics/labour/employment-and-unemployment/labour-force-australia/latest-release)].\n  5. Google closing stock price on 22 May 2024 [[Data](https://finance.yahoo.com/quote/GOOG/)].\n\n**For each of these, give a point forecast and an 80% prediction interval, and explain in a couple of sentences how each was obtained.**\n\n* The [Data] links give you possible data to start with, but you are free to use any data you like.\n* There is no need to use any fancy models or sophisticated methods. Simple is better for this assignment. The methods you use should be understandable to any high school student.\n* Full marks will be awarded if you submit the required information, and are able to meaningfully justify your results in a couple of sentences in each case.\n* Once the true values in each case are available, we will come back to this exercise and see who did the best using the scoring method described in class.\n* The student with the lowest score is the winner of our forecasting competition, and will win a $50 cash prize.\n* The assignment mark is not dependent on your score.\n\n\n<br><br><hr><b>Due: 8 March 2024</b><br><a href=https://learning.monash.edu/mod/quiz/view.php?id=2298116 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
+    "markdown": "---\ntitle: Assignment 1\n---\n\n\n\n\n**You must provide forecasts for the following items:**\n\n  1. Google closing stock price on 20 March 2024 [[Data](https://finance.yahoo.com/quote/GOOG/)].\n  2. Maximum temperature at Melbourne airport on 10 April 2024 [[Data](http://www.bom.gov.au/climate/dwo/IDCJDW3049.latest.shtml)].\n  3. The difference in points (Collingwood minus Essendon) scored in the AFL match between Collingwood and Essendon for the Anzac Day clash. 25 April 2024 [[Data](https://en.wikipedia.org/wiki/Anzac_Day_match)].\n  4. The seasonally adjusted estimate of total employment for April 2024. ABS CAT 6202, to be released around mid May 2024 [[Data](https://www.abs.gov.au/statistics/labour/employment-and-unemployment/labour-force-australia/latest-release)].\n  5. Google closing stock price on 22 May 2024 [[Data](https://finance.yahoo.com/quote/GOOG/)].\n\n**For each of these, give a point forecast and an 80% prediction interval, and explain in a couple of sentences how each was obtained.**\n\n* The [Data] links give you possible data to start with, but you are free to use any data you like.\n* There is no need to use any fancy models or sophisticated methods. Simple is better for this assignment. The methods you use should be understandable to any high school student.\n* Full marks will be awarded if you submit the required information, and are able to meaningfully justify your results in a couple of sentences in each case.\n* Once the true values in each case are available, we will come back to this exercise and see who did the best using the scoring method described in class.\n* The student with the lowest score is the winner of our forecasting competition, and will win a $50 cash prize.\n* The assignment mark is not dependent on your score.\n\n\n\n\n<br><br><hr><b>Due: 8 March 2024</b><br><a href=https://learning.monash.edu/mod/quiz/view.php?id=2298116 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"

diff --git a/_freeze/assignments/A2/execute-results/html.json b/_freeze/assignments/A2/execute-results/html.json
@@ -2,7 +2,7 @@
   "hash": "18d74fd2481bfeae1e9e0303e96f3d0b",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: Assignment 2\n---\n\n\nThis assignment will use the same data that you will use in the [retail project](Project.qmd) later in semester. Each student will use a different time series, selected using their student ID number as follows.\n\n```r\n# Replace the seed with your student ID\nset.seed(12345678)\nretail <- readr::read_rds(\"https://bit.ly/monashretaildata\") |>\n  filter(`Series ID` == sample(`Series ID`, 1))\n```\n\n  1. Plot your time series using the `autoplot()` command. What do you learn from the plot?\n  2. Plot your time series using the `gg_season()` command. What do you learn from the plot?\n  3. Plot your time series using the `gg_subseries()` command. What do you learn from the plot?\n  4. Find an appropriate Box-Cox transformation for your data and explain why you have chosen the particular transformation parameter $\\lambda$.\n  5. Produce a plot of an STL decomposition of the transformed data. What do you learn from the plot?\n\nFor all plots, please use appropriate axis labels and titles.\n\nYou need to submit one Rmarkdown or Quarto file which implements all steps above.\n\nTo receive full marks, the Rmd or qmd file must compile without errors.\n\n\n<br><br><hr><b>Due: 22 March 2024</b><br><a href=https://learning.monash.edu/mod/assign/view.php?id=2034165 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
+    "markdown": "---\ntitle: Assignment 2\n---\n\n\n\n\nThis assignment will use the same data that you will use in the [retail project](Project.qmd) later in semester. Each student will use a different time series, selected using their student ID number as follows.\n\n```r\n# Replace the seed with your student ID\nset.seed(12345678)\nretail <- readr::read_rds(\"https://bit.ly/monashretaildata\") |>\n  filter(`Series ID` == sample(`Series ID`, 1))\n```\n\n  1. Plot your time series using the `autoplot()` command. What do you learn from the plot?\n  2. Plot your time series using the `gg_season()` command. What do you learn from the plot?\n  3. Plot your time series using the `gg_subseries()` command. What do you learn from the plot?\n  4. Find an appropriate Box-Cox transformation for your data and explain why you have chosen the particular transformation parameter $\\lambda$.\n  5. Produce a plot of an STL decomposition of the transformed data. What do you learn from the plot?\n\nFor all plots, please use appropriate axis labels and titles.\n\nYou need to submit one Rmarkdown or Quarto file which implements all steps above.\n\nTo receive full marks, the Rmd or qmd file must compile without errors.\n\n\n\n\n<br><br><hr><b>Due: 22 March 2024</b><br><a href=https://learning.monash.edu/mod/assign/view.php?id=2034165 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"

diff --git a/_freeze/assignments/A3/execute-results/html.json b/_freeze/assignments/A3/execute-results/html.json
@@ -2,7 +2,7 @@
   "hash": "dd25dc06ba2854b756fac0005bb4a47e",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: Assignment 3\n---\n\n\nThis assignment will use national population data from 1960 -- 2022. Each student will use a different time series, selected using their student ID number as follows.\n\n```r\n# Replace seed with your student ID\nset.seed(12345678)\npop <- readr::read_rds(\"https://bit.ly/monashpopulationdata\") |>\n  filter(Country == sample(Country, 1))\n```\n\nPopulation should be modelled as a logarithm as it increases exponentially.\n\n1. Using a test set of 2018--2022, fit an ETS model chosen automatically, and three benchmark methods to the training data. Which gives the best forecasts on the test set, based on RMSE?\n2. Check the residuals from the best model using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise?\n3. Now use time-series cross-validation with a minimum sample size of 15 years, a step size of 1 year, and a forecast horizon of 5 years. Calculate the RMSE of the results. Does it change the conclusion you reach based on the test set?\n4. Which of these two methods of computing accuracy is more reliable? Why?\n\nSubmit an Rmd or qmd file which carries out the above analysis. You need to submit one file which implements all steps above.\n\nTo receive full marks, the Rmd or qmd file must compile without errors.\n\n\n<br><br><hr><b>Due: 12 April 2024</b><br><a href=https://learning.monash.edu/mod/assign/view.php?id=2034169 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
+    "markdown": "---\ntitle: Assignment 3\n---\n\n\n\n\nThis assignment will use national population data from 1960 -- 2022. Each student will use a different time series, selected using their student ID number as follows.\n\n```r\n# Replace seed with your student ID\nset.seed(12345678)\npop <- readr::read_rds(\"https://bit.ly/monashpopulationdata\") |>\n  filter(Country == sample(Country, 1))\n```\n\nPopulation should be modelled as a logarithm as it increases exponentially.\n\n1. Using a test set of 2018--2022, fit an ETS model chosen automatically, and three benchmark methods to the training data. Which gives the best forecasts on the test set, based on RMSE?\n2. Check the residuals from the best model using an ACF plot and a Ljung-Box test. Do the residuals appear to be white noise?\n3. Now use time-series cross-validation with a minimum sample size of 15 years, a step size of 1 year, and a forecast horizon of 5 years. Calculate the RMSE of the results. Does it change the conclusion you reach based on the test set?\n4. Which of these two methods of computing accuracy is more reliable? Why?\n\nSubmit an Rmd or qmd file which carries out the above analysis. You need to submit one file which implements all steps above.\n\nTo receive full marks, the Rmd or qmd file must compile without errors.\n\n\n\n\n<br><br><hr><b>Due: 12 April 2024</b><br><a href=https://learning.monash.edu/mod/assign/view.php?id=2034169 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"

diff --git a/_freeze/assignments/A4/execute-results/html.json b/_freeze/assignments/A4/execute-results/html.json
@@ -2,7 +2,7 @@
   "hash": "fb25018867f57bd07a76176f4b58a69e",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: Assignment 4\n---\n\n\n## Background\n\nHere is a function that generates data from an AR(1) model starting with the first value set to 0\n\n```r\ngenerate_ar1 <- function(n = 100, c = 0, phi, sigma = 1) {\n  # Generate errors\n  error <- rnorm(n, mean = 0, sd = sigma)\n  # Set up vector for the response with initial values set to 0\n  y <- rep(0, n)\n  # Generate remaining observations\n  for(i in seq(2, length = n-1)) {\n    y[i] <- c + phi * y[i-1] + error[i]\n  }\n  return(y)\n}\n```\n\nHere `n` is the number of observations to simulate, `c` is the constant, `phi` is the AR coefficient, and `sigma` is the standard deviation of the noise. The following example shows the function being used to generate 50 observations\n\n```r\nlibrary(fpp3)\ntsibble(time = 1:50, y = generate_ar1(n=50, c=1, phi=0.8), index = time) |>\n  autoplot(y)\n```\n\n## Instructions\n\n<ol>\n<li> Modify the `generate_ar1` function to generate data from an ARIMA(p,d,q) model with parameters to be specified by the user. The first line of your function definition should be\n\n  ```r\n  generate_arima <- function(n = 100, d = 0, c = 0, phi = NULL, theta = NULL, sigma = 1)\n  ```\n\n   Here `phi` and `theta` are vectors of AR and MA coefficients. Your function should return a numeric vector of length `n`.\n\n   For example `generate_arima(n = 50, d = 1, c = 2, theta = c(0.4, -0.6))` should return 50 observations generated from the ARIMA(2,1,0) model\n   $$y_t = y_{t-1} + 2 + 0.4\\varepsilon_{t-1} - 0.6\\varepsilon_{t-2} + \\varepsilon_t$$\n   where $\\varepsilon \\sim N(0,1)$.\n\n<li> The noise should be generated using the `rnorm()` function.\n\n<li> Your function should check stationarity and invertibility conditions and return an error if either condition is not satisfied. You can use the `stop()` function to generate an error. The model will be stationary if the following expression returns `TRUE`:\n\n  ```r\n  !any(abs(polyroot(c(1,-phi))) <= 1)\n  ```\n\n  The MA parameters will be invertible if the following expression returns `TRUE`:\n\n  ```r\n  !any(abs(polyroot(c(1,theta))) <= 1)\n  ```\n\n<li> The above function sets the first value of every series to 0. Your function should fix this problem by generating more observations than required and then discarding the first few observations. You will need to consider how many observations to discard, to prevent the returned series from being affected by the initial values.\n\n<li> You may find the `diffinv()` function useful.\n</ol>\n\nPlease submit your solution as a .R file.\n\n\n<br><br><hr><b>Due: 3 May 2024</b><br><a href=https://learning.monash.edu/mod/assign/view.php?id=2034170 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
+    "markdown": "---\ntitle: Assignment 4\n---\n\n\n\n\n## Background\n\nHere is a function that generates data from an AR(1) model starting with the first value set to 0\n\n```r\ngenerate_ar1 <- function(n = 100, c = 0, phi, sigma = 1) {\n  # Generate errors\n  error <- rnorm(n, mean = 0, sd = sigma)\n  # Set up vector for the response with initial values set to 0\n  y <- rep(0, n)\n  # Generate remaining observations\n  for(i in seq(2, length = n-1)) {\n    y[i] <- c + phi * y[i-1] + error[i]\n  }\n  return(y)\n}\n```\n\nHere `n` is the number of observations to simulate, `c` is the constant, `phi` is the AR coefficient, and `sigma` is the standard deviation of the noise. The following example shows the function being used to generate 50 observations\n\n```r\nlibrary(fpp3)\ntsibble(time = 1:50, y = generate_ar1(n=50, c=1, phi=0.8), index = time) |>\n  autoplot(y)\n```\n\n## Instructions\n\n<ol>\n<li> Modify the `generate_ar1` function to generate data from an ARIMA(p,d,q) model with parameters to be specified by the user. The first line of your function definition should be\n\n  ```r\n  generate_arima <- function(n = 100, d = 0, c = 0, phi = NULL, theta = NULL, sigma = 1)\n  ```\n\n   Here `phi` and `theta` are vectors of AR and MA coefficients. Your function should return a numeric vector of length `n`.\n\n   For example `generate_arima(n = 50, d = 1, c = 2, theta = c(0.4, -0.6))` should return 50 observations generated from the ARIMA(2,1,0) model\n   $$y_t = y_{t-1} + 2 + 0.4\\varepsilon_{t-1} - 0.6\\varepsilon_{t-2} + \\varepsilon_t$$\n   where $\\varepsilon \\sim N(0,1)$.\n\n<li> The noise should be generated using the `rnorm()` function.\n\n<li> Your function should check stationarity and invertibility conditions and return an error if either condition is not satisfied. You can use the `stop()` function to generate an error. The model will be stationary if the following expression returns `TRUE`:\n\n  ```r\n  !any(abs(polyroot(c(1,-phi))) <= 1)\n  ```\n\n  The MA parameters will be invertible if the following expression returns `TRUE`:\n\n  ```r\n  !any(abs(polyroot(c(1,theta))) <= 1)\n  ```\n\n<li> The above function sets the first value of every series to 0. Your function should fix this problem by generating more observations than required and then discarding the first few observations. You will need to consider how many observations to discard, to prevent the returned series from being affected by the initial values.\n\n<li> You may find the `diffinv()` function useful.\n</ol>\n\nPlease submit your solution as a .R file.\n\n\n\n\n<br><br><hr><b>Due: 3 May 2024</b><br><a href=https://learning.monash.edu/mod/assign/view.php?id=2034170 class = 'badge badge-large badge-blue'><font size='+2'>&nbsp;&nbsp;<b>Submit</b>&nbsp;&nbsp;</font><br></a>\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"