Updated week4 seminar

numbats · Mar 20, 2024 · e0f1a63 · e0f1a63
1 parent 8ada98a
commit e0f1a63
Show file tree

Hide file tree

Showing 15 changed files with 211 additions and 34 deletions.
diff --git a/_freeze/week4/index/execute-results/html.json b/_freeze/week4/index/execute-results/html.json
@@ -2,7 +2,7 @@
   "hash": "e7f94273921daac99cfd07d9d41f4c79",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Week 4: The forecaster's toolbox\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Four benchmark forecasting methods that we will use for comparison\n* Fitted values, residuals\n* Forecasting with transformations\n\n## Pre-class activities\n\nRead [Chapter 5 of the textbook](https://otexts.com/fpp3/toolbox.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-5, 9-10 from [Section 3.7 of the book](https://otexts.com/fpp3/decomposition-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week4/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week4/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\n\n\n * Produce forecasts using an appropriate benchmark method for household wealth (`hh_budget`). Plot the results using `autoplot()`.\n * Produce forecasts using an appropriate benchmark method for Australian takeaway food turnover (`aus_retail`). Plot the results using `autoplot()`.\n\n  * Compute seasonal naïve forecasts for quarterly Australian beer production from 1992.\n  * Test if the residuals are white noise. What do you conclude?\n\n * Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.\n * Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.\n * Compute the accuracy of your forecasts. Which method does best?\n * Repeat the exercise using the Australian takeaway food turnover data (`aus_retail`) with a test set of four years.\n\n\n\n## Assignments\n\n* [Assignment 2](../assignments/A2.qmd) is due on Sunday 24 March.\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
+    "markdown": "---\ntitle: \"Week 4: The forecaster's toolbox\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Four benchmark forecasting methods that we will use for comparison\n* Fitted values, residuals\n* Forecasting with transformations\n\n## Pre-class activities\n\nRead [Chapter 5 of the textbook](https://otexts.com/fpp3/toolbox.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-5, 9-10 from [Section 3.7 of the book](https://otexts.com/fpp3/decomposition-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week4/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week4/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\n1. Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.\n2. Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.\n3. Compute the accuracy of your forecasts. Which method does best?\n4. Do the residuals from the best method resemble white noise?\n\n\n\n## Assignments\n\n* [Assignment 2](../assignments/A2.qmd) is due on Sunday 24 March.\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"

diff --git a/_freeze/week4/slides/execute-results/tex.json b/_freeze/week4/slides/execute-results/tex.json
@@ -0,0 +1,17 @@
+{
+  "hash": "ee3371da1701f715a268398f50289e8b",
+  "result": {
+    "engine": "knitr",
+    "markdown": "---\ntitle: ETC3550/ETC5550 Applied forecasting\nauthor: \"Week 4: The Forecasters' toolbox\"\nformat:\n  beamer:\n    aspectratio: 169\n    fontsize: 14pt\n    section-titles: false\n    knitr:\n      opts_chunk:\n        dev: \"cairo_pdf\"\n    pdf-engine: pdflatex\n    fig-width: 7.5\n    fig-height: 3.5\n    include-in-header: ../header.tex\n---\n\n\n\n::: {.cell}\n\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest1-1.pdf)\n:::\n:::\n\n\n\n\\pause\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot1-1.pdf)\n:::\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest2-1.pdf)\n:::\n:::\n\n\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot2-1.pdf)\n:::\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest3-1.pdf)\n:::\n:::\n\n\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot3-1.pdf)\n:::\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest4-1.pdf)\n:::\n:::\n\n\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot4-1.pdf)\n:::\n:::\n\n\n\n\\only<2>{\\begin{textblock}{8}(.5,6.4)\\begin{block}{}\\fontsize{12}{13}\\sf\n\\begin{itemize}\\tightlist\n\\item Forecast accuracy averaged over test sets.\n\\item Also known as \"evaluation on a rolling forecasting origin\"\n\\end{itemize}\\end{block}\\end{textblock}}\n\n\\vspace*{10cm}\n\n\n## Your turn\n\\fontsize{13}{14}\\sf\n\n\n\n\n1. Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.\n2. Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.\n3. Compute the accuracy of your forecasts. Which method does best?\n",
+    "supporting": [
+      "slides_files"
+    ],
+    "filters": [
+      "rmarkdown/pagebreak.lua"
+    ],
+    "includes": {},
+    "engineDependencies": {},
+    "preserve": null,
+    "postProcess": false
+  }
+}
diff --git a/_freeze/week4/slides/figure-beamer/traintest1-1.pdf b/_freeze/week4/slides/figure-beamer/traintest1-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/traintest2-1.pdf b/_freeze/week4/slides/figure-beamer/traintest2-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/traintest3-1.pdf b/_freeze/week4/slides/figure-beamer/traintest3-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/traintest4-1.pdf b/_freeze/week4/slides/figure-beamer/traintest4-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/tscvggplot1-1.pdf b/_freeze/week4/slides/figure-beamer/tscvggplot1-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/tscvggplot2-1.pdf b/_freeze/week4/slides/figure-beamer/tscvggplot2-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/tscvggplot3-1.pdf b/_freeze/week4/slides/figure-beamer/tscvggplot3-1.pdf
diff --git a/_freeze/week4/slides/figure-beamer/tscvggplot4-1.pdf b/_freeze/week4/slides/figure-beamer/tscvggplot4-1.pdf
diff --git a/_freeze/week5/index/execute-results/html.json b/_freeze/week5/index/execute-results/html.json
@@ -1,9 +1,11 @@
 {
-  "hash": "20b8222433f8e5a786d02c88a4cb49c6",
+  "hash": "cfabdd841ad01c84069586943cfb29bb",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Week 5: Exponential smoothing\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Simple exponential smoothing\n* Corresponding ETS models\n\n## Pre-class activities\n\nRead [Sections 8.1-8.4 of the textbook](https://otexts.com/fpp3/expsmooth.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-6, 8-9, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week5/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week5/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\nTry forecasting the Chinese GDP from the `global_economy` data set using an ETS model.\n\nExperiment with the various options in the `ETS()` function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.\n\n[Hint: use `h=20` when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]\n\n\n\n## Assignments\n\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
-    "supporting": [],
+    "markdown": "---\ntitle: \"Week 5: Exponential smoothing\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Simple exponential smoothing\n* Corresponding ETS models\n\n## Pre-class activities\n\nRead [Sections 8.1-8.4 of the textbook](https://otexts.com/fpp3/expsmooth.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-6, 8, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week5/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week5/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\nTry forecasting the Chinese GDP from the `global_economy` data set using an ETS model.\n\nExperiment with the various options in the `ETS()` function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.\n\n[Hint: use `h=20` when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]\n\n\n\n## Assignments\n\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
+    "supporting": [
+      "index_files"
+    ],
     "filters": [
       "rmarkdown/pagebreak.lua"
     ],

diff --git a/week4/activities.md b/week4/activities.md
@@ -1,12 +1,4 @@
-
-
- * Produce forecasts using an appropriate benchmark method for household wealth (`hh_budget`). Plot the results using `autoplot()`.
- * Produce forecasts using an appropriate benchmark method for Australian takeaway food turnover (`aus_retail`). Plot the results using `autoplot()`.
-
-  * Compute seasonal naïve forecasts for quarterly Australian beer production from 1992.
-  * Test if the residuals are white noise. What do you conclude?
-
- * Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.
- * Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
- * Compute the accuracy of your forecasts. Which method does best?
- * Repeat the exercise using the Australian takeaway food turnover data (`aus_retail`) with a test set of four years.
+1. Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.
+2. Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
+3. Compute the accuracy of your forecasts. Which method does best?
+4. Do the residuals from the best method resemble white noise?
diff --git a/week4/seminar_code.R b/week4/seminar_code.R
@@ -1,16 +1,14 @@
 library(fpp3)
 
-## ---- Holiday tourism by state------------------------------------------------
+## ---- Holiday tourism by state --------------
 
 holidays <- tourism |>
   as_tibble() |>
   filter(Purpose == "Holiday") |>
   summarise(Trips = sum(Trips), .by = c("State", "Quarter")) |>
   as_tsibble(index = Quarter, key = State)
 
-
-## Fit models ------------------------------------------------------------------
-
+## Fit models
 fit <- holidays |>
   model(
     Seasonal_naive = SNAIVE(Trips),
@@ -20,7 +18,6 @@ fit <- holidays |>
   )
 
 ## Check residuals
-
 fit |>
   filter(State == "Victoria") |>
   select(Seasonal_naive) |>
@@ -33,10 +30,10 @@ augment(fit) |>
 ## Which model fits best?
 
 accuracy(fit) |>
-  group_by(.model) |>
   summarise(
     RMSSE = sqrt(mean(RMSSE^2)),
-    MAPE = mean(MAPE)
+    MAPE = mean(MAPE),
+    .by = .model
   ) |>
   arrange(RMSSE)
 
@@ -71,12 +68,6 @@ stl_fit |>
   forecast(h = "4 years") |>
   autoplot(holidays)
 
-accuracy(stl_fit) |>
-  summarise(
-    RMSSE = sqrt(mean(RMSSE^2)),
-    MAPE = mean(MAPE)
-  )
-
 # Use a test set of last 2 years to check forecast accuracy
 
 training <- holidays |>
@@ -102,10 +93,10 @@ test_fc |>
 
 test_fc |>
   accuracy(holidays) |>
-  group_by(.model) |>
   summarise(
     RMSSE = sqrt(mean(RMSSE^2)),
-    MAPE = mean(MAPE)
+    MAPE = mean(MAPE),
+    .by = .model
   ) |>
   arrange(RMSSE)
 
@@ -136,8 +127,52 @@ cv_fc <- cv_fit |>
 
 cv_fc |>
   accuracy(holidays, by = c("h", ".model", "State")) |>
-  group_by(.model, h) |>
-  summarise(RMSSE = sqrt(mean(RMSSE^2))) |>
+  summarise(
+    RMSSE = sqrt(mean(RMSSE^2)),
+    .by = c(.model, h)
+  ) |>
   ggplot(aes(x=h, y=RMSSE, group=.model, col=.model)) +
   geom_line()
 
+
+## hh_budget exercise
+
+# 1. Create training set by withholding last four years
+train <- hh_budget |>
+  filter(Year <= max(Year) - 4)
+#2. Fit benchmarks
+fit <- train |>
+  model(
+    naive = NAIVE(Wealth),
+    drift = RW(Wealth ~ drift()),
+    mean = MEAN(Wealth)
+  )
+fc <- fit |> forecast(h = 4)
+
+# 3. Compute accuracy
+fc |>
+  accuracy(hh_budget) |>
+  arrange(Country, RMSE)
+fc |>
+  accuracy(hh_budget) |>
+  summarise(RMSE = sqrt(mean(RMSE^2)), .by=.model) |>
+  arrange(RMSE)
+
+# 4. Do the residuals resemble white noise?
+
+fit |>
+  filter(Country == "Australia") |>
+  select(drift) |>
+  gg_tsresiduals()
+fit |>
+  filter(Country == "Canada") |>
+  select(drift) |>
+  gg_tsresiduals()
+fit |>
+  filter(Country == "Japan") |>
+  select(drift) |>
+  gg_tsresiduals()
+fit |>
+  filter(Country == "USA") |>
+  select(drift) |>
+  gg_tsresiduals()
diff --git a/week4/slides.qmd b/week4/slides.qmd
@@ -0,0 +1,131 @@
+---
+title: ETC3550/ETC5550 Applied forecasting
+author: "Week 4: The Forecasters' toolbox"
+format:
+  beamer:
+    aspectratio: 169
+    fontsize: 14pt
+    section-titles: false
+    knitr:
+      opts_chunk:
+        dev: "cairo_pdf"
+    pdf-engine: pdflatex
+    fig-width: 7.5
+    fig-height: 3.5
+    include-in-header: ../header.tex
+---
+
+```{r setup, include=FALSE}
+source(here::here("setup.R"))
+```
+```{r tscvplots, echo=FALSE}
+tscv_plot <- function(.init, .step, h = 1) {
+  expand.grid(
+    time = seq(26),
+    .id = seq(trunc(11 / .step))
+  ) |>
+    group_by(.id) |>
+    mutate(
+      observation = case_when(
+        time <= ((.id - 1) * .step + .init) ~ "train",
+        time %in% c((.id - 1) * .step + .init + h) ~ "test",
+        TRUE ~ "unused"
+      )
+    ) |>
+    ungroup() |>
+    filter(.id <= 26 - .init) |>
+    ggplot(aes(x = time, y = .id)) +
+    geom_segment(
+      aes(x = 0, xend = 27, y = .id, yend = .id),
+      arrow = arrow(length = unit(0.015, "npc")),
+      col = "black", size = .25
+    ) +
+    geom_point(aes(col = observation), size = 2) +
+    scale_y_reverse() +
+    scale_color_manual(values = c(train = "#0072B2", test = "#D55E00", unused = "gray")) +
+    # theme_void() +
+    # geom_label(aes(x = 28.5, y = 1, label = "time")) +
+    guides(col = FALSE) +
+    labs(x = "time", y = "") +
+    theme_void() +
+    theme(axis.title = element_text())
+}
+```
+
+## Time series cross-validation {-}
+
+**Traditional evaluation**
+
+```{r traintest1, fig.height=1, echo=FALSE}
+tscv_plot(.init = 18, .step = 10, h = 1:8) +
+  geom_text(aes(x = 10, y = 0.8, label = "Training data"), color = "#0072B2") +
+  geom_text(aes(x = 21, y = 0.8, label = "Test data"), color = "#D55E00") +
+  ylim(1, 0)
+```
+
+\pause
+
+**Time series cross-validation**
+
+```{r tscvggplot1, echo=FALSE, fig.height=2.2}
+tscv_plot(.init = 8, .step = 1, h = 1) +
+  geom_text(aes(x = 21, y = 0, label = "h = 1"), color = "#D55E00")
+```
+
+## Time series cross-validation {-}
+
+**Traditional evaluation**
+
+```{r traintest2, ref.label="traintest1", fig.height=1, echo=FALSE}
+```
+
+**Time series cross-validation**
+
+```{r tscvggplot2, echo=FALSE,  fig.height=2.2}
+tscv_plot(.init = 8, .step = 1, h = 2) +
+  geom_text(aes(x = 21, y = 0, label = "h = 2"), color = "#D55E00")
+```
+
+## Time series cross-validation {-}
+
+**Traditional evaluation**
+
+```{r traintest3, ref.label="traintest1", fig.height=1, echo=FALSE}
+```
+
+**Time series cross-validation**
+
+```{r tscvggplot3, echo=FALSE,  fig.height=2.2}
+tscv_plot(.init = 8, .step = 1, h = 3) +
+  geom_text(aes(x = 21, y = 0, label = "h = 3"), color = "#D55E00")
+```
+
+## Time series cross-validation {-}
+
+**Traditional evaluation**
+
+```{r traintest4, ref.label="traintest1", fig.height=1, echo=FALSE}
+```
+
+**Time series cross-validation**
+
+```{r tscvggplot4, echo=FALSE, fig.height=2.2}
+tscv_plot(.init = 8, .step = 1, h = 4) +
+  geom_text(aes(x = 21, y = 0, label = "h = 4"), color = "#D55E00")
+```
+
+\only<2>{\begin{textblock}{8}(.5,6.4)\begin{block}{}\fontsize{12}{13}\sf
+\begin{itemize}\tightlist
+\item Forecast accuracy averaged over test sets.
+\item Also known as "evaluation on a rolling forecasting origin"
+\end{itemize}\end{block}\end{textblock}}
+
+\vspace*{10cm}
+
+
+## Your turn
+\fontsize{13}{14}\sf
+
+```{r}
+#| child: activities.md
+```
diff --git a/week5/index.qmd b/week5/index.qmd
@@ -18,7 +18,7 @@ Read [Sections 8.1-8.4 of the textbook](https://otexts.com/fpp3/expsmooth.html)
 
 ## Exercises (on your own or in tutorial)
 
-Complete Exercises 1-6, 8-9, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).
+Complete Exercises 1-6, 8, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).
 
 ```{r}
 #| output: asis