Skip to content

Commit

Permalink
Updated week4 seminar
Browse files Browse the repository at this point in the history
  • Loading branch information
robjhyndman committed Mar 20, 2024
1 parent 8ada98a commit e0f1a63
Show file tree
Hide file tree
Showing 15 changed files with 211 additions and 34 deletions.
2 changes: 1 addition & 1 deletion _freeze/week4/index/execute-results/html.json
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@
"hash": "e7f94273921daac99cfd07d9d41f4c79",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Week 4: The forecaster's toolbox\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Four benchmark forecasting methods that we will use for comparison\n* Fitted values, residuals\n* Forecasting with transformations\n\n## Pre-class activities\n\nRead [Chapter 5 of the textbook](https://otexts.com/fpp3/toolbox.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-5, 9-10 from [Section 3.7 of the book](https://otexts.com/fpp3/decomposition-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week4/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week4/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\n\n\n * Produce forecasts using an appropriate benchmark method for household wealth (`hh_budget`). Plot the results using `autoplot()`.\n * Produce forecasts using an appropriate benchmark method for Australian takeaway food turnover (`aus_retail`). Plot the results using `autoplot()`.\n\n * Compute seasonal naïve forecasts for quarterly Australian beer production from 1992.\n * Test if the residuals are white noise. What do you conclude?\n\n * Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.\n * Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.\n * Compute the accuracy of your forecasts. Which method does best?\n * Repeat the exercise using the Australian takeaway food turnover data (`aus_retail`) with a test set of four years.\n\n\n\n## Assignments\n\n* [Assignment 2](../assignments/A2.qmd) is due on Sunday 24 March.\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
"markdown": "---\ntitle: \"Week 4: The forecaster's toolbox\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Four benchmark forecasting methods that we will use for comparison\n* Fitted values, residuals\n* Forecasting with transformations\n\n## Pre-class activities\n\nRead [Chapter 5 of the textbook](https://otexts.com/fpp3/toolbox.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-5, 9-10 from [Section 3.7 of the book](https://otexts.com/fpp3/decomposition-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week4/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week4/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\n1. Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.\n2. Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.\n3. Compute the accuracy of your forecasts. Which method does best?\n4. Do the residuals from the best method resemble white noise?\n\n\n\n## Assignments\n\n* [Assignment 2](../assignments/A2.qmd) is due on Sunday 24 March.\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
"supporting": [],
"filters": [
"rmarkdown/pagebreak.lua"
Expand Down
17 changes: 17 additions & 0 deletions _freeze/week4/slides/execute-results/tex.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
{
"hash": "ee3371da1701f715a268398f50289e8b",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: ETC3550/ETC5550 Applied forecasting\nauthor: \"Week 4: The Forecasters' toolbox\"\nformat:\n beamer:\n aspectratio: 169\n fontsize: 14pt\n section-titles: false\n knitr:\n opts_chunk:\n dev: \"cairo_pdf\"\n pdf-engine: pdflatex\n fig-width: 7.5\n fig-height: 3.5\n include-in-header: ../header.tex\n---\n\n\n\n::: {.cell}\n\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest1-1.pdf)\n:::\n:::\n\n\n\n\\pause\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot1-1.pdf)\n:::\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest2-1.pdf)\n:::\n:::\n\n\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot2-1.pdf)\n:::\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest3-1.pdf)\n:::\n:::\n\n\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot3-1.pdf)\n:::\n:::\n\n\n\n## Time series cross-validation {-}\n\n**Traditional evaluation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest4-1.pdf)\n:::\n:::\n\n\n\n**Time series cross-validation**\n\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot4-1.pdf)\n:::\n:::\n\n\n\n\\only<2>{\\begin{textblock}{8}(.5,6.4)\\begin{block}{}\\fontsize{12}{13}\\sf\n\\begin{itemize}\\tightlist\n\\item Forecast accuracy averaged over test sets.\n\\item Also known as \"evaluation on a rolling forecasting origin\"\n\\end{itemize}\\end{block}\\end{textblock}}\n\n\\vspace*{10cm}\n\n\n## Your turn\n\\fontsize{13}{14}\\sf\n\n\n\n\n1. Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.\n2. Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.\n3. Compute the accuracy of your forecasts. Which method does best?\n",
"supporting": [
"slides_files"
],
"filters": [
"rmarkdown/pagebreak.lua"
],
"includes": {},
"engineDependencies": {},
"preserve": null,
"postProcess": false
}
}
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
8 changes: 5 additions & 3 deletions _freeze/week5/index/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
{
"hash": "20b8222433f8e5a786d02c88a4cb49c6",
"hash": "cfabdd841ad01c84069586943cfb29bb",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Week 5: Exponential smoothing\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Simple exponential smoothing\n* Corresponding ETS models\n\n## Pre-class activities\n\nRead [Sections 8.1-8.4 of the textbook](https://otexts.com/fpp3/expsmooth.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-6, 8-9, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week5/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week5/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\nTry forecasting the Chinese GDP from the `global_economy` data set using an ETS model.\n\nExperiment with the various options in the `ETS()` function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.\n\n[Hint: use `h=20` when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]\n\n\n\n## Assignments\n\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
"supporting": [],
"markdown": "---\ntitle: \"Week 5: Exponential smoothing\"\n---\n\n::: {.cell}\n\n:::\n\n\n\n\n## What you will learn this week\n\n* Simple exponential smoothing\n* Corresponding ETS models\n\n## Pre-class activities\n\nRead [Sections 8.1-8.4 of the textbook](https://otexts.com/fpp3/expsmooth.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-6, 8, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).\n\n\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week5/slides.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week5/slides.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\nTry forecasting the Chinese GDP from the `global_economy` data set using an ETS model.\n\nExperiment with the various options in the `ETS()` function to see how much the forecasts change with damped trend, or with a Box-Cox transformation. Try to develop an intuition of what each is doing to the forecasts.\n\n[Hint: use `h=20` when forecasting, so you can clearly see the differences between the various options when plotting the forecasts.]\n\n\n\n## Assignments\n\n* [Assignment 3](../assignments/A3.qmd) is due on Friday 12 April.\n",
"supporting": [
"index_files"
],
"filters": [
"rmarkdown/pagebreak.lua"
],
Expand Down
16 changes: 4 additions & 12 deletions week4/activities.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,4 @@


* Produce forecasts using an appropriate benchmark method for household wealth (`hh_budget`). Plot the results using `autoplot()`.
* Produce forecasts using an appropriate benchmark method for Australian takeaway food turnover (`aus_retail`). Plot the results using `autoplot()`.

* Compute seasonal naïve forecasts for quarterly Australian beer production from 1992.
* Test if the residuals are white noise. What do you conclude?

* Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.
* Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
* Compute the accuracy of your forecasts. Which method does best?
* Repeat the exercise using the Australian takeaway food turnover data (`aus_retail`) with a test set of four years.
1. Create a training set for household wealth (`hh_budget`) by withholding the last four years as a test set.
2. Fit all the appropriate benchmark methods to the training set and forecast the periods covered by the test set.
3. Compute the accuracy of your forecasts. Which method does best?
4. Do the residuals from the best method resemble white noise?
69 changes: 52 additions & 17 deletions week4/seminar_code.R
Original file line number Diff line number Diff line change
@@ -1,16 +1,14 @@
library(fpp3)

## ---- Holiday tourism by state------------------------------------------------
## ---- Holiday tourism by state --------------

holidays <- tourism |>
as_tibble() |>
filter(Purpose == "Holiday") |>
summarise(Trips = sum(Trips), .by = c("State", "Quarter")) |>
as_tsibble(index = Quarter, key = State)


## Fit models ------------------------------------------------------------------

## Fit models
fit <- holidays |>
model(
Seasonal_naive = SNAIVE(Trips),
Expand All @@ -20,7 +18,6 @@ fit <- holidays |>
)

## Check residuals

fit |>
filter(State == "Victoria") |>
select(Seasonal_naive) |>
Expand All @@ -33,10 +30,10 @@ augment(fit) |>
## Which model fits best?

accuracy(fit) |>
group_by(.model) |>
summarise(
RMSSE = sqrt(mean(RMSSE^2)),
MAPE = mean(MAPE)
MAPE = mean(MAPE),
.by = .model
) |>
arrange(RMSSE)

Expand Down Expand Up @@ -71,12 +68,6 @@ stl_fit |>
forecast(h = "4 years") |>
autoplot(holidays)

accuracy(stl_fit) |>
summarise(
RMSSE = sqrt(mean(RMSSE^2)),
MAPE = mean(MAPE)
)

# Use a test set of last 2 years to check forecast accuracy

training <- holidays |>
Expand All @@ -102,10 +93,10 @@ test_fc |>

test_fc |>
accuracy(holidays) |>
group_by(.model) |>
summarise(
RMSSE = sqrt(mean(RMSSE^2)),
MAPE = mean(MAPE)
MAPE = mean(MAPE),
.by = .model
) |>
arrange(RMSSE)

Expand Down Expand Up @@ -136,8 +127,52 @@ cv_fc <- cv_fit |>

cv_fc |>
accuracy(holidays, by = c("h", ".model", "State")) |>
group_by(.model, h) |>
summarise(RMSSE = sqrt(mean(RMSSE^2))) |>
summarise(
RMSSE = sqrt(mean(RMSSE^2)),
.by = c(.model, h)
) |>
ggplot(aes(x=h, y=RMSSE, group=.model, col=.model)) +
geom_line()


## hh_budget exercise

# 1. Create training set by withholding last four years
train <- hh_budget |>
filter(Year <= max(Year) - 4)
#2. Fit benchmarks
fit <- train |>
model(
naive = NAIVE(Wealth),
drift = RW(Wealth ~ drift()),
mean = MEAN(Wealth)
)
fc <- fit |> forecast(h = 4)

# 3. Compute accuracy
fc |>
accuracy(hh_budget) |>
arrange(Country, RMSE)
fc |>
accuracy(hh_budget) |>
summarise(RMSE = sqrt(mean(RMSE^2)), .by=.model) |>
arrange(RMSE)

# 4. Do the residuals resemble white noise?

fit |>
filter(Country == "Australia") |>
select(drift) |>
gg_tsresiduals()
fit |>
filter(Country == "Canada") |>
select(drift) |>
gg_tsresiduals()
fit |>
filter(Country == "Japan") |>
select(drift) |>
gg_tsresiduals()
fit |>
filter(Country == "USA") |>
select(drift) |>
gg_tsresiduals()
131 changes: 131 additions & 0 deletions week4/slides.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,131 @@
---
title: ETC3550/ETC5550 Applied forecasting
author: "Week 4: The Forecasters' toolbox"
format:
beamer:
aspectratio: 169
fontsize: 14pt
section-titles: false
knitr:
opts_chunk:
dev: "cairo_pdf"
pdf-engine: pdflatex
fig-width: 7.5
fig-height: 3.5
include-in-header: ../header.tex
---

```{r setup, include=FALSE}
source(here::here("setup.R"))
```
```{r tscvplots, echo=FALSE}
tscv_plot <- function(.init, .step, h = 1) {
expand.grid(
time = seq(26),
.id = seq(trunc(11 / .step))
) |>
group_by(.id) |>
mutate(
observation = case_when(
time <= ((.id - 1) * .step + .init) ~ "train",
time %in% c((.id - 1) * .step + .init + h) ~ "test",
TRUE ~ "unused"
)
) |>
ungroup() |>
filter(.id <= 26 - .init) |>
ggplot(aes(x = time, y = .id)) +
geom_segment(
aes(x = 0, xend = 27, y = .id, yend = .id),
arrow = arrow(length = unit(0.015, "npc")),
col = "black", size = .25
) +
geom_point(aes(col = observation), size = 2) +
scale_y_reverse() +
scale_color_manual(values = c(train = "#0072B2", test = "#D55E00", unused = "gray")) +
# theme_void() +
# geom_label(aes(x = 28.5, y = 1, label = "time")) +
guides(col = FALSE) +
labs(x = "time", y = "") +
theme_void() +
theme(axis.title = element_text())
}
```

## Time series cross-validation {-}

**Traditional evaluation**

```{r traintest1, fig.height=1, echo=FALSE}
tscv_plot(.init = 18, .step = 10, h = 1:8) +
geom_text(aes(x = 10, y = 0.8, label = "Training data"), color = "#0072B2") +
geom_text(aes(x = 21, y = 0.8, label = "Test data"), color = "#D55E00") +
ylim(1, 0)
```

\pause

**Time series cross-validation**

```{r tscvggplot1, echo=FALSE, fig.height=2.2}
tscv_plot(.init = 8, .step = 1, h = 1) +
geom_text(aes(x = 21, y = 0, label = "h = 1"), color = "#D55E00")
```

## Time series cross-validation {-}

**Traditional evaluation**

```{r traintest2, ref.label="traintest1", fig.height=1, echo=FALSE}
```

**Time series cross-validation**

```{r tscvggplot2, echo=FALSE, fig.height=2.2}
tscv_plot(.init = 8, .step = 1, h = 2) +
geom_text(aes(x = 21, y = 0, label = "h = 2"), color = "#D55E00")
```

## Time series cross-validation {-}

**Traditional evaluation**

```{r traintest3, ref.label="traintest1", fig.height=1, echo=FALSE}
```

**Time series cross-validation**

```{r tscvggplot3, echo=FALSE, fig.height=2.2}
tscv_plot(.init = 8, .step = 1, h = 3) +
geom_text(aes(x = 21, y = 0, label = "h = 3"), color = "#D55E00")
```

## Time series cross-validation {-}

**Traditional evaluation**

```{r traintest4, ref.label="traintest1", fig.height=1, echo=FALSE}
```

**Time series cross-validation**

```{r tscvggplot4, echo=FALSE, fig.height=2.2}
tscv_plot(.init = 8, .step = 1, h = 4) +
geom_text(aes(x = 21, y = 0, label = "h = 4"), color = "#D55E00")
```

\only<2>{\begin{textblock}{8}(.5,6.4)\begin{block}{}\fontsize{12}{13}\sf
\begin{itemize}\tightlist
\item Forecast accuracy averaged over test sets.
\item Also known as "evaluation on a rolling forecasting origin"
\end{itemize}\end{block}\end{textblock}}

\vspace*{10cm}


## Your turn
\fontsize{13}{14}\sf

```{r}
#| child: activities.md
```
2 changes: 1 addition & 1 deletion week5/index.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ Read [Sections 8.1-8.4 of the textbook](https://otexts.com/fpp3/expsmooth.html)

## Exercises (on your own or in tutorial)

Complete Exercises 1-6, 8-9, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).
Complete Exercises 1-6, 8, 11-12 from [Section 5.11 of the book](https://otexts.com/fpp3/toolbox-exercises.html).

```{r}
#| output: asis
Expand Down

0 comments on commit e0f1a63

Please sign in to comment.