Refreshed

numbats · Jan 11, 2024 · 2dfb2e5 · 2dfb2e5
1 parent 6c095e3
commit 2dfb2e5
Show file tree

Hide file tree

Showing 57 changed files with 205 additions and 32 deletions.
diff --git a/_freeze/week1/index/execute-results/html.json b/_freeze/week1/index/execute-results/html.json
@@ -1,8 +1,8 @@
 {
-  "hash": "a73b99f2e2c91704c2fd0c057b0d09c0",
+  "hash": "486fd0324fed9230f02264be00a98236",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Week 1: What is forecasting?\"\n---\n\n::: {.cell}\n\n:::\n\n\n## What you will learn this week\n\n* How to think about forecasting from a statistical perspective\n* What makes something easy or hard to forecast?\n* Using the `tsibble` package in R\n\n## Pre-class activities\n\n* Install R and RStudio on your personal computer. Instructions are provided at [https://otexts.com/fpp3/appendix-using-r.html](https://otexts.com/fpp3/appendix-using-r.html).\n* Read [Chapter 1 of the textbook](http://OTexts.com/fpp3/intro.html) and watch all embedded videos\n* Watch this video\n\n<iframe width=\"100%\" height=\"415\" src=\"https://www.youtube.com/embed/HNJYRf0mvxg?si=k0wfI3Sq68TPm4Ek\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen></iframe>\n\n## Exercises (on your own or in tutorial)\n\nYour task this week is to make sure you are familiar with R, RStudio and the tidyverse packages. If you've already done ETC1010, then you may not need to do anything! But if you're new to R and the tidyverse, then you will need to get yourself up-to-speed.\n\nWork through the first five chapters of the **LearnR** tutorial at [learnr.numbat.space](https://learnr.numbat.space). Do as much of it as you think you need. For those students new to R, it is strongly recommended that you do all five chapters. For those who have previously used R, concentrate on the parts where you feel you are weakest.\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week1slides_week1.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week1slides_week1.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n\n## Seminar activities\n\n1. Download `tourism.xlsx` from [`http://robjhyndman.com/data/tourism.xlsx`](http://robjhyndman.com/data/tourism.xlsx), and read it into R using `read_excel()` from the `readxl` package.\n2. Create a tsibble which is identical to the `tourism` tsibble from the `tsibble` package.\n3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.\n4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.\n\n\n\n## Assignments\n\n* [Assignment 1](../assignments/A1.qmd) is due on Friday 08 March.\n",
+    "markdown": "---\ntitle: \"Week 1: What is forecasting?\"\n---\n\n::: {.cell}\n\n:::\n\n\n## What you will learn this week\n\n* How to think about forecasting from a statistical perspective\n* What makes something easy or hard to forecast?\n* Using the `tsibble` package in R\n\n## Pre-class activities\n\n* Install R and RStudio on your personal computer. Instructions are provided at [https://otexts.com/fpp3/appendix-using-r.html](https://otexts.com/fpp3/appendix-using-r.html).\n* Read [Chapter 1 of the textbook](http://OTexts.com/fpp3/intro.html) and watch all embedded videos\n* Watch this video\n\n<iframe width=\"100%\" height=\"415\" src=\"https://www.youtube.com/embed/HNJYRf0mvxg?si=k0wfI3Sq68TPm4Ek\" title=\"YouTube video player\" frameborder=\"0\" allow=\"accelerometer; autoplay; clipboard-write; encrypted-media; gyroscope; picture-in-picture; web-share\" allowfullscreen></iframe>\n\n## Exercises (on your own or in tutorial)\n\nYour task this week is to make sure you are familiar with R, RStudio and the tidyverse packages. If you've already done ETC1010, then you may not need to do anything! But if you're new to R and the tidyverse, then you will need to get yourself up-to-speed.\n\nWork through the first five chapters of the **LearnR** tutorial at [learnr.numbat.space](https://learnr.numbat.space). Do as much of it as you think you need. For those students new to R, it is strongly recommended that you do all five chapters. For those who have previously used R, concentrate on the parts where you feel you are weakest.\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week1slides_week1.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week1slides_week1.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\n\n1. Download `tourism.xlsx` from [`http://robjhyndman.com/data/tourism.xlsx`](http://robjhyndman.com/data/tourism.xlsx), and read it into R using `read_excel()` from the `readxl` package.\n2. Create a tsibble which is identical to the `tourism` tsibble from the `tsibble` package.\n3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.\n4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.\n\n\n\n## Assignments\n\n* [Assignment 1](../assignments/A1.qmd) is due on Friday 08 March.\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"

diff --git a/...ek1/slides_week1/execute-results/tex.json → ...eze/week1/slides/execute-results/tex.json b/...ek1/slides_week1/execute-results/tex.json → ...eze/week1/slides/execute-results/tex.json
diff --git a/_freeze/week10/index/execute-results/html.json b/_freeze/week10/index/execute-results/html.json
@@ -1,8 +1,8 @@
 {
-  "hash": "7a609a4605a40f10f4403cff779b9047",
+  "hash": "cf58012bf225f124ca1fa8047bd9991e",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Week 10: Multiple regression and forecasting\"\n---\n\n::: {.cell}\n\n:::\n\n\n## What you will learn this week\n\n* Useful predictors for time series forecasting using regression\n* Selecting predictors\n* Ex ante and ex post forecasting\n\n## Pre-class activities\n\nRead [Chapter 7 of the textbook](https://otexts.com/fpp3/regression.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 11-17 from [Section 9.11 of the book](https://otexts.com/fpp3/??-exercises.html).\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week10slides_week10.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week10slides_week10.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n\n## Seminar activities\n\n\n\n## Assignments\n\n* [Retail Project](../assignments/Project.qmd) is due on Friday 24 May.\n",
+    "markdown": "---\ntitle: \"Week 10: Multiple regression and forecasting\"\n---\n\n::: {.cell}\n\n:::\n\n\n## What you will learn this week\n\n* Useful predictors for time series forecasting using regression\n* Selecting predictors\n* Ex ante and ex post forecasting\n\n## Pre-class activities\n\nRead [Chapter 7 of the textbook](https://otexts.com/fpp3/regression.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 11-17 from [Section 9.11 of the book](https://otexts.com/fpp3/??-exercises.html).\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week10slides_week10.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week10slides_week10.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Assignments\n\n* [Retail Project](../assignments/Project.qmd) is due on Friday 24 May.\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"

diff --git a/_freeze/week10/slides/execute-results/tex.json b/_freeze/week10/slides/execute-results/tex.json
@@ -0,0 +1,17 @@
+{
+  "hash": "79652497113d1f059760c711fe2b746f",
+  "result": {
+    "engine": "knitr",
+    "markdown": "---\ntitle: \"ETC3550/ETC5550 Applied&nbsp;forecasting\"\nauthor: \"Week 10: Regression models\"\nformat:\n  beamer:\n    aspectratio: 169\n    fontsize: 14pt\n    section-titles: false\n    knitr:\n      opts_chunk:\n        dev: \"cairo_pdf\"\n    pdf-engine: pdflatex\n    fig-width: 7.5\n    fig-height: 3.5\n    include-in-header: ../header.tex\n---\n\n\n\n\n## Multiple regression and forecasting\n\n\\vspace*{0.2cm}\\begin{block}{}\\vspace*{-0.3cm}\n$$\n  y_t = \\beta_0 + \\beta_1 x_{1,t} + \\beta_2 x_{2,t} + \\cdots + \\beta_kx_{k,t} + \\varepsilon_t.\n$$\n\\end{block}\n\n* $y_t$ is the variable we want to predict: the \"response\" variable\n* Each $x_{j,t}$ is numerical and is called a \"predictor\".\n They are usually assumed to be known for all past and future times.\n* The coefficients $\\beta_1,\\dots,\\beta_k$ measure the effect of each\npredictor *after taking account of the effect of all other predictors\nin the model*.\n* $\\varepsilon_t$ is a white noise error term\n\n## Trend\n\n**Linear trend**\n\n\\centerline{$x_t = t,\\qquad t = 1,2,\\dots,$}\\pause\n\n**Piecewise linear trend with bend at $\\tau$**\n\\vspace*{-0.6cm}\n\\begin{align*}\nx_{1,t} &= t \\\\\nx_{2,t} &= \\left\\{ \\begin{array}{ll}\n  0 & t <\\tau\\\\\n  (t-\\tau) & t \\ge \\tau\n\\end{array}\\right.\n\\end{align*}\n\\pause\\vspace*{-0.8cm}\n\n**Quadratic or higher order trend**\n\n\\centerline{$x_{1,t} =t,\\quad x_{2,t}=t^2,\\quad \\dots$}\n\n\\pause\\vspace*{-0.1cm}\n\\centerline{\\textcolor{orange}{\\textbf{NOT RECOMMENDED!}}}\n\n## Uses of dummy variables\n\\fontsize{13}{14}\\sf\n\n**Seasonal dummies**\n\n* For quarterly data: use 3 dummies\n* For monthly data: use 11 dummies\n* For daily data: use 6 dummies\n* What to do with weekly data?\n\n\\pause\n\n**Outliers**\n\n* A dummy variable can remove its effect.\n\n\\pause\n\n**Public holidays**\n\n* For daily data: if it is a public holiday, dummy=1, otherwise dummy=0.\n\n## Holidays\n\n**For monthly data**\n\n* Christmas: always in December so part of monthly seasonal effect\n* Easter: use a dummy variable $v_t=1$ if any part of Easter is in that month, $v_t=0$ otherwise.\n* Ramadan and Chinese New Year similar.\n\n## Fourier series\n\nPeriodic seasonality can be handled using pairs of Fourier \\rlap{terms:}\\vspace*{-0.3cm}\n$$\ns_{k}(t) = \\sin\\left(\\frac{2\\pi k t}{m}\\right)\\qquad c_{k}(t) = \\cos\\left(\\frac{2\\pi k t}{m}\\right)\n$$\n$$\ny_t = a + bt + \\sum_{k=1}^K \\left[\\alpha_k s_k(t) + \\beta_k c_k(t)\\right] + \\varepsilon_t$$\\vspace*{-0.8cm}\n\n* Every periodic function can be approximated by sums of sin and cos terms for large enough $K$.\n* Choose $K$ by minimizing AICc or CV.\n* Called \"harmonic regression\"\n\n## Distributed lags\n\nLagged values of a predictor.\n\nExample: $x$ is advertising which has a delayed effect\n\n\\vspace*{-0.8cm}\\begin{align*}\n  x_{1} &= \\text{advertising for previous month;} \\\\\n  x_{2} &= \\text{advertising for two months previously;} \\\\\n        & \\vdots \\\\\n  x_{m} &= \\text{advertising for $m$ months previously.}\n\\end{align*}\n\n## Comparing regression models\n\\fontsize{13}{14}\\sf\n\n* $R^2$  does not allow for \"degrees of freedom\".\n* Adding *any* variable tends to increase the value of $R^2$, even if that variable is irrelevant.\n\\pause\n\nTo overcome this problem, we can use *adjusted $R^2$*:\n\\begin{block}{}\n$$\n\\bar{R}^2 = 1-(1-R^2)\\frac{T-1}{T-k-1}\n$$\nwhere $k=$ no.\\ predictors and $T=$ no.\\ observations.\n\\end{block}\n\n\\pause\n\n\\begin{alertblock}{Maximizing $\\bar{R}^2$ is equivalent to minimizing $\\hat\\sigma^2$.}\n\\centerline{$\\displaystyle\n\\hat{\\sigma}^2 = \\frac{1}{T-k-1}\\sum_{t=1}^T \\varepsilon_t^2$\n}\n\\end{alertblock}\n\n## Akaike's Information Criterion\n\n\\vspace*{0.2cm}\\begin{block}{}\n\\centerline{$\\text{AIC} = -2\\log(L) + 2(k+2)$}\n\\end{block}\\vspace*{-0.5cm}\n\n* $L=$ likelihood\n* $k=$ \\# predictors in model.\n* AIC penalizes terms more heavily than $\\bar{R}^2$.\n\n\\pause\\begin{block}{}\n\\centerline{$\\text{AIC}_{\\text{C}} = \\text{AIC} + \\frac{2(k+2)(k+3)}{T-k-3}$}\n\\end{block}\n\n* Minimizing the AIC or AICc is asymptotically equivalent to minimizing MSE via **leave-one-out cross-validation** (for any linear regression).\n\n## Leave-one-out cross-validation\n\nFor regression, leave-one-out cross-validation is faster and more efficient than time-series cross-validation.\n\n* Select one observation for test set, and use *remaining* observations in training set. Compute error on test observation.\n* Repeat using each possible observation as the test set.\n* Compute accuracy measure over all errors.\n\n\n::: {.cell}\n\n:::\n\n\n## Cross-validation {-}\n\n**Traditional evaluation**\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest1-1.pdf)\n:::\n:::\n\n\n\\pause\n\n**Time series cross-validation**\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/tscvggplot1-1.pdf)\n:::\n:::\n\n\n## Cross-validation {-}\n\n**Traditional evaluation**\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/traintest1a-1.pdf)\n:::\n:::\n\n\n**Leave-one-out cross-validation**\n\n\n::: {.cell}\n::: {.cell-output-display}\n![](slides_files/figure-beamer/unnamed-chunk-1-1.pdf)\n:::\n:::\n\n\n\\only<2>{\\begin{textblock}{4}(6,6)\\begin{block}{}\\fontsize{13}{15}\\sf\nCV = MSE on \\textcolor[HTML]{D55E00}{test sets}\\end{block}\\end{textblock}}\n\n## Bayesian Information Criterion\n\n\\begin{block}{}\n$$\n\\text{BIC} = -2\\log(L) + (k+2)\\log(T)\n$$\n\\end{block}\nwhere $L$ is the likelihood and $k$ is the number of predictors in the model.\\pause\n\n* BIC penalizes terms more heavily than AIC\n* Also called SBIC and SC.\n* Minimizing BIC is asymptotically equivalent to leave-$v$-out cross-validation when $v = T[1-1/(log(T)-1)]$.\n\n## Choosing regression variables\n\\fontsize{14}{15}\\sf\n\n**Best subsets regression**\n\n* Fit all possible regression models using one or more of the predictors.\n* Choose the best model based on one of the measures of predictive ability (CV, AIC, AICc).\n\\pause\n\n**Backwards stepwise regression**\n\n* Start with a model containing all variables.\n* Subtract one variable at a time. Keep model if lower CV.\n* Iterate until no further improvement.\n* Not guaranteed to lead to best model.\n\n## Ex-ante versus ex-post forecasts\n\n * *Ex ante forecasts* are made using only information available in advance.\n    - require forecasts of predictors\n * *Ex post forecasts* are made using later information on the predictors.\n    - useful for studying behaviour of forecasting models.\n\n * trend, seasonal and calendar variables are all known in advance, so these don't need to be forecast.\n",
+    "supporting": [
+      "slides_files"
+    ],
+    "filters": [
+      "rmarkdown/pagebreak.lua"
+    ],
+    "includes": {},
+    "engineDependencies": {},
+    "preserve": null,
+    "postProcess": false
+  }
+}
diff --git a/_freeze/week10/slides/figure-beamer/traintest1-1.pdf b/_freeze/week10/slides/figure-beamer/traintest1-1.pdf
diff --git a/_freeze/week10/slides/figure-beamer/traintest1a-1.pdf b/_freeze/week10/slides/figure-beamer/traintest1a-1.pdf
diff --git a/_freeze/week10/slides/figure-beamer/tscvggplot1-1.pdf b/_freeze/week10/slides/figure-beamer/tscvggplot1-1.pdf
diff --git a/_freeze/week10/slides/figure-beamer/unnamed-chunk-1-1.pdf b/_freeze/week10/slides/figure-beamer/unnamed-chunk-1-1.pdf
diff --git a/_freeze/week10/slides_week10/figure-beamer/traintest1-1.pdf b/_freeze/week10/slides_week10/figure-beamer/traintest1-1.pdf
diff --git a/_freeze/week10/slides_week10/figure-beamer/traintest1a-1.pdf b/_freeze/week10/slides_week10/figure-beamer/traintest1a-1.pdf
diff --git a/_freeze/week10/slides_week10/figure-beamer/tscvggplot1-1.pdf b/_freeze/week10/slides_week10/figure-beamer/tscvggplot1-1.pdf
diff --git a/_freeze/week10/slides_week10/figure-beamer/unnamed-chunk-1-1.pdf b/_freeze/week10/slides_week10/figure-beamer/unnamed-chunk-1-1.pdf
diff --git a/_freeze/week11/index/execute-results/html.json b/_freeze/week11/index/execute-results/html.json
@@ -1,8 +1,8 @@
 {
-  "hash": "42d043ce65fc34441751e5f5f48cdad3",
+  "hash": "c66a0d589ae7b0bf7f888d55183e2dec",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Week 11: Dynamic regression\"\n---\n\n::: {.cell}\n\n:::\n\n\n## What you will learn this week\n\n* How to combine regression models with ARIMA models to form dynamic regression models\n* Dynamic harmonic regression to handle complex seasonality\n* Lagged predictors\n\n## Pre-class activities\n\nRead [Chapter 10 of the textbook](https://otexts.com/fpp3/dynamic.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-7 from [Section 7.10 of the book](https://otexts.com/fpp3/regression-exercises.html).\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week11slides_week11.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week11slides_week11.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n\n## Seminar activities\n\nRepeat the daily electricity example, but instead of using a quadratic function of temperature, use a piecewise linear function with the \"knot\" around 20 degrees Celsius (use predictors `Temperature` & `Temp2`). How can you optimize the choice of knot?\n\nThe data can be created as follows.\n\n```r\nvic_elec_daily <- vic_elec |>\n  filter(year(Time) == 2014) |>\n  index_by(Date = date(Time)) |>\n  summarise(\n    Demand = sum(Demand)/1e3,\n    Temperature = max(Temperature),\n    Holiday = any(Holiday)\n  ) |>\n  mutate(\n    Temp2 = I(pmax(Temperature-20,0)),\n    Day_Type = case_when(\n      Holiday ~ \"Holiday\",\n      wday(Date) %in% 2:6 ~ \"Weekday\",\n      TRUE ~ \"Weekend\"\n    )\n  )\n```\n\nRepeat but using all available data, and handling the annual seasonality using Fourier terms.\n\n\n\n## Assignments\n\n* [Retail Project](../assignments/Project.qmd) is due on Friday 24 May.\n",
+    "markdown": "---\ntitle: \"Week 11: Dynamic regression\"\n---\n\n::: {.cell}\n\n:::\n\n\n## What you will learn this week\n\n* How to combine regression models with ARIMA models to form dynamic regression models\n* Dynamic harmonic regression to handle complex seasonality\n* Lagged predictors\n\n## Pre-class activities\n\nRead [Chapter 10 of the textbook](https://otexts.com/fpp3/dynamic.html) and watch all embedded videos\n\n## Exercises (on your own or in tutorial)\n\nComplete Exercises 1-7 from [Section 7.10 of the book](https://otexts.com/fpp3/regression-exercises.html).\n\n\n## Slides for seminar\n\n<iframe src='https://docs.google.com/gview?url=https://af.numbat.space/week11slides_week11.pdf&embedded=true' width='100%' height=465></iframe>\n<a href=https://af.numbat.space/week11slides_week11.pdf class='badge badge-small badge-red'>Download pdf</a>\n\n## Seminar activities\n\n\n\nRepeat the daily electricity example, but instead of using a quadratic function of temperature, use a piecewise linear function with the \"knot\" around 20 degrees Celsius (use predictors `Temperature` & `Temp2`). How can you optimize the choice of knot?\n\nThe data can be created as follows.\n\n```r\nvic_elec_daily <- vic_elec |>\n  filter(year(Time) == 2014) |>\n  index_by(Date = date(Time)) |>\n  summarise(\n    Demand = sum(Demand)/1e3,\n    Temperature = max(Temperature),\n    Holiday = any(Holiday)\n  ) |>\n  mutate(\n    Temp2 = I(pmax(Temperature-20,0)),\n    Day_Type = case_when(\n      Holiday ~ \"Holiday\",\n      wday(Date) %in% 2:6 ~ \"Weekday\",\n      TRUE ~ \"Weekend\"\n    )\n  )\n```\n\nRepeat but using all available data, and handling the annual seasonality using Fourier terms.\n\n\n\n## Assignments\n\n* [Retail Project](../assignments/Project.qmd) is due on Friday 24 May.\n",
     "supporting": [],
     "filters": [
       "rmarkdown/pagebreak.lua"