-
Notifications
You must be signed in to change notification settings - Fork 2
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
1 parent
6c095e3
commit 2dfb2e5
Showing
57 changed files
with
205 additions
and
32 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
4 changes: 2 additions & 2 deletions
4
...ek1/slides_week1/execute-results/tex.json → ...eze/week1/slides/execute-results/tex.json
Large diffs are not rendered by default.
Oops, something went wrong.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
{ | ||
"hash": "79652497113d1f059760c711fe2b746f", | ||
"result": { | ||
"engine": "knitr", | ||
"markdown": "---\ntitle: \"ETC3550/ETC5550 Applied forecasting\"\nauthor: \"Week 10: Regression models\"\nformat:\n beamer:\n aspectratio: 169\n fontsize: 14pt\n section-titles: false\n knitr:\n opts_chunk:\n dev: \"cairo_pdf\"\n pdf-engine: pdflatex\n fig-width: 7.5\n fig-height: 3.5\n include-in-header: ../header.tex\n---\n\n\n\n\n## Multiple regression and forecasting\n\n\\vspace*{0.2cm}\\begin{block}{}\\vspace*{-0.3cm}\n$$\n y_t = \\beta_0 + \\beta_1 x_{1,t} + \\beta_2 x_{2,t} + \\cdots + \\beta_kx_{k,t} + \\varepsilon_t.\n$$\n\\end{block}\n\n* $y_t$ is the variable we want to predict: the \"response\" variable\n* Each $x_{j,t}$ is numerical and is called a \"predictor\".\n They are usually assumed to be known for all past and future times.\n* The coefficients $\\beta_1,\\dots,\\beta_k$ measure the effect of each\npredictor *after taking account of the effect of all other predictors\nin the model*.\n* $\\varepsilon_t$ is a white noise error term\n\n## Trend\n\n**Linear trend**\n\n\\centerline{$x_t = t,\\qquad t = 1,2,\\dots,$}\\pause\n\n**Piecewise linear trend with bend at $\\tau$**\n\\vspace*{-0.6cm}\n\\begin{align*}\nx_{1,t} &= t \\\\\nx_{2,t} &= \\left\\{ \\begin{array}{ll}\n 0 & t <\\tau\\\\\n (t-\\tau) & t \\ge \\tau\n\\end{array}\\right.\n\\end{align*}\n\\pause\\vspace*{-0.8cm}\n\n**Quadratic or higher order trend**\n\n\\centerline{$x_{1,t} =t,\\quad x_{2,t}=t^2,\\quad \\dots$}\n\n\\pause\\vspace*{-0.1cm}\n\\centerline{\\textcolor{orange}{\\textbf{NOT RECOMMENDED!}}}\n\n## Uses of dummy variables\n\\fontsize{13}{14}\\sf\n\n**Seasonal dummies**\n\n* For quarterly data: use 3 dummies\n* For monthly data: use 11 dummies\n* For daily data: use 6 dummies\n* What to do with weekly data?\n\n\\pause\n\n**Outliers**\n\n* A dummy variable can remove its effect.\n\n\\pause\n\n**Public holidays**\n\n* For daily data: if it is a public holiday, dummy=1, otherwise dummy=0.\n\n## Holidays\n\n**For monthly data**\n\n* Christmas: always in December so part of monthly seasonal effect\n* Easter: use a dummy variable $v_t=1$ if any part of Easter is in that month, $v_t=0$ otherwise.\n* Ramadan and Chinese New Year similar.\n\n## Fourier series\n\nPeriodic seasonality can be handled using pairs of Fourier \\rlap{terms:}\\vspace*{-0.3cm}\n$$\ns_{k}(t) = \\sin\\left(\\frac{2\\pi k t}{m}\\right)\\qquad c_{k}(t) = \\cos\\left(\\frac{2\\pi k t}{m}\\right)\n$$\n$$\ny_t = a + bt + \\sum_{k=1}^K \\left[\\alpha_k s_k(t) + \\beta_k c_k(t)\\right] + \\varepsilon_t$$\\vspace*{-0.8cm}\n\n* Every periodic function can be approximated by sums of sin and cos terms for large enough $K$.\n* Choose $K$ by minimizing AICc or CV.\n* Called \"harmonic regression\"\n\n## Distributed lags\n\nLagged values of a predictor.\n\nExample: $x$ is advertising which has a delayed effect\n\n\\vspace*{-0.8cm}\\begin{align*}\n x_{1} &= \\text{advertising for previous month;} \\\\\n x_{2} &= \\text{advertising for two months previously;} \\\\\n & \\vdots \\\\\n x_{m} &= \\text{advertising for $m$ months previously.}\n\\end{align*}\n\n## Comparing regression models\n\\fontsize{13}{14}\\sf\n\n* $R^2$ does not allow for \"degrees of freedom\".\n* Adding *any* variable tends to increase the value of $R^2$, even if that variable is irrelevant.\n\\pause\n\nTo overcome this problem, we can use *adjusted $R^2$*:\n\\begin{block}{}\n$$\n\\bar{R}^2 = 1-(1-R^2)\\frac{T-1}{T-k-1}\n$$\nwhere $k=$ no.\\ predictors and $T=$ no.\\ observations.\n\\end{block}\n\n\\pause\n\n\\begin{alertblock}{Maximizing $\\bar{R}^2$ is equivalent to minimizing $\\hat\\sigma^2$.}\n\\centerline{$\\displaystyle\n\\hat{\\sigma}^2 = \\frac{1}{T-k-1}\\sum_{t=1}^T \\varepsilon_t^2$\n}\n\\end{alertblock}\n\n## Akaike's Information Criterion\n\n\\vspace*{0.2cm}\\begin{block}{}\n\\centerline{$\\text{AIC} = -2\\log(L) + 2(k+2)$}\n\\end{block}\\vspace*{-0.5cm}\n\n* $L=$ likelihood\n* $k=$ \\# predictors in model.\n* AIC penalizes terms more heavily than $\\bar{R}^2$.\n\n\\pause\\begin{block}{}\n\\centerline{$\\text{AIC}_{\\text{C}} = \\text{AIC} + \\frac{2(k+2)(k+3)}{T-k-3}$}\n\\end{block}\n\n* Minimizing the AIC or AICc is asymptotically equivalent to minimizing MSE via **leave-one-out cross-validation** (for any linear regression).\n\n## Leave-one-out cross-validation\n\nFor regression, leave-one-out cross-validation is faster and more efficient than time-series cross-validation.\n\n* Select one observation for test set, and use *remaining* observations in training set. Compute error on test observation.\n* Repeat using each possible observation as the test set.\n* Compute accuracy measure over all errors.\n\n\n::: {.cell}\n\n:::\n\n\n## Cross-validation {-}\n\n**Traditional evaluation**\n\n\n::: {.cell}\n::: {.cell-output-display}\ndata:image/s3,"s3://crabby-images/c985e/c985e35c34dab51a93cf129686eee45da11d5bc7" alt=""\n:::\n:::\n\n\n\\pause\n\n**Time series cross-validation**\n\n\n::: {.cell}\n::: {.cell-output-display}\ndata:image/s3,"s3://crabby-images/29756/29756dfc70779911737a7963aea22b653a48a72b" alt=""\n:::\n:::\n\n\n## Cross-validation {-}\n\n**Traditional evaluation**\n\n\n::: {.cell}\n::: {.cell-output-display}\ndata:image/s3,"s3://crabby-images/10df0/10df0c74b4d75ee98b67808dbea3f233fc36d0a3" alt=""\n:::\n:::\n\n\n**Leave-one-out cross-validation**\n\n\n::: {.cell}\n::: {.cell-output-display}\ndata:image/s3,"s3://crabby-images/e57b6/e57b625f88bb23620ea7a752e65ad638497b1887" alt=""\n:::\n:::\n\n\n\\only<2>{\\begin{textblock}{4}(6,6)\\begin{block}{}\\fontsize{13}{15}\\sf\nCV = MSE on \\textcolor[HTML]{D55E00}{test sets}\\end{block}\\end{textblock}}\n\n## Bayesian Information Criterion\n\n\\begin{block}{}\n$$\n\\text{BIC} = -2\\log(L) + (k+2)\\log(T)\n$$\n\\end{block}\nwhere $L$ is the likelihood and $k$ is the number of predictors in the model.\\pause\n\n* BIC penalizes terms more heavily than AIC\n* Also called SBIC and SC.\n* Minimizing BIC is asymptotically equivalent to leave-$v$-out cross-validation when $v = T[1-1/(log(T)-1)]$.\n\n## Choosing regression variables\n\\fontsize{14}{15}\\sf\n\n**Best subsets regression**\n\n* Fit all possible regression models using one or more of the predictors.\n* Choose the best model based on one of the measures of predictive ability (CV, AIC, AICc).\n\\pause\n\n**Backwards stepwise regression**\n\n* Start with a model containing all variables.\n* Subtract one variable at a time. Keep model if lower CV.\n* Iterate until no further improvement.\n* Not guaranteed to lead to best model.\n\n## Ex-ante versus ex-post forecasts\n\n * *Ex ante forecasts* are made using only information available in advance.\n - require forecasts of predictors\n * *Ex post forecasts* are made using later information on the predictors.\n - useful for studying behaviour of forecasting models.\n\n * trend, seasonal and calendar variables are all known in advance, so these don't need to be forecast.\n", | ||
"supporting": [ | ||
"slides_files" | ||
], | ||
"filters": [ | ||
"rmarkdown/pagebreak.lua" | ||
], | ||
"includes": {}, | ||
"engineDependencies": {}, | ||
"preserve": null, | ||
"postProcess": false | ||
} | ||
} |
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
_freeze/week10/slides_week10/figure-beamer/traintest1a-1.pdf
Binary file not shown.
Binary file modified
BIN
+0 Bytes
(100%)
_freeze/week10/slides_week10/figure-beamer/tscvggplot1-1.pdf
Binary file not shown.
Binary file modified
BIN
-1 Byte
(100%)
_freeze/week10/slides_week10/figure-beamer/unnamed-chunk-1-1.pdf
Binary file not shown.
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.