tutorial instructions for week6

numbats · Apr 2, 2024 · 0dd51f0 · 0dd51f0
1 parent e43960b
commit 0dd51f0
Show file tree

Hide file tree

Showing 26 changed files with 353 additions and 26 deletions.
diff --git a/_quarto.yml b/_quarto.yml
@@ -6,8 +6,9 @@ project:
   - "!quizzes/"
   - "!assignments/"
   - "next-time*"
-  - "week5/tutorial*"
+  - "week6/tutorial*"
   - "!data/ACIC23-competition/"
+  - "!week6/slides*"
   output-dir: docs
   resources:
   - "assets/*"

diff --git a/docs/data/nnet_best.rda b/docs/data/nnet_best.rda
diff --git a/docs/data/nnet_many.rda b/docs/data/nnet_many.rda
diff --git a/docs/data/wiggly-multi.rdata b/docs/data/wiggly-multi.rdata
diff --git a/docs/images/nn_annotate.png b/docs/images/nn_annotate.png
diff --git a/docs/images/reg_nn.png b/docs/images/reg_nn.png
diff --git a/docs/resources.html b/docs/resources.html
@@ -253,6 +253,7 @@ <h1>Useful links</h1>
 <li><p><a href="https://tensorflow.rstudio.com/install/">TensorFlow for R</a></p></li>
 <li><p><a href="https://lnalborczyk.github.io/slides/vendredi_quanti_2021/vendredi_quantis#1">A gentle introduction to deep learning in R using Keras</a></p></li>
 <li><p><a href="https://lorentzen.ch/index.php/blog/">(M+C)² Blog</a></p></li>
+<li><p><a href="http://onlinelibrary.wiley.com/doi/10.1002/sam.11271/abstract">Wickham et al (2015) Removing the Blindfold</a></p></li>
 </ul>
 
 

diff --git a/docs/search.json b/docs/search.json
diff --git a/docs/sitemap.xml b/docs/sitemap.xml
@@ -10,11 +10,11 @@
   </url>
   <url>
     <loc>https://iml.numbat.space/week6/tutorial.html</loc>
-    <lastmod>2024-03-29T01:27:47.522Z</lastmod>
+    <lastmod>2024-04-02T08:38:08.703Z</lastmod>
   </url>
   <url>
     <loc>https://iml.numbat.space/week6/tutorialsol.html</loc>
-    <lastmod>2024-03-29T01:27:47.522Z</lastmod>
+    <lastmod>2024-04-02T08:38:08.703Z</lastmod>
   </url>
   <url>
     <loc>https://iml.numbat.space/week5/tutorial.html</loc>
@@ -62,7 +62,7 @@
   </url>
   <url>
     <loc>https://iml.numbat.space/resources.html</loc>
-    <lastmod>2024-02-18T01:55:28.455Z</lastmod>
+    <lastmod>2024-04-02T06:49:21.420Z</lastmod>
   </url>
   <url>
     <loc>https://iml.numbat.space/index.html</loc>
@@ -118,7 +118,7 @@
   </url>
   <url>
     <loc>https://iml.numbat.space/week6/index.html</loc>
-    <lastmod>2024-02-05T21:26:36.255Z</lastmod>
+    <lastmod>2024-04-02T08:37:08.588Z</lastmod>
   </url>
   <url>
     <loc>https://iml.numbat.space/week7/index.html</loc>

diff --git a/docs/week2/images/slides.rmarkdown/data-in-model-space1-1.png b/docs/week2/images/slides.rmarkdown/data-in-model-space1-1.png
diff --git a/docs/week2/images/slides.rmarkdown/model-in-the-data-space1-1.png b/docs/week2/images/slides.rmarkdown/model-in-the-data-space1-1.png
diff --git a/docs/week2/images/slides.rmarkdown/model-in-the-data-space2-1.png b/docs/week2/images/slides.rmarkdown/model-in-the-data-space2-1.png
diff --git a/docs/week2/slides.html b/docs/week2/slides.html
diff --git a/docs/week3/tutorialsol.html b/docs/week3/tutorialsol.html
diff --git a/docs/week4/tutorialsol.html b/docs/week4/tutorialsol.html
diff --git a/docs/week6/index.html b/docs/week6/index.html
@@ -191,6 +191,8 @@ <h2 id="toc-title">On this page</h2>
   <ul>
   <li><a href="#main-reference" id="toc-main-reference" class="nav-link active" data-scroll-target="#main-reference">Main reference</a></li>
   <li><a href="#what-you-will-learn-this-week" id="toc-what-you-will-learn-this-week" class="nav-link" data-scroll-target="#what-you-will-learn-this-week">What you will learn this week</a></li>
+  <li><a href="#lecture-slides" id="toc-lecture-slides" class="nav-link" data-scroll-target="#lecture-slides">Lecture slides</a></li>
+  <li><a href="#tutorial-instructions" id="toc-tutorial-instructions" class="nav-link" data-scroll-target="#tutorial-instructions">Tutorial instructions</a></li>
   <li><a href="#assignments" id="toc-assignments" class="nav-link" data-scroll-target="#assignments">Assignments</a></li>
   </ul>
 </nav>
@@ -229,6 +231,23 @@ <h2 class="anchored" data-anchor-id="what-you-will-learn-this-week">What you wil
 <li>Diagnosing the fit</li>
 </ul>
 </section>
+<section id="lecture-slides" class="level2">
+<h2 class="anchored" data-anchor-id="lecture-slides">Lecture slides</h2>
+<ul>
+<li><a href="https://iml.numbat.space/week6/slides.html">html</a></li>
+<li><a href="https://iml.numbat.space/week6/slides.pdf">pdf</a></li>
+<li><a href="https://iml.numbat.space/week6/slides.qmd">qmd</a></li>
+<li><a href="https://iml.numbat.space/week6/slides.R">R</a></li>
+</ul>
+</section>
+<section id="tutorial-instructions" class="level2">
+<h2 class="anchored" data-anchor-id="tutorial-instructions">Tutorial instructions</h2>
+<p>Instructions:</p>
+<ul>
+<li><a href="https://iml.numbat.space/week6/tutorial.html">html</a></li>
+<li><a href="https://iml.numbat.space/week6/tutorial.qmd">qmd</a></li>
+</ul>
+</section>
 <section id="assignments" class="level2">
 <h2 class="anchored" data-anchor-id="assignments">Assignments</h2>
 <ul>

diff --git a/docs/week6/slides.qmd b/docs/week6/slides.qmd
@@ -0,0 +1,276 @@
+---
+title: "ETC3250/5250 Introduction to Machine Learning"
+title-slide-attributes: 
+  data-background-image: "../images/bg.png"
+  data-background-size: 100%
+subtitle: "Week 6: Neural networks and deep learning"
+author: 
+ - name: "Professor Di Cook"
+   email: "[email protected]"
+institute: "Department of Econometrics and Business Statistics"
+footer: "ETC3250/5250 Lecture 6 | [iml.numbat.space](iml.numbat.space)"
+format:
+  revealjs:
+    multiplex: false
+    slide-number: c/t
+    slide-tone: false
+    theme: "../assets/monash.scss"
+    width: 1600
+    height: 900
+    margin: 0.05
+    embed-resources: true
+---
+
+```{r, include = FALSE}
+source("../setup.R")
+library(ggpubr)
+library(kableExtra)
+```
+
+## Overview
+
+We will cover:
+
+* Structure of a neural network
+* Fitting neural networks
+* Diagnosing the fit
+
+## Structure of a neural network {.transition-slide .center}
+
+## Nested logistic regressions
+
+:::: {.columns}
+::: {.column}
+
+Remember the logistic function:
+
+\begin{align}
+f(x) &= \frac{e^{\beta_0+\sum_{j=1}^p\beta_jx_j}}{1+e^{\beta_0+\sum_{j=1}^p\beta_jx_j}}\\
+  &= \frac{1}{1+e^{-(\beta_0+\sum_{j=1}^p\beta_jx_j)}}
+\end{align}
+
+Also,
+
+$$\log_e\frac{f(x)}{1 - f(x)} = \beta_0+\sum_{j=1}^p\beta_jx_j$$
+:::
+
+::: {.column}
+::: {.fragment}
+
+<br><br>
+[Above the threshold predict to be 1.]{.smaller} 
+
+```{r}
+#| echo: false
+#| out-width: 70%
+#| fig-width: 4
+#| fig-height: 4
+library(tidyverse)
+x <- seq(-2, 2, 0.1)
+y <- exp(1+3*x)/(1+exp(1+3*x))
+df2 <- tibble(x, y)
+ggplot(df2, aes(x=x, y=y)) + 
+  geom_line() +
+  geom_hline(yintercept=0.5, colour="orange") +
+  annotate("text", x=0.84, y=0.55, label="Activation threshold ??", colour="orange") +
+  geom_hline(yintercept=c(0,1), linetype=2)
+```
+
+:::
+
+:::
+::::
+
+## Linear regression as a network
+
+:::: {.columns}
+::: {.column}
+$$\widehat{y} =b_0+\sum_{j=1}^pb_jx_j$$
+
+Drawing as a network model: 
+
+$p$ [inputs]{.monash-orange2} (predictors), multiplied by [weights]{.monash-orange2} (coefficients), summed, add a [constant]{.monash-orange2}, predicts [output]{.monash-orange2} (response). 
+:::
+::: {.column}
+![](../images/reg_nn.png){width=80%}
+
+:::
+::::
+
+## Single hidden layer NN
+
+:::: {.columns}
+::: {.column}
+
+\begin{align}
+\widehat{y} =a_{0}+\sum_{k=1}^s(a_{k}(b_{j0}+\sum_{j=1}^pb_{jk}x_j))
+\end{align}
+
+
+:::
+::: {.column}
+
+![](../images/nn_annotate.png)
+:::
+::::
+
+## What does this look like? [(1/2)]{.smallest}
+
+:::: {.columns}
+::: {.column style="font-size: 70%"}
+The architecture allows for combining multiple linear models to generate non-linear classifications. 
+
+```{r}
+#| echo: false
+#| out-width: 70%
+#| fig-width: 4
+#| fig-height: 5
+w <- read_csv(here::here("data/wiggly.csv"))
+ggplot(w, aes(x=x, y=y, colour=class, shape=class)) + 
+  geom_point() +
+  scale_color_brewer("", palette="Dark2") +
+  scale_shape("") +
+  theme(legend.position = "bottom") 
+```
+
+:::
+::: {.column style="font-size: 70%"}
+
+The best fit uses $s=4$, four nodes in the hidden layer. Can you sketch four lines that would split this data well?
+
+```{r}
+#| echo: false
+#| out-width: 70%
+#| fig-width: 4
+#| fig-height: 5
+load(here::here("data/nnet_many.rda"))
+load(here::here("data/nnet_best.rda"))
+
+ggplot(subset(best$output,  node == 1), aes(x, y)) +
+  geom_raster(aes(fill = pred)) +
+  geom_point(aes(shape = class), data = w) +
+  scale_fill_gradient2("", low="#1B9E77", 
+                       high="#D95F02", 
+                       mid = "white", 
+                       midpoint = 0.5,
+                       guide = "colourbar",
+                       limits = c(0,1)) +
+  scale_shape("") +
+  theme(legend.position = "bottom",
+        legend.text = element_text(size=6)) 
+```
+:::
+::::
+
+[[Wickham et al (2015) Removing the Blindfold](http://onlinelibrary.wiley.com/doi/10.1002/sam.11271/abstract)]{.smallest}
+
+## What does this look like? [(2/2)]{.smallest}
+
+:::: {.columns}
+::: {.column}
+
+The models at each of the nodes of the hidden layer. 
+
+```{r}
+#| echo: false
+#| out-width: 90%
+#| fig-width: 5
+#| fig-height: 4
+ggplot(data=best$hidden, aes(x, y)) +
+  geom_tile(aes(fill=pred)) + 
+  geom_point(data=w, aes(shape=class)) +
+  facet_wrap(~node, ncol=2) + 
+  scale_fill_gradient2("", low="#AF8DC3",
+                                    mid="#F7F7F7",
+                                    high="#7FBF7B",
+                                    midpoint=0.5,
+                                    limits=c(0,1)) +
+  scale_shape("") +
+  theme(axis.text = element_blank(), 
+        axis.title = element_blank(),
+        legend.text = element_text(size=6))
+```
+:::
+
+::: {.column}
+::: {.fragment}
+
+```{r}
+#| echo: false
+#| out-width: 70%
+#| fig-width: 4
+#| fig-height: 4
+ggplot(data=best$hidden, aes(x, y)) + 
+  geom_contour(aes(z=pred, group=node), 
+               colour="grey50", 
+               size=2, 
+               breaks = 0.5) +
+  geom_point(data=w, aes(colour=class, 
+                 shape=class)) + 
+  scale_color_brewer("", palette="Dark2") +
+  scale_shape("") +
+  theme(legend.position = "bottom",
+        legend.text = element_text(size=6)) 
+
+
+```
+:::
+:::
+::::
+
+## But can be painful to find the best!
+
+These are all the models fitted, using $s=2, 3, 4$ with the fit statistics.
+
+```{r}
+#| echo: false
+#| out-width: 100%
+#| fig-width: 12
+#| fig-height: 4
+qual <- unique(many[, c("value", "accuracy", "nodes", "id")])
+ggplot(qual, aes(x=accuracy, y=value)) +
+  geom_point(alpha=0.7, size=3) +
+  xlab("Accuracy") +
+  ylab("Value of fitting criterion") +
+  facet_wrap(~nodes)
+
+```
+
+Fitted using the R package `nnet`. It's very unstable, and this is still a problem with current procedures.
+
+## Fitting with keras {.transition-slide .center}
+
+## Steps
+
+1. Define architecture
+
+- if you are classifying images, you need to flatten the image into a single row of data, eg 24x24 pixel image would be converted to a row of 576 values. Each pixel is a variable.
+- How many hidden layers do you need? 
+- How many nodes in the hidden layer?
+
+2. Specify activation
+
+- linear
+- relu
+- sigmoid
+- softmax
+
+3. Choose loss function
+
+4. Training process
+
+5. Evaluation
+
+## Example: penguins
+
+## Example: fashion_mnist
+
+## Want to learn more?
+
+This is a very nice slide set: [A gentle introduction to deep learning in R using Keras](https://lnalborczyk.github.io/slides/vendredi_quanti_2021/vendredi_quantis#1)
+
+And the tutorials at [TensorFlow for R](https://tensorflow.rstudio.com/install/) have lots of examples.
+
+## Next: Explainable artificial intelligence (XAI) {.transition-slide .center}
+
+
diff --git a/docs/week6/tutorial.html b/docs/week6/tutorial.html
@@ -5245,8 +5245,8 @@ <h2 class="anchored" data-anchor-id="deciding-on-variables-in-a-large-data-probl
 <span id="cb7-7"><a href="#cb7-7" aria-hidden="true" tabindex="-1"></a>  <span class="fu">print</span>(<span class="at">n=</span><span class="dv">50</span>)</span></code><button title="Copy to Clipboard" class="code-copy-button"><i class="bi"></i></button></pre></div>
 </div>
 </section>
-<section id="can-boosting-better-detect-bushfire-case" class="level2">
-<h2 class="anchored" data-anchor-id="can-boosting-better-detect-bushfire-case">Can boosting better detect bushfire case?</h2>
+<section id="can-boosting-better-detect-bushfire-cause" class="level2">
+<h2 class="anchored" data-anchor-id="can-boosting-better-detect-bushfire-cause">Can boosting better detect bushfire cause?</h2>
 <p>Fit a boosted tree model using <code>xgboost</code> to the bushfires data. You can use the code below. Compute the confusion tables and the balanced accuracy for the test data for both the forest model and the boosted tree model, to make the comparison.</p>
 <div class="cell" data-layout-align="center">
 <div class="sourceCode cell-code" id="cb8"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb8-1"><a href="#cb8-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">121</span>)</span>

diff --git a/docs/week6/tutorial.qmd b/docs/week6/tutorial.qmd
@@ -468,7 +468,7 @@ p1 + p2 + p3 + plot_layout(ncol=3)
 Each of these variables has some difference in median value between the classes, but none shows any separation between them. If the three most important variables show little separation, it indicates the difficulty in distinguishing between these classes. However, it looks like if the distance from a road, or CFA station is bigger, the chance of the cause being a lightning start is higher. This makes sense, because these would be locations further from human activity, and thus the fire is less likely to started by people. The arf360 relates to rain from a year ago. It also appears that if the rainfall was higher a year ago, lightning is more likely the cause. This also makes some sense, because with more rain in the previous year, there should be more vegetation. Particularly, if recent months have been dry, then there is likely a lot of dry vegetation which is combustible. Ideally we would create a new variable (feature engineering) that looks at difference in rainfall from the previous year to just before the current year's fire season, to model these types of conditions.
 :::
 
-## Can boosting better detect bushfire case?
+## Can boosting better detect bushfire cause?
 
 Fit a boosted tree model using `xgboost` to the bushfires data. You can use the code below. Compute the confusion tables and the balanced accuracy for the test data for both the forest model and the boosted tree model, to make the comparison.
 

diff --git a/docs/week6/tutorialsol.html b/docs/week6/tutorialsol.html
@@ -5576,8 +5576,8 @@ <h2 class="anchored" data-anchor-id="deciding-on-variables-in-a-large-data-probl
 </div>
 </div>
 </section>
-<section id="can-boosting-better-detect-bushfire-case" class="level2">
-<h2 class="anchored" data-anchor-id="can-boosting-better-detect-bushfire-case">Can boosting better detect bushfire case?</h2>
+<section id="can-boosting-better-detect-bushfire-cause" class="level2">
+<h2 class="anchored" data-anchor-id="can-boosting-better-detect-bushfire-cause">Can boosting better detect bushfire cause?</h2>
 <p>Fit a boosted tree model using <code>xgboost</code> to the bushfires data. You can use the code below. Compute the confusion tables and the balanced accuracy for the test data for both the forest model and the boosted tree model, to make the comparison.</p>
 <div class="cell" data-layout-align="center">
 <div class="sourceCode cell-code" id="cb15"><pre class="sourceCode r code-with-copy"><code class="sourceCode r"><span id="cb15-1"><a href="#cb15-1" aria-hidden="true" tabindex="-1"></a><span class="fu">set.seed</span>(<span class="dv">121</span>)</span>

diff --git a/resources.qmd b/resources.qmd
@@ -39,3 +39,5 @@ Written by Emil Hvitfeldt to cover creating new variables as broadly as possibly
 - [A gentle introduction to deep learning in R using Keras](https://lnalborczyk.github.io/slides/vendredi_quanti_2021/vendredi_quantis#1)
 
 - [(M+C)² Blog](https://lorentzen.ch/index.php/blog/)
+
+- [Wickham et al (2015) Removing the Blindfold](http://onlinelibrary.wiley.com/doi/10.1002/sam.11271/abstract)
diff --git a/week2/images/slides.rmarkdown/data-in-model-space1-1.png b/week2/images/slides.rmarkdown/data-in-model-space1-1.png
diff --git a/week2/images/slides.rmarkdown/model-in-the-data-space1-1.png b/week2/images/slides.rmarkdown/model-in-the-data-space1-1.png
diff --git a/week2/images/slides.rmarkdown/model-in-the-data-space2-1.png b/week2/images/slides.rmarkdown/model-in-the-data-space2-1.png
Original file line number	Diff line number	Diff line change
Expand Up		@@ -39,3 +39,5 @@ Written by Emil Hvitfeldt to cover creating new variables as broadly as possibly
		- [A gentle introduction to deep learning in R using Keras](https://lnalborczyk.github.io/slides/vendredi_quanti_2021/vendredi_quantis#1)

		- [(M+C)² Blog](https://lorentzen.ch/index.php/blog/)

		- [Wickham et al (2015) Removing the Blindfold](http://onlinelibrary.wiley.com/doi/10.1002/sam.11271/abstract)