Skip to content
Open
Show file tree
Hide file tree
Changes from 10 commits
Commits
Show all changes
38 commits
Select commit Hold shift + click to select a range
5664973
First draft of Tutorial 7.1, interactions
ghostpress Sep 16, 2025
5a0b0b9
Moved R code to Rmd file with chunks and exercise headings
ghostpress Sep 17, 2025
112f42e
Push before meeting
ghostpress Sep 18, 2025
c52a7b5
Updated todos
ghostpress Sep 18, 2025
2e717e2
Finalized Tutorial 7 lesson 1 and cleaned up draft files
ghostpress Oct 2, 2025
4012f57
Added html and drafts to gitignore
ghostpress Oct 2, 2025
cca5f41
Finalized Tutorial 7 lesson 2
ghostpress Oct 2, 2025
3383325
Updated model math to match OpenIntro formatting
ghostpress Oct 2, 2025
e8c375f
Removed hints that wouldn't compile
ghostpress Oct 2, 2025
f202845
Added note about model interpretation
ghostpress Oct 2, 2025
596a3df
Updated T7L1
ghostpress Oct 9, 2025
83c550d
Forgot to push changes
ghostpress Oct 15, 2025
38172ed
Initial simulation lesson
ghostpress Oct 17, 2025
f7c0a4b
Added plots of bootstrap lines
ghostpress Oct 28, 2025
00e0cb3
alternative ways of installing learnrhash
jhuggins Oct 31, 2025
2b4da26
Merge branch 'tutorial-7-lv' of github.com:BU-Intro-Stats/MA214-tutor…
jhuggins Oct 31, 2025
c16cf3c
Another try at 7.3 modeling uncertainty, drafted extrapolation/interp…
ghostpress Nov 3, 2025
6ce8fff
Started 7.4
ghostpress Nov 3, 2025
48207fa
Finished draft of Tutorial 7.4 on cross-validation
ghostpress Nov 3, 2025
eaf7706
Placeholder file for Tutorial 7.5 on linear regression
ghostpress Nov 3, 2025
f60cc00
Merge branch 'tutorial-7-lv' of https://github.com/jhuggins/MA214-tut…
ghostpress Nov 3, 2025
5768b7b
Fixed compile issues
ghostpress Nov 3, 2025
5d1327d
Updates to 7.3
ghostpress Nov 3, 2025
7f72795
Added graphs to show uncertainty in model and bootstrap estimates, st…
ghostpress Nov 6, 2025
660cec0
Added SE formula and discussion of uncertainty sources
ghostpress Nov 7, 2025
2d08f1a
Finished 7.3 lesson, still need to add 'your turn' sections on a 2nd …
ghostpress Nov 7, 2025
ae4e567
Moved computation of SE to extrapolation example, but still discusse…
ghostpress Nov 7, 2025
b62cd8e
Added a second dataset for students to try bootstrapping themselves
ghostpress Nov 10, 2025
707ad1d
Started Tutorial 7.5 - logistic regression case study
ghostpress Nov 13, 2025
a1f7c83
Forgot to push data files
ghostpress Nov 13, 2025
0a41ac3
Updated 7.3 from feedback
ghostpress Nov 20, 2025
d2ceed4
Added more odds and log-odds concepts
ghostpress Dec 3, 2025
cc379c7
Drafted Tutorial 7.5, case study
ghostpress Dec 4, 2025
0940d9f
Style changes, made sure the document knits and added a submit section
ghostpress Dec 4, 2025
1ea913c
Forgot a submit section
ghostpress Dec 4, 2025
a5d6596
Standardized filenames and directories
ghostpress Dec 6, 2025
24f2cf5
Added an index page (with placeholder content)
ghostpress Dec 6, 2025
53cd7e6
Deleted unnecessary files
ghostpress Dec 6, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,6 @@
*lesson.html
*lesson_files/
rsconnect/

*.html
drafts/
100 changes: 100 additions & 0 deletions 07-adv-model-infer/Tutorial7.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
title: "Tutorial 7: Advanced Inferential Modeling"
output: learnr::tutorial
runtime: shiny_prerendered
---

```{r setup, message=FALSE, warning=FALSE, include=FALSE}
devtools::install_github("rundel/learnrhash")

library(learnr)
library(tidyverse)
library(openintro)
library(grid)
library(png)
#library(emo)

knitr::opts_chunk$set(echo = FALSE,
fig.align = "center",
fig.height = 3,
fig.width = 5,
message = FALSE,
warning = FALSE)

tutorial_options(exercise.eval = FALSE)

# Hash generation helpers
# Should ideally be loaded from the imstutorials package when it exists
is_server_context <- function(.envir) {
# We are in the server context if there are the follow:
# * input - input reactive values
# * output - shiny output
# * session - shiny session
#
# Check context by examining the class of each of these.
# If any is missing then it will be a NULL which will fail.

inherits(.envir$input, "reactivevalues") &
inherits(.envir$output, "shinyoutput") &
inherits(.envir$session, "ShinySession")
}

check_server_context <- function(.envir) {
if (!is_server_context(.envir)) {
calling_func <- deparse(sys.calls()[[sys.nframe() - 1]])
err <- paste0("Function `", calling_func, "`", " must be called from an Rmd chunk where `context = \"server\"`")
stop(err, call. = FALSE)
}
}
encoder_logic <- function(strip_output = FALSE) {
p <- parent.frame()
check_server_context(p)
# Make this var available within the local context below
assign("strip_output", strip_output, envir = p)
# Evaluate in parent frame to get input, output, and session
local(
{
encoded_txt <- shiny::eventReactive(
input$hash_generate,
{
# shiny::getDefaultReactiveDomain()$userData$tutorial_state
state <- learnr:::get_tutorial_state()
shiny::validate(shiny::need(length(state) > 0, "No progress yet."))
shiny::validate(shiny::need(nchar(input$name) > 0, "No name entered."))
shiny::validate(shiny::need(nchar(input$studentID) > 0, "Please enter your student ID"))
user_state <- purrr::map_dfr(state, identity, .id = "label")
user_state <- dplyr::group_by(user_state, label, type, correct)
user_state <- dplyr::summarize(
user_state,
answer = list(answer),
timestamp = dplyr::first(timestamp),
.groups = "drop"
)
user_state <- dplyr::relocate(user_state, correct, .before = timestamp)
user_info <- tibble(
label = c("student_name", "student_id"),
type = "identifier",
answer = as.list(c(input$name, input$studentID)),
timestamp = format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z", tz = "UTC")
)
learnrhash::encode_obj(bind_rows(user_info, user_state))
}
)
output$hash_output <- shiny::renderText(encoded_txt())
},
envir = p
)
}

hash_encoder_ui <- {
shiny::div("If you have completed this tutorial and are happy with all of your", "solutions, please enter your identifying information, then click the button below to generate your hash", textInput("name", "What's your name?"), textInput("studentID", "What is your student ID?"), renderText({
input$caption
}), )
}
```

```{r child = 'Tutorial7_Lesson1.Rmd'}
```

```{r child = 'Tutorial7_Lesson2.Rmd'}
```
263 changes: 263 additions & 0 deletions 07-adv-model-infer/Tutorial7_Lesson1.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,263 @@
---
title: "Lesson 1: Interactions"
#subtitle: "Lesson 1: Interactions"
output:
learnr::tutorial:
progressive: true
allow_skip: true
runtime: shiny_prerendered
---

# Lesson 1: Interactions

```{r setup, message=FALSE, warning=FALSE, include=FALSE}
devtools::install_github("rundel/learnrhash")

library(learnr)
library(tidyverse)
library(openintro)
library(grid)
library(png)
#library(emo)

knitr::opts_chunk$set(echo = FALSE,
fig.align = "center",
fig.height = 3,
fig.width = 5,
message = FALSE,
warning = FALSE)

tutorial_options(exercise.eval = FALSE)

# Hash generation helpers
# Should ideally be loaded from the imstutorials package when it exists
is_server_context <- function(.envir) {
# We are in the server context if there are the follow:
# * input - input reactive values
# * output - shiny output
# * session - shiny session
#
# Check context by examining the class of each of these.
# If any is missing then it will be a NULL which will fail.

inherits(.envir$input, "reactivevalues") &
inherits(.envir$output, "shinyoutput") &
inherits(.envir$session, "ShinySession")
}

check_server_context <- function(.envir) {
if (!is_server_context(.envir)) {
calling_func <- deparse(sys.calls()[[sys.nframe() - 1]])
err <- paste0("Function `", calling_func, "`", " must be called from an Rmd chunk where `context = \"server\"`")
stop(err, call. = FALSE)
}
}
encoder_logic <- function(strip_output = FALSE) {
p <- parent.frame()
check_server_context(p)
# Make this var available within the local context below
assign("strip_output", strip_output, envir = p)
# Evaluate in parent frame to get input, output, and session
local(
{
encoded_txt <- shiny::eventReactive(
input$hash_generate,
{
# shiny::getDefaultReactiveDomain()$userData$tutorial_state
state <- learnr:::get_tutorial_state()
shiny::validate(shiny::need(length(state) > 0, "No progress yet."))
shiny::validate(shiny::need(nchar(input$name) > 0, "No name entered."))
shiny::validate(shiny::need(nchar(input$studentID) > 0, "Please enter your student ID"))
user_state <- purrr::map_dfr(state, identity, .id = "label")
user_state <- dplyr::group_by(user_state, label, type, correct)
user_state <- dplyr::summarize(
user_state,
answer = list(answer),
timestamp = dplyr::first(timestamp),
.groups = "drop"
)
user_state <- dplyr::relocate(user_state, correct, .before = timestamp)
user_info <- tibble(
label = c("student_name", "student_id"),
type = "identifier",
answer = as.list(c(input$name, input$studentID)),
timestamp = format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z", tz = "UTC")
)
learnrhash::encode_obj(bind_rows(user_info, user_state))
}
)
output$hash_output <- shiny::renderText(encoded_txt())
},
envir = p
)
}

hash_encoder_ui <- {
shiny::div("If you have completed this tutorial and are happy with all of your", "solutions, please enter your identifying information, then click the button below to generate your hash", textInput("name", "What's your name?"), textInput("studentID", "What is your student ID?"), renderText({
input$caption
}), )
}
```


In the previous tutorials, you've learned how to perform multiple regression for inference and prediction. We will build on these skills in this tutorial, and in particular learn how to adapt models where two or more variables are highly inter-correlated.

## Exercise 1

First, load the `icecream.csv` data file into a DataFrame variable and create scatterplots to examine the relationships between the varables. In this analysis, we are interested in determining what factors influence ice cream consumption from an ice cream truck with a route that travels across different neighborhoods.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Make sure to state what the variable names are (in this case for consumption, temperature, and price) since the students won't immediately be able to see that information

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think an example with an indicator variable makes it easier to motivate why you would want interactions

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe walk them through the indicator variable example, ask some questions about interpretation, then have them try the same approach on the current (non-indicator) example


```{r ex1, exercise = TRUE}
# Load data
icecream <- _______("icecream.csv")

# View some of the variables
head(icecream)

# Plot ice cream consumption vs outside temperature
ggplot(data=icecream, aes(y=_______, x=_______)) +
geom_point()

# Plot ice cream consumption vs ice cream price, colored by outside temperature
ggplot(data=icecream, aes(y=_______, x=_______, col=temp)) +
geom_point()

# Plot ice cream consumption vs consumer income, colored by price
ggplot(data=icecream, aes(y=_______, x=_______, col=_______)) +
geom_point()
```

```{r ex1-hint-1}
# Try the following command for loading the csv file:
read.csv("icecream.csv")
```

```{r ex1-solution}
# Solution
icecream <- read.csv("icecream.csv")

ggplot(data=icecream, aes(y=cons, x=temp)) +
geom_point()

ggplot(data=icecream, aes(y=cons, x=price, col=temp)) +
geom_point()

ggplot(data=icecream, aes(y=cons, x=income, col=temp)) +
geom_point()
```


```{r mc1}
question("What is the relationship between ice cream consumption and temperature?",
answer("Positive and somewhat linear", correct=TRUE),
answer("Positive and strongly linear"),
answer("Negative and non-linear"),
answer("There is no correlation"))
```


## Exercise 2

We can see that ice cream consumption and temperature have a positive and somewhat linear relationship; that is, when temperature increases, we observe that consumption is also likely to increase. But, we have two other variables in the dataset: (ice cream) price and (consumer) income. Could these two variables be inter-related?

First, let's write a model for the data: if $y$ is consumption, then with the variables we explored above the model can be expressed as

$y = \beta_0 + \beta_{\rm temp}x_{\rm temp} + \beta_{\rm price}x_{\rm price} + \beta_{\rm income}x_{\rm income} + residuals$

Now, fit this model below:

```{r ex2, exercise = TRUE}
m1 <- ____

summary(m1)
```

```{r ex2-solution}
# Add the independent variables (covariates) after the ~ and separated by +
m1 <- lm(data=icecream, cons ~ temp + price + income)

summary(m1)
```


## Exercise 3

Now we will try adding "interaction terms," which account for the effect of two covariates that may depend on each other. We can add these terms to the model as follows:

$y = \beta_0 + \beta_{\rm temp}x_{\rm temp} + \beta_{\rm price}x_{\rm price} + \beta_{\rm income}x_{\rm income} + \beta_{\rm price:income}x_{\rm price}x_{\rm income} + residuals$

In R, the syntax for adding an interaction term within the linear model `lm()` is `lm(y ~ x1 + x2 + x1:x2)`. Try it with the icecream data below:

```{r ex3, exercise = TRUE}
m2 <- ____

summary(m2)
```

```{r ex3-solution}
# Add the independent variables (covariates) after the ~ and separated by +
m2 <- lm(data=icecream, cons ~ temp + price + income + price:income)

summary(m2)
```


## Exercise 4

```{r, echo=FALSE}
tab <- matrix(c(0.719, 0.687, 0.037, 0.759, 0.721, 0.035), ncol=3, byrow=TRUE)
colnames(tab) <- c('Multiple R-squared', 'Adjusted R-squared', 'Residual SE')
rownames(tab) <- c('m1', 'm2')
tab <- as.table(tab)
print(tab)
```

```{r mc2}
question("Given the measures of model fit above, which model do you think fits the icecream consumption data better?",
answer("m1, linear model with no interactions"),
answer("m2, linear model with interaction between price and income", correct=TRUE),
answer("Not sure"))
```


## Exercise 5
Suppose we were to run an experiment where 24 bean plants are randomized into one of four groups:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm...this example is already in the online supplement. It would be nice to have a different one.

Also, I think there's reason in the supplement they had this exercise at the beginning: it helps to motivate why we care about interactions.


1. Each plant receives 1 teaspoon of water and 1 hour of sunlight each day.
2. Each plant receives 4 tablespoons of water and 1 hour of sunlight each day.
3. Each plant receives 1 teaspoon of water and 8 hours of sunlight each day.
4. Each plant receives 4 tablespoons of water and 8 hours of sunlight each day.


```{r mc3}
question("Which group do you think will have the least plant growth?",
answer("Group 1"),
answer("Group 2", correct=TRUE),
answer("Group 3"),
answer("Group 4"))
```

```{r mc4}
question("The most plant growth?",
answer("Group 1"),
answer("Group 2"),
answer("Group 3"),
answer("Group 4", correct=TRUE))
```

```{r mc5}
question("Do you think the effects of the water and sunlight on plans are independent?",
answer("Yes", message="Consider two examples: a plant that receives very little water and a lot of sun will dry up, while one that receives a lot of water and very little sun will rot. Most plants need a balance of water and sunlight to grow properly."),
answer("No", correct=TRUE),
answer("Not sure", message="Consider two examples: a plant that receives very little water and a lot of sun will dry up, while one that receives a lot of water and very little sun will rot. Most plants need a balance of water and sunlight to grow properly."))
```


## Submit

```{r, echo=FALSE, context="server"}
encoder_logic()
```

```{r encode, echo=FALSE}
learnrhash::encoder_ui(ui_before = hash_encoder_ui)
```
Loading