forked from OpenIntroStat/ims-tutorials
-
Notifications
You must be signed in to change notification settings - Fork 0
Tutorial 7, lessons 1 and 2 #1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
ghostpress
wants to merge
38
commits into
main
Choose a base branch
from
tutorial-7-lv
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Open
Changes from 10 commits
Commits
Show all changes
38 commits
Select commit
Hold shift + click to select a range
5664973
First draft of Tutorial 7.1, interactions
ghostpress 5a0b0b9
Moved R code to Rmd file with chunks and exercise headings
ghostpress 112f42e
Push before meeting
ghostpress c52a7b5
Updated todos
ghostpress 2e717e2
Finalized Tutorial 7 lesson 1 and cleaned up draft files
ghostpress 4012f57
Added html and drafts to gitignore
ghostpress cca5f41
Finalized Tutorial 7 lesson 2
ghostpress 3383325
Updated model math to match OpenIntro formatting
ghostpress e8c375f
Removed hints that wouldn't compile
ghostpress f202845
Added note about model interpretation
ghostpress 596a3df
Updated T7L1
ghostpress 83c550d
Forgot to push changes
ghostpress 38172ed
Initial simulation lesson
ghostpress f7c0a4b
Added plots of bootstrap lines
ghostpress 00e0cb3
alternative ways of installing learnrhash
jhuggins 2b4da26
Merge branch 'tutorial-7-lv' of github.com:BU-Intro-Stats/MA214-tutor…
jhuggins c16cf3c
Another try at 7.3 modeling uncertainty, drafted extrapolation/interp…
ghostpress 6ce8fff
Started 7.4
ghostpress 48207fa
Finished draft of Tutorial 7.4 on cross-validation
ghostpress eaf7706
Placeholder file for Tutorial 7.5 on linear regression
ghostpress f60cc00
Merge branch 'tutorial-7-lv' of https://github.com/jhuggins/MA214-tut…
ghostpress 5768b7b
Fixed compile issues
ghostpress 5d1327d
Updates to 7.3
ghostpress 7f72795
Added graphs to show uncertainty in model and bootstrap estimates, st…
ghostpress 660cec0
Added SE formula and discussion of uncertainty sources
ghostpress 2d08f1a
Finished 7.3 lesson, still need to add 'your turn' sections on a 2nd …
ghostpress ae4e567
Moved computation of SE to extrapolation example, but still discusse…
ghostpress b62cd8e
Added a second dataset for students to try bootstrapping themselves
ghostpress 707ad1d
Started Tutorial 7.5 - logistic regression case study
ghostpress a1f7c83
Forgot to push data files
ghostpress 0a41ac3
Updated 7.3 from feedback
ghostpress d2ceed4
Added more odds and log-odds concepts
ghostpress cc379c7
Drafted Tutorial 7.5, case study
ghostpress 0940d9f
Style changes, made sure the document knits and added a submit section
ghostpress 1ea913c
Forgot a submit section
ghostpress a5d6596
Standardized filenames and directories
ghostpress 24f2cf5
Added an index page (with placeholder content)
ghostpress 53cd7e6
Deleted unnecessary files
ghostpress File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -8,3 +8,6 @@ | |
| *lesson.html | ||
| *lesson_files/ | ||
| rsconnect/ | ||
|
|
||
| *.html | ||
| drafts/ | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,100 @@ | ||
| --- | ||
| title: "Tutorial 7: Advanced Inferential Modeling" | ||
| output: learnr::tutorial | ||
| runtime: shiny_prerendered | ||
| --- | ||
|
|
||
| ```{r setup, message=FALSE, warning=FALSE, include=FALSE} | ||
| devtools::install_github("rundel/learnrhash") | ||
|
|
||
| library(learnr) | ||
| library(tidyverse) | ||
| library(openintro) | ||
| library(grid) | ||
| library(png) | ||
| #library(emo) | ||
|
|
||
| knitr::opts_chunk$set(echo = FALSE, | ||
| fig.align = "center", | ||
| fig.height = 3, | ||
| fig.width = 5, | ||
| message = FALSE, | ||
| warning = FALSE) | ||
|
|
||
| tutorial_options(exercise.eval = FALSE) | ||
|
|
||
| # Hash generation helpers | ||
| # Should ideally be loaded from the imstutorials package when it exists | ||
| is_server_context <- function(.envir) { | ||
| # We are in the server context if there are the follow: | ||
| # * input - input reactive values | ||
| # * output - shiny output | ||
| # * session - shiny session | ||
| # | ||
| # Check context by examining the class of each of these. | ||
| # If any is missing then it will be a NULL which will fail. | ||
|
|
||
| inherits(.envir$input, "reactivevalues") & | ||
| inherits(.envir$output, "shinyoutput") & | ||
| inherits(.envir$session, "ShinySession") | ||
| } | ||
|
|
||
| check_server_context <- function(.envir) { | ||
| if (!is_server_context(.envir)) { | ||
| calling_func <- deparse(sys.calls()[[sys.nframe() - 1]]) | ||
| err <- paste0("Function `", calling_func, "`", " must be called from an Rmd chunk where `context = \"server\"`") | ||
| stop(err, call. = FALSE) | ||
| } | ||
| } | ||
| encoder_logic <- function(strip_output = FALSE) { | ||
| p <- parent.frame() | ||
| check_server_context(p) | ||
| # Make this var available within the local context below | ||
| assign("strip_output", strip_output, envir = p) | ||
| # Evaluate in parent frame to get input, output, and session | ||
| local( | ||
| { | ||
| encoded_txt <- shiny::eventReactive( | ||
| input$hash_generate, | ||
| { | ||
| # shiny::getDefaultReactiveDomain()$userData$tutorial_state | ||
| state <- learnr:::get_tutorial_state() | ||
| shiny::validate(shiny::need(length(state) > 0, "No progress yet.")) | ||
| shiny::validate(shiny::need(nchar(input$name) > 0, "No name entered.")) | ||
| shiny::validate(shiny::need(nchar(input$studentID) > 0, "Please enter your student ID")) | ||
| user_state <- purrr::map_dfr(state, identity, .id = "label") | ||
| user_state <- dplyr::group_by(user_state, label, type, correct) | ||
| user_state <- dplyr::summarize( | ||
| user_state, | ||
| answer = list(answer), | ||
| timestamp = dplyr::first(timestamp), | ||
| .groups = "drop" | ||
| ) | ||
| user_state <- dplyr::relocate(user_state, correct, .before = timestamp) | ||
| user_info <- tibble( | ||
| label = c("student_name", "student_id"), | ||
| type = "identifier", | ||
| answer = as.list(c(input$name, input$studentID)), | ||
| timestamp = format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z", tz = "UTC") | ||
| ) | ||
| learnrhash::encode_obj(bind_rows(user_info, user_state)) | ||
| } | ||
| ) | ||
| output$hash_output <- shiny::renderText(encoded_txt()) | ||
| }, | ||
| envir = p | ||
| ) | ||
| } | ||
|
|
||
| hash_encoder_ui <- { | ||
| shiny::div("If you have completed this tutorial and are happy with all of your", "solutions, please enter your identifying information, then click the button below to generate your hash", textInput("name", "What's your name?"), textInput("studentID", "What is your student ID?"), renderText({ | ||
| input$caption | ||
| }), ) | ||
| } | ||
| ``` | ||
|
|
||
| ```{r child = 'Tutorial7_Lesson1.Rmd'} | ||
| ``` | ||
|
|
||
| ```{r child = 'Tutorial7_Lesson2.Rmd'} | ||
| ``` |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,263 @@ | ||
| --- | ||
| title: "Lesson 1: Interactions" | ||
| #subtitle: "Lesson 1: Interactions" | ||
| output: | ||
| learnr::tutorial: | ||
| progressive: true | ||
| allow_skip: true | ||
| runtime: shiny_prerendered | ||
| --- | ||
|
|
||
| # Lesson 1: Interactions | ||
|
|
||
| ```{r setup, message=FALSE, warning=FALSE, include=FALSE} | ||
| devtools::install_github("rundel/learnrhash") | ||
|
|
||
| library(learnr) | ||
| library(tidyverse) | ||
| library(openintro) | ||
| library(grid) | ||
| library(png) | ||
| #library(emo) | ||
|
|
||
| knitr::opts_chunk$set(echo = FALSE, | ||
| fig.align = "center", | ||
| fig.height = 3, | ||
| fig.width = 5, | ||
| message = FALSE, | ||
| warning = FALSE) | ||
|
|
||
| tutorial_options(exercise.eval = FALSE) | ||
|
|
||
| # Hash generation helpers | ||
| # Should ideally be loaded from the imstutorials package when it exists | ||
| is_server_context <- function(.envir) { | ||
| # We are in the server context if there are the follow: | ||
| # * input - input reactive values | ||
| # * output - shiny output | ||
| # * session - shiny session | ||
| # | ||
| # Check context by examining the class of each of these. | ||
| # If any is missing then it will be a NULL which will fail. | ||
|
|
||
| inherits(.envir$input, "reactivevalues") & | ||
| inherits(.envir$output, "shinyoutput") & | ||
| inherits(.envir$session, "ShinySession") | ||
| } | ||
|
|
||
| check_server_context <- function(.envir) { | ||
| if (!is_server_context(.envir)) { | ||
| calling_func <- deparse(sys.calls()[[sys.nframe() - 1]]) | ||
| err <- paste0("Function `", calling_func, "`", " must be called from an Rmd chunk where `context = \"server\"`") | ||
| stop(err, call. = FALSE) | ||
| } | ||
| } | ||
| encoder_logic <- function(strip_output = FALSE) { | ||
| p <- parent.frame() | ||
| check_server_context(p) | ||
| # Make this var available within the local context below | ||
| assign("strip_output", strip_output, envir = p) | ||
| # Evaluate in parent frame to get input, output, and session | ||
| local( | ||
| { | ||
| encoded_txt <- shiny::eventReactive( | ||
| input$hash_generate, | ||
| { | ||
| # shiny::getDefaultReactiveDomain()$userData$tutorial_state | ||
| state <- learnr:::get_tutorial_state() | ||
| shiny::validate(shiny::need(length(state) > 0, "No progress yet.")) | ||
| shiny::validate(shiny::need(nchar(input$name) > 0, "No name entered.")) | ||
| shiny::validate(shiny::need(nchar(input$studentID) > 0, "Please enter your student ID")) | ||
| user_state <- purrr::map_dfr(state, identity, .id = "label") | ||
| user_state <- dplyr::group_by(user_state, label, type, correct) | ||
| user_state <- dplyr::summarize( | ||
| user_state, | ||
| answer = list(answer), | ||
| timestamp = dplyr::first(timestamp), | ||
| .groups = "drop" | ||
| ) | ||
| user_state <- dplyr::relocate(user_state, correct, .before = timestamp) | ||
| user_info <- tibble( | ||
| label = c("student_name", "student_id"), | ||
| type = "identifier", | ||
| answer = as.list(c(input$name, input$studentID)), | ||
| timestamp = format(Sys.time(), "%Y-%m-%d %H:%M:%S %Z", tz = "UTC") | ||
| ) | ||
| learnrhash::encode_obj(bind_rows(user_info, user_state)) | ||
| } | ||
| ) | ||
| output$hash_output <- shiny::renderText(encoded_txt()) | ||
| }, | ||
| envir = p | ||
| ) | ||
| } | ||
|
|
||
| hash_encoder_ui <- { | ||
| shiny::div("If you have completed this tutorial and are happy with all of your", "solutions, please enter your identifying information, then click the button below to generate your hash", textInput("name", "What's your name?"), textInput("studentID", "What is your student ID?"), renderText({ | ||
| input$caption | ||
| }), ) | ||
| } | ||
| ``` | ||
|
|
||
|
|
||
| In the previous tutorials, you've learned how to perform multiple regression for inference and prediction. We will build on these skills in this tutorial, and in particular learn how to adapt models where two or more variables are highly inter-correlated. | ||
|
|
||
| ## Exercise 1 | ||
|
|
||
| First, load the `icecream.csv` data file into a DataFrame variable and create scatterplots to examine the relationships between the varables. In this analysis, we are interested in determining what factors influence ice cream consumption from an ice cream truck with a route that travels across different neighborhoods. | ||
|
|
||
| ```{r ex1, exercise = TRUE} | ||
| # Load data | ||
| icecream <- _______("icecream.csv") | ||
|
|
||
| # View some of the variables | ||
| head(icecream) | ||
|
|
||
| # Plot ice cream consumption vs outside temperature | ||
| ggplot(data=icecream, aes(y=_______, x=_______)) + | ||
| geom_point() | ||
|
|
||
| # Plot ice cream consumption vs ice cream price, colored by outside temperature | ||
| ggplot(data=icecream, aes(y=_______, x=_______, col=temp)) + | ||
| geom_point() | ||
|
|
||
| # Plot ice cream consumption vs consumer income, colored by price | ||
| ggplot(data=icecream, aes(y=_______, x=_______, col=_______)) + | ||
| geom_point() | ||
| ``` | ||
|
|
||
| ```{r ex1-hint-1} | ||
| # Try the following command for loading the csv file: | ||
| read.csv("icecream.csv") | ||
| ``` | ||
|
|
||
| ```{r ex1-solution} | ||
| # Solution | ||
| icecream <- read.csv("icecream.csv") | ||
|
|
||
| ggplot(data=icecream, aes(y=cons, x=temp)) + | ||
| geom_point() | ||
|
|
||
| ggplot(data=icecream, aes(y=cons, x=price, col=temp)) + | ||
| geom_point() | ||
|
|
||
| ggplot(data=icecream, aes(y=cons, x=income, col=temp)) + | ||
| geom_point() | ||
| ``` | ||
|
|
||
|
|
||
| ```{r mc1} | ||
| question("What is the relationship between ice cream consumption and temperature?", | ||
| answer("Positive and somewhat linear", correct=TRUE), | ||
| answer("Positive and strongly linear"), | ||
| answer("Negative and non-linear"), | ||
| answer("There is no correlation")) | ||
| ``` | ||
|
|
||
|
|
||
| ## Exercise 2 | ||
|
|
||
| We can see that ice cream consumption and temperature have a positive and somewhat linear relationship; that is, when temperature increases, we observe that consumption is also likely to increase. But, we have two other variables in the dataset: (ice cream) price and (consumer) income. Could these two variables be inter-related? | ||
|
|
||
| First, let's write a model for the data: if $y$ is consumption, then with the variables we explored above the model can be expressed as | ||
|
|
||
| $y = \beta_0 + \beta_{\rm temp}x_{\rm temp} + \beta_{\rm price}x_{\rm price} + \beta_{\rm income}x_{\rm income} + residuals$ | ||
|
|
||
| Now, fit this model below: | ||
|
|
||
| ```{r ex2, exercise = TRUE} | ||
| m1 <- ____ | ||
|
|
||
| summary(m1) | ||
| ``` | ||
|
|
||
| ```{r ex2-solution} | ||
| # Add the independent variables (covariates) after the ~ and separated by + | ||
| m1 <- lm(data=icecream, cons ~ temp + price + income) | ||
|
|
||
| summary(m1) | ||
| ``` | ||
|
|
||
|
|
||
| ## Exercise 3 | ||
|
|
||
| Now we will try adding "interaction terms," which account for the effect of two covariates that may depend on each other. We can add these terms to the model as follows: | ||
|
|
||
| $y = \beta_0 + \beta_{\rm temp}x_{\rm temp} + \beta_{\rm price}x_{\rm price} + \beta_{\rm income}x_{\rm income} + \beta_{\rm price:income}x_{\rm price}x_{\rm income} + residuals$ | ||
|
|
||
| In R, the syntax for adding an interaction term within the linear model `lm()` is `lm(y ~ x1 + x2 + x1:x2)`. Try it with the icecream data below: | ||
|
|
||
| ```{r ex3, exercise = TRUE} | ||
| m2 <- ____ | ||
|
|
||
| summary(m2) | ||
| ``` | ||
|
|
||
| ```{r ex3-solution} | ||
| # Add the independent variables (covariates) after the ~ and separated by + | ||
| m2 <- lm(data=icecream, cons ~ temp + price + income + price:income) | ||
|
|
||
| summary(m2) | ||
| ``` | ||
|
|
||
|
|
||
| ## Exercise 4 | ||
|
|
||
| ```{r, echo=FALSE} | ||
| tab <- matrix(c(0.719, 0.687, 0.037, 0.759, 0.721, 0.035), ncol=3, byrow=TRUE) | ||
| colnames(tab) <- c('Multiple R-squared', 'Adjusted R-squared', 'Residual SE') | ||
| rownames(tab) <- c('m1', 'm2') | ||
| tab <- as.table(tab) | ||
| print(tab) | ||
| ``` | ||
|
|
||
| ```{r mc2} | ||
| question("Given the measures of model fit above, which model do you think fits the icecream consumption data better?", | ||
| answer("m1, linear model with no interactions"), | ||
| answer("m2, linear model with interaction between price and income", correct=TRUE), | ||
| answer("Not sure")) | ||
| ``` | ||
|
|
||
|
|
||
| ## Exercise 5 | ||
| Suppose we were to run an experiment where 24 bean plants are randomized into one of four groups: | ||
|
||
|
|
||
| 1. Each plant receives 1 teaspoon of water and 1 hour of sunlight each day. | ||
| 2. Each plant receives 4 tablespoons of water and 1 hour of sunlight each day. | ||
| 3. Each plant receives 1 teaspoon of water and 8 hours of sunlight each day. | ||
| 4. Each plant receives 4 tablespoons of water and 8 hours of sunlight each day. | ||
|
|
||
|
|
||
| ```{r mc3} | ||
| question("Which group do you think will have the least plant growth?", | ||
| answer("Group 1"), | ||
| answer("Group 2", correct=TRUE), | ||
| answer("Group 3"), | ||
| answer("Group 4")) | ||
| ``` | ||
|
|
||
| ```{r mc4} | ||
| question("The most plant growth?", | ||
| answer("Group 1"), | ||
| answer("Group 2"), | ||
| answer("Group 3"), | ||
| answer("Group 4", correct=TRUE)) | ||
| ``` | ||
|
|
||
| ```{r mc5} | ||
| question("Do you think the effects of the water and sunlight on plans are independent?", | ||
| answer("Yes", message="Consider two examples: a plant that receives very little water and a lot of sun will dry up, while one that receives a lot of water and very little sun will rot. Most plants need a balance of water and sunlight to grow properly."), | ||
| answer("No", correct=TRUE), | ||
| answer("Not sure", message="Consider two examples: a plant that receives very little water and a lot of sun will dry up, while one that receives a lot of water and very little sun will rot. Most plants need a balance of water and sunlight to grow properly.")) | ||
| ``` | ||
|
|
||
|
|
||
| ## Submit | ||
|
|
||
| ```{r, echo=FALSE, context="server"} | ||
| encoder_logic() | ||
| ``` | ||
|
|
||
| ```{r encode, echo=FALSE} | ||
| learnrhash::encoder_ui(ui_before = hash_encoder_ui) | ||
| ``` | ||
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Make sure to state what the variable names are (in this case for consumption, temperature, and price) since the students won't immediately be able to see that information
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think an example with an indicator variable makes it easier to motivate why you would want interactions
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe walk them through the indicator variable example, ask some questions about interpretation, then have them try the same approach on the current (non-indicator) example