Working on week 1 activities

numbats · Jan 29, 2025 · a10b7f0 · a10b7f0
1 parent 8853a4d
commit a10b7f0
Show file tree

Hide file tree

Showing 2 changed files with 52 additions and 11 deletions.
diff --git a/_freeze/week1/activities/execute-results/html.json b/_freeze/week1/activities/execute-results/html.json
@@ -1,9 +1,11 @@
 {
-  "hash": "ad21c08c29a5ce320571dc7f094777c2",
+  "hash": "0fe8fe2208e2d27d0745ffd369d2615a",
   "result": {
     "engine": "knitr",
-    "markdown": "---\ntitle: \"Activities: Week 1\"\neditor: source\nengine: knitr\nfilters:\n  - webr-teachr\n  - quiz-teachr\nwebr:\n  packages: [\"fpp3\", \"urca\"]\n  autoload-packages: false\n---\n\n# Time series data and patterns\n\n## Exercise 1\n\nThe `pedestrian` dataset contains hourly pedestrian counts from 2015-01-01 to 2016-12-31 at 4 sensors in the city of Melbourne.\n\nThe data is shown below:\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 66,037 × 5\n   Sensor         Date_Time           Date        Time Count\n   <chr>          <dttm>              <date>     <int> <int>\n 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01     0  1630\n 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01     1   826\n 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01     2   567\n 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01     3   264\n 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01     4   139\n 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01     5    77\n 7 Birrarung Marr 2015-01-01 06:00:00 2015-01-01     6    44\n 8 Birrarung Marr 2015-01-01 07:00:00 2015-01-01     7    56\n 9 Birrarung Marr 2015-01-01 08:00:00 2015-01-01     8   113\n10 Birrarung Marr 2015-01-01 09:00:00 2015-01-01     9   166\n# ℹ 66,027 more rows\n```\n\n\n:::\n:::\n\n::: {.callout-caution}\n## Your turn!\n\nIdentify the `index` variable, `key` variable(s), and measured variable(s) of this dataset.\n:::\n\n::: {.callout-tip}\n## Hint\n\n* The `index` variable contains the complete time information\n* The `key` variable(s) identify each time series\n* The measured variable(s) are what you want to explore/forecast.\n:::\n\n::: columns\n\n::: {.column width=\"30%\"}\n\n## `index` variable\n:::{.quiz-singlechoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [X] [Date_Time]{hint=\"o\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"30%\"}\n\n## `key` variable(s)\n:::{.quiz-multichoice}\n- [X] [Sensor]{hint=\"o\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"40%\"}\n\n## measured variable(s)\n:::{.quiz-multichoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [X] [Count]{hint=\"o\"}\n:::\n:::\n:::\n\n## Exercise 2\n\nThe `aus_accommodation` dataset contains quarterly data on Australian tourist accommodation from short-term non-residential accommodation with 15 or more rooms, 1998 Q1 - 2016 Q2.\n\nThe units of the measured variables are as follows:\n\n* Takings are in millions of Australian dollars\n* Occupancy is a percentage of rooms occupied\n* CPI is an index with value 100 in 2012 Q1.\n\n::: {.callout-caution}\n## Your turn!\n\nComplete the code to convert this dataset into a tsibble.\n:::\n\n```{webr-teachr}\nlibrary(<<fpp3>>)\n\naus_accommodation <- read.csv(\n  \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n  mutate(Date = as.Date(Date)) |>\n  as_tsibble(\n    <<key = State, index = Date>>\n  )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nchecks <- c(\n  \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n  \"You should specify which column provides the time of the measurements with `index`.\" = !search_ast(.code, .fn = as_tsibble, index = Date),\n  \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE)\n)\n\nif(any(checks)) return(checks)\n\nif(!is_yearquarter(aus_accommodation$Date)) cat(\"Great, you've got a tsibble!\\nAlthough something doesn't look right - check the frequency of the data, why isn't it quarterly?\\n\")\nFALSE\n```\n\n\n## Exercise 3\n\n:::{.callout-important}\n## Temporal granularity\n\nThe previous exercise produced a dataset with daily frequency - although clearly the data is quarterly! This is because we are using a daily granularity which is inappropriate for this data.\n:::\n\nCommon temporal granularities can be created with these functions:\n\n::: {.cell}\n::: {.cell-output-display}\n\n\n|Granularity |Function             |\n|:-----------|:--------------------|\n|Annual      |`as.integer()`       |\n|Quarterly   |`yearquarter()`      |\n|Monthly     |`yearmonth()`        |\n|Weekly      |`yearweek()`         |\n|Daily       |`as_date()`, `ymd()` |\n|Sub-daily   |`as_datetime()`      |\n\n\n:::\n:::\n\n\n::: {.callout-caution}\n## Your turn!\n\nUse the appropriate granularity for the `aus_accommodation` dataset, and verify that the frequency is now quarterly.\n:::\n\n\n```{webr-teachr}\naus_accommodation <- read.csv(\n  \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n  mutate(<<Quarter = yearquarter(Date)>>) |>\n  as_tsibble(\n    key = State, index = <<Quarter>>\n  )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n  \"You need to save the dataset as `aus_accommodation`\" = !exists(\"aus_accommodation\"),\n  \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n  \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE),\n  \"You should use `yearquarter()` to change the time column into a quarterly granularity\" = !is_yearquarter(aus_accommodation[[index_var(aus_accommodation)]])\n)\n```\n\n## Exercise 4\n\nThe `tourism` dataset contains the quarterly overnight trips from 1998 Q1 to 2016 Q4 across Australia.\n\nIt is disaggregated by 3 key variables:\n\n* `State`: States and territories of Australia\n* `Region`: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs\n* `Purpose`: Stopover purpose of visit: \"Holiday\", \"Visiting friends and relatives\", \"Business\", \"Other reason\".\n\n::: {.callout-caution}\n## Your turn!\n\nCalculate the total quarterly tourists visiting Victoria from the `tourism` dataset.\n:::\n\n\n1. Download `tourism.xlsx` from [`http://robjhyndman.com/data/tourism.xlsx`](http://robjhyndman.com/data/tourism.xlsx), and read it into R using `read_excel()` from the `readxl` package.\n2. Create a tsibble which is identical to the `tourism` tsibble from the `tsibble` package.\n3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.\n4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.\n",
-    "supporting": [],
+    "markdown": "---\ntitle: \"Activities: Week 1\"\neditor: source\nengine: knitr\nfilters:\n  - webr-teachr\n  - quiz-teachr\nwebr:\n  packages: [\"fpp3\", \"urca\"]\n  autoload-packages: false\n---\n\n# Time series data and patterns\n\n## Exercise 1\n\nThe `pedestrian` dataset contains hourly pedestrian counts from 2015-01-01 to 2016-12-31 at 4 sensors in the city of Melbourne.\n\nThe data is shown below:\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 66,037 × 5\n   Sensor         Date_Time           Date        Time Count\n   <chr>          <dttm>              <date>     <int> <int>\n 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01     0  1630\n 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01     1   826\n 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01     2   567\n 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01     3   264\n 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01     4   139\n 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01     5    77\n 7 Birrarung Marr 2015-01-01 06:00:00 2015-01-01     6    44\n 8 Birrarung Marr 2015-01-01 07:00:00 2015-01-01     7    56\n 9 Birrarung Marr 2015-01-01 08:00:00 2015-01-01     8   113\n10 Birrarung Marr 2015-01-01 09:00:00 2015-01-01     9   166\n# ℹ 66,027 more rows\n```\n\n\n:::\n:::\n\n::: {.callout-caution}\n## Your turn!\n\nIdentify the `index` variable, `key` variable(s), and measured variable(s) of this dataset.\n:::\n\n::: {.callout-tip}\n## Hint\n\n* The `index` variable contains the complete time information\n* The `key` variable(s) identify each time series\n* The measured variable(s) are what you want to explore/forecast.\n:::\n\n::: columns\n\n::: {.column width=\"30%\"}\n\n## `index` variable\n:::{.quiz-singlechoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [X] [Date_Time]{hint=\"o\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"30%\"}\n\n## `key` variable(s)\n:::{.quiz-multichoice}\n- [X] [Sensor]{hint=\"o\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"40%\"}\n\n## measured variable(s)\n:::{.quiz-multichoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [X] [Count]{hint=\"o\"}\n:::\n:::\n:::\n\n## Exercise 2\n\nThe `aus_accommodation` dataset contains quarterly data on Australian tourist accommodation from short-term non-residential accommodation with 15 or more rooms, 1998 Q1 - 2016 Q2.\n\nThe units of the measured variables are as follows:\n\n* Takings are in millions of Australian dollars\n* Occupancy is a percentage of rooms occupied\n* CPI is an index with value 100 in 2012 Q1.\n\n::: {.callout-caution}\n## Your turn!\n\nComplete the code to convert this dataset into a tsibble.\n:::\n\n```{webr-teachr}\nlibrary(<<fpp3>>)\n\naus_accommodation <- read.csv(\n  \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n  mutate(Date = as.Date(Date)) |>\n  as_tsibble(\n    <<key = State, index = Date>>\n  )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nchecks <- c(\n  \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n  \"You should specify which column provides the time of the measurements with `index`.\" = !search_ast(.code, .fn = as_tsibble, index = Date),\n  \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE)\n)\n\nif(any(checks)) return(checks)\n\nif(!is_yearquarter(aus_accommodation$Date)) cat(\"Great, you've got a tsibble!\\nAlthough something doesn't look right - check the frequency of the data, why isn't it quarterly?\\n\")\nFALSE\n```\n\n\n## Exercise 3\n\n:::{.callout-important}\n## Temporal granularity\n\nThe previous exercise produced a dataset with daily frequency - although clearly the data is quarterly! This is because we are using a daily granularity which is inappropriate for this data.\n:::\n\nCommon temporal granularities can be created with these functions:\n\n::: {.cell}\n::: {.cell-output-display}\n\n\n|Granularity |Function             |\n|:-----------|:--------------------|\n|Annual      |`as.integer()`       |\n|Quarterly   |`yearquarter()`      |\n|Monthly     |`yearmonth()`        |\n|Weekly      |`yearweek()`         |\n|Daily       |`as_date()`, `ymd()` |\n|Sub-daily   |`as_datetime()`      |\n\n\n:::\n:::\n\n\n::: {.callout-caution}\n## Your turn!\n\nUse the appropriate granularity for the `aus_accommodation` dataset, and verify that the frequency is now quarterly.\n:::\n\n\n```{webr-teachr}\naus_accommodation <- read.csv(\n  \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n  mutate(<<Quarter = yearquarter(Date)>>) |>\n  as_tsibble(\n    key = State, index = <<Quarter>>\n  )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n  \"You need to save the dataset as `aus_accommodation`\" = !exists(\"aus_accommodation\"),\n  \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n  \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE),\n  \"You should use `yearquarter()` to change the time column into a quarterly granularity\" = !is_yearquarter(aus_accommodation[[index_var(aus_accommodation)]])\n)\n```\n\n## Exercise 4\n\nThe `tourism` dataset contains the quarterly overnight trips from 1998 Q1 to 2016 Q4 across Australia.\n\nIt is disaggregated by 3 key variables:\n\n* `State`: States and territories of Australia\n* `Region`: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs\n* `Purpose`: Stopover purpose of visit: \"Holiday\", \"Visiting friends and relatives\", \"Business\", \"Other reason\".\n\nCalculate the total quarterly tourists visiting Victoria from the `tourism` dataset.\n\n```{webr-teachr}\ntourism |>\n  filter(<<State == \"Victoria\">>) |>\n  summarise(<<Trips == sum(Trips)>>)\n\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n  \"You need to use the filter() function to extract only Victorian tourists.\" = !search_ast(.code, .fn = filter),\n  \"You need to use the summarise() function to sum over the Region and Purpose keys.\" = !search_ast(.code, .fn = summarise),\n)\n```\n\nFind what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.\n\n```{webr-teachr}\ntourism |>\n  as_tibble() |>\n  group_by(<<Region, Purpose>>) |>\n  summarise(<<Trips = mean(Trips), .groups = \"drop\">>) |>\n  filter(<<Trips == max(Trips)>>)\n\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n  \"You need to use the as_tibble() function to convert back to a tibble object.\" = !search_ast(.code, .fn = as_tibble),\n  \"You need to use the group_by() function to group by Region and Purpose.\" = !search_ast(.code, .fn = group_by),\n)\n```\n\nCreate a new tsibble which combines the Purposes and Regions, and just has total trips by State.\n\n```{webr-teachr}\ntourism\n\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n  \"You need to use the filter() function to extract only Victorian tourists.\" = !search_ast(.code, .fn = filter),\n  \"You need to use the summarise() function to sum over the Region and Purpose keys.\" = !search_ast(.code, .fn = summarise),\n)\n```\n",
+    "supporting": [
+      "activities_files"
+    ],
     "filters": [
       "rmarkdown/pagebreak.lua"
     ],

diff --git a/week1/activities.qmd b/week1/activities.qmd
@@ -183,14 +183,53 @@ It is disaggregated by 3 key variables:
 * `Region`: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs
 * `Purpose`: Stopover purpose of visit: "Holiday", "Visiting friends and relatives", "Business", "Other reason".
 
-::: {.callout-caution}
-## Your turn!
-
 Calculate the total quarterly tourists visiting Victoria from the `tourism` dataset.
-:::
 
+```{webr-teachr}
+tourism |>
+  filter(<<State == "Victoria">>) |>
+  summarise(<<Trips == sum(Trips)>>)
+
+???
+
+if(!("fpp3" %in% .packages())) return(c("You need to load the fpp3 package!" = TRUE))
+
+c(
+  "You need to use the filter() function to extract only Victorian tourists." = !search_ast(.code, .fn = filter),
+  "You need to use the summarise() function to sum over the Region and Purpose keys." = !search_ast(.code, .fn = summarise),
+)
+```
+
+Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.
+
+```{webr-teachr}
+tourism |>
+  as_tibble() |>
+  group_by(<<Region, Purpose>>) |>
+  summarise(<<Trips = mean(Trips), .groups = "drop">>) |>
+  filter(<<Trips == max(Trips)>>)
+
+???
+
+if(!("fpp3" %in% .packages())) return(c("You need to load the fpp3 package!" = TRUE))
 
-1. Download `tourism.xlsx` from [`http://robjhyndman.com/data/tourism.xlsx`](http://robjhyndman.com/data/tourism.xlsx), and read it into R using `read_excel()` from the `readxl` package.
-2. Create a tsibble which is identical to the `tourism` tsibble from the `tsibble` package.
-3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.
-4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
+c(
+  "You need to use the as_tibble() function to convert back to a tibble object." = !search_ast(.code, .fn = as_tibble),
+  "You need to use the group_by() function to group by Region and Purpose." = !search_ast(.code, .fn = group_by),
+)
+```
+
+Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
+
+```{webr-teachr}
+tourism
+
+???
+
+if(!("fpp3" %in% .packages())) return(c("You need to load the fpp3 package!" = TRUE))
+
+c(
+  "You need to use the filter() function to extract only Victorian tourists." = !search_ast(.code, .fn = filter),
+  "You need to use the summarise() function to sum over the Region and Purpose keys." = !search_ast(.code, .fn = summarise),
+)
+```