Skip to content

Commit

Permalink
Working on week 1 activities
Browse files Browse the repository at this point in the history
  • Loading branch information
robjhyndman committed Jan 29, 2025
1 parent 8853a4d commit a10b7f0
Show file tree
Hide file tree
Showing 2 changed files with 52 additions and 11 deletions.
8 changes: 5 additions & 3 deletions _freeze/week1/activities/execute-results/html.json
Original file line number Diff line number Diff line change
@@ -1,9 +1,11 @@
{
"hash": "ad21c08c29a5ce320571dc7f094777c2",
"hash": "0fe8fe2208e2d27d0745ffd369d2615a",
"result": {
"engine": "knitr",
"markdown": "---\ntitle: \"Activities: Week 1\"\neditor: source\nengine: knitr\nfilters:\n - webr-teachr\n - quiz-teachr\nwebr:\n packages: [\"fpp3\", \"urca\"]\n autoload-packages: false\n---\n\n# Time series data and patterns\n\n## Exercise 1\n\nThe `pedestrian` dataset contains hourly pedestrian counts from 2015-01-01 to 2016-12-31 at 4 sensors in the city of Melbourne.\n\nThe data is shown below:\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 66,037 × 5\n Sensor Date_Time Date Time Count\n <chr> <dttm> <date> <int> <int>\n 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630\n 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826\n 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01 2 567\n 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01 3 264\n 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01 4 139\n 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01 5 77\n 7 Birrarung Marr 2015-01-01 06:00:00 2015-01-01 6 44\n 8 Birrarung Marr 2015-01-01 07:00:00 2015-01-01 7 56\n 9 Birrarung Marr 2015-01-01 08:00:00 2015-01-01 8 113\n10 Birrarung Marr 2015-01-01 09:00:00 2015-01-01 9 166\n# ℹ 66,027 more rows\n```\n\n\n:::\n:::\n\n::: {.callout-caution}\n## Your turn!\n\nIdentify the `index` variable, `key` variable(s), and measured variable(s) of this dataset.\n:::\n\n::: {.callout-tip}\n## Hint\n\n* The `index` variable contains the complete time information\n* The `key` variable(s) identify each time series\n* The measured variable(s) are what you want to explore/forecast.\n:::\n\n::: columns\n\n::: {.column width=\"30%\"}\n\n## `index` variable\n:::{.quiz-singlechoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [X] [Date_Time]{hint=\"o\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"30%\"}\n\n## `key` variable(s)\n:::{.quiz-multichoice}\n- [X] [Sensor]{hint=\"o\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"40%\"}\n\n## measured variable(s)\n:::{.quiz-multichoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [X] [Count]{hint=\"o\"}\n:::\n:::\n:::\n\n## Exercise 2\n\nThe `aus_accommodation` dataset contains quarterly data on Australian tourist accommodation from short-term non-residential accommodation with 15 or more rooms, 1998 Q1 - 2016 Q2.\n\nThe units of the measured variables are as follows:\n\n* Takings are in millions of Australian dollars\n* Occupancy is a percentage of rooms occupied\n* CPI is an index with value 100 in 2012 Q1.\n\n::: {.callout-caution}\n## Your turn!\n\nComplete the code to convert this dataset into a tsibble.\n:::\n\n```{webr-teachr}\nlibrary(<<fpp3>>)\n\naus_accommodation <- read.csv(\n \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n mutate(Date = as.Date(Date)) |>\n as_tsibble(\n <<key = State, index = Date>>\n )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nchecks <- c(\n \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n \"You should specify which column provides the time of the measurements with `index`.\" = !search_ast(.code, .fn = as_tsibble, index = Date),\n \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE)\n)\n\nif(any(checks)) return(checks)\n\nif(!is_yearquarter(aus_accommodation$Date)) cat(\"Great, you've got a tsibble!\\nAlthough something doesn't look right - check the frequency of the data, why isn't it quarterly?\\n\")\nFALSE\n```\n\n\n## Exercise 3\n\n:::{.callout-important}\n## Temporal granularity\n\nThe previous exercise produced a dataset with daily frequency - although clearly the data is quarterly! This is because we are using a daily granularity which is inappropriate for this data.\n:::\n\nCommon temporal granularities can be created with these functions:\n\n::: {.cell}\n::: {.cell-output-display}\n\n\n|Granularity |Function |\n|:-----------|:--------------------|\n|Annual |`as.integer()` |\n|Quarterly |`yearquarter()` |\n|Monthly |`yearmonth()` |\n|Weekly |`yearweek()` |\n|Daily |`as_date()`, `ymd()` |\n|Sub-daily |`as_datetime()` |\n\n\n:::\n:::\n\n\n::: {.callout-caution}\n## Your turn!\n\nUse the appropriate granularity for the `aus_accommodation` dataset, and verify that the frequency is now quarterly.\n:::\n\n\n```{webr-teachr}\naus_accommodation <- read.csv(\n \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n mutate(<<Quarter = yearquarter(Date)>>) |>\n as_tsibble(\n key = State, index = <<Quarter>>\n )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n \"You need to save the dataset as `aus_accommodation`\" = !exists(\"aus_accommodation\"),\n \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE),\n \"You should use `yearquarter()` to change the time column into a quarterly granularity\" = !is_yearquarter(aus_accommodation[[index_var(aus_accommodation)]])\n)\n```\n\n## Exercise 4\n\nThe `tourism` dataset contains the quarterly overnight trips from 1998 Q1 to 2016 Q4 across Australia.\n\nIt is disaggregated by 3 key variables:\n\n* `State`: States and territories of Australia\n* `Region`: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs\n* `Purpose`: Stopover purpose of visit: \"Holiday\", \"Visiting friends and relatives\", \"Business\", \"Other reason\".\n\n::: {.callout-caution}\n## Your turn!\n\nCalculate the total quarterly tourists visiting Victoria from the `tourism` dataset.\n:::\n\n\n1. Download `tourism.xlsx` from [`http://robjhyndman.com/data/tourism.xlsx`](http://robjhyndman.com/data/tourism.xlsx), and read it into R using `read_excel()` from the `readxl` package.\n2. Create a tsibble which is identical to the `tourism` tsibble from the `tsibble` package.\n3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.\n4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.\n",
"supporting": [],
"markdown": "---\ntitle: \"Activities: Week 1\"\neditor: source\nengine: knitr\nfilters:\n - webr-teachr\n - quiz-teachr\nwebr:\n packages: [\"fpp3\", \"urca\"]\n autoload-packages: false\n---\n\n# Time series data and patterns\n\n## Exercise 1\n\nThe `pedestrian` dataset contains hourly pedestrian counts from 2015-01-01 to 2016-12-31 at 4 sensors in the city of Melbourne.\n\nThe data is shown below:\n\n::: {.cell}\n::: {.cell-output .cell-output-stdout}\n\n```\n# A tibble: 66,037 × 5\n Sensor Date_Time Date Time Count\n <chr> <dttm> <date> <int> <int>\n 1 Birrarung Marr 2015-01-01 00:00:00 2015-01-01 0 1630\n 2 Birrarung Marr 2015-01-01 01:00:00 2015-01-01 1 826\n 3 Birrarung Marr 2015-01-01 02:00:00 2015-01-01 2 567\n 4 Birrarung Marr 2015-01-01 03:00:00 2015-01-01 3 264\n 5 Birrarung Marr 2015-01-01 04:00:00 2015-01-01 4 139\n 6 Birrarung Marr 2015-01-01 05:00:00 2015-01-01 5 77\n 7 Birrarung Marr 2015-01-01 06:00:00 2015-01-01 6 44\n 8 Birrarung Marr 2015-01-01 07:00:00 2015-01-01 7 56\n 9 Birrarung Marr 2015-01-01 08:00:00 2015-01-01 8 113\n10 Birrarung Marr 2015-01-01 09:00:00 2015-01-01 9 166\n# ℹ 66,027 more rows\n```\n\n\n:::\n:::\n\n::: {.callout-caution}\n## Your turn!\n\nIdentify the `index` variable, `key` variable(s), and measured variable(s) of this dataset.\n:::\n\n::: {.callout-tip}\n## Hint\n\n* The `index` variable contains the complete time information\n* The `key` variable(s) identify each time series\n* The measured variable(s) are what you want to explore/forecast.\n:::\n\n::: columns\n\n::: {.column width=\"30%\"}\n\n## `index` variable\n:::{.quiz-singlechoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [X] [Date_Time]{hint=\"o\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"30%\"}\n\n## `key` variable(s)\n:::{.quiz-multichoice}\n- [X] [Sensor]{hint=\"o\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [ ] [Count]{hint=\"x\"}\n:::\n:::\n\n::: {.column width=\"40%\"}\n\n## measured variable(s)\n:::{.quiz-multichoice}\n- [ ] [Sensor]{hint=\"x\"}\n- [ ] [Date_Time]{hint=\"x\"}\n- [ ] [Date]{hint=\"x\"}\n- [ ] [Time]{hint=\"x\"}\n- [X] [Count]{hint=\"o\"}\n:::\n:::\n:::\n\n## Exercise 2\n\nThe `aus_accommodation` dataset contains quarterly data on Australian tourist accommodation from short-term non-residential accommodation with 15 or more rooms, 1998 Q1 - 2016 Q2.\n\nThe units of the measured variables are as follows:\n\n* Takings are in millions of Australian dollars\n* Occupancy is a percentage of rooms occupied\n* CPI is an index with value 100 in 2012 Q1.\n\n::: {.callout-caution}\n## Your turn!\n\nComplete the code to convert this dataset into a tsibble.\n:::\n\n```{webr-teachr}\nlibrary(<<fpp3>>)\n\naus_accommodation <- read.csv(\n \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n mutate(Date = as.Date(Date)) |>\n as_tsibble(\n <<key = State, index = Date>>\n )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nchecks <- c(\n \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n \"You should specify which column provides the time of the measurements with `index`.\" = !search_ast(.code, .fn = as_tsibble, index = Date),\n \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE)\n)\n\nif(any(checks)) return(checks)\n\nif(!is_yearquarter(aus_accommodation$Date)) cat(\"Great, you've got a tsibble!\\nAlthough something doesn't look right - check the frequency of the data, why isn't it quarterly?\\n\")\nFALSE\n```\n\n\n## Exercise 3\n\n:::{.callout-important}\n## Temporal granularity\n\nThe previous exercise produced a dataset with daily frequency - although clearly the data is quarterly! This is because we are using a daily granularity which is inappropriate for this data.\n:::\n\nCommon temporal granularities can be created with these functions:\n\n::: {.cell}\n::: {.cell-output-display}\n\n\n|Granularity |Function |\n|:-----------|:--------------------|\n|Annual |`as.integer()` |\n|Quarterly |`yearquarter()` |\n|Monthly |`yearmonth()` |\n|Weekly |`yearweek()` |\n|Daily |`as_date()`, `ymd()` |\n|Sub-daily |`as_datetime()` |\n\n\n:::\n:::\n\n\n::: {.callout-caution}\n## Your turn!\n\nUse the appropriate granularity for the `aus_accommodation` dataset, and verify that the frequency is now quarterly.\n:::\n\n\n```{webr-teachr}\naus_accommodation <- read.csv(\n \"https://workshop.nectric.com.au/user2024/data/aus_accommodation.csv\"\n) |>\n mutate(<<Quarter = yearquarter(Date)>>) |>\n as_tsibble(\n key = State, index = <<Quarter>>\n )\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n \"You need to save the dataset as `aus_accommodation`\" = !exists(\"aus_accommodation\"),\n \"You need to use the as_tsibble() function to convert the data into a tsibble.\" = !search_ast(.code, .fn = as_tsibble),\n \"You need to specify the key variables that identify each time series\" = exists_in(.errored, grepl, pattern = \"distinct rows\", fixed = TRUE),\n \"You should use `yearquarter()` to change the time column into a quarterly granularity\" = !is_yearquarter(aus_accommodation[[index_var(aus_accommodation)]])\n)\n```\n\n## Exercise 4\n\nThe `tourism` dataset contains the quarterly overnight trips from 1998 Q1 to 2016 Q4 across Australia.\n\nIt is disaggregated by 3 key variables:\n\n* `State`: States and territories of Australia\n* `Region`: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs\n* `Purpose`: Stopover purpose of visit: \"Holiday\", \"Visiting friends and relatives\", \"Business\", \"Other reason\".\n\nCalculate the total quarterly tourists visiting Victoria from the `tourism` dataset.\n\n```{webr-teachr}\ntourism |>\n filter(<<State == \"Victoria\">>) |>\n summarise(<<Trips == sum(Trips)>>)\n\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n \"You need to use the filter() function to extract only Victorian tourists.\" = !search_ast(.code, .fn = filter),\n \"You need to use the summarise() function to sum over the Region and Purpose keys.\" = !search_ast(.code, .fn = summarise),\n)\n```\n\nFind what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.\n\n```{webr-teachr}\ntourism |>\n as_tibble() |>\n group_by(<<Region, Purpose>>) |>\n summarise(<<Trips = mean(Trips), .groups = \"drop\">>) |>\n filter(<<Trips == max(Trips)>>)\n\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n \"You need to use the as_tibble() function to convert back to a tibble object.\" = !search_ast(.code, .fn = as_tibble),\n \"You need to use the group_by() function to group by Region and Purpose.\" = !search_ast(.code, .fn = group_by),\n)\n```\n\nCreate a new tsibble which combines the Purposes and Regions, and just has total trips by State.\n\n```{webr-teachr}\ntourism\n\n???\n\nif(!(\"fpp3\" %in% .packages())) return(c(\"You need to load the fpp3 package!\" = TRUE))\n\nc(\n \"You need to use the filter() function to extract only Victorian tourists.\" = !search_ast(.code, .fn = filter),\n \"You need to use the summarise() function to sum over the Region and Purpose keys.\" = !search_ast(.code, .fn = summarise),\n)\n```\n",
"supporting": [
"activities_files"
],
"filters": [
"rmarkdown/pagebreak.lua"
],
Expand Down
55 changes: 47 additions & 8 deletions week1/activities.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -183,14 +183,53 @@ It is disaggregated by 3 key variables:
* `Region`: The tourism regions are formed through the aggregation of Statistical Local Areas (SLAs) which are defined by the various State and Territory tourism authorities according to their research and marketing needs
* `Purpose`: Stopover purpose of visit: "Holiday", "Visiting friends and relatives", "Business", "Other reason".

::: {.callout-caution}
## Your turn!

Calculate the total quarterly tourists visiting Victoria from the `tourism` dataset.
:::

```{webr-teachr}
tourism |>
filter(<<State == "Victoria">>) |>
summarise(<<Trips == sum(Trips)>>)
???
if(!("fpp3" %in% .packages())) return(c("You need to load the fpp3 package!" = TRUE))
c(
"You need to use the filter() function to extract only Victorian tourists." = !search_ast(.code, .fn = filter),
"You need to use the summarise() function to sum over the Region and Purpose keys." = !search_ast(.code, .fn = summarise),
)
```

Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.

```{webr-teachr}
tourism |>
as_tibble() |>
group_by(<<Region, Purpose>>) |>
summarise(<<Trips = mean(Trips), .groups = "drop">>) |>
filter(<<Trips == max(Trips)>>)
???
if(!("fpp3" %in% .packages())) return(c("You need to load the fpp3 package!" = TRUE))
1. Download `tourism.xlsx` from [`http://robjhyndman.com/data/tourism.xlsx`](http://robjhyndman.com/data/tourism.xlsx), and read it into R using `read_excel()` from the `readxl` package.
2. Create a tsibble which is identical to the `tourism` tsibble from the `tsibble` package.
3. Find what combination of `Region` and `Purpose` had the maximum number of overnight trips on average.
4. Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.
c(
"You need to use the as_tibble() function to convert back to a tibble object." = !search_ast(.code, .fn = as_tibble),
"You need to use the group_by() function to group by Region and Purpose." = !search_ast(.code, .fn = group_by),
)
```

Create a new tsibble which combines the Purposes and Regions, and just has total trips by State.

```{webr-teachr}
tourism
???
if(!("fpp3" %in% .packages())) return(c("You need to load the fpp3 package!" = TRUE))
c(
"You need to use the filter() function to extract only Victorian tourists." = !search_ast(.code, .fn = filter),
"You need to use the summarise() function to sum over the Region and Purpose keys." = !search_ast(.code, .fn = summarise),
)
```

0 comments on commit a10b7f0

Please sign in to comment.