Accuracy function - grouping issue

Hi there!

The winter wind howls... and with it, a fresh bug has slipped under the door.
I was working on forecasting hierarchical time series and used the cross-validation feature. However, when computing accuracy metrics via `accuracy()`, there was an issue in grouping `by` if not including the keys of the data.

Here's the reproducible example:

``` r
library(tidyverse)
library(fpp3)

# --- FIT AND FORECAST ---
tourism_full <- tourism %>% 
  aggregate_key((State/Region), Trips = sum(Trips))

fit <- tourism_full %>% 
  slide_tsibble(.size = 60) %>% 
  relocate(.id) %>% 
  filter(.id <= 2) %>% # only size 2 for Cross-validation
  # modelling
  model(base = ETS(Trips)) |>
  reconcile(
    ols = min_trace(base, method = "ols"),
    mint = min_trace(base, method = "mint_shrink")
  )

fc <- fit %>% 
  forecast(h = 4) %>% 
  group_by(.id, .model, State, Region) %>% 
  mutate(h = row_number()) %>%
  ungroup() %>%
  as_fable(response = "Trips", distribution = Trips)

# --- ACCURACY ---
# Directly group_by h and model
accuracy_1 <- fc %>% 
  accuracy(
    tourism_full,
    by = c("h", ".model"),
    measures = list(RMSE = RMSE)
  )

# Group_by h, model, series; then summarise later
accuracy_2 <- fc %>% 
  accuracy(
    tourism_full,
    by = c("h", ".model", "State", "Region"),
    measures = list(RMSE = RMSE)
  ) %>% 
  group_by(.model, h) %>%
  summarise(RMSE = sqrt(mean(RMSE^2)), .groups = "drop")

# Plotting
accuracy_1 %>% 
  ggplot(aes(x = h, y = RMSE, color = .model)) +
  geom_line()

accuracy_2 %>%
  ggplot(aes(x = h, y = RMSE, color = .model)) +
  geom_line()
```

Theoretically the numbers should be the same. Diving into the codes of `accuracy()`, there’s an issue in the 1st call.

As we group by _h_ and _model_, there’s a point ([here](https://github.com/tidyverts/fabletools/blob/3f3ce99899f20195a8f355fb357e84477549475c/R/accuracy.R#L578-L581)) where the function ***left-join*** the forecast data with the actual data based on `c("h", ".model")` as our argument. 

But, the keys of forecast data are  `.id, State, Region, .model` and keys of actual data are `State, Region`. And the `accuracy()` doesn’t take that into account:

``` r
# --- This is forecast data
# A tsibble: 2,040 x 8 [1Q]
# Key:       .id, State, Region, .model [510]
    .id State  Region     .fc        .dist Quarter     h .model
  <int> <chr*> <chr*>   <dbl>       <dist>   <qtr> <int> <chr> 
1     1 ACT    Canberra  486. N(486, 3706) 2013 Q1     1 base  
2     1 ACT    Canberra  486. N(486, 3706) 2013 Q2     2 base  
3     1 ACT    Canberra  486. N(486, 3706) 2013 Q3     3 base  
# ℹ 2,037 more rows

# --- This is actual data for calculating resid
# A tsibble: 6,800 x 4 [1Q]
# Key:       State, Region [85]
  State        Region       Quarter .actual
  <chr*>       <chr*>         <qtr>   <dbl>
1 <aggregated> <aggregated> 1998 Q1  23182.
2 <aggregated> <aggregated> 1998 Q2  20323.
3 <aggregated> <aggregated> 1998 Q3  19827.
# ℹ 6,797 more rows

# --- After left-join
# A tibble: 173,400 × 11
    .id State  Region     .fc        .dist Quarter     h .model State.y            Region.y     .actual
  <int> <chr*> <chr*>   <dbl>       <dist>   <qtr> <int> <chr>  <chr*>             <chr*>         <dbl>
1     1 ACT    Canberra  486. N(486, 3706) 2013 Q1     1 base   <aggregated>       <aggregated>  21984.
2     1 ACT    Canberra  486. N(486, 3706) 2013 Q1     1 base   ACT                <aggregated>    525.
3     1 ACT    Canberra  486. N(486, 3706) 2013 Q1     1 base   New South Wales    <aggregated>   6971.
4     1 ACT    Canberra  486. N(486, 3706) 2013 Q1     1 base   Northern Territory <aggregated>    133.
5     1 ACT    Canberra  486. N(486, 3706) 2013 Q1     1 base   Queensland         <aggregated>   4779.
# ℹ 173,395 more rows
```

We actually left-join based on `Quarter`, leading to a tibble with 173,400 rows.

This was my own investigation, please let me know if I'm missing anything! :)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Accuracy function - grouping issue #421

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Accuracy function - grouping issue #421

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions