Skip to content

FR: optional .interval argument to fill_gaps #302

@warnes

Description

@warnes

I am joining multiple time series values collected on different intervals, ranging from months to years. Consequently, I need to harmonize the intervals to perform the join.

At the moment, I don't see a documented method for setting the desired interval, either directly, or when calling fill_gaps.

StackOverflow shows a mechanism for overriding the interval by explicitly changing the object attribute (see https://stackoverflow.com/a/75981369), but I prefer to use documented interfaces whenever possible.

For my current code, it would be very helpful to have an additional optional .interval argument to fill_gaps that performs this step.

Perhaps something like these:

set_interval<-function(object, ...)
{
  attr(object, 'interval') <- new_interval(...)
  object
}

fill_gaps_interval <- function(.data, ..., .full = FALSE, .start = NULL, .end = NULL, .interval=NULL) 
{
  if(!is.null(.interval))
  { 
    .interval <- as.list(.interval)
    .interval$object <- .data
    .data <- do.call(set_interval, .interval)
  }
  
  call <- match.call()
  call$.data <- .data
  call$.interval <- NULL
  call[[1L]] <- quote(tsibble::fill_gaps)
  eval(call, parent.frame())
}

Reproducable Example:

> library(tidyverse)
> library(tsibble)

> df1 <- tsibble(quarter = yearquarter(as_date(c('2020-1-1','2021-1-1','2022-3-1'))),
+                   amount = c(5, 2, 1))
Using `quarter` as index variable.

> df2 <- tsibble(quarter = yearquarter(as_date(c('2022-1-1','2022-4-1','2022-7-1'))),
+                   amount = c(5, 2, 1))
Using `quarter` as index variable.

> ###
> # Existing functionality
> ###
> 
> interval(df1)
<interval[1]>
[1] 4Q

> # --> Fills 4Q interval
> df1 %>% fill_gaps(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'))
# A tsibble: 4 x 2 [4Q]
  quarter amount
    <qtr>  <dbl>
1 2020 Q1      5
2 2021 Q1      2
3 2022 Q1      1
4 2023 Q1     NA

> # --> Fills 1Q interval
> interval(df2)
<interval[1]>
[1] 1Q

> df2 %>% fill_gaps(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'))
# A tsibble: 13 x 2 [1Q]
   quarter amount
     <qtr>  <dbl>
 1 2020 Q1     NA
 2 2020 Q2     NA
 3 2020 Q3     NA
 4 2020 Q4     NA
 5 2021 Q1     NA
 6 2021 Q2     NA
 7 2021 Q3     NA
 8 2021 Q4     NA
 9 2022 Q1      5
10 2022 Q2      2
11 2022 Q3      1
12 2022 Q4     NA
13 2023 Q1     NA

> ###
> # Desired functionality: Fill to individual quarter 
> ##
> df1 %>% fill_gaps_interval(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'), .interval=c(quarter=1))
# A tsibble: 13 x 2 [1Q]
   quarter amount
     <qtr>  <dbl>
 1 2020 Q1      5
 2 2020 Q2     NA
 3 2020 Q3     NA
 4 2020 Q4     NA
 5 2021 Q1      2
 6 2021 Q2     NA
 7 2021 Q3     NA
 8 2021 Q4     NA
 9 2022 Q1      1
10 2022 Q2     NA
11 2022 Q3     NA
12 2022 Q4     NA
13 2023 Q1     NA

> df2 %>% fill_gaps_interval(.start=yearquarter('2020-01-01'), .end=yearquarter('2023-01-01'), .interval=c(quarter=1))
# A tsibble: 13 x 2 [1Q]
   quarter amount
     <qtr>  <dbl>
 1 2020 Q1     NA
 2 2020 Q2     NA
 3 2020 Q3     NA
 4 2020 Q4     NA
 5 2021 Q1     NA
 6 2021 Q2     NA
 7 2021 Q3     NA
 8 2021 Q4     NA
 9 2022 Q1      5
10 2022 Q2      2
11 2022 Q3      1
12 2022 Q4     NA
13 2023 Q1     NA

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions