atime: Asymptotic Timing
- compare time/memory/other quantities of different R codes that depend on N: atime().
 - estimate the asymptotic complexity (big-O notation) of any R code that depends on some data size N: references_best().
 - compare time/memory of different git versions of R package code: atime_versions().
 - continuous performance testing of R packages: atime_pkg().
 
| tests | |
| coverage | 
## Install last released version from CRAN:
install.packages("atime")
## Install latest version from GitHub:
if(!require("remotes"))install.packages("remotes")
remotes::install_github("tdhock/atime")Get started by reading the sparse matrices and regular expressions vignettes, and others.
The main function is atime for which you can specify these arguments:
Nis numeric vector of data sizes to vary.setupis an expression to evaluate for every data size, before timings.timesis the number of times each expression is timed (so we can take the median and ignore outliers).seconds.limitis the max number of seconds. If an expression takes more time, then it will not be timed for larger N values.- there should also be at least one other named argument (an expression to time for every size N, name is the label which will appear on plots).
 
## When studying asymptotic complexity, always provide sizes on a log
## scale (10^sequence) as below:
(subject.size.vec <- unique(as.integer(10^seq(0,3.5,l=100))))
## Compute asymptotic time and memory measurement:
atime.list <- atime::atime(
  N=subject.size.vec,#vector of sizes.
  setup={#Run for each size, before timings:
    subject <- paste(rep("a", N), collapse="")
    pattern <- paste(rep(c("a?", "a"), each=N), collapse="")
  },
  times=10,#number of timings to compute for each expression.
  seconds.limit=0.1,#max seconds per expression.
  ## Different expressions which will be evaluated for each size N:
  PCRE.match=regexpr(pattern, subject, perl=TRUE),
  TRE.match=regexpr(pattern, subject, perl=FALSE),
  constant.replacement=gsub("a","constant size replacement",subject),
  linear.replacement=gsub("a",subject,subject))
atime.list
plot(atime.list)
## Compute and plot asymptotic reference lines:
(best.list <- atime::references_best(atime.list))
plot(best.list)
## Compute and plot data size N for given time/memory.
pred.list <- predict(best.list, seconds=1e-2, kilobytes=10)
plot(pred.list)On my machine I got the following results:
> (subject.size.vec <- unique(as.integer(10^seq(0,3.5,l=100))))
 [1]    1    2    3    4    5    6    7    8    9   10   11   12   13   14   15
[16]   17   18   20   22   23   25   28   30   33   35   38   42   45   49   53
[31]   58   63   68   74   81   87   95  103  112  121  132  143  155  168  183
[46]  198  215  233  253  275  298  323  351  380  413  448  486  527  572  620
[61]  673  730  792  859  932 1011 1097 1190 1291 1401 1519 1648 1788 1940 2104
[76] 2283 2477 2687 2915 3162The vector above is the sequence of sizes N, used with each expression, to measure time and memory. When studying asymptotic complexity, always provide sizes on a log scale as above.
> atime.list
atime list with 228 measurements for
PCRE.match(N=1 to 20)
TRE.match(N=1 to 275)
constant.replacement(N=1 to 3162)
linear.replacement(N=1 to 3162)The output above shows the min and max N values that were run for each
  of the expressions. In this case constant.replacement and
  linear.replacement were run all the way up to the max size (3162),
  but PCRE.match only went up to 20, and TRE.match only went up to
  275, because no larger N values are considered after the median time
  for a given N has has exceeded seconds.limit which is 0.1
  above. This behavior ensures that total time taken by atime will be
  about seconds.limit * times * number of expressions (times is the
  number of times each expression is evaluated at each data size). The
  output of the plot method for this atime result list is shown below,
> plot(atime.list)The plot above facilitates comparing the time and memory of the different expressions, and makes it easy to see the ranking of different algorithms, but it does not show the asymptotic complexity class.
To estimate the asymptotic complexity class, use the code below:
> (best.list <- atime::references_best(atime.list))
references_best list with 456 measurements, best fit complexity:
constant.replacement (N kilobytes, N seconds)
linear.replacement (N^2 kilobytes, N^2 seconds)
PCRE.match (2^N seconds)
TRE.match (N^3 seconds)The output above shows the best fit asymptotic time complexity for each expression. To visualize the results you can do:
plot(best.list)The plot above shows the timings of each expression as a function of data size N (black), as well as the two closest asymptotic reference lines (violet, one smaller, one larger). If you have chosen N and seconds.limit appropriately for your problem (as we have in this case) then you should be able to observe the following:
- on the left you can see timings for small N, where overhead dominates the timings, and the curve is approximately constant.
 - on the right you can see the asymptotic trend.
    
- Polynomial complexity algorithms show up as linear trends, and the slope indicates the asymptotic complexity class (larger slope for more complex algorithm in N).
 - Exponential complexity algorithms show up as super-linear curves (such as PCRE.match in this case, but in practice you should rarely encounter exponential time algorithms).
 
 - If you do not see an interpretable result with clear linear trends
    on the right of the log-log plot, you should try to increase
    
seconds.limitand the max value inNuntil you start to see linear trends, and clearly overlapping reference lines (as is the case here). 
When comparing algorithms in terms of computational resources, we can show
- Latency: the time/memory required for a given data size N (subset rows of measurements for a given value of N);
 - Throughput: the data size N possible for a given time/memory budget (use predict method).
 
We can do both using the code below,
> atime.list[["measurements"]][N==323, .(expr.name, seconds=median, kilobytes)]
              expr.name   seconds kilobytes
                 <char>     <num>     <num>
1:            TRE.match 0.0678032    0.0000
2: constant.replacement 0.0000667    7.9375
3:   linear.replacement 0.0002435  101.9375
> pred.list <- predict(best.list, seconds=1e-2, kilobytes=10)
> pred.list[["prediction"]]
        unit            expr.name unit.value          N
      <char>               <char>      <num>      <num>
1:   seconds           PCRE.match       0.01   17.82348
2:   seconds            TRE.match       0.01  168.46338
3:   seconds   linear.replacement       0.01 2069.38604
4: kilobytes constant.replacement      10.00  407.55220
5: kilobytes   linear.replacement      10.00  100.92007
> plot(pred.list)atime_versions() runs different versions of your R package code, for
  varying data sizes N, so you can see if there are any asymptotic
  differences in performance, between git versions of your package. See
  ?atime::atime_versions for documentation and examples (grates example
  and output).
If you want to run atime_versions() to check R package performance
  in every Pull Request, autocomment-atime-results is a GitHub action
  which can plot results in a PR comment, so you can see if the PR
  affects performance (example output: binsegRcpp, data.table). First,
  you should define a .ci/atime/tests.R code file that creates an R
  object called test.list which should be a list of performance tests,
  each one is a list of arguments that will be passed to
  atime_versions. See ?atime_pkg for documentation, and see these repos for code examples:
- atime-test-funs branch of binsegRcpp repo has a simple example with 4 test cases.
 - data.table has a more complex example with over 10 test cases.
 
bench::press (multi-dimensional search including N) does something
  similar to atime (runs different N) and atime_grid (search over
  parameters other than N). However it can not
  store results if check=FALSE, results must be equal if check=TRUE, and
  there is no way to easily specify a time limit which stops for larger
  sizes (like seconds.limit argument in atime).
testComplexity::asymptoticTimings does something similar, but only for one expression (not several), and there is no special setup argument like atime (which means that the timing must include data setup code which may be irrelevant).
| Language | Users | Github workflow result display | Comparative benchmarking | Performance testing | |
|---|---|---|---|---|---|
| atime (proposed) | R | data.table | PR comments | yes | yes | 
| bench | R | - | yes | - | |
| microbenchmark | R | - | yes | - | |
| system.time | R | - | yes | - | |
| rbenchmark | R | - | yes | - | |
| airspeed velocity | Python | numpy | web page | - | yes | 
| conbench | any | arrow | web page | - | yes | 
| touchstone | R | PR comments | - | yes | |
| pytest-benchmark | Python | web page | - | yes | 
See Bencher prior art for even more related work, and see continuous
  benchmarking for a plot that shows how false positives can show up
  if you use a database of historical timings (perhaps run on different
  computers, see Dirk’s real timings to see the typical variability of R
  CI on GitHub Actions). In contrast, atime_pkg uses a database of
  historical commits (known Fast and Slow), and runs them alongside
  commits which are relevant to the current PR (HEAD, merge-base, etc),
  in the same R session, so we can be confident that any differences
  that we see are real. In the Bencher framework, a similar idea is
  presented in Relative Continuous Benchmarking, which shows how to
  compare two branches, feature-branch and main.


