Skip to content

Commit

Permalink
Merge pull request #11 from nfidd/ensemble-session
Browse files Browse the repository at this point in the history
ensemble session
  • Loading branch information
seabbs authored Nov 4, 2024
2 parents e2405ba + beec9d7 commit 2e7bb9e
Show file tree
Hide file tree
Showing 2 changed files with 166 additions and 64 deletions.
22 changes: 10 additions & 12 deletions sessions/forecast-ensembles.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -9,7 +9,7 @@ We can classify models along a spectrum by how much they include an understandin
These different approaches all have different strength and weaknesses, and it is not clear a prior which one produces the best forecast in any given situation.
One way to attempt to draw strength from a diversity of approaches is the creation of so-called *forecast ensembles* from the forecasts produced by different models.

In this session, we'll start with forecasts from the models we explored in the [forecasting models](forecasting-models) session and build ensembles of these models.
In this session, we'll start with forecasts from models of different levels of mechanism vs. statistical complexity and build ensembles of these models.
We will then compare the performance of these ensembles to the individual models and to each other.
Rather than using the forecast samples we have been using we will instead now use quantile-based forecasts.

Expand All @@ -33,7 +33,7 @@ If we characterise a predictive distribution by its quantiles, we specify these

## Objectives

The aim of this session is to introduce the concept of ensembles of forecasts and to evaluate the performance of ensembles of the models we explored in the [forecasting models](forecasting-models) session.
The aim of this session is to introduce the concept of ensembles of forecasts and to evaluate the performance of ensembles of the multiple models.

::: {.callout-note collapse="true"}

Expand All @@ -45,7 +45,7 @@ The source file of this session is located at `sessions/forecast-ensembles.qmd`.

## Libraries used

In this session we will use the `nfidd` package to load a data set of infection times and access stan models and helper functions, the `dplyr` and `tidyr` packages for data wrangling, `ggplot2` library for plotting, the `tidybayes` package for extracting results of the inference and the `scoringutils` package for evaluating forecasts.
In this session we will use the `nfidd` package to load a data set of infection times and access stan models and helper functions, the `dplyr` and `tidyr` packages for data wrangling, `ggplot2` library for plotting, the `tidybayes` package for extracting results of the inference and the `scoringutils` package for evaluating forecasts.
We will also use `qra` for quantile regression averaging in the weighted ensemble section.

```{r libraries, message = FALSE}
Expand Down Expand Up @@ -74,15 +74,15 @@ set.seed(123)

# Individual forecast models

In this session we will use the forecasts from the models we explored in the session on [forecasting models](forecasting-models). There all shared the same basic renewal with delays structure but used different models for the evolution of the effective reproduction number over time. These were:
In this session we will use the forecasts from different models. There all shared the same basic renewal with delays structure but used different models for the evolution of the effective reproduction number over time. These were:

- A random walk model
- A random walk model (what we have looked at so far)
- A differenced autoregressive model referred to as "More statistical"
- A simple model of susceptible depeltion referred to as "More mechanistic"
- A simple model of susceptible depletion referred to as "More mechanistic"

For the purposes of this session the precise details of the models are not critical to the concepts we are exploring.

As in the session on [forecasting concepts](forecasting-concepts), we have fitted these models to a range of forecast dates so you don't have to wait for the models to fit.
As previously, we have fitted these models to a range of forecast dates so you don't have to wait for the models to fit.
We will now evaluate the forecasts from the mechanistic and statistical models.

```{r load_forecasts}
Expand All @@ -99,8 +99,6 @@ forecasts

::: {.callout-tip collapse="true"}
## How did we generate these forecasts?
We generated these forecasts using the code in `data-raw/generate-example-forecasts.r` which uses the same approach we just took before for a single forecast date but generalises it to many forecast dates.

Some important things to note about these forecasts:

- We used a 14 day forecast horizon.
Expand Down Expand Up @@ -317,7 +315,7 @@ Are there any differences across forecast dates?

## Evaluation

As in the [forecasting concepts session](forecasting-concepts), we can evaluate the accuracy of the ensembles using the `{scoringutils}` package and in particular the `score()` function.
As in the [forecast evaluation session](forecast-evaluation), we can evaluate the accuracy of the ensembles using the `{scoringutils}` package and in particular the `score()` function.

```{r score-ensembles}
ensemble_scores <- simple_ensembles |>
Expand Down Expand Up @@ -743,7 +741,7 @@ weighted_ensemble_scores |>
summarise_scores(by = c("model"))
```

Remembering the session on [forecasting concepts](forecasting-concepts), we should also check performance on the log scale.
Remembering the session on [forecast evaluation](forecast-evaluation), we should also check performance on the log scale.

```{r score-overview-weighted-log}
log_ensemble_scores <- weighted_ensembles |>
Expand All @@ -767,7 +765,7 @@ How do the weighted ensembles compare to the simple ensembles on the natural and

::: {.callout-note collapse="true"}
## Solution
How do the weighted ensembles compare to the simple ensembles on the natural and log scale?
The best ensembles slightly outperform some of the simple ensembles but there is no obvious benefit from using weighted ensembles. Why might this be the case?
:::

# Going further
Expand Down
208 changes: 156 additions & 52 deletions sessions/slides/introduction-to-ensembles.qmd
Original file line number Diff line number Diff line change
@@ -1,12 +1,162 @@
---
title: "Multi-model ensembles"
author: "Nowcasting and forecasting of infectious disease dynamics"
author: "Understanding, evaluating, and improving forecasts of infectious disease burden"
format:
revealjs:
output: slides/forecasting-as-an-epidemiological-problem.html
footer: "Multi-model ensembles"
slide-level: 3
---

## Different types of models {.smaller}

![](figures/mechanism.png){width="70%"}

:::: {.columns}

::: {.column}

- We can classify models by the level of mechanism they include
- All of the model types we will introduce in the next few slides have been used for COVID-19 forecasting (the US and/or European COVID-19 forecast hub)

**NOTE:** level of mechanism $\neq$ model complexity

:::

::: {.column}

![](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41597-022-01517-w/MediaObjects/41597_2022_1517_Fig4_HTML.png?as=webp)

[Cramer *et al.*, *Scientific Data*, 2022](https://doi.org/10.1038/s41597-022-01517-w)

:::

::::

### {.smaller}

![](figures/mechanism1.png){width="70%"}

:::: {.columns}

::: {.column}

**Complex agent-based models**

Conceptually probably the closest to metereological forecasting, if with much less real-time data.

:::

::: {.column}

![](https://ars.els-cdn.com/content/image/1-s2.0-S1755436517300221-gr1.jpg)

[Venkatramanan *et al.*, *Epidemics*, 2018](https://doi.org/10.1016/j.epidem.2017.02.010)

:::

::::

### {.smaller}

![](figures/mechanism2.png){width="70%"}

:::: {.columns}

::: {.column}

**Compartmental models**

Aim to capture relevant mechanisms but without going to the individual level.

:::

::: {.column}

![](https://journals.plos.org/ploscompbiol/article/figure/image?size=large&id=10.1371/journal.pcbi.1008619.g001)

[Keeling *et al.*, *PLOS Comp Biol*, 2021](https://doi.org/10.1371/journal.pcbi.1008619)

:::

::::

### {.smaller}

![](figures/mechanism3.png){width="70%"}

:::: {.columns}

::: {.column}

**Semi-mechanistic models**

- Include some epidemiological mechanism (e.g. SIR or the renewal equation)
- Add a nonmechanistic time component inspired by statistical models (e.g. random walk)
- e.g., the model we have worked with so far

:::

::: {.column}

![](https://ars.els-cdn.com/content/image/1-s2.0-S1755436516300445-gr2.jpg)

[Funk *et al.*, *Epidemics*, 2018](http://dx.doi.org/10.1016/j.epidem.2016.11.003)

:::

::::

### {.smaller}

![](figures/mechanism4.png){width="70%"}

:::: {.columns}

::: {.column}

**Statistical models**

- Models that don't include any epidemiological background e.g. ARIMA; also called *time-series models*
- The random walk model when used on its own (without going via $R_t$) is called a [stochastic volatility model](https://mc-stan.org/docs/stan-users-guide/time-series.html#stochastic-volatility-models)

:::

::: {.column}

![](figures/muni_model.png)

:::

::::

### {.smaller}

![](figures/mechanism.png){width="70%"}

:::: {.columns}

::: {.column}

**Other models**

- Expert or crowd opinion
- Machine learning
- ...

:::

::: {.column}

![](figures/crowdforecastr.png)

[Bosse *et al.*, *Wellcome Open Res*, 2024](https://doi.org/10.12688/wellcomeopenres.19380.2)

:::

::::


## Ensembles

- Combine many different models' forecasts into a single prediction
Expand Down Expand Up @@ -75,10 +225,6 @@ Many uncertainties, many approaches: many models

- Also enables consistent evaluation

## Collaborative modelling

### "Forecast hubs"

- Outbreak modelling

- Since 2013 for US influenza
Expand All @@ -100,55 +246,13 @@ Many uncertainties, many approaches: many models
## ... ... Multi-model ensemble {.smaller}
![](figures/respicast-comparison.png)

## Evaluation: European COVID-19 Hub

- March 2021 - now

- ~40 teams, ~50 models

- Weekly ensemble & evaluation
![](figures/hub-metadata)

## Evaluation: European COVID-19 Hub {.smaller}

![](figures/hub-ensemble-good)


## Evaluation: European COVID-19 Hub {.smaller}

![](figures/hub-ensemble-less-good)


## Evaluating ensembles

::: {.fragment .fade-in}

- Ensemble reduces variance

- (+) Stable

- (-) Can't explore extremes
:::

::: {.fragment .fade-in}
## `r fontawesome::fa("laptop-code", "white")` Your Turn {background-color="#447099" transition="fade-in"}

- Dependent on components
1. Create unweighted and weighted ensembles using forecasts from multiple models.
2. Evaluate the forecasts from ensembles compared to their constituent models.

- (+) Ensemble typically more accurate than any individual component
#

- (-) Obscures mechanisms
:::
[Return to the session](../forecast-ensembles)

## Ensembles: reflections

- Comparing uncertainties

- Which aspects of uncertainty do we want to keep (compare), or combine?



- Collaborative modelling

- Opportunity for exchange & evaluation

- Consensus - at the cost of context?

0 comments on commit 2e7bb9e

Please sign in to comment.