Merge pull request #11 from nfidd/ensemble-session

ensemble session
nfidd · Nov 4, 2024 · 2e7bb9e · 2e7bb9e
2 parents e2405ba + beec9d7
commit 2e7bb9e
Show file tree

Hide file tree

Showing 2 changed files with 166 additions and 64 deletions.
diff --git a/sessions/forecast-ensembles.qmd b/sessions/forecast-ensembles.qmd
@@ -9,7 +9,7 @@ We can classify models along a spectrum by how much they include an understandin
 These different approaches all have different strength and weaknesses, and it is not clear a prior which one produces the best forecast in any given situation.
 One way to attempt to draw strength from a diversity of approaches is the creation of so-called *forecast ensembles* from the forecasts produced by different models.
 
-In this session, we'll start with forecasts from the models we explored in the [forecasting models](forecasting-models) session and build ensembles of these models. 
+In this session, we'll start with forecasts from models of different levels of mechanism vs. statistical complexity and build ensembles of these models. 
 We will then compare the performance of these ensembles to the individual models and to each other.
 Rather than using the forecast samples we have been using we will instead now use quantile-based forecasts. 
 
@@ -33,7 +33,7 @@ If we characterise a predictive distribution by its quantiles, we specify these
 
 ## Objectives
 
-The aim of this session is to introduce the concept of ensembles of forecasts and to evaluate the performance of ensembles of the models we explored in the [forecasting models](forecasting-models) session.
+The aim of this session is to introduce the concept of ensembles of forecasts and to evaluate the performance of ensembles of the multiple models.
 
 ::: {.callout-note collapse="true"}
 
@@ -45,7 +45,7 @@ The source file of this session is located at `sessions/forecast-ensembles.qmd`.
 
 ## Libraries used
 
-In this session we will use the `nfidd` package to load a data set of infection times and access stan models and helper functions, the `dplyr` and `tidyr` packages for data wrangling, `ggplot2` library for plotting, the `tidybayes` package for extracting results of the inference and the `scoringutils` package for evaluating forecasts.
+In this session we will use the `nfidd` package to load a data set of infection times and access stan models and helper functions, the `dplyr`  and `tidyr` packages for data wrangling, `ggplot2` library for plotting, the `tidybayes` package for extracting results of the inference and the `scoringutils` package for evaluating forecasts.
 We will also use `qra` for quantile regression averaging in the weighted ensemble section.
 
 ```{r libraries, message = FALSE}
@@ -74,15 +74,15 @@ set.seed(123)
 
 # Individual forecast models
 
-In this session we will use the forecasts from the models we explored in the session on [forecasting models](forecasting-models). There all shared the same basic renewal with delays structure but used different models for the evolution of the effective reproduction number over time. These were:
+In this session we will use the forecasts from different models. There all shared the same basic renewal with delays structure but used different models for the evolution of the effective reproduction number over time. These were:
 
-- A random walk model
+- A random walk model (what we have looked at so far)
 - A differenced autoregressive model referred to as "More statistical"
-- A simple model of susceptible depeltion referred to as "More mechanistic"
+- A simple model of susceptible depletion referred to as "More mechanistic"
 
 For the purposes of this session the precise details of the models are not critical to the concepts we are exploring.
 
-As in the session on [forecasting concepts](forecasting-concepts), we have fitted these models to a range of forecast dates so you don't have to wait for the models to fit.
+As previously, we have fitted these models to a range of forecast dates so you don't have to wait for the models to fit.
 We will now evaluate the forecasts from the mechanistic and statistical models.
 
 ```{r load_forecasts}
@@ -99,8 +99,6 @@ forecasts
 
 ::: {.callout-tip collapse="true"}
 ## How did we generate these forecasts?
-We generated these forecasts using the code in `data-raw/generate-example-forecasts.r` which uses the same approach we just took before for a single forecast date but generalises it to many forecast dates.
-
 Some important things to note about these forecasts:
 
   - We used a 14 day forecast horizon.
@@ -317,7 +315,7 @@ Are there any differences across forecast dates?
 
 ## Evaluation 
 
-As in the [forecasting concepts session](forecasting-concepts), we can evaluate the accuracy of the ensembles using the `{scoringutils}` package and in particular the `score()` function.
+As in the [forecast evaluation session](forecast-evaluation), we can evaluate the accuracy of the ensembles using the `{scoringutils}` package and in particular the `score()` function.
 
 ```{r score-ensembles}
 ensemble_scores <- simple_ensembles |>
@@ -743,7 +741,7 @@ weighted_ensemble_scores |>
   summarise_scores(by = c("model"))
 ```
 
-Remembering the session on [forecasting concepts](forecasting-concepts), we should also check performance on the log scale. 
+Remembering the session on [forecast evaluation](forecast-evaluation), we should also check performance on the log scale. 
 
 ```{r score-overview-weighted-log}
 log_ensemble_scores <- weighted_ensembles |>
@@ -767,7 +765,7 @@ How do the weighted ensembles compare to the simple ensembles on the natural and
 
 ::: {.callout-note collapse="true"}
 ## Solution
-How do the weighted ensembles compare to the simple ensembles on the natural and log scale?
+The best ensembles slightly outperform some of the simple ensembles but there is no obvious benefit from using weighted ensembles. Why might this be the case?
 :::
 
 # Going further

diff --git a/sessions/slides/introduction-to-ensembles.qmd b/sessions/slides/introduction-to-ensembles.qmd
@@ -1,12 +1,162 @@
 ---
 title: "Multi-model ensembles"
-author: "Nowcasting and forecasting of infectious disease dynamics"
+author: "Understanding, evaluating, and improving forecasts of infectious disease burden"
 format:
   revealjs:
     output: slides/forecasting-as-an-epidemiological-problem.html
     footer: "Multi-model ensembles"
+slide-level: 3
 ---
 
+## Different types of models {.smaller}
+
+![](figures/mechanism.png){width="70%"}
+
+:::: {.columns}
+
+::: {.column}
+
+- We can classify models by the level of mechanism they include
+- All of the model types we will introduce in the next few slides have been used for COVID-19 forecasting (the US and/or European COVID-19 forecast hub)
+
+**NOTE:**  level of mechanism $\neq$ model complexity
+
+:::
+
+::: {.column}
+
+![](https://media.springernature.com/full/springer-static/image/art%3A10.1038%2Fs41597-022-01517-w/MediaObjects/41597_2022_1517_Fig4_HTML.png?as=webp)
+
+[Cramer *et al.*, *Scientific Data*, 2022](https://doi.org/10.1038/s41597-022-01517-w)
+
+:::
+
+::::
+
+### {.smaller}
+
+![](figures/mechanism1.png){width="70%"}
+
+:::: {.columns}
+
+::: {.column}
+
+**Complex agent-based models**
+
+Conceptually probably the closest to metereological forecasting, if with much less real-time data.
+
+:::
+
+::: {.column}
+
+![](https://ars.els-cdn.com/content/image/1-s2.0-S1755436517300221-gr1.jpg)
+
+[Venkatramanan *et al.*, *Epidemics*, 2018](https://doi.org/10.1016/j.epidem.2017.02.010)
+
+:::
+
+::::
+
+### {.smaller}
+
+![](figures/mechanism2.png){width="70%"}
+
+:::: {.columns}
+
+::: {.column}
+
+**Compartmental models**
+
+Aim to capture relevant mechanisms but without going to the individual level.
+
+:::
+
+::: {.column}
+
+![](https://journals.plos.org/ploscompbiol/article/figure/image?size=large&id=10.1371/journal.pcbi.1008619.g001)
+
+[Keeling *et al.*, *PLOS Comp Biol*, 2021](https://doi.org/10.1371/journal.pcbi.1008619)
+
+:::
+
+::::
+
+### {.smaller}
+
+![](figures/mechanism3.png){width="70%"}
+
+:::: {.columns}
+
+::: {.column}
+
+**Semi-mechanistic models**
+
+- Include some epidemiological mechanism (e.g. SIR or the renewal equation)
+- Add a nonmechanistic time component inspired by statistical models (e.g. random walk)
+- e.g., the model we have worked with so far
+
+:::
+
+::: {.column}
+
+![](https://ars.els-cdn.com/content/image/1-s2.0-S1755436516300445-gr2.jpg)
+
+[Funk *et al.*, *Epidemics*, 2018](http://dx.doi.org/10.1016/j.epidem.2016.11.003)
+
+:::
+
+::::
+
+### {.smaller}
+
+![](figures/mechanism4.png){width="70%"}
+
+:::: {.columns}
+
+::: {.column}
+
+**Statistical models**
+
+- Models that don't include any epidemiological background e.g. ARIMA; also called *time-series models*
+- The random walk model when used on its own (without going via $R_t$) is called a [stochastic volatility model](https://mc-stan.org/docs/stan-users-guide/time-series.html#stochastic-volatility-models)
+
+:::
+
+::: {.column}
+
+![](figures/muni_model.png)
+
+:::
+
+::::
+
+### {.smaller}
+
+![](figures/mechanism.png){width="70%"}
+
+:::: {.columns}
+
+::: {.column}
+
+**Other models**
+
+- Expert or crowd opinion
+- Machine learning
+- ...
+
+:::
+
+::: {.column}
+
+![](figures/crowdforecastr.png)
+
+[Bosse *et al.*, *Wellcome Open Res*, 2024](https://doi.org/10.12688/wellcomeopenres.19380.2)
+
+:::
+
+::::
+
+
 ## Ensembles
 
 - Combine many different models' forecasts into a single prediction
@@ -75,10 +225,6 @@ Many uncertainties, many approaches: many models
 
 - Also enables consistent evaluation
 
-## Collaborative modelling
-
-### "Forecast hubs"
-
 - Outbreak modelling
 
   - Since 2013 for US influenza
@@ -100,55 +246,13 @@ Many uncertainties, many approaches: many models
 ## ... ... Multi-model ensemble {.smaller}
 ![](figures/respicast-comparison.png)
 
-## Evaluation: European COVID-19 Hub 
-
-- March 2021 - now
-
-- ~40 teams, ~50 models
-
-- Weekly ensemble & evaluation
-![](figures/hub-metadata)
-
-## Evaluation: European COVID-19 Hub {.smaller}
-
-![](figures/hub-ensemble-good)
-
-
-## Evaluation: European COVID-19 Hub {.smaller}
-
-![](figures/hub-ensemble-less-good)
-
-
-## Evaluating ensembles
-
-::: {.fragment .fade-in}
-
-  - Ensemble reduces variance
-
-    - (+) Stable
-
-    - (-) Can't explore extremes
-:::
-
-::: {.fragment .fade-in}
+## `r fontawesome::fa("laptop-code", "white")` Your Turn {background-color="#447099" transition="fade-in"}
 
-  - Dependent on components
+1. Create unweighted and weighted ensembles using forecasts from multiple models.
+2. Evaluate the forecasts from ensembles compared to their constituent models.
 
-    - (+) Ensemble typically more accurate than any individual component
+#
 
-    - (-) Obscures mechanisms
-:::
+[Return to the session](../forecast-ensembles)
 
-## Ensembles: reflections
 
-- Comparing uncertainties
-
-   - Which aspects of uncertainty do we want to keep (compare), or combine?
-
-
-
-- Collaborative modelling
-
-  - Opportunity for exchange & evaluation
-
-   - Consensus - at the cost of context?