Skip to content

karlabbabic/outbreak-probabilities

Repository files navigation

outbreak-probabilities

DTC Sandpit Challenge: methods for estimating the probability of a major outbreak

Table of Contents
  1. To-Do
  2. Set-up
  3. Methods

To-Do

  • Simulate – done
  • Analytic – upload the cell in rough-work collab to GitHub without the sliders (input params)
  • ML – write code that uses the simulated data and include plots (train 4 separate classifiers and save the models)
  • Write unit tests (ask Matthew) to cover as many lines as you can

Set-up

Continuous Integration

  • Create CI workflow in .github/workflows/ci.yml
    • GitHub Actions
    • Code Coverage

Testing

Simulation of Trajectories

Input: first k weeks of infectious cases, e.g. k[0:3] of k = [1,2,6,8,...].
Output: a CSV file simulated_cases.csv with case number entries; columns are days, e.g. day_1, day_2, day_3, ...

  • Consider using the tempfile module rather than saving to the user directory every time.

Methods

Method 1: Analytic Solution

Input:

  • the first k days worth of simulated infection data from simulated_cases.csv
  • estimated range for the reproduction number

Output:

  • The conditional probability P([I1,I2,I3] | R)
  • Outbreak probability given first three cases P(PMO | [I1,I2,I3])
  • Outbreak probability given reproduction number P(PMO | R)
  • Overall outbreak probability: (conditional probability) × (outbreak probability given reproduction number)

What to do:

  1. Numerically compute the integral for the serial interval distribution
  2. Compute the expected number of new cases

Method 2: Trajectory Matching

Input:

  • Sequence of case counts

Output:

  • All trajectories of cases where the first k days of simulated data match the observed sequence
  • Outbreak probability: fraction of those trajectories classified as major outbreaks

Method 3: Machine Learning

Input:

  • an observed input sequence of early case counts, e.g. data = [1,2,6] = k[0:3]
  • ML model(s) trained on simulated trajectories

Output:

  • predicted outbreak probability (and model metrics); saved model files (TBD)

About

DTC Sandpit Challenge: methods for estimating the probability of a major outbreak

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors