-
Notifications
You must be signed in to change notification settings - Fork 2
Instantiate infections upon exposure #80
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
swo
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall, I think this is an improvement
There are some name changes and docstrings I'd like to see
And then some unactionable comments about separations of powers
|
|
||
| class Simulation: | ||
| PROPERTIES = { | ||
| INIT_PROPERTIES = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's docstring these? I wasn't immediately clear to me what "init" vs. "sim" meant here
| def create_person( | ||
| self, infector: Optional[str], t_exposed: float, generation: int | ||
| ) -> str: | ||
| """Add a new person to the data""" | ||
| id = str(len(self.infections)) | ||
| self.infections[id] = {x: None for x in self.PROPERTIES} | ||
| self.infections[id] = { | ||
| "id": id, | ||
| "infector": infector, | ||
| "t_exposed": t_exposed, | ||
| "generation": generation, | ||
| "simulated": False, | ||
| } | {x: None for x in self.SIM_PROPERTIES} | ||
| return id |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because this is a small, ad hoc simulation, this doesn't matter, but I liked having create_person return an id and do nothing else. Now create_person is a mix of creating a database-entry-person and updating them.
I guess we would have wanted to draw a distinction between a person-as-dictionary vs. a simulated-person, who has some properties.
But again this doesn't matter here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we want to immediately instantiate people, it seemed to me like it no longer made sense to draw that distinction. Otherwise we'd just be adding functions to the pile that we have to call every time a new infection arises, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, there would be an extra function, because the "create empty database entry" and "fill in defaults for this person" would be separate
Again, that might be good if we had a bigger simulation where wanted more clear separation between data mgmt and simulation, but here it doesn't matter so much
| def run(self) -> None: | ||
| """Run simulation""" | ||
| # queue is pairs (t_exposed, infector) | ||
| # queue is of infection ids |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a nice improvement
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although now it means that methods like generate_infections now directly interact with the underlying data storage, rather than having run() do all that. Which violates the separation of powers a little, but again doesn't matter so much in this ad hoc simulation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm sure I'm missing something, but that feels like a line we crossed somewhere between "we should instantiate infections earlier" and "this will always be a branching process"
Or, put another way, the reason I saw "this will always be a BP" as necessary scope for "infections should instantiate earlier" was because I don't see how to instantiate infections when their time of infection is drawn without doing it inside the infection simulation routine. Anything else would just land us back where we were, wouldn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm just talking about class/static methods vs. instance methods, not necessarily about the ordering of anything
Like, the "resolve infection" function (i.e., figure out the natural history and no. of onward infections for some person) could be a class method that doesn't touch the database, but returns the natural history information back to run(), that in turn interacts with the database
I'm saying: I think there's a way to write code so that run() is the only instance method
| # and add the infections they cause to the queue, in time order | ||
| id = self.create_person() | ||
| self.generate_infection(id=id, t_exposed=t_exposed, infector=infector) | ||
| offspring = self.generate_infection(id=id) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think "infection" is now used to refer to too many things
Like, maybe we want to rename infection_queue to infected_queue? Because now it's a queue of IDs of people who have been infected, and whose onward infections we need to process
And now generate_infection returns a list of infectees, that gets merged with infection_queue, so these are the same species
I guess generate_infection now processes/resolves an infection, and returns the list of downstream infectees
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, the names are... not great at the moment. I feel like the less-descriptive queue might be best here.
The generate_ family also seems somewhat inadequate. I was tempted to rename them simulate_ since they're simulating parts of infection histories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We've replaced the (time, infector) + dict paradigm with a dict + dict paradigm since they're all just infections now, except some have not been fleshed out to full histories.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cf. my note above. I think "generate" might have made more sense if/when that function was a static method that just returned information. But now that it's interacting with the database, I see that "simulate" or "process" or whatever makes more sense
| # number of generations | ||
| for t in self.get_person_property(id, "infection_times"): | ||
| bisect.insort_right(infection_queue, (t, id), key=lambda x: x[0]) | ||
| for child in offspring: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And now "child" is essentially the same species too
| == "Cumulative" | ||
| ) | ||
|
|
||
| max_infections = 1000000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| max_infections = 1000000 | |
| max_infections = 1_000_000 |
or int(1e6)
Resolves #67, and since that changes the behavior of
max_infectionsalso resolves #65, should lay the groundwork to make resolving #62 easy.The core changes here are:
Simulation.create_person(self)is nowcreate_person(self, infector: Optional[str], t_exposed: float, generation: int)Simulation.run()no longer callsSimulation.create_person()except for the index case. All other calls are inSimulation.generate_infections()There are a number of cascading changes required to accommodate this, including to
Simulate(e.g. adding a new property to track whether an individual was actually simulated or just initialized), to summarization (e.g. now we have to filter out the not-actually-simulated infections to count detections), and to plotting (all around the need to filter out purely-initialized infections).I think the new approach has some nice side-effects:
Simulate.register_infectee()prob_control_by_genis much more straightforwardThis also frees us to remove the redundant
"infection_times"property, which is now entirely retrievable from other, more needed, information ([sim.get_person_property(infectee, "t_exposed") for infectee in sim.get_person_property(id, "infectees")]). But that's out of scope here.Our new hard-coded (at least in the app)
max_infectionsof 1 million is sufficiently robust to we can simulate 4 generations at R=10, which I think indicates we're good to go on that front.It also means that to test that our exception, you either have to briefly adjust the source code so that it's smaller, or do something pretty absurd.