-
Notifications
You must be signed in to change notification settings - Fork 27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix consistency bug when storing aggregate shares #412
Conversation
a98e488
to
c6ee81b
Compare
c152e15
to
780d38a
Compare
21bea11
to
2123855
Compare
848d6ef
to
9cb6aa5
Compare
5402ab0
to
38f250a
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Looks reasonable over all. However I'm wondering if you could help me with something: In Rename the
try_put_agg_share_span()
retry metric #411 I've tried to encapsulate this logic so that I can reuse it in draft07. (In draft07 we need to commit to the state change at an earlier point, just after AggregationJobInitReq.) - It would be great to exercise the retry logic in unit tests. Could you set up
MockAggregator
to haave one of the batch buckets return a replay and another not? - nit: It's good practice to link PRs to issues they close, e.g., by adding
Closes #408
to the top comment.
38f250a
to
bfa9443
Compare
bfa9443
to
3684717
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've tacked on a commit with a suggestion for cleaning up the retry logic. The basic idea is to add a table that tracks what we did with each report the last time we tried try_put_agg_share_span()
. Feel free to iterate on this or replace it with something you think is better.
f5be4cb
to
c12ea23
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🦆 - thanks for the hard work on a really tricky change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just noticed the changes to DOs. (Sorry, I should have caught this earlier.) There might be changes here that break assumptions in daphne
, we need to carefully document anything that's broken.
Also, ReportsProcessed
is now obsolete, so we should either delete it or at least document somewhere that it's not used and will be deleted.
let replayed: usize = put_agg_share_result | ||
.into_iter() | ||
.map(|(_, (r, _))| r.map(|replayed| replayed.len()).unwrap_or_default()) | ||
.sum(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If an error occurs, we need to handle it as fatal.
It should either commit an aggregate share to storage and mark the associated reports as aggregated or do nothing at all. Due to the batching nature of the input this requires an equally complex return type which needs to be handled by the helper.
1094133
to
90de281
Compare
Closes #408