-
Notifications
You must be signed in to change notification settings - Fork 26
ChainSync: let GSM disable and re-enable CSJ; also enable LoP in PreSyncing #1492
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This is Draft because:
|
313e6c4
to
dd352dc
Compare
1e19435
to
ad70ffb
Compare
ad70ffb
to
54f958b
Compare
-- were left as Disengaged instead of being reset to | ||
-- Jumpers). | ||
-- | ||
-- One key remark: the 'CaughtUpPreSyncing' transition does |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Considering the context of this PR, it seems worthwhile to double-check this claim now, even though Issue #1491 will to.
TODO Now that the CSJ changes are actually so minor, we should split out the LoP changes. It'll make backporting the CSJ diff that much easier. |
-- behavior might handle the different possibilities of | ||
-- 'CaughtUpPreSyncing' better (and most likely adaptively). | ||
-- However, that new logic is not immediately obvious, and so | ||
-- it's not clear that the extra complexity is worthwhile. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only motivator I can think of for trying to handle this transition more carefully is if there's a bug in the CaughtUp BlockFetch such that the node might have several honest peers that are stuck on the forecast horizon "arbitrarily far" behind the wall clock (the hypothetical bug would be that the node has honest peers but is for some reason failing to download their blocks for more than ~20 minutes). If the GSM CaughtUpPreSyncing
transition ultimately unstucks the node, then its selection will finally advance, and now all of those honest peers will have to send all the headers between where the node had been stuck and the actual wall clock. Note that the node might have been stuck for much more than 20 minutes (ie for multiple cycles of the GSM anti-thrashing delay).
But it seems unwise to complicate CSJ to guard against a potential bug in CaughtUp's BlockFetch logic. IE we should assume that a node whose selection is far behind the honest chain simply has been eclipsed: ie has no honest peers, and so CSJ isn't obligated to reduce load on whatever upstream peers the node had during the eclipse.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think, @coot @crocodile-dentist ? We can chat about it on the 7th, if you'd rather.
Fixes #1490. Also a couple minor related changes:
PreSyncing
instead of justSyncing
(minor).