Skip to content

ChainSync: let GSM disable and re-enable CSJ; also enable LoP in PreSyncing #1492

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

nfrisby
Copy link
Contributor

@nfrisby nfrisby commented Apr 30, 2025

Fixes #1490. Also a couple minor related changes:

  • Enables the LoP in PreSyncing instead of just Syncing (minor).
  • The GSM now notifies other components of its transition instead of just the final state. This is useful because the PreSyncing state can be transitioned to from either Syncing or CaughtUp, and the appropriate reaction may depend on which.

@nfrisby
Copy link
Contributor Author

nfrisby commented Apr 30, 2025

This is Draft because:

@nfrisby nfrisby force-pushed the nfrisby/bugfix-csj-gsm branch 2 times, most recently from 313e6c4 to dd352dc Compare May 1, 2025 14:02
@nfrisby nfrisby force-pushed the nfrisby/bugfix-csj-gsm branch 2 times, most recently from 1e19435 to ad70ffb Compare May 2, 2025 14:41
@nfrisby nfrisby force-pushed the nfrisby/bugfix-csj-gsm branch from ad70ffb to 54f958b Compare May 2, 2025 14:45
-- were left as Disengaged instead of being reset to
-- Jumpers).
--
-- One key remark: the 'CaughtUpPreSyncing' transition does
Copy link
Contributor Author

@nfrisby nfrisby May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Considering the context of this PR, it seems worthwhile to double-check this claim now, even though Issue #1491 will to.

@nfrisby
Copy link
Contributor Author

nfrisby commented May 2, 2025

TODO Now that the CSJ changes are actually so minor, we should split out the LoP changes. It'll make backporting the CSJ diff that much easier.

-- behavior might handle the different possibilities of
-- 'CaughtUpPreSyncing' better (and most likely adaptively).
-- However, that new logic is not immediately obvious, and so
-- it's not clear that the extra complexity is worthwhile.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The only motivator I can think of for trying to handle this transition more carefully is if there's a bug in the CaughtUp BlockFetch such that the node might have several honest peers that are stuck on the forecast horizon "arbitrarily far" behind the wall clock (the hypothetical bug would be that the node has honest peers but is for some reason failing to download their blocks for more than ~20 minutes). If the GSM CaughtUpPreSyncing transition ultimately unstucks the node, then its selection will finally advance, and now all of those honest peers will have to send all the headers between where the node had been stuck and the actual wall clock. Note that the node might have been stuck for much more than 20 minutes (ie for multiple cycles of the GSM anti-thrashing delay).

But it seems unwise to complicate CSJ to guard against a potential bug in CaughtUp's BlockFetch logic. IE we should assume that a node whose selection is far behind the honest chain simply has been eclipsed: ie has no honest peers, and so CSJ isn't obligated to reduce load on whatever upstream peers the node had during the eclipse.

Copy link
Contributor Author

@nfrisby nfrisby May 2, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What do you think, @coot @crocodile-dentist ? We can chat about it on the 7th, if you'd rather.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] - CSJ is always on, regardless of the GSM state
1 participant