-
Notifications
You must be signed in to change notification settings - Fork 419
Fix race condition causing async payment failure #4106
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Fix race condition causing async payment failure #4106
Conversation
👋 Thanks for assigning @joostjager as a reviewer! |
Codecov Report❌ Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #4106 +/- ##
==========================================
+ Coverage 88.53% 88.68% +0.15%
==========================================
Files 175 176 +1
Lines 132702 132928 +226
Branches 132702 132928 +226
==========================================
+ Hits 117484 117892 +408
+ Misses 12618 12344 -274
- Partials 2600 2692 +92
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
As the LSP of an async sender, when we receive an update_add with the hold_htlc flag set, after its onion is decoded we transition the pending HTLC to the ChannelManager::pending_intercepted_htlcs. However, if we receive the release_held_htlc message from the receiver *before* we've had a chance to make this transition, we'll fail to release the HTLC and it will sit in the pending intercepts map until it is failed backwards. To fix this race condition, if we receive release_held_htlc from the recipient we'll not only check the pending_intercepted_htlcs map for the presence of this HTLC but also check the map where we keep HTLCs prior to their onions being decoded.
1be75bd
to
5ed1b9f
Compare
/// channel counterparty to the often-offline recipient. | ||
intercept_id: InterceptId, | ||
/// The short channel id corresponding to the to-be-released HTLC. | ||
short_channel_id: u64, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What are your thoughts about the redundancy here? Maybe instead of intercept_Id
, it could just be node_id
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We'd need to also include the channel_id
, I believe, to avoid an edge case where the scid changes if the channel splices in between the LSP intercepting the HTLC and the recipient coming online. Technically that splice edge case could also cause us to not release HTLCs in this PR's race condition fix code (I think?), but it's unlikely to since the race involves the HTLC being released very quickly after it's locked in.
I don't really feel strongly, let me know if you want to dedup the htlc_id
field that way.
👋 The first review has been submitted! Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer. |
As the LSP of an async sender, when we receive an
update_add
with thehold_htlc
flag set, after its onion is decoded we transition the pending HTLC to theChannelManager::pending_intercepted_htlcs
. However, if we receive therelease_held_htlc
message from the receiver before we've had a chance to make this transition, we'll fail to release the HTLC and it will sit in the pending intercepts map until it is failed backwards.To fix this race condition, if we receive
release_held_htlc
from the recipient we'll not only check thepending_intercepted_htlcs
map for the presence of this HTLC but also check the map where we keep HTLCs prior to their onions being decoded.