Skip to content

Conversation

valentinewallace
Copy link
Contributor

As the LSP of an async sender, when we receive an update_add with the hold_htlc flag set, after its onion is decoded we transition the pending HTLC to the ChannelManager::pending_intercepted_htlcs. However, if we receive the release_held_htlc message from the receiver before we've had a chance to make this transition, we'll fail to release the HTLC and it will sit in the pending intercepts map until it is failed backwards.

To fix this race condition, if we receive release_held_htlc from the recipient we'll not only check the pending_intercepted_htlcs map for the presence of this HTLC but also check the map where we keep HTLCs prior to their onions being decoded.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Sep 22, 2025

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Copy link

codecov bot commented Sep 22, 2025

Codecov Report

❌ Patch coverage is 96.15385% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.68%. Comparing base (e82ef2c) to head (5ed1b9f).
⚠️ Report is 6 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/ln/channelmanager.rs 88.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4106      +/-   ##
==========================================
+ Coverage   88.53%   88.68%   +0.15%     
==========================================
  Files         175      176       +1     
  Lines      132702   132928     +226     
  Branches   132702   132928     +226     
==========================================
+ Hits       117484   117892     +408     
+ Misses      12618    12344     -274     
- Partials     2600     2692      +92     
Flag Coverage Δ
fuzzing 21.54% <0.00%> (?)
tests 88.52% <96.15%> (-0.01%) ⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

As the LSP of an async sender, when we receive an update_add with the hold_htlc
flag set, after its onion is decoded we transition the pending HTLC to the
ChannelManager::pending_intercepted_htlcs.  However, if we receive the
release_held_htlc message from the receiver *before* we've had a chance to make
this transition, we'll fail to release the HTLC and it will sit in the pending
intercepts map until it is failed backwards.

To fix this race condition, if we receive release_held_htlc from the recipient
we'll not only check the pending_intercepted_htlcs map for the presence of this
HTLC but also check the map where we keep HTLCs prior to their onions being
decoded.
@valentinewallace valentinewallace force-pushed the 2025-08-async-sender-fix-race branch from 1be75bd to 5ed1b9f Compare September 22, 2025 20:03
@valentinewallace valentinewallace added the weekly goal Someone wants to land this this week label Sep 22, 2025
@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Sep 22, 2025
/// channel counterparty to the often-offline recipient.
intercept_id: InterceptId,
/// The short channel id corresponding to the to-be-released HTLC.
short_channel_id: u64,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What are your thoughts about the redundancy here? Maybe instead of intercept_Id, it could just be node_id?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We'd need to also include the channel_id, I believe, to avoid an edge case where the scid changes if the channel splices in between the LSP intercepting the HTLC and the recipient coming online. Technically that splice edge case could also cause us to not release HTLCs in this PR's race condition fix code (I think?), but it's unlikely to since the race involves the HTLC being released very quickly after it's locked in.

I don't really feel strongly, let me know if you want to dedup the htlc_id field that way.

@ldk-reviews-bot
Copy link

👋 The first review has been submitted!

Do you think this PR is ready for a second reviewer? If so, click here to assign a second reviewer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
weekly goal Someone wants to land this this week
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants