-
Notifications
You must be signed in to change notification settings - Fork 132
Fix bandwidth related issues #1478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
9500e22
to
8958945
Compare
8906450
to
09c7f4f
Compare
Fixed unit test, ready for review. |
8958945
to
3b5aace
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
great fixes, have some comments, but overall looking good!
tapchannel/aux_invoice_manager.go
Outdated
|
||
// We also need to validate that the HTLC is actually the correct asset | ||
// and arrived through the correct asset channel. | ||
channels, err := s.cfg.LightningClient.ListChannels(ctx, true, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels a bit off-place to call here, but I guess it can serve for a while
Could explore one of the following:
a1) Add a chanID
filter to list channels, to get a performance boost (LSPs with 100s of chans would suffer a bit here -- reminder: this is called for each incoming shard, and number of shards is up to payer)
a2) Otherwise a more isolated and quicker approach would be to cache the lndclient.ChannelInfo
(no LND changes)
b) Could add this strict assetID-set check over the channel but in a different part in the flow, where the channel info is already present and querying is not needed. We could let this "leak" into the commitment/allocation phase, but handle it gracefully there? Is definitely more involved and requires LND/aux-interface changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah was thinking something similar here. We don't actually store the custom chan data ourselves, so for now we must relay on lnd
to deliver it to us. In other areas we pass it within the hook callbacks, but only the channels related to a given context.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, I think we can actually get rid of this extra RPC call:
- With
handleInvoiceAccept
we're passed the actual invoice that we created. - This invoice has an
scid
in the hop hint. - We also have the circuit key here which contains the scid.
Can't we parse out the hop hint earlier in the pipeline, pass it in here, then assert that it matches the circuit key scid? Would need to check if this is the scid alias or not passed in...
Even with that though, we'd still the obtain the custom data from a channel to ensure that it can actually carry a given HTLC.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even with that though, we'd still the obtain the custom data from a channel to ensure that it can actually carry a given HTLC.
Yeah, we really want to make sure things come in through the correct channel. Otherwise our peer could trick us.
But I added the peer to the ListChannels
query, which should already cut down the size of the response by quite a large amount. And then I also added caching as @GeorgeTsagk suggested.
I think that should be fine in terms of performance.
tapchannel/aux_traffic_shaper.go
Outdated
return b.AssetID.Val | ||
})..., | ||
) | ||
if !commitment.HasAllAssetIDs(htlcAssetIDs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just to make sure:
when using group keys the sender will locally have an HTLC asset id which would be the x-coord of the group key, but by the point where the other peer of the channel receives the asset HTLC records the asset coin selection already happened, so we should be expecting only real asset IDs at this point. Correct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Haven't run the itests against this yet, but it will be caught if not true
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, you're right. I didn't consider this so things actually failed here. Working on a fix right now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps once we land this fix, we should commit to just including the group key in all relevant areas for HTLCs/RFQ? Otherwise we'll always need to keep in mind the little trick to remember that sometimes an asset ID can actually be a group key. Transmitting the group key within these wire messages and TLV blobs will also save us from needing to do a db/universe look up to verify that something that looks like a group key is actually a group key.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I can attempt that in a follow-up PR. Though I think it might be quite a bit involved, but we'll see I guess.
09c7f4f
to
ac55403
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Subject to George's comments, LGTM 👍
Could do with an itest in terminal I suppose to ensure that this fix remains effective. Perhaps there is already a PR for that?
tapchannel/aux_invoice_manager.go
Outdated
|
||
// We also need to validate that the HTLC is actually the correct asset | ||
// and arrived through the correct asset channel. | ||
channels, err := s.cfg.LightningClient.ListChannels(ctx, true, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yeah was thinking something similar here. We don't actually store the custom chan data ourselves, so for now we must relay on lnd
to deliver it to us. In other areas we pass it within the hook callbacks, but only the channels related to a given context.
tapchannel/aux_invoice_manager.go
Outdated
|
||
// We also need to validate that the HTLC is actually the correct asset | ||
// and arrived through the correct asset channel. | ||
channels, err := s.cfg.LightningClient.ListChannels(ctx, true, false) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IIUC, I think we can actually get rid of this extra RPC call:
- With
handleInvoiceAccept
we're passed the actual invoice that we created. - This invoice has an
scid
in the hop hint. - We also have the circuit key here which contains the scid.
Can't we parse out the hop hint earlier in the pipeline, pass it in here, then assert that it matches the circuit key scid? Would need to check if this is the scid alias or not passed in...
Even with that though, we'd still the obtain the custom data from a channel to ensure that it can actually carry a given HTLC.
tapchannel/aux_traffic_shaper.go
Outdated
return b.AssetID.Val | ||
})..., | ||
) | ||
if !commitment.HasAllAssetIDs(htlcAssetIDs) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps once we land this fix, we should commit to just including the group key in all relevant areas for HTLCs/RFQ? Otherwise we'll always need to keep in mind the little trick to remember that sometimes an asset ID can actually be a group key. Transmitting the group key within these wire messages and TLV blobs will also save us from needing to do a db/universe look up to verify that something that looks like a group key is actually a group key.
12194ad
to
ca6de1a
Compare
To avoid needing to call spew.Sdump() when the trace log level isn't even being used, we wrap the calls in a function closure instead. That way the potentially CPU intensive spew only happens when the trace log level is actually enabled.
To be able to detect certain asset related routing conditions, we'll want to commit the group key of assets into the funding blob of the channel.
To easily find out if all asset IDs from a set are actually committed to a channel, we add helper functions for the different channel messages that we can encounter in our subsystems (depending on whether we get the message from a blob in a hook or as JSON over the RPC interface).
We need to make sure an asset HTLC actually comes through the correct channel that commits to that asset in the first place.
We move a block of code further down to where it's going to be used, so the next commit(s) will have a more easy to digest diff.
This fixes the first part of the issue: We didn't tell lnd that we wanted to handle traffic for non-asset channels. But that's wrong, because then lnd will pick non-asset channels for HTLCs in some situations. So we explicitly need to tell lnd there is no bandwidth in non-asset channels if an asset HTLC should be forwarded (or sent).
This is the third part of the fix: We need to make sure that we don't pick an asset channel that has the wrong type of assets when telling lnd what channel it can use.
To debug commitment issues, it's useful to log the exact asset leaf that is committed. To be able to easily decode (and compare) that commitment, we also add a unit test that outputs the leaf as JSON.
When re-anchoring a passive asset, we need to reset any time locks it previously had on it. Because the re-anchoring is a normal transfer with just a simple signature, we need to clear any previous restrictions. The ReAnchorPassiveAssets query basically needs to mirror what the asset.CopySpendTemplate() method does.
ca6de1a
to
8133f36
Compare
Pull Request Test Coverage Report for Build 14619648845Details
💛 - Coveralls |
Ready for re-review. Fixed all bugs and also updated the corresponding integration tests. The |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 🐰
Depends on #1462.
Fixes #1471.
Fixes #1275.
Fixes #1374.