Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

multi: make reassignment of alias channel edge atomic #8777

Open
wants to merge 7 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion channeldb/channel.go
Original file line number Diff line number Diff line change
Expand Up @@ -1804,7 +1804,7 @@ func (c *OpenChannel) ChanSyncMsg() (*lnwire.ChannelReestablish, error) {
currentCommitSecret[0] ^= 1

// If this is a tweakless channel, then we'll purposefully send
// a next local height taht's invalid to trigger a force close
// a next local height that's invalid to trigger a force close
// on their end. We do this as tweakless channels don't require
// that the commitment point is valid, only that it's present.
if c.ChanType.IsTweakless() {
Expand Down
5 changes: 5 additions & 0 deletions docs/release-notes/release-notes-0.19.0.md
Original file line number Diff line number Diff line change
Expand Up @@ -63,6 +63,11 @@
* [Fixed a bug](https://github.com/lightningnetwork/lnd/pull/9322) that caused
estimateroutefee to ignore the default payment timeout.

* [Make reassignment of alias channel edges atomic](
https://github.com/lightningnetwork/lnd/pull/8777). This fixes an edge case
where we might rate limit a peer because he already sent us a channel update
with the confirmed channel ID but the graph was still not updated.

# New Features

* [Support](https://github.com/lightningnetwork/lnd/pull/8390) for
Expand Down
119 changes: 94 additions & 25 deletions funding/manager.go
Original file line number Diff line number Diff line change
Expand Up @@ -530,9 +530,12 @@ type Config struct {
// the initiator for channels of the anchor type.
MaxAnchorsCommitFeeRate chainfee.SatPerKWeight

// DeleteAliasEdge allows the Manager to delete an alias channel edge
// from the graph. It also returns our local to-be-deleted policy.
DeleteAliasEdge func(scid lnwire.ShortChannelID) (
// ReAssignSCID allows the Manager to assign a new SCID to an
// option-scid channel being part of the underlying graph. This is
// necessary because option-scid channels change their scid during their
// lifetime (public zeroconf channels for example) so we need to make
// sure to update the underlying graph.
ReAssignSCID func(aliasScID, newScID lnwire.ShortChannelID) (
*models.ChannelEdgePolicy, error)

// AliasManager is an implementation of the aliasHandler interface that
Expand Down Expand Up @@ -3719,19 +3722,29 @@ func (f *Manager) annAfterSixConfs(completeChan *channeldb.OpenChannel,
"maps: %v", err)
}

// We'll delete the edge and add it again via
// addToGraph. This is because the peer may have
// sent us a ChannelUpdate with an alias and we don't
// want to relay this.
ourPolicy, err := f.cfg.DeleteAliasEdge(baseScid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the actually issue is we call DeleteSixConfs too early - if we only call it once the edge is recreated, we can fix the case where the peer sends a channel_update using alias?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm I don't think so, we don't want this ChanUpdate with the Alias so we need to delete the mapping earlier so we do not risk adding the Alias ChanUpdate after we readded the Edge ?

// We reassign the same scid to the graph db. This will
// trigger a deletion of the current edge data and
// reinsert the channel with the same edge info and
// policy. This is done to guarantee that potential
// ChannelUpdates using the alias as the scid are
// removed and not relayed to the broader network
// because the alias is not a verifiable channel id.
ourPolicy, err := f.cfg.ReAssignSCID(
baseScid, baseScid,
)
if err != nil {
return fmt.Errorf("failed deleting real edge "+
"for alias channel from graph: %v",
err)
return fmt.Errorf("unable to reassign alias "+
"edge in graph: %w", err)
}

err = f.addToGraph(
completeChan, &baseScid, nil, ourPolicy,
log.Infof("Successfully reassigned alias edge in "+
"graph(non-zeroconf): %v(%d) -> %v(%d)",
baseScid, baseScid.ToUint64(),
baseScid, baseScid.ToUint64())

// We send the rassigned ChannelUpdate to the peer.
err = f.sendChanUpdate(
completeChan, &baseScid, ourPolicy,
Comment on lines +3746 to +3747
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The unit tests pass even with this call commented out. I propose to cover this case of announcement sending in unit tests as well.

Another idea. DeleteSixConfs, ReAssignSCID, and sendChanUpdate is a common sequence in both cases where they are used. Does it make sense to create a method doing DeleteSixConfs, ReAssignSCID, and sendChanUpdate and reuse it in both cases? It case there will be a third case in the future, there will be less chances to use incorrectly (e.g. to forget to call sendChanUpdate).

)
if err != nil {
return fmt.Errorf("failed to re-add to "+
Expand Down Expand Up @@ -3808,23 +3821,33 @@ func (f *Manager) waitForZeroConfChannel(c *channeldb.OpenChannel) error {
"six confirmations: %v", err)
}

// TODO: Make this atomic!
ourPolicy, err := f.cfg.DeleteAliasEdge(c.ShortChanID())
// The underlying graph entry for this channel id needs to be
// reassigned with the new confirmed scid. Moreover channel
// updates with the alias scid are removed so that we do not
// relay them to the broader network.
ourPolicy, err := f.cfg.ReAssignSCID(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this fixes the issue mentioned in the PR description re the possible race. We can still receive an channel_update from the peer while we are in the process of re-assigning the SCID. Meanwhile, in the gossiper, if we receive a channel_update that we don't understand, it's put in the prematureChannelUpdates and will be processed later.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I described the problem in detail here: ef6cd7a

Looking back I think this issue could have been solved by deleting the rejectCache before adding the edge via the addToGraph function. Tho I think having the deletion and addition atomic is the better desing wdyt ?

c.ShortChanID(), confChan.shortChanID,
)
if err != nil {
return fmt.Errorf("unable to delete alias edge from "+
"graph: %v", err)
return fmt.Errorf("unable to reassign alias edge in "+
"graph: %w", err)
}

// We'll need to update the graph with the new ShortChannelID
// via an addToGraph call. We don't pass in the peer's
// alias since we'll be using the confirmed SCID from now on
// regardless if it's public or not.
err = f.addToGraph(
c, &confChan.shortChanID, nil, ourPolicy,
aliasScid := c.ShortChanID()
confirmedScid := confChan.shortChanID
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The variables can be used above in the ReAssignSCID call and below in the sendChanUpdate call.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

looks like this is not addressed


log.Infof("Successfully reassigned alias edge in "+
"graph(zeroconf): %v(%d) -> %v(%d)",
aliasScid, aliasScid.ToUint64(),
confirmedScid, confirmedScid.ToUint64())

// Send the ChannelUpdate with the confirmed scid to the peer.
err = f.sendChanUpdate(
c, &confChan.shortChanID, ourPolicy,
)
if err != nil {
return fmt.Errorf("failed adding confirmed zero-conf "+
"SCID to graph: %v", err)
return fmt.Errorf("failed to send ChannelUpdate to "+
"gossiper: %v", err)
}
}

Expand Down Expand Up @@ -4590,6 +4613,52 @@ func (f *Manager) announceChannel(localIDKey, remoteIDKey *btcec.PublicKey,
return nil
}

// sendChanUpdate sends a ChannelUpdate to the gossiper which is as a
// consequence sent to the peer.
//
// TODO(ziggie): Refactor the gossip msgs so that not always all msgs have
// to be created but only the ones which are needed.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

think we have a de-dup logic when sending the msgs tho

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was more referring to the creation of these messages, we always require all messages to be created although only single ones are used, that's why I propose splitting them, and it will also come handy in the follow-up PR which sets the dont_forward bit.

func (f *Manager) sendChanUpdate(completeChan *channeldb.OpenChannel,
shortChanID *lnwire.ShortChannelID,
ourPolicy *models.ChannelEdgePolicy) error {

chanID := lnwire.NewChanIDFromOutPoint(completeChan.FundingOutpoint)

fwdMinHTLC, fwdMaxHTLC := f.extractAnnounceParams(completeChan)

ann, err := f.newChanAnnouncement(
f.cfg.IDKey, completeChan.IdentityPub,
&completeChan.LocalChanCfg.MultiSigKey,
completeChan.RemoteChanCfg.MultiSigKey.PubKey, *shortChanID,
chanID, fwdMinHTLC, fwdMaxHTLC, ourPolicy,
completeChan.ChanType,
)
if err != nil {
return fmt.Errorf("error generating channel "+
"announcement: %v", err)
}

errChan := f.cfg.SendAnnouncement(ann.chanUpdateAnn)
select {
case err := <-errChan:
if err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can be simplified to,

if graph.IsError(err, graph.ErrOutdated, graph.ErrIgnored) {
	log.Debugf("Graph rejected ChannelUpdate: %v", err)

	return nil
}

if err != nil {
	return fmt.Errorf("error sending channel update: %v",
				err)
}

or just,

if graph.IsError(err, graph.ErrOutdated, graph.ErrIgnored) {
	log.Debugf("Graph rejected ChannelUpdate: %v", err)

	return nil
}

return err

since all the returned error from sendChanUpdate have already been wrapped with contexts and save us from a long chained error msgs.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

not addressed

if graph.IsError(err, graph.ErrOutdated,
graph.ErrIgnored) {

log.Debugf("Graph rejected "+
"ChannelUpdate: %v", err)
} else {
return fmt.Errorf("error sending channel "+
"update: %v", err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

%v -> %w, weird the linter didn't catch this case.

}
}
case <-f.quit:
return ErrFundingManagerShuttingDown
}

return nil
}

// InitFundingWorkflow sends a message to the funding manager instructing it
// to initiate a single funder workflow with the source peer.
func (f *Manager) InitFundingWorkflow(msg *InitFundingMsg) {
Expand Down
Loading
Loading