-
Notifications
You must be signed in to change notification settings - Fork 2.2k
Description
The graphCache
keeps a map from node pub key to the channels it owns. So we don't have a map from a channel to the node's owned by it. This is for memory purposes since the cache will already be quite large.
This does, however, mean that when RemoveChannel
is called on the graph cache, that the node keys need to be specified so that we can find the channel entry to remove.
This is a problem when RemoveChannel
is called from (c *ChannelGraph) MarkEdgeZombe
which is only currently called from (*Builder) MarkZombieEdge
which passes in empty pub keys instead of the real node keys.
func (b *Builder) MarkZombieEdge(chanID uint64) error {
// We use the "zero key" for both node pubkeys so this edge
// can't be resurrected.
var zeroKey [33]byte
err := b.cfg.Graph.MarkEdgeZombie(chanID, zeroKey, zeroKey)
if err != nil {
return fmt.Errorf("unable to mark spent chan(id=%v) as a "+
"zombie: %w", chanID, err)
}
return nil
}
The purpose of the zero key idea is explained here:
// makeZombiePubkeys derives the node pubkeys to store in the zombie index for a
// particular pair of channel policies. The return values are one of:
// 1. (pubkey1, pubkey2)
// 2. (pubkey1, blank)
// 3. (blank, pubkey2)
//
// A blank pubkey means that corresponding node will be unable to resurrect a
// channel on its own. For example, node1 may continue to publish recent
// updates, but node2 has fallen way behind. After marking an edge as a zombie,
// we don't want another fresh update from node1 to resurrect, as the edge can
// only become live once node2 finally sends something recent.
//
// In the case where we have neither update, we allow either party to resurrect
// the channel. If the channel were to be marked zombie again, it would be
// marked with the correct lagging channel since we received an update from only
// one side.
func makeZombiePubkeys(info *models.ChannelEdgeInfo,
e1, e2 *models.ChannelEdgePolicy) ([33]byte, [33]byte) {
This means that when MarkEdgeZombie
calls the cache's RemoveChannel
method, that it is always passing in zero keys meaning that
the actual channels will not be removed from the cache.
This is a problem for 2 reasons:
- Memory leak. We will not clean up zombie channels
- pathfinding will keep seeing and maybe trying those channels.
Solution space
I can think of 3 possibilities here:
- (naive approach): we also let the cache keep track of chan->nodes. This is not ideal since this is an extra
uint64+(2x32 bytes)
per channel. - we just keep a
zombieIndex map[uint64]
in the cache and we add an entry there if and only if we did not find the nodes for the channels. Then, during any node channel traversal, we always check the zombie index to see if we should maybe remove this channel. at this point we can also remove it from the in-mem zombie index. This is much better than strategy 1 since we only track a uint64 per channel that we didnt find nodes for. But the down side here is that we may still never remove those channels if pathfinding never traverses through the channels of one of the nodes owning that channel. - this builds on 2: to address the issue in 2, we can just have a clean-up loop in the cache that runs ever day or so that does a full traversal and removes any channel that is found in the in-mem zombie-index.