Remove dangling anchor sweeping #8997
Labels
anchors
mempool
security
General label for issues/PRs related to the security of the software
utxo sweeping
When a commitment transaction confirms without a spend of our anchor output, the anchor becomes "dangling". For 16 blocks after the commitment transaction confirms, the dangling anchor can only be spent by us. After 16 blocks, anyone can spend it.
Currently, LNDs sweeper system tracks dangling anchors and attempts to economically spend them every block, potentially batching them with other inputs. There are several good reasons to remove this functionality entirely, as we will explore below:
Dangling anchors are rare
Two rare events must generally occur for a dangling anchor to exist:
Force closes are rare because they are expensive and no one likes them. Everyone would rather co-op close if possible.
Force-closed channels with no CPFP are rare because local HTLC deadlines are the main reason to force close in the first place, in which case the anchor is used to CPFP and confirm the commitment transaction quickly. Someone may also force close to reallocate capital when a peer becomes unresponsive, but they are then also likely to spend their anchor to speed along the force close and minimize further opportunity cost for those funds.
The only case where dangling anchors are expected to consistently exist is when force closes happen due to implementation bugs. Ideally we should fix those bugs and not have to worry about expensive force closes and dangling anchor sweeping in the first place.
Recovering meaningful amounts even rarer
Even when the rare force close with a dangling anchor does occur, the amount recovered from the dangling anchor is peanuts. Anchor outputs have a value of 330 sats ($0.20 at today's exchange rate), so even if we could claim the full amount (not possible due to transaction fees) the benefit is small.
In practice, the amounts recovered from dangling anchors are much less than 330 sats. Prior to the 16 CSV expiry, it is only economical to sweep dangling anchors at fee rates below ~2 sat/vB. After the CSV expiry, it is only economical at fee rates below ~6 sat/vB. So as long as mempool fee rates are 6 sat/vB or higher, we can't recover anything from dangling anchors. At lower fee rates, some fraction of the 330 sat anchor can be recovered.
Anchor sweeping bots are more efficient than LND
When fee rates do dip below 6 sat/vB, we need to batch many dangling anchor inputs into a single sweep transaction to maximize the amount recovered. LND is currently quite bad at this, since it will only batch together dangling anchors from channels that were closed at the same time. And it only considers dangling anchors from its own channels, which severely limits the amount of batching possible.
By contrast, dedicated anchor sweeping bots track all dangling anchors in the UTXO set and attempt to batch them all when fee rates make it economical to do so. As a result, these bots are more fee efficient and better reduce the UTXO set than LND can.
LND sweeps dangling anchors unsafely
Because anchor sweeping bots are more efficient than LND, they can generally offer higher fee rates when sweeping anchors than LND can. This means it is rather common for anchor sweeping bots to replace LND's anchor sweep transactions in mempools.
In some cases, this can cause inadvertent pinning of time-critical LND transactions, leading to potential fund loss. This is possible due to LND's current batching logic, which can batch dangling anchors with time-critical inputs, thereby enabling anyone on the network to execute a pinning attack (accidentally or intentionally).
In addition, LND generally adds one of its wallet UTXOs to dangling anchor spends to ensure the transaction output is above the dust limit. This is then one less wallet UTXO that could be used to CPFP a time-critical commitment transaction in the future. If enough of these UTXOs are locked up in this way, funds could be accidentally lost or intentionally stolen.
Sweeping dangling anchors adds complexity
Sweeping anchors properly has proven to be more difficult and complex than anyone anticipated [1, 2, 3]. It sometimes seems that every "improvement" we make to this logic causes a new regression or vulnerability. It would be really nice to rip this code out entirely, rather than further complicating things to "properly" do anchor sweeping.
Ephemeral anchors will eliminate dangling anchors
Rapid progress towards TRUC transactions and ephemeral anchors is being made in Bitcoin Core, with a potential release happening sooner than expected. As channels start using the new ephemeral anchor construct, dangling anchors will become a thing of the past.
In summary, handling dangling anchors is tricky and error-prone, and attempting to do so opens doors to multiple security vulnerabilities. The current situation in LND is not great, so we either need to add significant complexity to handle dangling anchors more safely, or we need to remove the logic entirely.
Putting in the engineering work to handle dangling anchors properly has minimal benefit, since dangling anchors are rare, recoverable values are tiny (or none), and ephemeral anchors will eliminate the problem entirely. As the recent sweeper rewrite has shown us, changing this part of the codebase is a lot of work, and things can go wrong. In the end, it seems that sweeping of dangling anchors is a task better suited for dedicated bots.
So, let's rip out dangling anchor sweeping entirely to simplify the codebase and improve security.
The text was updated successfully, but these errors were encountered: