Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fundchannel_cancel hangs indefinitely on a peer's RBF'd second channel in CHANNELD_AWAITING_LOCKIN state with first channel in CHANNELD_NORMAL state #7785

Open
whitslack opened this issue Nov 2, 2024 · 2 comments

Comments

@whitslack
Copy link
Collaborator

whitslack commented Nov 2, 2024

Issue and Steps to Reproduce

  1. Have one channel in CHANNELD_NORMAL state with a peer.
  2. Propose to fund a second channel with that same peer using fundchannel_start.
  3. Establish the second channel using fundchannel_complete.
  4. Broadcast the signed funding transaction with a low fee rate.
  5. Propose to fund a channel with some other peer using fundchannel_start.
  6. Construct a transaction to fund the channel with the second peer, spending the same coin(s) as would be spent by the transaction of step 3 but using a higher fee rate.
  7. Establish the channel with the second peer using fundchannel_complete.
  8. Broadcast the second signed funding transaction, which replaces the first transaction.
  9. Wait for confirmation of the second transaction.
  10. Attempt to cancel the funding of the channel with the first peer using fundchannel_cancel.
  11. fundchannel_cancel never returns.

getinfo output

This is with CLN 24.08.2.

@whitslack whitslack changed the title fundchannel_cancel hangs indefinitely on a peer's RBF'd second channel with first channel in CHANNELD_AWAITING_LOCKIN state fundchannel_cancel hangs indefinitely on a peer's RBF'd second channel in CHANNELD_AWAITING_LOCKIN state with first channel in CHANNELD_NORMAL state Nov 2, 2024
@whitslack
Copy link
Collaborator Author

whitslack commented Nov 2, 2024

Might be a red herring. The first peer was connected inbound to me via a Tor exit node. (I have no Tor service.) I disconnected the peer, and it has yet to reconnect, so maybe it went offline but the Tor exit node didn't. Still, I suspect something might not be plumbed up quite right if fundchannel_cancel never notices that the peer isn't responding.

@whitslack
Copy link
Collaborator Author

Nope. There is definitely a problem here. The peer cannot maintain a stable connection with me because I keep sending WIRE_ERROR for the second channel and dropping the whole connection.

Here's what I see in the log repeatedly (every minute or so):

2024-11-02T15:54:31.710Z DEBUG   03…55-connectd: Connect IN
2024-11-02T15:54:31.711Z DEBUG   03…55-connectd: peer_out WIRE_INIT
2024-11-02T15:54:31.712Z DEBUG   03…55-connectd: peer_in WIRE_INIT
2024-11-02T15:54:31.712Z DEBUG   03…55-connectd: Handed peer, entering loop
2024-11-02T15:54:31.712Z DEBUG   03…55-chan#255536: Peer has reconnected, state CHANNELD_NORMAL: connecting subd
2024-11-02T15:54:31.734Z DEBUG   03…55-channeld-chan#255536: pid 14729, msgfd 188
2024-11-02T15:54:31.735Z DEBUG   03…55-chan#255536: Already have funding locked in
2024-11-02T15:54:31.735Z DEBUG   03…55-chan#255537: Peer has reconnected, state CHANNELD_AWAITING_LOCKIN: connecting subd
2024-11-02T15:54:31.735Z DEBUG   03…55-chan#255537: Telling connectd to send error 00…2e
2024-11-02T15:54:31.736Z DEBUG   03…55-connectd: peer_out WIRE_GOSSIP_TIMESTAMP_FILTER
2024-11-02T15:54:31.736Z DEBUG   03…55-gossipd: seeker: disabling gossip
2024-11-02T15:54:31.740Z DEBUG   03…55-connectd: peer_out WIRE_ERROR
2024-11-02T15:54:31.740Z DEBUG   03…55-connectd: discard_peer
2024-11-02T15:54:31.740Z DEBUG   03…55-lightningd: peer_disconnect_done
2024-11-02T15:54:31.740Z DEBUG   03…55-lightningd: Already reconnecting
2024-11-02T15:54:31.741Z DEBUG   plugin-funder: Cleaning up inflights for peer id 03…55
2024-11-02T15:54:31.741Z DEBUG   03…55-channeld-chan#255536: option_static_remotekey = 1, option_anchor_outputs = 0 option_anchors_zero_fee_htlc_tx = 1
2024-11-02T15:54:31.741Z DEBUG   03…55-channeld-chan#255536: init LOCAL: remote_per_commit = 03…7e, old_remote_per_commit = 03…73 next_idx_local = 687 next_idx_remote = 688 revocations_received = 687 feerates { SENT_ADD_ACK_REVOCATION:1255 } range 253-12700 blockheights { SENT_ADD_ACK_REVOCATION:0 }, our current 868565
2024-11-02T15:54:31.742Z DEBUG   03…55-hsmd: Got WIRE_HSMD_GET_PER_COMMITMENT_POINT
2024-11-02T15:54:31.742Z DEBUG   03…55-hsmd: Got WIRE_HSMD_GET_PER_COMMITMENT_POINT
2024-11-02T15:54:31.742Z DEBUG   03…55-channeld-chan#255536: peer_out WIRE_CHANNEL_REESTABLISH
2024-11-02T15:54:31.743Z DEBUG   03…55-channeld-chan#255536: billboard: Sent reestablish, waiting for theirs
2024-11-02T15:54:36.737Z DEBUG   03…55-connectd: Subd did not close, forcing close
2024-11-02T15:54:36.737Z **BROKEN** 03…55-connectd: Peer did not close, forcing close
2024-11-02T15:54:36.737Z INFO    03…55-channeld-chan#255536: Peer connection lost
2024-11-02T15:54:36.737Z DEBUG   03…55-channeld-chan#255536: Status closed, but not exited. Killing
2024-11-02T15:54:36.737Z INFO    03…55-chan#255536: Peer transient failure in CHANNELD_NORMAL: channeld: Owning subdaemon channeld died (62208)

Note that there are two different channel IDs here. 255536 is the first channel, which is in CHANNELD_NORMAL state. 255537 is the second channel, which is in CHANNELD_AWAITING_LOCKIN state and would be indefinitely since its funding transaction got replaced (intentionally).

Why does connectd disconnect the peer after sending WIRE_ERROR when there is still a perfectly good channel running on the connection?


Epilogue: Interestingly, my long-hung fundchannel_cancel finally returned when I gave up and called dev-forget-channel. Unfortunately, I suspect this means that nobody ended up telling the peer to abandon the channel too.

{
   "code": -1,
   "message": "Channel structure was freed!"
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant