Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

lightningd crashes when non-important plugin misbehaves #7838

Open
JssDWt opened this issue Nov 18, 2024 · 3 comments
Open

lightningd crashes when non-important plugin misbehaves #7838

JssDWt opened this issue Nov 18, 2024 · 3 comments
Milestone

Comments

@JssDWt
Copy link
Contributor

JssDWt commented Nov 18, 2024

Issue and Steps to Reproduce

  • plugin returns a non-result value
  • lightningd crashes
Plugin trampoline for htlc_accepted returned non-result response {"error":{"code":-32700,"data":null,"message":"missing field `forward_msat`"},"id":"cln:htlc_accepted#1","jsonrpc":"2.0"}
lightningd: FATAL SIGNAL 6 (version v24.08.1-17-ga780ad4-modded)
0x5639b61b0dc7 send_backtrace
        common/daemon.c:33
0x5639b61b0e4f crashdump
        common/daemon.c:75
0x7fbe5a5ccfcf ???
        ./signal/../sysdeps/unix/sysv/linux/x86_64/libc_sigaction.c:0
0x7fbe5a61bd3c __pthread_kill_implementation
        ./nptl/pthread_kill.c:44
0x7fbe5a5ccf31 __GI_raise
        ../sysdeps/posix/raise.c:26
0x7fbe5a5b7471 __GI_abort
        ./stdlib/abort.c:79
0x5639b615f606 fatal_vfmt
        lightningd/log.c:1038
0x5639b615f69f fatal
        lightningd/log.c:1048
0x5639b618585e plugin_hook_callback
        lightningd/plugin_hook.c:167
0x5639b617fd4f plugin_response_handle
        lightningd/plugin.c:663
0x5639b618349a plugin_read_json_one
        lightningd/plugin.c:775
0x5639b6183736 plugin_read_json
        lightningd/plugin.c:826
0x5639b631d6d7 next_plan
        ccan/ccan/io/io.c:60
0x5639b631db62 do_plan
        ccan/ccan/io/io.c:422
0x5639b631dc1b io_ready
        ccan/ccan/io/io.c:439
0x5639b631f4d6 io_loop
        ccan/ccan/io/poll.c:455
0x5639b61576a2 io_loop_with_timers
        lightningd/io_loop_with_timers.c:22
0x5639b615ce88 main
        lightningd/lightningd.c:1470
0x7fbe5a5b81c9 __libc_start_call_main
        ../sysdeps/nptl/libc_start_call_main.h:58
0x7fbe5a5b8284 __libc_start_main_impl
        ../csu/libc-start.c:360
0x5639b6131f10 ???
        ???:0
0xffffffffffffffff ???
        ???:0

getinfo output

v24.08

@rustyrussell rustyrussell added this to the v24.11 milestone Nov 18, 2024
@cdecker
Copy link
Member

cdecker commented Nov 18, 2024

I suspect that crashing in this case is the best we can do: The plugin has gotten the request, may have done something with it (created DB entries, ordered a Chai Latte, you name it), and that cannot be undone by ignoring the failed result, and pretending we never called the hook.

We have two options:

  • Ignore the broken result, which may lead to non-atomic changes between plugin state and node state. If the plugin is mandatory, this is just plain the wrong answer. Yours is a trampoline plugin, and we'd be rejecting the HTLC with a generic error message. The default entirety of the logic is in the plugin, and so it must share the fate of the rest of the node, and vice-versa
  • Crash when a mandatory component, such as a plugin which changes our behavior. This is what we currently do, and it guarantees that the plugin will be presented with the same HTLC upon startup (hence why the hooks need to be idempotent by the way).

In your case feeding the HTLC back into the plugin caused it to crash again, causing a crash loop, but what else could we do? If we didn't replay the HTLC then the plugin would not have seen some of the HTLCs, potentially breaking the guarantees lightningd gives the plugin.

Given these considerations, I don't think this is fixable, as a change/weakening in semantics would be our only option, and we don't want that. You were unfortunate since htlc_accepted is special in that the HTLC gets replayed at startup, which is not the case for all hooks.

@JssDWt
Copy link
Contributor Author

JssDWt commented Nov 18, 2024

The one option I see is to kill the plugin, if it's run with plugin rather than important-plugin. Then process htlcs as if the plugin didn't exist. I would consider that 'expected' if a non-important plugin misbehaves.

@vincenzopalazzo
Copy link
Contributor

vincenzopalazzo commented Nov 18, 2024

I believe the discussion is more about when a plugin should return an error to Core Lightning. I agree with @cdecker that killing the plugin on encountering an unexpected error does not solve anything. You cannot apply this rule universally to every plugin because you risk terminating a plugin that is performing critical operations and must not be killed—for example, a backup plugin.

That said, this is a Rust plugin (likely the one GL is shipping to handle the trampoline functionality), and in my opinion, this represents a misuse of Rust's ? operator.

Why should this error be returned to Core Lightning in this case? If it’s a JSON encoding/decoding error (I am assuming from the error, looks like a serde_json error), the plugin author should terminate the plugin intentionally (panic) because only they know whether it is safe to do so. Am I missing something here?

Plugin trampoline for htlc_accepted returned non-result response {"error":{"code":-32700,"data":null,"message":"missing field `forward_msat`"},"id":"cln:htlc_accepted#1","jsonrpc":"2.0"}

@rustyrussell rustyrussell modified the milestones: v24.11, v25.02 Dec 13, 2024
@rustyrussell rustyrussell modified the milestones: v25.02, v25.05 Feb 17, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants