Skip to content

Conversation

TheBlueMatt
Copy link
Collaborator

This finally adds support for full native Rust `async` persistence
to `ChainMonitor`.

Way back when, before we had any other persistence, we added the
`Persist` trait to persist `ChannelMonitor`s. It eventualy grew
homegrown async persistence support via a simple immediate return
and callback upon completion. We later added a persistence trait
in `lightning-background-processor` to persist the few fields that
it needed to drive writes for. Over time, we found more places
where persistence was useful, and we eventually added a generic
`KVStore` trait.

In dc75436c673fad8b5b8ed8d5a9db1ac95650685a we removed the
`lightning-background-processor` `Persister` in favor of simply
using the native `KVStore` directly.

Here we continue that trend, building native `async`
`ChannelMonitor` persistence on top of our native `KVStore` rather
than hacking support for it into the `chain::Persist` trait.
Because `MonitorUpdatingPersister` already exists as a common way
to wrap a `KVStore` into a `ChannelMonitor` persister, we build
exclusively on that (though note that the "monitor updating" part
is now optional), utilizing its new async option as our native
async driver.

Thus, we end up with a `ChainMonitor::new_async_beta` which takes
a `MonitorUpdatingPersisterAsync` rather than a classic
`chain::Persist` and then operates the same as a normal
`ChainMonitor`.

While the requirement that users now use a
`MonitorUpdatingPersister` to wrap their `KVStore` before providing
it to `ChainMonitor` is somewhat awkward, as we move towards a
`KVStore`-only world it seems like `MonitorUpdatingPersister`
should eventually merge into `ChainMonitor`.

Curious to hear what folks think. No tests yet which I need to do.

@ldk-reviews-bot
Copy link

ldk-reviews-bot commented Sep 9, 2025

👋 Thanks for assigning @joostjager as a reviewer!
I'll wait for their review and will help manage the review process.
Once they submit their review, I'll check if a second reviewer would be helpful.

Copy link

codecov bot commented Sep 9, 2025

Codecov Report

❌ Patch coverage is 85.38588% with 89 lines in your changes missing coverage. Please review.
✅ Project coverage is 88.72%. Comparing base (1ed6c4b) to head (462a647).
⚠️ Report is 29 commits behind head on main.

Files with missing lines Patch % Lines
lightning/src/util/persist.rs 79.72% 45 Missing and 15 partials ⚠️
lightning/src/util/test_utils.rs 81.31% 13 Missing and 4 partials ⚠️
lightning/src/util/native_async.rs 81.48% 5 Missing ⚠️
lightning/src/ln/chanmon_update_fail_tests.rs 97.24% 1 Missing and 3 partials ⚠️
lightning/src/chain/chainmonitor.rs 94.00% 3 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #4063      +/-   ##
==========================================
+ Coverage   88.71%   88.72%   +0.01%     
==========================================
  Files         176      177       +1     
  Lines      132778   133252     +474     
  Branches   132778   133252     +474     
==========================================
+ Hits       117790   118234     +444     
- Misses      12304    12320      +16     
- Partials     2684     2698      +14     
Flag Coverage Δ
fuzzing 21.47% <1.56%> (-0.07%) ⬇️
tests 88.56% <85.38%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

@tnull tnull self-requested a review September 9, 2025 07:44
Copy link
Contributor

@joostjager joostjager left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great to see how you wrestle the wrappers, polling and generics, especially after having fought with that myself for quite some time.

);
(start, end)
})
let latest_update_id = monitor.get_latest_update_id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there test coverage to make this change safely? Migration code often isn't covered as well. It seems that persister_with_real_monitors provides some, but not sure what exactly.

Also wondering if this change is necessary for this PR. Otherwise it might be better to split it off.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Its sadly required. We have to make the persist_new_channel call as a non-async call and then block its future later. If we kept the old code which read the old monitor, we'd need to await it before calling persist_new_channel and suddenly we can have write-order inversions. I'll add a comment describing why its important.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this would break upgrades for any monitors written pre-0.1, right (i.e., even if users just have them stilll lying around)? Should we document that in the pending_changelog?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It shouldn't? The change is still functionally-equivalent, we just do the read and cleanup after the first write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial question about test coverage is still open?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, we should add test coverage but the previous logic also had no coverage, and the new logic is clearly safer as it relies on listing and existing code rather than code specifically for this case. #4104

/// [`MonitorUpdatingPersisterAsync`] and thus allows persistence to be completed async.
///
/// Note that async monitor updating is considered beta, and bugs may be triggered by its use.
pub fn new_async_beta(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think there are now three ways to do persistence: sync, the previous async way via implementing a different Persist and this new_async_beta?

Is there any form of consolidation possible between the two async setups?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I mention it in the last commit, but I think the eventual consolidation should be that we merge MonitorUpdatingPersister into ChainMonitor and then the Persist interface is just the interface between ChannelManager and ChainMonitor, a user will always just instantiate a ChainMonitor with either a KVStore or a KVStoreSync and we'll deal with the rest.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense to me. I just wondered if we should already now steer towards MonitorUpdatingPersister with an async kv store as the only way to do async. I don't think it is more "beta" than the current callback-based async?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thoughts on this?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I'd missed this. I don't really see a strong reason to change the current API and remove the manual-async approach immediately. Its not additional code to maintain (given the new logic uses it under the hood anyway) and we do have folks using it. That said, it does probably make sense to deprecate it, which I'll go ahead and do here.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh actually nevermind, we should have a discussion about if we want to support async outside of rust, which would need the old API (or a way to make async KVStore work outside of rust, which I think we can do eventually as well).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't consider the bindings. Not great to remain stuck with multiple ways to do it, but not sure what we can do either.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we can map the async stuff to bindings eventually, so its not like we're stuck, just a question of priorities.

/// - [`MonitorUpdatingPersister`] will potentially have more listing to do if you need to run
/// [`MonitorUpdatingPersister::cleanup_stale_updates`].
///
/// Note that you can disable the update-writing entirely by setting `maximum_pending_updates`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can the Persist impl for KvStore (non-updating) be removed now?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I think the next step in cleaning things up would be to consolidate ChainMonitor and MonitorUpdatingPersister and then we'd remove that blanket impl.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we could already remove it now and let users use MonitorUpdatingPersister with zero updates? Or you want to keep it to avoid a two-step api change?

@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch from c5c70bc to 53cb9f6 Compare September 9, 2025 21:06
@joostjager joostjager mentioned this pull request Sep 10, 2025
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did a first high-level pass, looks good so far!

I think this would be great candidate to follow the recently-discussed policy of testing on LDK Node. Mind also opening a draft PR against its develop branch we now have to test this out? (doesn't need to be production ready, happy to take it over eventually)

);
(start, end)
})
let latest_update_id = monitor.get_latest_update_id();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But this would break upgrades for any monitors written pre-0.1, right (i.e., even if users just have them stilll lying around)? Should we document that in the pending_changelog?

}

/// A generic trait which is able to spawn futures in the background.
pub trait FutureSpawner: Send + Sync + 'static {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this live in lightning-types or a new shared lightning-util crate, so that we can DRY up the one in lightning-block-sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe? Not entirely sure what to do with it honestly. It doesn't really below in lightning-types and adding a crate just for this seems like overkill...We could probably move it to lightning::util in a separate module and just use that one in lightning-block-sync?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I went ahead and did this.

struct PanicingSpawner;
impl FutureSpawner for PanicingSpawner {
fn spawn<T: Future<Output = ()> + MaybeSend + 'static>(&self, _: T) {
unreachable!();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a message here explaining what happened if it was ever hit?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought that was clear from the PanicingSpawner name :) I could write out why its used below but it seems weird to put that here rather than where PanicingSpawner is used?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but if it panics we won't see the type name, no? Not that important, but some message could provide some initial context on what happened/why we crashed?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'll still give a line number 🤷‍♂️

if let Some(a) = res_a {
a.await?;
}
if let Some(b) = res_b {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we make these else if branches to express that we'd only ever deal with one res here?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems more correct not to, given we're not returning the Ok from them just ?ing them?

///
/// Note that async monitor updating is considered beta, and bugs may be triggered by its use.
pub fn new_async_beta(
chain_source: Option<C>, broadcaster: T, logger: L, feeest: F,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Why not simply fee_estimator?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cause I copied the one from new 🤷‍♂️

@TheBlueMatt
Copy link
Collaborator Author

I think this would be great candidate to follow the recently-discussed policy of testing on LDK Node. Mind also opening a draft PR against its develop branch we now have to test this out? (doesn't need to be production ready, happy to take it over eventually)

Indeed, though I don't actually think we want to use this right away - because async persist is still considered "beta" I don't think there's anything actionable in LDK Node until probably at least 0.3? I'm happy to do a test integration but does LDK Node already have support for async KVStore/use MonitorUpdatingPersister? Without those it doesn't seem worth much?

I'll at least go update ldk-sample, as my goal here is in part to make it easier to test async persist in my own node in 0.2.

@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch 3 times, most recently from ac1c9d5 to dfaa102 Compare September 11, 2025 16:55
@TheBlueMatt TheBlueMatt self-assigned this Sep 11, 2025
@tnull
Copy link
Contributor

tnull commented Sep 12, 2025

I'm happy to do a test integration but does LDK Node already have support for async KVStore/use MonitorUpdatingPersister? Without those it doesn't seem worth much?

For the former, it will after lightningdevkit/ldk-node#633 lands which I intend to finish ~early next week (will also need some changes over here in LDK). For the latter we already had a PR open / close that however was blocked on the LDK upgrade (lightningdevkit/ldk-node#456). I think I'll see to pick that up in the coming week(s), too.

@tnull
Copy link
Contributor

tnull commented Sep 12, 2025

will also need some changes over here in LDK

Actually: #4069

@TheBlueMatt TheBlueMatt added this to the 0.2 milestone Sep 18, 2025
@TheBlueMatt TheBlueMatt mentioned this pull request Sep 2, 2025
24 tasks
@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch 2 times, most recently from def7770 to 403385f Compare September 21, 2025 14:34
@TheBlueMatt TheBlueMatt marked this pull request as ready for review September 21, 2025 14:34
@TheBlueMatt
Copy link
Collaborator Author

Updated TestKVStore to make future completion manual and added a test which utilizes that. I tested it against lightningdevkit/ldk-node#633 and it didn't seem to cause test failures, but I'm happy to work on updating that in the future if it does. Given we probably don't want to ship this in ldk-node in this version, I'm not sure updating ldk-node to use it should really be a priority, but I'll happily open an ldk-node PR to use this after this week.

@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch 2 times, most recently from 6616a70 to c2079d3 Compare September 21, 2025 18:33
Comment on lines +877 to +883
Ok(()) => inner.async_completed_updates.lock().unwrap().push(completion),
Err(e) => {
log_error!(
inner.logger,
"Failed to persist new ChannelMonitor {channel_id}: {e}. The node will now likely stall as this channel will not be able to make progress. You should restart as soon as possible.",
);
},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error handling in spawn_async_persist_new_channel adds the completion to async_completed_updates regardless of whether there's an error, while the similar error handling in spawn_async_update_persisted_channel at line 908 only adds the completion when there's no error. These behaviors should be consistent - either both should add completions only on success, or both should add completions regardless of errors.

Spotted by Diamond

Fix in Graphite


Is this helpful? React 👍 or 👎 to let us know.

@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch 2 times, most recently from c4748f9 to 5b058df Compare September 21, 2025 18:57
@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch from ed66038 to a501bc7 Compare September 22, 2025 12:52
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes LGTM, I think.

I do worry that in three weeks we'll start forgetting about a lot of the context/details regarding the KVStore ordering requirements and will trip up eventually. Makes sense to move forward now, but I do wonder if we could do a better job enforcing some of our assumptions via types eventually.

CHANNEL_MONITOR_PERSISTENCE_PRIMARY_NAMESPACE,
CHANNEL_MONITOR_PERSISTENCE_SECONDARY_NAMESPACE,
).await?;
let primary = CHANNEL_MONITOR_PERSISTENCE_PRIMARY_NAMESPACE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Personal preference, and we discussed this many times already, but IMO these 'cleanups' make the code harder to read, as to me it's much clearer to read the fully-captialized constants in the write call than going through the indirection of 2-3 more less-descript variables.

I'd take some more verticality and some more indentation/structure over additional variables any day. FWIW, in any of these cases we even maintain the number of lines, so not even reduce verticality.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some of the cases it could probably be walked back as i was hoping to reuse the namespace variables but it turned out they were different. Given I failed to recognize they were different we really need to change the constants so that they're much shorter and can be eyeballed much easier.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In some of the cases it could probably be walked back as i was hoping to reuse the namespace variables but it turned out they were different. Given I failed to recognize they were different we really need to change the constants so that they're much shorter and can be eyeballed much easier.

I disagree. They were intentionally kept expressive and consistent.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can discuss on #4105, but I wrote to the wrong namespace at least 3 times at various points in this PR, both from copying around code and from bogus attempts at cleaning up code. That indicates to me pretty clearly we need to rework those constants cause they're gonna cause issues

struct PanicingSpawner;
impl FutureSpawner for PanicingSpawner {
fn spawn<T: Future<Output = ()> + MaybeSend + 'static>(&self, _: T) {
unreachable!();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, but if it panics we won't see the type name, no? Not that important, but some message could provide some initial context on what happened/why we crashed?

CHANNEL_MONITOR_UPDATE_PERSISTENCE_PRIMARY_NAMESPACE,
monitor_key.as_str(),
let primary = CHANNEL_MONITOR_UPDATE_PERSISTENCE_PRIMARY_NAMESPACE;
// Note that this is NOT an async function, but rather calls the *sync* KVStore
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This really makes me wonder if we should still somehow reflect the ordering requirements in the KVStore interface itself (could split write in register_write/complete_write calls, for example). It seems very likely to me that we'll otherwise eventually mess up?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea, I agree there's room for a better API here, tho its not at all clear to me what that is. A register/write split seems like it locks the implementation in to something like our queuing stuff in FS Store, but that doesn't need to be required globally. We'd probably end up with some kind of crazy intermediate state return that gets passed back to complete, which also seems pretty annoying.

@TheBlueMatt
Copy link
Collaborator Author

CI failure appears to just be a slow-IO issue :/

joostjager
joostjager previously approved these changes Sep 22, 2025
Copy link
Contributor

@tnull tnull left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feel free to squash.

@TheBlueMatt
Copy link
Collaborator Author

TheBlueMatt commented Sep 22, 2025

Squashed, only additionally addressing some comment from @tnull I'd missed

$ git diff-tree -U1 0abc47b6c 462a6479d9
diff --git a/lightning/src/util/persist.rs b/lightning/src/util/persist.rs
index 5d3575e1ec..9036a27f49 100644
--- a/lightning/src/util/persist.rs
+++ b/lightning/src/util/persist.rs
@@ -1479,5 +1479,3 @@ mod tests {
 	// Exercise the `MonitorUpdatingPersister` with real channels and payments.
-	fn do_persister_with_real_monitors(persisters_max_pending_updates: (u64, u64)) {
-		let persister_0_max_pending_updates = persisters_max_pending_updates.0;
-		let persister_1_max_pending_updates = persisters_max_pending_updates.1;
+	fn do_persister_with_real_monitors(max_pending_updates_0: u64, max_pending_updates_1: u64) {
 		let chanmon_cfgs = create_chanmon_cfgs(4);
@@ -1487,3 +1485,3 @@ mod tests {
 			&chanmon_cfgs[0].logger,
-			persister_0_max_pending_updates,
+			max_pending_updates_0,
 			&chanmon_cfgs[0].keys_manager,
@@ -1497,3 +1495,3 @@ mod tests {
 			&chanmon_cfgs[1].logger,
-			persister_1_max_pending_updates,
+			max_pending_updates_1,
 			&chanmon_cfgs[1].keys_manager,
@@ -1546,6 +1544,6 @@ mod tests {
 					let monitor_name = mon.persistence_key();
-					let expected_updates = if persister_0_max_pending_updates == 0 {
+					let expected_updates = if max_pending_updates_0 == 0 {
 						0
 					} else {
-						mon.get_latest_update_id() % persister_0_max_pending_updates
+						mon.get_latest_update_id() % max_pending_updates_0
 					};
@@ -1564,6 +1562,6 @@ mod tests {
 					let monitor_name = mon.persistence_key();
-					let expected_updates = if persister_1_max_pending_updates == 0 {
+					let expected_updates = if max_pending_updates_1 == 0 {
 						0
 					} else {
-						mon.get_latest_update_id() % persister_1_max_pending_updates
+						mon.get_latest_update_id() % max_pending_updates_1
 					};
@@ -1592,3 +1590,3 @@ mod tests {
 		let mut sender = 0;
-		for i in 3..=persister_0_max_pending_updates * 2 {
+		for i in 3..=max_pending_updates_0 * 2 {
 			let receiver;
@@ -1635,6 +1633,5 @@ mod tests {
 		// Make sure everything is persisted as expected after close.
-		// We always send at least two payments, and loop up to persister_0_max_pending_updates *
-		// 2.
+		// We always send at least two payments, and loop up to max_pending_updates_0 * 2.
 		check_persisted_data!(
-			cmp::max(2, persister_0_max_pending_updates * 2) * EXPECTED_UPDATES_PER_PAYMENT + 1
+			cmp::max(2, max_pending_updates_0 * 2) * EXPECTED_UPDATES_PER_PAYMENT + 1
 		);
@@ -1644,5 +1641,5 @@ mod tests {
 	fn persister_with_real_monitors() {
-		do_persister_with_real_monitors((7, 3));
-		do_persister_with_real_monitors((0, 1));
-		do_persister_with_real_monitors((4, 2));
+		do_persister_with_real_monitors(7, 3);
+		do_persister_with_real_monitors(0, 1);
+		do_persister_with_real_monitors(4, 2);
 	}

Though users maybe shouldn't use `MonitorUpdatingPersister` if they
don't actually want to persist `ChannelMonitorUpdate`s, we also
shouldn't panic if `maximum_pending_updates` is set to zero.
In the coming commits `MonitorUpdatingPersister`'s internal state
will be reworked. To avoid spurious test diff, we instead use the
public API of `MonitorUpdatingPersister` rather than internal bits
in tests.
In the coming commits, we'll use the `MonitorUpdatingPersister` as
*the* way to do async monitor updating in the `ChainMonitor`.
However, to support folks who don't actually want a
`MonitorUpdatingPersister` in that case, we explicitly support them
setting `maximum_pending_updates` to 0, disabling all of the
update-writing behavior.
As we've done with several other structs, this adds an async
variant of `MonitorUpdatingPersister` and adds an async-sync
wrapper for those using `KVStoreSync`. Unlike with other structs,
we leave `MonitorUpdatingPersister` as the sync variant and make
the new async logic a `MonitorUpdatingPersisterAsync` as the async
monitor updating flow is still considered beta.

This does not yet expose the async monitor updating logic anywhere,
as doing a standard `Persist` async variant would not work for
ensuring the `ChannelManager` and `ChainMonitor` don't block on
async writes or suddenly require a runtime.
Pre-0.1, after a channel was closed we generated
`ChannelMonitorUpdate`s with a static `update_id` of `u64::MAX`. In
this case, when using `MonitorUpdatingPersister`, we had to read
the persisted `ChannelMonitor` to figure out what range of monitor
updates to remove from disk. However, now that we have a `list`
method there's no reason to do this anymore, we can just use that.

Simplifying code that we anticipate never hitting anymore is always
a win.
In the next commit we'll use this to spawn async persistence
operations in the background, but for now we just move the
`lightning-block-sync` `FutureSpawner` into `lightning`.
In the next commit we'll add the ability to use an async `KVStore`
as the backing for a `ChainMonitor`. Here we tee this up by adding
an async API to `MonitorUpdatingPersisterAsync`. Its not intended
for public use and is thus only `pub(crate)` but allows spawning
all operations via a generic `FutureSpawner` trait, initiating the
write via the `KVStore` before any `await`s (or async functions).

Because we aren't going to make the `ChannelManager` (or
`ChainMonitor`) fully async, we need a way to alert the
`ChainMonitor` when a persistence completes, but we leave that for
the next commit.
This finally adds support for full native Rust `async` persistence
to `ChainMonitor`.

Way back when, before we had any other persistence, we added the
`Persist` trait to persist `ChannelMonitor`s. It eventualy grew
homegrown async persistence support via a simple immediate return
and callback upon completion. We later added a persistence trait
in `lightning-background-processor` to persist the few fields that
it needed to drive writes for. Over time, we found more places
where persistence was useful, and we eventually added a generic
`KVStore` trait.

In dc75436 we removed the
`lightning-background-processor` `Persister` in favor of simply
using the native `KVStore` directly.

Here we continue that trend, building native `async`
`ChannelMonitor` persistence on top of our native `KVStore` rather
than hacking support for it into the `chain::Persist` trait.
Because `MonitorUpdatingPersister` already exists as a common way
to wrap a `KVStore` into a `ChannelMonitor` persister, we build
exclusively on that (though note that the "monitor updating" part
is now optional), utilizing its new async option as our native
async driver.

Thus, we end up with a `ChainMonitor::new_async_beta` which takes
a `MonitorUpdatingPersisterAsync` rather than a classic
`chain::Persist` and then operates the same as a normal
`ChainMonitor`.

While the requirement that users now use a
`MonitorUpdatingPersister` to wrap their `KVStore` before providing
it to `ChainMonitor` is somewhat awkward, as we move towards a
`KVStore`-only world it seems like `MonitorUpdatingPersister`
should eventually merge into `ChainMonitor`.
`TestStore` recently got the ability to make async writes, but
wasn't a very useful test as all writes actually completed
immediately. Instead, here, we make the writes actually-async,
forcing the test to mark writes complete as required.
We only require `Send + Sync` on the `Future`s which are spawned
in std builds, so its weird to require them on the trait itself in
all builds. Instead, make them consistent.
We already have pretty good coverage of the
`MonitorUpdatingPersister` itself as well as `ChainMonitor`'s
update-completion handling, so here we just focus on an end-to-end
test to make sure `ChainMonitor` behaves at all correctly in the
new async mode.
@TheBlueMatt TheBlueMatt force-pushed the 2025-09-async-chainmonitor branch from 287b9c8 to 462a647 Compare September 22, 2025 18:46
@tnull tnull merged commit 32314e3 into lightningdevkit:main Sep 23, 2025
24 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

4 participants