Skip to content

Conversation

kim
Copy link
Contributor

@kim kim commented Sep 26, 2025

Report more metrics about snapshot compression, namely:

  • time to compress a single snapshot (histogram)
  • for each compression pass:
    • number of snapshots found to be already compressed (gauge)
    • number of snapshots compressed (gauge)
    • cumulative number of objects compressed (gauge)
    • cumulative number of objects hardlinked (gauge)

Those metrics are collected from the spacetimedb-snapshot crate
without imposing a prometheus dependency on it, i.e. they can be
observed by the caller as ordinary Rust types.

This is exploited to avoid scanning the entire snapshot repository on
each pass -- only the range (last_compressed + 1)..newest_snapshot is
visited (note that the compress_snapshots method now short-circuits on
errors).

Lastly, the snapshot worker can now be configured to disable
compression. This greatly simplifies implementation of alternative
post-processing strategies, e.g. involving archival, for which a more
coarse-grained compression strategy may be more appropriate.

Subcribers are notified of a new snapshot after compression, such that
any filesystem locks should be released.

Expected complexity level and risk

2

Testing

May need some, I'm pondering.

The `DurabilityProvider` trait was introduced to enable the
`HostController` to procure an alternative `Durability` impl from an
external source.

It is also useful to be able to instantiate a `SnapshotWorker`
externally, in order to subscribe to snapshot creation events without
access to the `RelationalDB` instance it is operating on.

At a later stage, we may also use it to control the snapshot frequency
externally.

This patch thus reframes the trait as `PersistenceProvider`, whose job
is to provide persistence-related services.

Also separates snapshot creation and compression of older snapshots, and
adds instrumentation to gather timing information for both.
@kim
Copy link
Contributor Author

kim commented Sep 26, 2025

Stacked on top of #3295

@kim kim force-pushed the kim/snapshot-compression-stats branch 3 times, most recently from 84b088f to c838ed5 Compare September 26, 2025 07:31
Report more metrics about snapshot compression, namely:

- time to compress a single snapshot (histogram)
- for each compression pass:
  - number of snapshots found to be already compressed (gauge)
  - number of snapshots compressed (gauge)
  - cumulative number of objects compressed (gauge)
  - cumulative number of objects hardlinked (gauge)

Those metrics are collected from the `spacetimedb-snapshot` crate
without imposing a prometheus dependency on it, i.e. they can be
observed by the caller as ordinary Rust types.

This is exploited to avoid scanning the entire snapshot repository on
each pass -- only the range `(last_compressed + 1)..newest_snapshot` is
visited (note that the `compress_snapshots` method now short-circuits on
errors).

Lastly, the snapshot worker can now be configured to disable
compression. This greatly simplifies implementation of alternative
post-processing strategies, e.g. involving archival, for which a more
coarse-grained compression strategy may be more appropriate.

Subcribers are notified of a new snapshot _after_ compression, such that
any filesystem locks should be released.
@kim kim force-pushed the kim/snapshot-compression-stats branch from c838ed5 to 89779b6 Compare September 26, 2025 08:55
@bfops bfops added the release-any To be landed in any release window label Sep 29, 2025
Base automatically changed from kim/persistence-provider to master October 1, 2025 09:12
@kim kim enabled auto-merge October 1, 2025 11:00
@kim kim added this pull request to the merge queue Oct 1, 2025
@github-merge-queue github-merge-queue bot removed this pull request from the merge queue due to failed status checks Oct 1, 2025
@kim kim added this pull request to the merge queue Oct 1, 2025
Merged via the queue into master with commit ffaf791 Oct 1, 2025
25 of 26 checks passed
@kim kim deleted the kim/snapshot-compression-stats branch October 1, 2025 12:31
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
release-any To be landed in any release window
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants