Skip to content

chore: epoch change wait for previous blob object GC to finish#3100

Draft
halfprice wants to merge 1 commit intomainfrom
zhewu/wait-for-object-gc
Draft

chore: epoch change wait for previous blob object GC to finish#3100
halfprice wants to merge 1 commit intomainfrom
zhewu/wait-for-object-gc

Conversation

@halfprice
Copy link
Collaborator

@halfprice halfprice commented Mar 10, 2026

Description

This PR adds a dependency in epoch change logic, where it waits for the previous epoch GC run's blob object info
table cleanup to finish. With this waiting, the blob info table and blob object info table should be strictly consistent
answering blob.is_certified and blob.is_registered query. This is required both for consistency check as well
as any future development that requires a consistent blob info table.

With this requirement, we should always enable blob info cleanup. So this PR updates the GarbageCollector
to forcibly set enable_blob_info_cleanup to true, despite what value is set in node config. Keep
enable_blob_info_cleanup for now for compatibility.

Test plan

How did you test the new or updated feature?


Release notes

Check each box that your changes affect. If none of the boxes relate to your changes, release notes aren't required.
For each box you select, include information after the relevant heading that describes the impact of your changes that
a user might notice and any actions they must take to implement updates. (Add release notes after the colon for each item)

  • Storage node:
  • Aggregator:
  • Publisher:
  • CLI:

@halfprice
Copy link
Collaborator Author

@codex review

Copy link

@chatgpt-codex-connector chatgpt-codex-connector bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

💡 Codex Review

Here are some automated review suggestions for this pull request.

Reviewed commit: 7803d5799a

ℹ️ About Codex in GitHub

Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you

  • Open a pull request for review
  • Mark a draft as ready
  • Comment "@codex review".

If Codex has suggestions, it will comment; otherwise it will react with 👍.

Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".

.await;

// Notify that blob info cleanup is done (even on error, to unblock waiters).
let _ = self.blob_info_cleanup_done.send(epoch);

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Badge Persist cleanup epoch even when no waiters are subscribed

blob_info_cleanup_done is updated with watch::Sender::send, but this channel has no long-lived receiver (the initial receiver is dropped), so send can fail when no task is currently waiting and the epoch value is not retained; in that case a later wait_for_blob_info_cleanup subscriber will still see an older epoch and can block indefinitely (for example, waiting for epoch N-1 after its GC already finished). Use an update method that always mutates the stored value (e.g. send_replace/send_if_modified) or keep a receiver alive.

Useful? React with 👍 / 👎.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant