Skip to content

Conversation

@sumeerbhola
Copy link
Collaborator

The snapshot is used to create an iterator, which is recreated based on the storage.snapshot.recreate_iter_duration cluster setting, which defaults to 20s.

This is mostly plumbing changes, except for catchup_scan.go.

Fixes #133851

Epic: none

Release note (ops change): The cluster setting
storage.snapshot.recreate_iter_duration (default 20s) controls how frequently a long-lived engine iterator, backed by an engine snapshot, will be closed and recreated. Currently, it is only used for iterators used in rangefeed catchup scans.

The snapshot is used to create an iterator, which is recreated based on
the storage.snapshot.recreate_iter_duration cluster setting, which
defaults to 20s.

This is mostly plumbing changes, except for catchup_scan.go.

Fixes cockroachdb#133851

Epic: none

Release note (ops change): The cluster setting
storage.snapshot.recreate_iter_duration (default 20s) controls how
frequently a long-lived engine iterator, backed by an engine snapshot,
will be closed and recreated. Currently, it is only used for iterators
used in rangefeed catchup scans.
@sumeerbhola sumeerbhola requested review from a team as code owners October 27, 2025 18:51
@sumeerbhola sumeerbhola requested review from RaduBerinde, asg0451, kev-cao and mgartner and removed request for a team October 27, 2025 18:51
@cockroach-teamcity
Copy link
Member

This change is Reviewable

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arulajmani @stevendanna this the same code as in #154412, which was rolled back, post the Pebble fix to the snapshot and excise interaction. I noticed there has been some refactoring in this area since that change, so you may want to take a quick look to ensure I haven't messed with the spirit of that refactor.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @asg0451, @kev-cao, @mgartner, @RaduBerinde, and @stevendanna)

Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for following up here.

Copy link
Collaborator Author

@sumeerbhola sumeerbhola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @asg0451, @kev-cao, @mgartner, and @RaduBerinde)


pkg/kv/kvserver/rangefeed/catchup_scan.go line 258 at r1 (raw file):

			// iterator, and using the cpu time to amortize that cost seems
			// reasonable.
			if (readmitted && i.iterRecreateDuration > 0) || util.RaceEnabled {

@stevendanna @wenyihu6
If we have range-key [a, d) and point keys a, b, c, it is now allowable for us to recreate the iterator at b and c. And so the first, second and third iterators will all see part of that range key. Which will result in us emitting three kvpb.RangeFeedDeleteRange events with spans [a, d), [b, d) and [c, d) respectively.
Two questions:

  • Is that ok behavior from a correctness perspective?
  • Is there an existing rangefeed/changefeed/... test that has a range key with multiple overlapping point keys. If yes, what test? Given we always recreate under RaceEnabled, I want to see how that test behaves with this change.

Copy link
Collaborator

@stevendanna stevendanna left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Reviewable status: :shipit: complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @asg0451, @kev-cao, @mgartner, @RaduBerinde, and @wenyihu6)


pkg/kv/kvserver/rangefeed/catchup_scan.go line 258 at r1 (raw file):

Previously, sumeerbhola wrote…

@stevendanna @wenyihu6
If we have range-key [a, d) and point keys a, b, c, it is now allowable for us to recreate the iterator at b and c. And so the first, second and third iterators will all see part of that range key. Which will result in us emitting three kvpb.RangeFeedDeleteRange events with spans [a, d), [b, d) and [c, d) respectively.
Two questions:

  • Is that ok behavior from a correctness perspective?
  • Is there an existing rangefeed/changefeed/... test that has a range key with multiple overlapping point keys. If yes, what test? Given we always recreate under RaceEnabled, I want to see how that test behaves with this change.

Hrmmmm. This is a good callout.

  • I don't see why it wouldn't be OK from a correctness perspective. But my understanding is that we it would have been valid for us to see either set of range keys anyway. At lease, we've assumed this is the case in export request. I'll keep thinking about this.
  • At the moment changefeeds never see range keys because the only operations that write range keys are operations that first take the relevant table offline. However, we do expect to see these in PCR. I think the closest test to what you describe is TestWithOnDeleteRange in pkg/kv/kvclient/rangefeed/rangefeed_external_test.go. We might need to modify it to have more overlaps.

@mgartner mgartner removed their request for review October 28, 2025 20:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

kv/rangefeed: use snapshots instead of iterators for rangefeed catchup scans

3 participants