- 
                Notifications
    You must be signed in to change notification settings 
- Fork 4k
kvserver: use Pebble snapshot for catchup scans #156303
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
The snapshot is used to create an iterator, which is recreated based on the storage.snapshot.recreate_iter_duration cluster setting, which defaults to 20s. This is mostly plumbing changes, except for catchup_scan.go. Fixes cockroachdb#133851 Epic: none Release note (ops change): The cluster setting storage.snapshot.recreate_iter_duration (default 20s) controls how frequently a long-lived engine iterator, backed by an engine snapshot, will be closed and recreated. Currently, it is only used for iterators used in rangefeed catchup scans.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@arulajmani @stevendanna this the same code as in #154412, which was rolled back, post the Pebble fix to the snapshot and excise interaction. I noticed there has been some refactoring in this area since that change, so you may want to take a quick look to ensure I haven't messed with the spirit of that refactor.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @asg0451, @kev-cao, @mgartner, @RaduBerinde, and @stevendanna)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for following up here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @asg0451, @kev-cao, @mgartner, and @RaduBerinde)
pkg/kv/kvserver/rangefeed/catchup_scan.go line 258 at r1 (raw file):
// iterator, and using the cpu time to amortize that cost seems // reasonable. if (readmitted && i.iterRecreateDuration > 0) || util.RaceEnabled {
@stevendanna @wenyihu6
If we have range-key [a, d) and point keys a, b, c, it is now allowable for us to recreate the iterator at b and c. And so the first, second and third iterators will all see part of that range key. Which will result in us emitting three kvpb.RangeFeedDeleteRange events with spans [a, d), [b, d) and [c, d) respectively.
Two questions:
- Is that ok behavior from a correctness perspective?
- Is there an existing rangefeed/changefeed/... test that has a range key with multiple overlapping point keys. If yes, what test? Given we always recreate under RaceEnabled, I want to see how that test behaves with this change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status:
complete! 0 of 0 LGTMs obtained (waiting on @arulajmani, @asg0451, @kev-cao, @mgartner, @RaduBerinde, and @wenyihu6)
pkg/kv/kvserver/rangefeed/catchup_scan.go line 258 at r1 (raw file):
Previously, sumeerbhola wrote…
@stevendanna @wenyihu6
If we have range-key [a, d) and point keys a, b, c, it is now allowable for us to recreate the iterator at b and c. And so the first, second and third iterators will all see part of that range key. Which will result in us emitting three kvpb.RangeFeedDeleteRange events with spans [a, d), [b, d) and [c, d) respectively.
Two questions:
- Is that ok behavior from a correctness perspective?
- Is there an existing rangefeed/changefeed/... test that has a range key with multiple overlapping point keys. If yes, what test? Given we always recreate under
RaceEnabled, I want to see how that test behaves with this change.
Hrmmmm. This is a good callout.
- I don't see why it wouldn't be OK from a correctness perspective. But my understanding is that we it would have been valid for us to see either set of range keys anyway. At lease, we've assumed this is the case in export request. I'll keep thinking about this.
- At the moment changefeeds never see range keys because the only operations that write range keys are operations that first take the relevant table offline. However, we do expect to see these in PCR. I think the closest test to what you describe is TestWithOnDeleteRange in pkg/kv/kvclient/rangefeed/rangefeed_external_test.go. We might need to modify it to have more overlaps.
The snapshot is used to create an iterator, which is recreated based on the storage.snapshot.recreate_iter_duration cluster setting, which defaults to 20s.
This is mostly plumbing changes, except for catchup_scan.go.
Fixes #133851
Epic: none
Release note (ops change): The cluster setting
storage.snapshot.recreate_iter_duration (default 20s) controls how frequently a long-lived engine iterator, backed by an engine snapshot, will be closed and recreated. Currently, it is only used for iterators used in rangefeed catchup scans.