-
Notifications
You must be signed in to change notification settings - Fork 599
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(snapshot-backfill): implement executor to consume upstream table #20167
base: main
Are you sure you want to change the base?
Conversation
cb6d3ef
to
5685057
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
license-eye has checked 5555 files.
Valid | Invalid | Ignored | Fixed |
---|---|---|---|
2347 | 1 | 3207 | 0 |
Click to see the invalid file list
- src/stream/src/executor/backfill/snapshot_backfill/consume_upstream/upstream_table_trait.rs
Use this command to fix any missing license headers
```bash
docker run -it --rm -v $(pwd):/github/workspace apache/skywalking-eyes header fix
</details>
src/stream/src/executor/backfill/snapshot_backfill/consume_upstream/upstream_table_trait.rs
Show resolved
Hide resolved
5685057
to
6b2ce75
Compare
This stack of pull requests is managed by Graphite. Learn more about stacking. |
6b2ce75
to
8a97831
Compare
This pull request has been modified. If you want me to regenerate unit test for any of the files related, please find the file in "Files Changed" tab and add a comment |
Any description? |
Added the PR description. @kwannoel @hzxa21 @yezizp2012 @st1page PTAL |
I hereby agree to the terms of the RisingWave Labs, Inc. Contributor License Agreement.
What's changed and what's your intention?
This PR is for supporting the cross db streaming query described in #19631.
In this PR, we implement a
ConsumeUpstreamStream
to consume the upstream table epoch by epoch and yieldStreamChunk
. InsideConsumeUpstreamStream
, we useVnodeStream
implemented in #19936 to consume each vnode, and with it we are able to get the latest progress of each vnode at anytime. After it finishes one epoch, it will callnext_epoch
the next epoch and create a newVnodeStream
to consume the new epoch. AUpstreamTableExecutor
will poll theConsumeUpstreamStream
and thebarrier_rx
concurrently. We use the backfill progress state introduced in #19720 to track the progress state. When receiving a new barrier, the executor will inspect the latest progress ofConsumeUpstreamStream
and write progress state, and then yield the barrier, and then continue consuming theConsumeUpstreamStream
. On update vnode bitmap, we will recreate theConsumeUpstreamStream
for the new vnode bitmap. The logic of the executor is like the following:Ideally, the
ConsumeUpstreamStream
should be implemented with thetry_stream
macro, so that the state machine of async execution can be generated automatically by the rust compiler. However, we need to be able to access the progress of the stream at the time we receive barrier, but the stream generated bytry_stream
macro will take the ownership of the internalVnodeStream
, and then we won't be able to get the latest progress of the ongoingVnodeStream
. Therefore, in this PR, we will implement the state machine by ourselves. The state machine is like the following:The states represent the await point in the following code
Note that in each state, we need to store the progress of all vnodes owned by the executor even if the progress is made in the previous epoch, so that the previous progress won't be lost.
To support better future extension, the input of
ConsumeUpstreamStream
can be any type that implements the followingUpstreamTable
trait.In this way, we can easily reuse the logic for different ways to consume upstream table, such as consuming the subscription of other RisingWave cluster, as long as we implement this trait for it.
Checklist
Documentation
Release note