-
-
Notifications
You must be signed in to change notification settings - Fork 0
Rust Arroyo adapter #98
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
9f63abb
to
10e66f7
Compare
bd59e6f
to
95dfcec
Compare
sentry_streams/src/operators.rs
Outdated
}, | ||
} | ||
|
||
/* |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is this supposed to be removed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah. It was a previous implementation. I forgot to remove.
Since the PR is fairly large, I think I'm having a bit of a hard time understanding how all the files in |
) | ||
|
||
|
||
class RustArroyoAdapter(StreamAdapter[Route, Route]): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't know if I really understand why we have this Adapter? At a high level, does this accomplish something different than ArroyoAdapter in streams/?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Conceptually it does not do much differently. Though I created a different one because the data structures involved are quite different:
- ArroyoStep does not exist, it is replaced by a Rust enum.
- ArroyoConsumer is replaced by a rust one, but its interface in Rust is quite different.
- Arroyo Consumer and Producers have to be created by the Rust code because they are the rust ones so I cannot use the logic that creates the python ones. These are Arroyo structs, they cannot be exposed to python.
- I cannot expose the StrategyFactory to python (that's an arroyo interface) so everything has to be hidden behind the
run
method in theArroyoConsumer
Rust structure.
All in all it could be possible to merge the logic into the python Arroyo adapter but the result would be very brittle as the interfaces to deal with between python and Rust are very different.
The logic is more clear this way.
Done. See the PR description |
A basic Rust Arroyo adapter to urn streaming pipelines.
The basic idea is to have the runtime built in rust and able to run either rust or python
application logic on top of it. Potentially mixing up rust and python on the same pipeline
on the same consumer should be possible.
The adapter is built with pyo3 and maturin and it is packaged inside the sentry_stream
python package as a binary.
The python runner is still the entrypoint. All the pipeline translation logic is still the same
and it is in python so only the basic primitives have to be written in rust.
At this stage only source, sink and map are provided, as a followup there will be a
basic rust Arroyo strategy that runs pyhon code so we can make python primitives
immediately runnable in rust and port them to rust native when we want. Still the
primitives will be implemented only once.
On top of this work we can define how to fully integrate Rust code in the pipeline.
Work to be done after this PR.
Run it this way
Architecture:
ArroyoConsumer
(https://github.com/getsentry/streams/blob/main/sentry_streams/sentry_streams/adapters/arroyo/consumer.py#L25) but in Rust. This struct accumulates streaming primitives, reverses the order and builds a ProcessingStrategy. The ProcessingStrategy is an Arroyo Rust one.InitialOffset
,PykafkaConsumerConfig
,PyKafkaProducerConfig
,Route
, etc.RuntimeOperator
enum. It serves a similar purposeArroyoStep
https://github.com/getsentry/streams/blob/main/sentry_streams/sentry_streams/adapters/arroyo/steps.py#L29 does. Except this is served better by an enum in rust.add_step
method the cool stuff happens in therun
method: https://github.com/getsentry/streams/pull/98/files#diff-3d009ee5960444519bcb6053ac4ae9d659e392e0a5f80bd1a6f4f16e9a681b82R77. This assembles the Arroyo strategies (in rust) and produces a strategy factory, then it runs it.