-
Notifications
You must be signed in to change notification settings - Fork 89
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[pulsar-sink][samza][test][build] Add Pulsar sink #388
Conversation
tests/venice-pulsar-test/src/pulsarintegrationtest/java/sink/PulsarVeniceSinkTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR! Didn't scan all of it yet but just left a few comments...
Regarding Github Actions, it might be fine to leave this separate for now, but I guess eventually we would want to run it as part of the suite. The GH Actions workflow is generated by the build in the venice-test-common
module, where we bucket the integration tests in order to minimize the wallclock time it takes to run the whole suite. Since the new tests leverage testcontainers, it may make sense to put them as separate (parallel) tasks in the same workflow.
@FelixGV this ^^^ was exactly my reasoning to keep this wf separate. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left a few more comments, and an idea on how to debug the issue further.
clients/venice-pulsar/src/main/java/com/linkedin/venice/pulsar/sink/VeniceSink.java
Outdated
Show resolved
Hide resolved
clients/venice-pulsar/src/main/java/com/linkedin/venice/pulsar/sink/VeniceSink.java
Outdated
Show resolved
Hide resolved
clients/venice-pulsar/src/main/java/com/linkedin/venice/pulsar/sink/VeniceSink.java
Outdated
Show resolved
Hide resolved
clients/venice-samza/src/main/java/com/linkedin/venice/samza/VeniceSystemProducer.java
Outdated
Show resolved
Hide resolved
clients/venice-samza/src/main/java/com/linkedin/venice/samza/VeniceSystemProducer.java
Show resolved
Hide resolved
clients/venice-samza/src/main/java/com/linkedin/venice/samza/VeniceSystemProducer.java
Show resolved
Hide resolved
tests/venice-pulsar-test/src/pulsarintegrationtest/java/sink/PulsarVeniceSinkTest.java
Outdated
Show resolved
Hide resolved
tests/venice-pulsar-test/src/pulsarintegrationtest/java/sink/PulsarVeniceSinkTest.java
Outdated
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only partially checked. Left very high-level comment
clients/venice-pulsar/src/main/java/com/linkedin/venice/pulsar/sink/VeniceSink.java
Outdated
Show resolved
Hide resolved
clients/venice-pulsar/src/main/java/com/linkedin/venice/pulsar/sink/VeniceSink.java
Outdated
Show resolved
Hide resolved
…s extra parameter for the router URL
I renamed the sink, added |
Thanks for continuing to iterate on this @dlg99 ! It's looking pretty good now. Only thing left would be the test coverage. IIUC, the venice-pulsar module is at 27%, failing to meet 33%. Since it's brand new code, I think we probably can push it over the threshold without too much trouble, right? You can see the jacoco report with exact lines that are covered/uncovered by downloading the artifact from the GH Action run, or by rerunning the build, e.g. something along the lines of:
For the refactorings you needed to do on the Samza side, I think we could override that, given that you're still new to the project and of course it's not your fault that the Samza module has poor coverage to begin with (at least in the way coverage is measured today, which is more restrictive than in reality). |
@FelixGV I improved test coverage. Samza changes tested in the integration test but this is not counted in the test coverage numbers. I can try adding unit tests but realistically this will happen mid-next week. We can as well merge this, assuming there no other comments, and do a follow up change. |
Add Pulsar sink
This PR adds Pulsar Sink (send data from Apache Pulsar to Venice).
It includes slight refactoring of Samza client/VeniceProducer to avoid use/configuration of D2.
How was this PR tested?
The change adds unit tests and an integration test.
Does this PR introduce any user-facing changes?
It includes slight refactoring of Samza client/VeniceProducer to avoid use/configuration of D2 but it is backwards compatible.