Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[pulsar-sink][samza][test][build] Add Pulsar sink #388

Merged
merged 12 commits into from
May 11, 2023

Conversation

dlg99
Copy link
Contributor

@dlg99 dlg99 commented May 2, 2023

Add Pulsar sink

This PR adds Pulsar Sink (send data from Apache Pulsar to Venice).
It includes slight refactoring of Samza client/VeniceProducer to avoid use/configuration of D2.

How was this PR tested?

The change adds unit tests and an integration test.

Does this PR introduce any user-facing changes?

  • No. You can skip the rest of this section.
  • Yes. Make sure to explain your proposed changes and call out the behavior change.

It includes slight refactoring of Samza client/VeniceProducer to avoid use/configuration of D2 but it is backwards compatible.

@dlg99 dlg99 marked this pull request as draft May 2, 2023 20:24
@dlg99 dlg99 marked this pull request as ready for review May 2, 2023 22:10
Copy link
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Didn't scan all of it yet but just left a few comments...

Regarding Github Actions, it might be fine to leave this separate for now, but I guess eventually we would want to run it as part of the suite. The GH Actions workflow is generated by the build in the venice-test-common module, where we bucket the integration tests in order to minimize the wallclock time it takes to run the whole suite. Since the new tests leverage testcontainers, it may make sense to put them as separate (parallel) tasks in the same workflow.

build.gradle Show resolved Hide resolved
clients/venice-pulsar/build.gradle Outdated Show resolved Hide resolved
@dlg99
Copy link
Contributor Author

dlg99 commented May 2, 2023

Since the new tests leverage testcontainers, it may make sense to put them as separate (parallel) tasks in the same workflow.

@FelixGV this ^^^ was exactly my reasoning to keep this wf separate.

Copy link
Contributor

@FelixGV FelixGV left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a few more comments, and an idea on how to debug the issue further.

eolivelli
eolivelli previously approved these changes May 8, 2023
Copy link
Contributor

@eolivelli eolivelli left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Contributor

@nisargthakkar nisargthakkar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Only partially checked. Left very high-level comment

@dlg99 dlg99 changed the title [samza][test][build] Add Pulsar sink [pulsar-sink][samza][test][build] Add Pulsar sink May 10, 2023
@dlg99
Copy link
Contributor Author

dlg99 commented May 10, 2023

I renamed the sink, added pulsar-sink tag, renamed/updated readme.

@FelixGV
Copy link
Contributor

FelixGV commented May 11, 2023

Thanks for continuing to iterate on this @dlg99 ! It's looking pretty good now.

Only thing left would be the test coverage. IIUC, the venice-pulsar module is at 27%, failing to meet 33%. Since it's brand new code, I think we probably can push it over the threshold without too much trouble, right? You can see the jacoco report with exact lines that are covered/uncovered by downloading the artifact from the GH Action run, or by rerunning the build, e.g. something along the lines of:

./gradlew clients:venice-pulsar:test clients:venice-pulsar:diffCoverage

For the refactorings you needed to do on the Samza side, I think we could override that, given that you're still new to the project and of course it's not your fault that the Samza module has poor coverage to begin with (at least in the way coverage is measured today, which is more restrictive than in reality).

@dlg99
Copy link
Contributor Author

dlg99 commented May 11, 2023

@FelixGV I improved test coverage.
Good chunk of it with the tests for the config (get/set, and code generated by lombok, basically tests for the sake of coverage) and added expanded tests for specific failure cases. I'll leave it at that for now. CI is still running.

Samza changes tested in the integration test but this is not counted in the test coverage numbers. I can try adding unit tests but realistically this will happen mid-next week. We can as well merge this, assuming there no other comments, and do a follow up change.

@FelixGV FelixGV merged commit 0cb2594 into linkedin:main May 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants