-
Notifications
You must be signed in to change notification settings - Fork 3.2k
Use system-provided cyrus-sasl/libsasl2 at runtime #5168
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
|
🎉 All Contributor License Agreements have been signed. Ready to merge. |
a740b69 to
07e9f8e
Compare
07e9f8e to
9f54d07
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements runtime loading of the Cyrus SASL library (libsasl2) on Unix systems to enable SASL GSSAPI mechanism support without requiring static linking at build time. This change allows Python wheels to be distributed on PyPI while maintaining GSSAPI functionality and addresses ABI compatibility issues between different Linux distributions.
Key Changes
- Dynamic loading of
libsasl2at runtime usingdlopeninstead of build-time linking - Removal of build dependencies on
libsasl2across all packaging configurations - Simplification of build artifacts by eliminating separate GSSAPI and non-GSSAPI builds
Reviewed Changes
Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.
Show a summary per file
| File | Description |
|---|---|
src/rdkafka_sasl_cyrus.c |
Implements dynamic loading of libsasl2 with function pointer resolution and error handling |
src/rdkafka_sasl_int.h |
Adds new API functions for library status checking |
src/rdkafka_sasl.c |
Updates provider selection to check runtime library availability |
src/rdkafka_conf.c |
Updates error message for missing library support |
src/CMakeLists.txt |
Removes static linking to libsasl2 |
configure.self |
Removes libsasl2 build dependency configuration |
| Various packaging files | Removes libsasl2 dependencies and GSSAPI build variants |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
9f54d07 to
4a87be9
Compare
|
Hi @emasab, could you please review this PR? |
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- #3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
Instead of having multiple builds with and without `libsasl2` on Linux, one of which cannot be redistributed within Python wheels, we use `dlopen` to load `libsasl2` at runtime on Unix. The SASL GSSAPI mechanism availability is thus checked at runtime. Because of differences in soname ABI bumps between Debian-based and RPM-based distros, the previous SASL builds did not work on Debian-based systems. This is also solved in this change, by probing the various known names for `libsasl2`.
|
This is causing us some pain due to environment differences at the moment. Any chance this can get looked at soon? |
4a87be9 to
99d647e
Compare
- G-Research#3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
- #3 (CI/CD script) - confluentinc#4972 (Avoid unnecessary producer epoch bumps) - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer) - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
|
Hi @jgiannuzzi . Thanks for contributing to librdkafka! Our supported way of using libsasl2 is by linking librdkafka dynamically, so you install our JS: |
|
Thanks for your comment @emasab! The method you described does not work for a user install sadly, and requires the additional ecosystem-specific instructions you mentioned. This is exactly what this PR is trying to solve: allowing the regular Python wheels / .NET nuget package / etc. to just work without having to install anything system-wide. |
|
But you still have to install the
Is it because you're distributing an application that uses librdkafka and GSSAPI and want to simplify the installation procedure? |
|
I can say that we've been using this patch internally ourselves for a few months now and it has resulted in a tremendous amount of user gratitude for a number of reasons:
Side note, we tried baking in libsasl2 because our users were so frustrated with this and that went badly because people would set the SASL_PATH env var to whatever it was on their build machine (Jenkins) but the actual path on the prod image / machine would sometimes be different. This often gets caught in staging but in certain cases this has even caused prod incidents as certain kafka code paths escaped testing in an environment where the sasl path varied. You do have to install libsasl but that can trivially be added to most base images without needing users to worry about specifics. I haven't seen anyone actually asking about it, and to install it just requires an apt install using all the standard unix dependency tooling without users needing to understand specifics of particular kafka libs. I know it seems like a minor optimisation but it's hard to overstate just how much this has positively impacted peoples dev and release flows with Kafka. We're starting to see broader adoption among some groups that previously saw it as too much hassle because now it "just works" tm without people needing any special environment tuning which is always painful in locked down environments. |
|
I suppose what I'm saying is it doesn't strictly solve something that couldn't be worked around before but now there's no need for workarounds. It's just removing a few footguns and smoothing a few roads, much to the happiness of our users. |
|
I'd like to highlight another critical issue this PR resolves: version mismatches between the confluent-kafka Python package and the system-installed librdkafka (which lead to):
This PR fixes these issues by aligning with modern packaging expectations. It ensures that installing the Python package brings along the exact C library it was built and tested with, making builds predictable and upgrades straightforward. |
Motivation
Currently the Python bindings require the user to build from source in order to use the SASL GSSAPI mechanism on Linux. This is because PyPI doesn't allow Python wheels that link to non-core libraries, like
libsasl2.Other bindings that can use the builds linking to
libsasl2- like the .NET ones - do not work on Debian-based systems without adding a symlink fromlibsasl2.so.3tolibsasl2.so.2. This is because the builds are done on an RPM-based system, which have a different soname ABI policy.Implementation
Instead of having multiple builds with and without
libsasl2on Linux, we usedlopento loadlibsasl2at runtime on Unix. The SASL GSSAPI mechanism availability is thus checked at runtime.The differences between Debian-based and RPM-based distros are addressed by probing the various known names for
libsasl2. The subset of the ABI we use has not changed between the upstream soname ABI bumps.The documentation has been updated to remove the previous limitations around libsasl2/GSSAPI.
The CI and build scripts have been updated to only build one flavor per linux/libc, and the
--disable-gssapiparameter has been removed.The 2 build systems (mklove and cmake) have been updated to build SASL GSSAPI support based on the availability of
libdlinstead oflibsasl2.The
librdkafka.redistnuget package has been updated to include only one build for linux/glibc/x64, as thecentos8-librdkafka.sobuild is obsoleted by the single build with no dependencies.The static library build now supports the SASL GSSAPI mechanism.
A very small subset of the libsasl2 header has been included in
rdkafka_sasl_cyrus.c. The license can be found at https://github.com/cyrusimap/cyrus-sasl/blob/master/COPYING.