Skip to content

Conversation

@jgiannuzzi
Copy link

@jgiannuzzi jgiannuzzi commented Aug 14, 2025

Motivation

Currently the Python bindings require the user to build from source in order to use the SASL GSSAPI mechanism on Linux. This is because PyPI doesn't allow Python wheels that link to non-core libraries, like libsasl2.

Other bindings that can use the builds linking to libsasl2 - like the .NET ones - do not work on Debian-based systems without adding a symlink from libsasl2.so.3 to libsasl2.so.2. This is because the builds are done on an RPM-based system, which have a different soname ABI policy.

Implementation

Instead of having multiple builds with and without libsasl2 on Linux, we use dlopen to load libsasl2 at runtime on Unix. The SASL GSSAPI mechanism availability is thus checked at runtime.

The differences between Debian-based and RPM-based distros are addressed by probing the various known names for libsasl2. The subset of the ABI we use has not changed between the upstream soname ABI bumps.

The documentation has been updated to remove the previous limitations around libsasl2/GSSAPI.

The CI and build scripts have been updated to only build one flavor per linux/libc, and the --disable-gssapi parameter has been removed.

The 2 build systems (mklove and cmake) have been updated to build SASL GSSAPI support based on the availability of libdl instead of libsasl2.

The librdkafka.redist nuget package has been updated to include only one build for linux/glibc/x64, as the centos8-librdkafka.so build is obsoleted by the single build with no dependencies.

The static library build now supports the SASL GSSAPI mechanism.

A very small subset of the libsasl2 header has been included in rdkafka_sasl_cyrus.c. The license can be found at https://github.com/cyrusimap/cyrus-sasl/blob/master/COPYING.

Copilot AI review requested due to automatic review settings August 14, 2025 10:17
@jgiannuzzi jgiannuzzi requested a review from a team as a code owner August 14, 2025 10:17
@confluent-cla-assistant
Copy link

confluent-cla-assistant bot commented Aug 14, 2025

🎉 All Contributor License Agreements have been signed. Ready to merge.
✅ jgiannuzzi
Please push an empty commit if you would like to re-run the checks to verify CLA status for all contributors.

This comment was marked as outdated.

This comment was marked as outdated.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR implements runtime loading of the Cyrus SASL library (libsasl2) on Unix systems to enable SASL GSSAPI mechanism support without requiring static linking at build time. This change allows Python wheels to be distributed on PyPI while maintaining GSSAPI functionality and addresses ABI compatibility issues between different Linux distributions.

Key Changes

  • Dynamic loading of libsasl2 at runtime using dlopen instead of build-time linking
  • Removal of build dependencies on libsasl2 across all packaging configurations
  • Simplification of build artifacts by eliminating separate GSSAPI and non-GSSAPI builds

Reviewed Changes

Copilot reviewed 25 out of 25 changed files in this pull request and generated 3 comments.

Show a summary per file
File Description
src/rdkafka_sasl_cyrus.c Implements dynamic loading of libsasl2 with function pointer resolution and error handling
src/rdkafka_sasl_int.h Adds new API functions for library status checking
src/rdkafka_sasl.c Updates provider selection to check runtime library availability
src/rdkafka_conf.c Updates error message for missing library support
src/CMakeLists.txt Removes static linking to libsasl2
configure.self Removes libsasl2 build dependency configuration
Various packaging files Removes libsasl2 dependencies and GSSAPI build variants

Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.

@jgiannuzzi
Copy link
Author

Hi @emasab, could you please review this PR?

jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Sep 8, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#4972 (Avoid unnecessary producer epoch bumps)
 - confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Sep 8, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to G-Research/librdkafka that referenced this pull request Sep 8, 2025
- #3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Oct 9, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Oct 9, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Oct 9, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Oct 9, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
Instead of having multiple builds with and without `libsasl2` on Linux,
one of which cannot be redistributed within Python wheels, we use
`dlopen` to load `libsasl2` at runtime on Unix.
The SASL GSSAPI mechanism availability is thus checked at runtime.
Because of differences in soname ABI bumps between Debian-based and
RPM-based distros, the previous SASL builds did not work on Debian-based
systems. This is also solved in this change, by probing the various
known names for `libsasl2`.
@feldoh
Copy link

feldoh commented Oct 10, 2025

This is causing us some pain due to environment differences at the moment. Any chance this can get looked at soon?

jgiannuzzi added a commit to jgiannuzzi/librdkafka that referenced this pull request Oct 22, 2025
- G-Research#3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
jgiannuzzi added a commit to G-Research/librdkafka that referenced this pull request Oct 22, 2025
- #3 (CI/CD script)
- confluentinc#4972 (Avoid unnecessary producer epoch bumps)
- confluentinc#4989 (Fully utilize the max.in.flight.requests.per.connection parameter on the idempotent producer)
 - confluentinc#5168 (Use system-provided cyrus-sasl/libsasl2 at runtime)
@emasab
Copy link
Contributor

emasab commented Nov 4, 2025

Hi @jgiannuzzi . Thanks for contributing to librdkafka! Our supported way of using libsasl2 is by linking librdkafka dynamically, so you install our librdkafka1 package for Debian or RH that depends on libsasl2 as well (https://packages.confluent.io/clients) and then do
Python: pip install --no-binary=confluent-kafka
Go: you use -tags dynamic
.NET: you exclude the librdkafka.redist

<PackageReference Include="librdkafka.redist" Version="2.12.0" ExcludeAssets="All" />

JS: export CKJS_LINKING=dynamic

@jgiannuzzi
Copy link
Author

Thanks for your comment @emasab! The method you described does not work for a user install sadly, and requires the additional ecosystem-specific instructions you mentioned. This is exactly what this PR is trying to solve: allowing the regular Python wheels / .NET nuget package / etc. to just work without having to install anything system-wide.
Could you please consider the approach I suggest in this PR and let me know whether what you think about it?

@emasab
Copy link
Contributor

emasab commented Nov 4, 2025

But you still have to install the libsasl2 package of your distribution. If it was everything included in the binary I'd agree. With Python you need a compiler to build the C extension, we've similar requirements for CGO.

does not work for a user install sadly

Is it because you're distributing an application that uses librdkafka and GSSAPI and want to simplify the installation procedure?

@feldoh
Copy link

feldoh commented Nov 4, 2025

I can say that we've been using this patch internally ourselves for a few months now and it has resulted in a tremendous amount of user gratitude for a number of reasons:

  • We used to have to have every lib that used kafka has a series of big red boxes at the top of docs to give people the special weird instructions needed to use kafka. These have now been deleted as there are no special instructions.
  • We used to have users doing pip install XYZ that relied on confluent kafka but because you can't express the binary requirements as proper dependencies this just resulted in these libs/tools being broken. If users actually read all the docs they'd generally complain but get it right, most users skip the docs and just try to pip install as works for almost all other tools and get sad. We've seen a measurable drop in support requests on this issue, actually it's been 0 in the last quarter.
  • We used to have lets call it late error states because someone's personal machine will have ubuntu but for various support or compatibility reasons the "prod" version will have redhat for example so their binary deps were wrong. You can reasonably argue this is a skill issue, however, most users that use kafka do not know about this level of system level detail, this is just a footgun and since adopting this patch we've gone from a few issues every few weeks to 0, the footgun is just gone, it just works.
  • Library providers would compile a binary wheel as we have secure build systems where compiling from source is disallowed or just extremely slow. That wheel would work fine for them but not for the secure build. This also naturally prevented us from using the confluent lib directly in the first place, so we have to do very careful alignment of base images, none of which is required at all post this change.

Side note, we tried baking in libsasl2 because our users were so frustrated with this and that went badly because people would set the SASL_PATH env var to whatever it was on their build machine (Jenkins) but the actual path on the prod image / machine would sometimes be different. This often gets caught in staging but in certain cases this has even caused prod incidents as certain kafka code paths escaped testing in an environment where the sasl path varied.

You do have to install libsasl but that can trivially be added to most base images without needing users to worry about specifics. I haven't seen anyone actually asking about it, and to install it just requires an apt install using all the standard unix dependency tooling without users needing to understand specifics of particular kafka libs.

I know it seems like a minor optimisation but it's hard to overstate just how much this has positively impacted peoples dev and release flows with Kafka. We're starting to see broader adoption among some groups that previously saw it as too much hassle because now it "just works" tm without people needing any special environment tuning which is always painful in locked down environments.

@feldoh
Copy link

feldoh commented Nov 4, 2025

I suppose what I'm saying is it doesn't strictly solve something that couldn't be worked around before but now there's no need for workarounds. It's just removing a few footguns and smoothing a few roads, much to the happiness of our users.

@marcin-krystianc
Copy link
Contributor

I'd like to highlight another critical issue this PR resolves: version mismatches between the confluent-kafka Python package and the system-installed librdkafka (which lead to):

  • Lack of Reproducibility: When an application's dependencies are split between pip and a system package manager like apt, we lose reproducible builds. An application that works perfectly on a developer's machine can fail in a CI pipeline or production environment simply because the base image has a slightly different version of the system-installed librdkafka. This forces developers to debug the environment instead of their code.

  • Confusing Upgrade Path: The upgrade process is non-obvious and error-prone. A developer might upgrade the confluent-kafka Python package to get a new feature or bugfix, but see no change in behavior because the underlying C library they are actually using is the older, system-installed one. They have to remember to separately manage and upgrade the system package, which is an unintuitive and easily forgotten step.

This PR fixes these issues by aligning with modern packaging expectations. It ensures that installing the Python package brings along the exact C library it was built and tested with, making builds predictable and upgrades straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants