fix(node-integration-tests): Fix flaky kafkajs test race condition#20189
Conversation
…tion order independent The test was flaky because the producer and consumer transactions could arrive in either order, but the test asserted them in a fixed sequence. Now both transactions are collected and assertions are performed after both have been received, regardless of arrival order. Fixes #20121 Co-Authored-By: Claude <[email protected]> Agent-Logs-Url: https://github.com/getsentry/sentry-javascript/sessions/0bc166af-c93a-42a6-a762-dae244893e2b Co-authored-by: Lms24 <[email protected]>
|
This PR has been automatically closed. All non-maintainer contributions must reference an existing GitHub issue. Next steps:
Please review our contributing guidelines for more details. |
Semver Impact of This PR🟢 Patch (bug fixes) 📋 Changelog PreviewThis is how your changes will appear in the changelog. New Features ✨Core
Deps
Bug Fixes 🐛
Internal Changes 🔧
🤖 This preview updates automatically when you update the PR. |
size-limit report 📦
|
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
❌ Bugbot Autofix is OFF. To automatically fix reported issues with cloud agents, enable autofix in the Cursor dashboard.
Reviewed by Cursor Bugbot for commit 282bd66. Configure here.
node-overhead report 🧳Note: This is a synthetic benchmark with a minimal express app and does not necessarily reflect the real-world performance impact in an application.
|
…ucing (#21074) ## Summary - Replace the fixed 4s `setTimeout` in `suites/tracing/kafkajs/scenario.mjs` with a deterministic wait on the kafkajs `GROUP_JOIN` event. - Also `await consumer.run(...)` instead of leaving the promise dangling. ## Why The kafkajs integration test flaked with the consumer transaction never arriving within the test timeout. The scenario was: 1. `consumer.subscribe(...)` (awaited) 2. `consumer.run(...)` — **not** awaited 3. `await new Promise(resolve => setTimeout(resolve, 4000))` 4. `producer.send(...)` On slow CI runners, the consumer group rebalance / join can take longer than the fixed 4 seconds. Although `fromBeginning: true` lets a late-joining consumer still read offset 0, when the join is slow enough the consumer transaction isn't created before the per-test timeout. Listening for the `GROUP_JOIN` event removes the timing assumption entirely: we proceed exactly when the consumer is in its group and actively polling. The listener is registered before `consumer.run()` so it cannot miss the event. This mirrors the prior fix for the same suite (#20189, which addressed transaction ordering) and the same race-on-startup pattern already documented in the amqplib scenario. Fixes #21044 Co-authored-by: claude[bot] <41898282+claude[bot]@users.noreply.github.com> Co-authored-by: Claude Opus 4.7 (1M context) <[email protected]>

The kafkajs integration test asserted producer and consumer transactions in a fixed order, but they can arrive in either order due to Kafka's async nature.
To fix the flake, we collect both transactions via callbacks, then assert after both have arrived using
find()by transaction name instead of relying on arrival ordercloses #20121