Skip to content

.

252e3e3
Select commit
Loading
Failed to load commit list.
Merged

ref(batcher): Only flush the bucket that triggered the flush event #6168

.
252e3e3
Select commit
Loading
Failed to load commit list.
@sentry/warden / warden completed Apr 30, 2026 in 1m 42s

5 issues

Medium

kill() can race with _flush_loop and skip the final full flush - `sentry_sdk/_span_batcher.py:113-119`

kill() sets self._running = False before putting the sentinel None onto _flush_queue. If the flusher thread is between iterations (just finished a _flush call and about to re-evaluate the 'while self._running' condition) when kill() runs, it will observe _running=False and exit without ever consuming the None sentinel — meaning no final full flush is performed and any remaining buffered spans are dropped on shutdown. The previous Event-based design had the same shape, but moving to a queue does not fix this and the docstring/issue explicitly aims to preserve full-flush-on-shutdown semantics. Consider performing a final self._flush() inline in kill() (or before setting _running=False) to guarantee shutdown drainage.

kill() can lose buffered spans if flush loop exits before consuming the sentinel - `sentry_sdk/_span_batcher.py:113-119`

kill() sets self._running = False and then puts None on the flush queue. The flush loop only performs the shutdown flush as a side effect of consuming a queued trace_id (or the None sentinel) and calling _flush(). However, the loop's continuation is gated by while self._running: checked at the top of each iteration. If kill() sets _running = False after the loop has finished an iteration but before it re-enters get(), the loop exits without ever consuming the None and without calling _flush(), so any spans still in self._span_buffer are silently dropped on shutdown. The previous Event-based implementation in the parent Batcher.kill() relied on _flush_loop calling _flush() unconditionally each iteration, which avoided this hazard.

Also found at:

  • sentry_sdk/_span_batcher.py:58-73

Low

Flush queue can grow unboundedly under sustained high span volume - `sentry_sdk/_span_batcher.py:103-109`

Every span added past MAX_BEFORE_FLUSH (or exceeding MAX_BYTES_BEFORE_FLUSH) calls self._flush_queue.put(span.trace_id) on every subsequent add until the bucket is actually drained by the flusher thread. Under sustained load on a single trace, many duplicate trace_id entries can pile up in _flush_queue while the flusher processes one per loop iteration. After the first flush deletes the bucket, the remaining duplicates become no-ops in _flush, but the queue itself keeps growing memory until it catches up. Consider deduplicating (e.g., only put when not already pending) or using a set-based signal to bound memory.

Test coverage for shutdown/kill drainage of buckets is not evident in the diff

The PR changes shutdown semantics (kill() now uses a queue sentinel instead of an Event) and changes the flush loop to only flush on explicit triggers. The skill requires verifying that tests cover edge cases such as: kill() while buckets contain spans, multiple buckets receiving simultaneous threshold triggers, and the time-based full-flush path. The diff lists tests/tracing/test_span_buffer.py and test_span_streaming.py as updated, but reviewers should confirm these new behaviors are explicitly exercised, particularly that no spans are lost when kill() is called with non-empty buckets.

add() can enqueue many duplicate trace_ids while a bucket is awaiting flush - `sentry_sdk/_span_batcher.py:103-109`

Once size+1 >= MAX_BEFORE_FLUSH (or the byte threshold) is met, every subsequent span on the same trace_id (up to MAX_BEFORE_DROP) calls self._flush_queue.put(span.trace_id) again, because the bucket isn't drained until the flush loop consumes the entry. This can enqueue up to ~1000 duplicate entries per bucket; the flush loop processes each by calling _flush(trace_id) on an already-empty bucket, wasting wakeups and (more importantly) starving the time-based full flush check, since each successful get() in the loop runs before the time-check and can keep _last_full_flush from advancing under sustained load.

4 skills analyzed
Skill Findings Duration Cost
code-review 3 1m 6s $1.14
find-bugs 2 1m 32s $1.22
skill-scanner 0 20.1s $0.45
security-review 0 30.1s $0.46

Duration: 3m 28s · Tokens: 520.6k in / 10.2k out · Cost: $3.27 (+merge: $0.00, +dedup: $0.01)