Skip to content

test(graph-maintenance): reproduce concurrent-insert deadlock on the queue#2368

Merged
nicoloboschi merged 1 commit into
mainfrom
test/graph-maintenance-deadlock-repro
Jun 23, 2026
Merged

test(graph-maintenance): reproduce concurrent-insert deadlock on the queue#2368
nicoloboschi merged 1 commit into
mainfrom
test/graph-maintenance-deadlock-repro

Conversation

@nicoloboschi

Copy link
Copy Markdown
Collaborator

Summary

A deterministic, DB-level regression guard for the concurrent-insert deadlock fixed in #2353.

#2353's own tests only assert that the Python list handed to conn.execute is sorted — they never exercise a real lock or prove a deadlock is prevented (and would spuriously fail a refactor to sort in SQL). This adds the behavioural test:

  • test_unordered_concurrent_enqueue_deadlocks — two concurrent transactions insert overlapping graph_maintenance_queue keys in opposite order, with an asyncio.Barrier between the two per-row unique-key locks so the cross-wait is guaranteed. Postgres detects the cycle and aborts one with DeadlockDetectedError (call takes ~1s = deadlock_timeout).
  • test_ordered_concurrent_enqueue_does_not_deadlock — same setup, both in sorted order (what fix(graph-maintenance): sort unit_ids to eliminate concurrent-insert deadlock #2353 produces): no cycle, both sets land in the queue.

Modelling note

Production enqueues a whole victim set in one INSERT ... SELECT unnest(...) ON CONFLICT DO NOTHING statement. That single statement still takes the per-row unique-key locks one at a time in unnest order — we just can't pause inside a statement, so each worker issues the rows one at a time with a barrier between. Same lock, made observable.

Notes

  • Runs against the real Postgres test DB via the existing memory fixture (like test_graph_maintenance.py); uses a unique bank_id per test, so it's isolated under xdist.
  • No production code changes.

…queue

Deterministic, DB-level regression guard for the deadlock fixed in #2353.
Two concurrent transactions insert overlapping graph_maintenance_queue keys in
opposite order (with a barrier between the two per-row locks) and Postgres
aborts one with DeadlockDetectedError; the sorted-order companion test shows a
shared lock order eliminates the cycle. Unlike #2353's tests — which only assert
the Python list handed to execute() is sorted — this exercises the actual lock.
@nicoloboschi nicoloboschi merged commit 8eb2937 into main Jun 23, 2026
188 of 196 checks passed
@nicoloboschi nicoloboschi deleted the test/graph-maintenance-deadlock-repro branch June 23, 2026 12:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant