test(graph-maintenance): reproduce concurrent-insert deadlock on the queue#2368
Merged
Merged
Conversation
…queue Deterministic, DB-level regression guard for the deadlock fixed in #2353. Two concurrent transactions insert overlapping graph_maintenance_queue keys in opposite order (with a barrier between the two per-row locks) and Postgres aborts one with DeadlockDetectedError; the sorted-order companion test shows a shared lock order eliminates the cycle. Unlike #2353's tests — which only assert the Python list handed to execute() is sorted — this exercises the actual lock.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
A deterministic, DB-level regression guard for the concurrent-insert deadlock fixed in #2353.
#2353's own tests only assert that the Python list handed to
conn.executeis sorted — they never exercise a real lock or prove a deadlock is prevented (and would spuriously fail a refactor to sort in SQL). This adds the behavioural test:test_unordered_concurrent_enqueue_deadlocks— two concurrent transactions insert overlappinggraph_maintenance_queuekeys in opposite order, with anasyncio.Barrierbetween the two per-row unique-key locks so the cross-wait is guaranteed. Postgres detects the cycle and aborts one withDeadlockDetectedError(call takes ~1s =deadlock_timeout).test_ordered_concurrent_enqueue_does_not_deadlock— same setup, both in sorted order (what fix(graph-maintenance): sort unit_ids to eliminate concurrent-insert deadlock #2353 produces): no cycle, both sets land in the queue.Modelling note
Production enqueues a whole victim set in one
INSERT ... SELECT unnest(...) ON CONFLICT DO NOTHINGstatement. That single statement still takes the per-row unique-key locks one at a time inunnestorder — we just can't pause inside a statement, so each worker issues the rows one at a time with a barrier between. Same lock, made observable.Notes
memoryfixture (liketest_graph_maintenance.py); uses a uniquebank_idper test, so it's isolated under xdist.