test: sync injected qsigshare in llmq signing test by thepastaclaw · Pull Request #7304 · dashpay/dash

thepastaclaw · 2026-05-04T01:32:02Z

Summary

Flush the direct P2P qsigshare injection in
feature_llmq_signing.py --spork21 with send_and_ping().
Build and inject the full threshold set of RPC-created signature shares.
This avoids depending on the earlier async submit=true shares being relayed
before the recovered-signature timeout.
Keep the existing recovered-signature assertions intact.

Validation

Passed locally on macOS arm64:

python3 -m py_compile test/functional/feature_llmq_signing.py
git diff --check
python3 test/functional/feature_llmq_signing.py --spork21 \
  --configfile=/Users/claw/Projects/dash/test/config.ini

Repeated-run stress on macOS arm64 using the local build and this worktree's
test source:

python3 test/functional/test_runner.py -j10 --timeout-factor=1 \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21" \
  "feature_llmq_signing.py --spork21"

Result: 8/10 passed under local -j10 resource contention. This reduces the
observed local failure rate from the previously recorded 6/10 baseline on
develop to 2/10.

github-actions · 2026-05-04T01:32:35Z

✅ No Merge Conflicts Detected

This PR currently has no conflicts with other open PRs.

thepastaclaw · 2026-05-04T01:33:05Z

✅ Review complete (commit d21e0a7)

thepastaclaw · 2026-05-04T01:33:21Z

@coderabbitai review

coderabbitai · 2026-05-04T01:33:26Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai · 2026-05-04T01:34:47Z

Walkthrough

The functional test for LLMQ signing was modified to change how a QSIGSHARE is delivered to the recovery member: the test now uses P2PInterface.send_and_ping(...) in place of P2PInterface.send_message(...), combining the send with a ping/sync action to ensure the peer is ready when the msg_qsigshare is transmitted.

Estimated code review effort

🎯 1 (Trivial) | ⏱️ ~2 minutes

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 0.00% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately summarizes the main change: synchronizing an injected qsigshare message in the llmq signing test using a send-and-ping operation.
Linked Issues check	✅ Passed	The code changes directly address the root cause identified in issue `#7255` by flushing the injected qsigshare with sync_with_ping() to eliminate the async race condition causing intermittent timeouts.
Out of Scope Changes check	✅ Passed	The single line change (replacing send_message() with send_and_ping()) is directly scoped to fixing the flakiness issue in feature_llmq_signing.py --spork21 with no extraneous modifications.
Description check	✅ Passed	The pull request description is directly related to the changeset. It clearly describes the modification: changing from send_message() to send_and_ping() in the qsigshare injection, references the specific issue (`#7255`), and provides validation results from local testing.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing Touches

🧪 Generate unit tests (beta)

Create PR with unit tests

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

thepastaclaw

Code Review

Single-line test fix adding sync_with_ping() after a direct qsigshare injection in feature_llmq_signing.py to eliminate a documented race against wait_for_sigs. Both reviewers converged that the change is minimal and correct. Only a stylistic nitpick remains.

Reviewed commit: 7c4f221

💬 1 nitpick(s)

knst · 2026-05-04T07:25:19Z

#7301 is fixing #7255

thepastaclaw · 2026-05-04T07:29:43Z

Thanks — I checked #7301. It adds qsendrecsigs message support plus InstantSend coverage, but it doesn't touch feature_llmq_signing.py or the direct qsigshare send path that timed out in #7255. This draft is narrower: it just flushes that injected qsigshare before wait_for_sigs(), so I think the two changes are complementary rather than duplicates.

thepastaclaw · 2026-05-05T23:41:16Z

@coderabbitai review

coderabbitai · 2026-05-05T23:41:22Z

✅ Actions performed

Review triggered.

Note: CodeRabbit is an incremental review system and does not re-review already reviewed commits. This command is applicable only when automatic reviews are paused.

coderabbitai

🧹 Nitpick comments (1)

test/functional/feature_llmq_signing.py (1)
59-60: ⚡ Quick win

Consider scaling the wait_for_sigs timeout by timeout_factor to fully eliminate CI-load flakiness.

send_and_ping guarantees the recovery member has received the share at the P2P layer, but the actual BLS share aggregation into a recovered signature happens asynchronously in a background thread. Under heavy CI multi-process load the 15-second ceiling at line 111 can still expire before recovery completes — exactly the scenario described in issue #7255.

The standard mitigation (also recommended in the original issue) is to multiply by self.options.timeout_factor:
♻️ Suggested change
     def wait_for_sigs(hasrecsigs, isconflicting1, isconflicting2, timeout):
-        self.wait_until(lambda: check_sigs(hasrecsigs, isconflicting1, isconflicting2), timeout = timeout)
+        self.wait_until(lambda: check_sigs(hasrecsigs, isconflicting1, isconflicting2), timeout = timeout * self.options.timeout_factor)
This covers every wait_for_sigs call site (lines 66, 111, 167, 171, 177, 201) in one change. The assert_sigs_nochange calls also pass a raw timeout to self.wait_until, but since those intentionally assert no change they should arguably stay fixed to avoid slow-running no-op waits.

Also applies to: 111-111
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@test/functional/feature_llmq_signing.py` around lines 59 - 60, The
wait_for_sigs helper currently calls self.wait_until(..., timeout=timeout) with
a raw timeout which can flake under CI; update the wait_for_sigs definition to
scale the passed timeout by self.options.timeout_factor (i.e. call
self.wait_until(..., timeout=timeout * self.options.timeout_factor)) so all
wait_for_sigs call sites (the ones at lines referenced in the review)
automatically get the scaled timeout; do not change the assert_sigs_nochange raw
timeout behavior.

🤖 Prompt for all review comments with AI agents

Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Nitpick comments:
In `@test/functional/feature_llmq_signing.py`:
- Around line 59-60: The wait_for_sigs helper currently calls
self.wait_until(..., timeout=timeout) with a raw timeout which can flake under
CI; update the wait_for_sigs definition to scale the passed timeout by
self.options.timeout_factor (i.e. call self.wait_until(..., timeout=timeout *
self.options.timeout_factor)) so all wait_for_sigs call sites (the ones at lines
referenced in the review) automatically get the scaled timeout; do not change
the assert_sigs_nochange raw timeout behavior.

ℹ️ Review info

⚙️ Run configuration

Configuration used: Repository UI

Review profile: CHILL

Plan: Pro

Run ID: c0d4e4b5-b0ad-4e2f-a290-12cb6c5b0eca

📥 Commits

Reviewing files that changed from the base of the PR and between 7c4f221 and f670321.

📒 Files selected for processing (1)

test/functional/feature_llmq_signing.py

thepastaclaw

Code Review

Single-line functional-test synchronization fix at the exact reviewed commit f670321. Replacing send_message() with send_and_ping() for the injected qsigshare forces the recovery member to process that direct P2P message before the recovered-signature wait continues, without changing production code or the test's assertions. I found no correctness, security, or Dash-specific interaction issues in the reviewed change.

Reviewed commit: f670321

thepastaclaw

Code Review

The current head is a small, mechanically correct test-stability change for feature_llmq_signing.py --spork21: it converts RPC sig-share results into P2P QSIGSHARE payloads and flushes the direct send with send_and_ping(). The new full-threshold direct injection removes the recovered-signature assertion's previous dependence on the async submit=true relay path, which is a reasonable flake fix but leaves that positive relay behavior without coverage in this test.

Reviewed commit: 1dda821

🟡 1 suggestion(s)

🤖 Prompt for all review comments with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed.

In `test/functional/feature_llmq_signing.py`:
- [SUGGESTION] lines 99-115: Direct injection no longer positively covers the submit=true relay path
  This block now rebuilds shares for members 0, 1, and 2 with `submit=false` and injects all of them directly into the selected recovery member. In this test `set_dash_llmq_test_params(5, 3)` overrides `LLMQ_TEST` to a 3-of-5 quorum, so those three injected shares are sufficient by themselves; the recovered-signature assertion at line 120 can pass even if the earlier default `submit=true` calls at lines 79 and 87 stop relaying shares under SPORK_21. Avoiding that async dependency is the right way to deflake this specific check, but it also means the test no longer has a positive assertion that submit=true shares are relayed and consumed. If that behavior is meant to stay covered here, add a separate deterministic assertion for the submit=true relay path rather than relying on this recovery step.

thepastaclaw · 2026-05-06T01:49:15Z

+            # 3. Sending enough sig shares received from RPC to the recovery member through P2P interface, should
+            # result in a recovered sig. Build all threshold shares explicitly so this test does not depend on the
+            # asynchronous submit=true shares above being relayed before the timeout expires.
+            sig_shares = [
+                rpc_sig_share_to_p2p(self.mninfo[i].get_node(self).quorum("sign", q_type, id, msgHash, quorumHash, False))
+                for i in range(2)
+            ]
+            sig_shares.append(rpc_sig_share_to_p2p(sig_share_rpc_1))
            for mn in self.mninfo: # type: MasternodeInfo
                assert mn.get_node(self).getconnectioncount() == self.llmq_size
            # Get the current recovery member of the quorum
            q = self.nodes[0].quorum('selectquorum', q_type, id)
            mn: MasternodeInfo = self.get_mninfo(q['recoveryMembers'][0])
            # Open a P2P connection to it
            p2p_interface = mn.get_node(self).add_p2p_connection(P2PInterface())
-            # Send the last required QSIGSHARE message to the recovery member
-            p2p_interface.send_message(msg_qsigshare([sig_share]))
+            # Send the required QSIGSHARE messages to the recovery member
+            p2p_interface.send_and_ping(msg_qsigshare(sig_shares))


🟡 Suggestion: Direct injection no longer positively covers the submit=true relay path

This block now rebuilds shares for members 0, 1, and 2 with submit=false and injects all of them directly into the selected recovery member. In this test set_dash_llmq_test_params(5, 3) overrides LLMQ_TEST to a 3-of-5 quorum, so those three injected shares are sufficient by themselves; the recovered-signature assertion at line 120 can pass even if the earlier default submit=true calls at lines 79 and 87 stop relaying shares under SPORK_21. Avoiding that async dependency is the right way to deflake this specific check, but it also means the test no longer has a positive assertion that submit=true shares are relayed and consumed. If that behavior is meant to stay covered here, add a separate deterministic assertion for the submit=true relay path rather than relying on this recovery step.

_{source: ['codex']}

🤖 Fix this with AI agents

These findings are from an automated code review. Verify each finding against the current code and only fix it if needed. In `test/functional/feature_llmq_signing.py`: - [SUGGESTION] lines 99-115: Direct injection no longer positively covers the submit=true relay path This block now rebuilds shares for members 0, 1, and 2 with `submit=false` and injects all of them directly into the selected recovery member. In this test `set_dash_llmq_test_params(5, 3)` overrides `LLMQ_TEST` to a 3-of-5 quorum, so those three injected shares are sufficient by themselves; the recovered-signature assertion at line 120 can pass even if the earlier default `submit=true` calls at lines 79 and 87 stop relaying shares under SPORK_21. Avoiding that async dependency is the right way to deflake this specific check, but it also means the test no longer has a positive assertion that submit=true shares are relayed and consumed. If that behavior is meant to stay covered here, add a separate deterministic assertion for the submit=true relay path rather than relying on this recovery step.

thepastaclaw · 2026-05-06T02:44:09Z

Checked the CodeRabbit timeout suggestion. No code change needed here: BitcoinTestFramework.wait_until() already passes self.options.timeout_factor through to wait_until_helper(), which multiplies the timeout internally, so changing wait_for_sigs() to multiply by self.options.timeout_factor would double-scale these waits. The fixed raw timeouts in assert_sigs_nochange() also already go through the same framework scaling.

thepastaclaw

Code Review

Test-only change to feature_llmq_signing.py that hardens the spork21 LLMQ signing path. The split into a dedicated submit_true subtest plus explicit threshold-share construction with send_and_ping addresses a real flake risk. One minor coverage observation: the new submit_true subtest only asserts hasrecsig on the recovery member, not on all quorum members.

Reviewed commit: d21e0a7

💬 1 nitpick(s)

thepastaclaw · 2026-05-06T05:42:20Z

+            # Sign a distinct request through the default submit=true path to assert sig share relay and recovery.
+            q_submit_true = self.nodes[0].quorum('selectquorum', q_type, submit_true_id)
+            submit_true_recovery_member = self.get_mninfo(q_submit_true['recoveryMembers'][0])
+            submit_true_signers = [mn for mn in self.mninfo if mn != submit_true_recovery_member][:3]
+            for mn in submit_true_signers:
+                mn.get_node(self).quorum("sign", q_type, submit_true_id, submit_true_msgHash)
+            wait_for_recsig(submit_true_recovery_member, submit_true_id, submit_true_msgHash, 15)


💬 Nitpick: submit=true subtest only asserts recovery on the recovery member, not all quorum members

The new submit_true_id/submit_true_msgHash flow waits only for hasrecsig on submit_true_recovery_member. Prior to this PR, the implicit submit=true assertion was the all-nodes wait_for_sigs(True, False, True, 15) at line 136. After the split, that final wait still covers all-nodes recsig propagation, but only for the P2P-injected id/msgHash path. The recsig announcement code path is the same regardless of how recovery was triggered, so this is a minor coverage delta rather than a real regression risk — but extending the assertion to all masternodes is cheap and preserves the original guarantee. Note the suggested fix in the suggestion field; you'd want to re-derive submit_true_signers similarly when applying it.

💡 Suggested change

Suggested change

# Sign a distinct request through the default submit=true path to assert sig share relay and recovery.

q_submit_true = self.nodes[0].quorum('selectquorum', q_type, submit_true_id)

submit_true_recovery_member = self.get_mninfo(q_submit_true['recoveryMembers'][0])

submit_true_signers = [mn for mn in self.mninfo if mn != submit_true_recovery_member][:3]

for mn in submit_true_signers:

mn.get_node(self).quorum("sign", q_type, submit_true_id, submit_true_msgHash)

wait_for_recsig(submit_true_recovery_member, submit_true_id, submit_true_msgHash, 15)

# Sign a distinct request through the default submit=true path to assert sig share relay and recovery

# propagates to every quorum member.

q_submit_true = self.nodes[0].quorum('selectquorum', q_type, submit_true_id)

submit_true_recovery_member = self.get_mninfo(q_submit_true['recoveryMembers'][0])

submit_true_signers = [mn for mn in self.mninfo if mn != submit_true_recovery_member][:3]

for mn in submit_true_signers:

mn.get_node(self).quorum("sign", q_type, submit_true_id, submit_true_msgHash)

self.wait_until(

lambda: all(

mn.get_node(self).quorum("hasrecsig", q_type, submit_true_id, submit_true_msgHash)

for mn in self.mninfo

),

timeout=15,

)

_{source: ['codex']}

thepastaclaw · 2026-05-06T07:17:12Z

Closing this as obsolete. The intermittent feature_llmq_signing.py --spork21 issue should be handled by the QSENDRECSIGS / quorum-connection fix path already in develop (e.g. c1cdb75 from #7289, and the related #7301 test coverage), not by hardening this individual assertion with extra direct-injected shares.

thepastaclaw commented May 4, 2026

View reviewed changes

Comment thread test/functional/feature_llmq_signing.py Outdated

thepastaclaw force-pushed the fix-7255-llmq-signing-p2p-sync branch from 7c4f221 to f670321 Compare May 5, 2026 23:41

coderabbitai Bot reviewed May 5, 2026

View reviewed changes

thepastaclaw force-pushed the fix-7255-llmq-signing-p2p-sync branch from f670321 to 1dda821 Compare May 6, 2026 00:48

thepastaclaw commented May 6, 2026

View reviewed changes

test: sync injected qsigshare in llmq signing test

d21e0a7

thepastaclaw force-pushed the fix-7255-llmq-signing-p2p-sync branch from 1dda821 to d21e0a7 Compare May 6, 2026 04:38

thepastaclaw commented May 6, 2026

View reviewed changes

thepastaclaw closed this May 6, 2026

Conversation

thepastaclaw commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary

Validation

Uh oh!

github-actions Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✅ No Merge Conflicts Detected

Uh oh!

thepastaclaw commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thepastaclaw commented May 4, 2026

Uh oh!

coderabbitai Bot commented May 4, 2026

Uh oh!

coderabbitai Bot commented May 4, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

Estimated code review effort

❌ Failed checks (1 warning)

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

Uh oh!

knst commented May 4, 2026

Uh oh!

thepastaclaw commented May 4, 2026

Uh oh!

thepastaclaw commented May 5, 2026

Uh oh!

coderabbitai Bot commented May 5, 2026

Uh oh!

coderabbitai Bot left a comment

Choose a reason for hiding this comment

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

thepastaclaw May 6, 2026

Choose a reason for hiding this comment

Uh oh!

thepastaclaw commented May 6, 2026

Uh oh!

thepastaclaw left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

thepastaclaw May 6, 2026

Choose a reason for hiding this comment

Uh oh!

thepastaclaw commented May 6, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

thepastaclaw commented May 4, 2026 •

edited

Loading

github-actions Bot commented May 4, 2026 •

edited

Loading

thepastaclaw commented May 4, 2026 •

edited

Loading

coderabbitai Bot commented May 4, 2026 •

edited

Loading