Skip to content

Fix explicit wait_for_schema_agreement() failing #837

Open
nyh wants to merge 2 commits intoscylladb:masterfrom
nyh:fix-604
Open

Fix explicit wait_for_schema_agreement() failing #837
nyh wants to merge 2 commits intoscylladb:masterfrom
nyh:fix-604

Conversation

@nyh
Copy link
Copy Markdown

@nyh nyh commented May 3, 2026

The Python driver's wait_for_schema_agreement() function has an optional connection parameter, and whether or not this parameter is given has two different purposes:

  1. connection is given when the user's DDL request generated a "RESULT_KIND_SCHEMA_CHANGE" response, and the driver is required to use the same node.
  2. connection is not given, the driver may connection to any node, and will typically use its "control connection".

The second use case sometimes failed if one of the node recently went down. The problem is that it was possible that the control connection was to that node that went down, and the driver did not yet set up a new control connection. The wait_for_schema_agreement() had an error on that one specific control connection, and instead of retrying on a new control connection, it gave up immediately - which was wrong. The first patch in this series fixes this wait_for_schema_agreement(connection=None) case, and the second patch adds reproducing tests using in-process MockConnection/MockCluster objects.

This fix will improve, for example, ScyllaDB's dtest. dtest's create_ks_query() runs session.cluster.control_connection.wait_for_schema_agreement(wait_time=120). It is arguable why it needs to do this (since the DDL that preceeded it will have already waited for schema agreement!), but it does. And that used to fail - rarely but not rarely enough - when one of the nodes went down just before trying to call this function.

Fixes #604.

nyh added 2 commits May 3, 2026 18:36
…nection

When wait_for_schema_agreement() is called without an explicit connection
(e.g., via the public API or from dtest's create_ks()), it uses the
control connection. If that connection is closed concurrently — for
example because a node was stopped — a ConnectionShutdown exception is
raised and propagated to the caller, causing an unrecoverable failure.

The root cause is that the connection was captured once at the start of
the function and never refreshed. The control connection can reconnect
to another node at any time, but the function was unaware of this.

Fix this by distinguishing between two call modes:

1. Explicit connection (passed after DDL RESULT_KIND_SCHEMA_CHANGE):
   The caller must verify schema agreement through the specific
   coordinator node. A ConnectionShutdown here is a real error and is
   still re-raised.

2. Control-connection mode (connection=None): There is no hard
   requirement on which node is queried. When a ConnectionShutdown
   occurs, wait briefly for the control connection to reconnect, pick
   up the new self._connection, and retry within the existing timeout
   budget. This matches the behavior for OperationTimedOut, which
   already retried in a loop.

This avoids the need for callers to work around the bug by discarding
and recreating their session after a node stop, as seen for example
in several scylla-dtest workarounds (e.g., SCYLLADB-1256).

Fixes: scylladb#604
…ction

Add two unit tests to ControlConnectionTest that cover the bug fixed in
the previous commit (scylladb#604):

test_wait_for_schema_agreement_recovers_from_closed_control_connection:
  Simulates a node being stopped while wait_for_schema_agreement() is
  running in control-connection mode (no explicit connection passed).
  The mock initial connection raises ConnectionShutdown on its first
  query and simultaneously installs a replacement connection, mirroring
  what the driver's reconnect logic does in production.
  Before the fix this exception propagated to the caller.
  After the fix the function transparently picks up the new control
  connection and returns True.

test_wait_for_schema_agreement_explicit_connection_raises_on_shutdown:
  Guards the DDL code path, where the caller passes an explicit
  connection to ensure schema agreement is verified through the DDL
  coordinator. In this case ConnectionShutdown must still be re-raised
  — there is no safe fallback to a different node.
  This test passes both before and after the fix, confirming that the
  fix does not weaken the explicit-connection path.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR fixes ControlConnection.wait_for_schema_agreement(connection=None) to recover when the control connection is closed during the schema agreement polling loop (e.g., a node goes down) by retrying using an updated/reconnected control connection, and adds unit test coverage for the regression.

Changes:

  • Retry schema agreement checks on ConnectionShutdown when no explicit connection was provided, switching to a refreshed control connection when available.
  • Preserve existing behavior when an explicit connection is passed (propagate ConnectionShutdown).
  • Add unit tests covering both the recovery path (implicit control connection) and the explicit-connection failure path.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
cassandra/cluster.py Adds ConnectionShutdown recovery logic for the implicit-control-connection schema agreement path.
tests/unit/test_control_connection.py Adds regression tests for control connection shutdown recovery and explicit connection behavior.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment thread tests/unit/test_control_connection.py
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Copilot reviewed 2 out of 2 changed files in this pull request and generated no new comments.

Comments suppressed due to low confidence (1)

tests/unit/test_control_connection.py:26

  • There are now two separate imports from cassandra.connection (ConnectionShutdown and EndPoint/DefaultEndPoint/DefaultEndPointFactory). Consider combining them into a single import statement to avoid duplication and keep imports organized (helps linters like isort/flake8).
from cassandra import OperationTimedOut, SchemaTargetType, SchemaChangeType
from cassandra.connection import ConnectionShutdown
from cassandra.protocol import ResultMessage, RESULT_KIND_ROWS
from cassandra.cluster import ControlConnection, _Scheduler, ProfileManager, EXEC_PROFILE_DEFAULT, ExecutionProfile
from cassandra.pool import Host
from cassandra.connection import EndPoint, DefaultEndPoint, DefaultEndPointFactory
from cassandra.policies import (SimpleConvictionPolicy, RoundRobinPolicy,

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

@nyh
Copy link
Copy Markdown
Author

nyh commented May 4, 2026

@avikivity @mykaul this is the kind of issues that caused numerous different Scylla tests over the years to sporadically fail (one of the recent ones, https://scylladb.atlassian.net/browse/SCYLLADB-1256), and it's important to finally fix.

@sylwiaszunejko sylwiaszunejko requested a review from Lorak-mmk May 4, 2026 10:30
Copy link
Copy Markdown

@Lorak-mmk Lorak-mmk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks!

@Lorak-mmk
Copy link
Copy Markdown

I'll wait with merging so that @dkropachev can take a look too.

Copy link
Copy Markdown
Collaborator

@sylwiaszunejko sylwiaszunejko left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dkropachev ping, could you also take a look?

@dkropachev
Copy link
Copy Markdown
Collaborator

dkropachev commented May 6, 2026

@Lorak-mmk , i wonder, doesn't it has the same correctness problem ?
I think that we shouldn't mix two cases:

  1. When driver calls it, it knows from which connection/host exactly it came from, so we are checking on exactly that host
  2. When it is called from the outside of the driver, we don't know where schema update event happened and we need to check schema on all the nodes (probably with some scope, dc/whole-cluster)

And I feel like for both cases we need separate API, for first one - current API on CC is ok, for second one - it is definitely should be on the Session.

@Lorak-mmk
Copy link
Copy Markdown

i wonder, doesn't it has the same correctness problem ?

No, this PR doesn't introduce any new correctness problem. It only touches the case without explicit connection (so the case where user called it).
In that case, the current semantics is that it awaits schema agreement on control connection. This PR makes it resistant to CC reconnection happening during this wait. I see no bug - the semantics is still that after await returns, there is schema agreement on current CC (the only thing that changes is that the CC may be different than before the call).

What you propose (checking schema on all nodes) is imo a good idea for semantics change - but its a separate task that should not block this PR.
Btw this is already how we do schema agreement check in Rust Driver - we issue system.local query to all nodes, instead of checking system.peers and system.local on a single node.

@dkropachev
Copy link
Copy Markdown
Collaborator

dkropachev commented May 6, 2026

i wonder, doesn't it has the same correctness problem ?

No, this PR doesn't introduce any new correctness problem. It only touches the case without explicit connection (so the case where user called it). In that case, the current semantics is that it awaits schema agreement on control connection. This PR makes it resistant to CC reconnection happening during this wait. I see no bug - the semantics is still that after await returns, there is schema agreement on current CC (the only thing that changes is that the CC may be different than before the call).

What you propose (checking schema on all nodes) is imo a good idea for semantics change - but its a separate task that should not block this PR. Btw this is already how we do schema agreement check in Rust Driver - we issue system.local query to all nodes, instead of checking system.peers and system.local on a single node.

I did not put it clearly, what I was saying is that user should not call session.cluster.control_connection.wait_for_schema_agreement, in fact it should be private, since there is no way user may know if control connection has enough information to conclude if cluster is agreed on schema or not, if we keep it this way it will lead to false positives.
In other words, this PR fixes case that should not even be considered in this API, instead we should implement Session API and update dtest to use it instead of current.

@Lorak-mmk
Copy link
Copy Markdown

I was saying is that user should not call session.cluster.control_connection.wait_for_schema_agreement

I agree that control_connection should not be accessed. Unfortunately it is.

since there is no way user may know if control connection has enough information to conclude if cluster is agreed on schema or not

And I also agree that the semantics of the current API may not be very useful in some situations, and that we should introduce a better one.
Still, this does not conflict with this PR. We can both introduce new API and make the current one a bit better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

wait_for_schema_agreement fails when connection is closed

5 participants