Skip to content

tests: fix NLB replacement test bootstrap crash due to missing rackdc properties#772

Merged
dkropachev merged 1 commit into
scylladb:masterfrom
mykaul:fix/nlb-test-bootstrap-rackdc
Apr 7, 2026
Merged

tests: fix NLB replacement test bootstrap crash due to missing rackdc properties#772
dkropachev merged 1 commit into
scylladb:masterfrom
mykaul:fix/nlb-test-bootstrap-rackdc

Conversation

@mykaul

@mykaul mykaul commented Mar 28, 2026

Copy link
Copy Markdown

Summary

  • Fix TestFullNodeReplacementThroughNlb::test_should_survive_full_node_replacement_through_nlb which always crashes when bootstrapping node 4/5
  • Root cause: _bootstrap_node() calls ccm_cluster.add() without data_center or rack — CCM leaves cassandra-rackdc.properties empty, and Scylla's snitch crashes with locator::bad_property_file_error
  • Fix: infer data_center/rack from existing cluster nodes and pass explicitly

Details

The test was added in PR #706 with @skip_scylla_version_lt(scylla_version="2026.1.0"). Since CI uses SCYLLA_VERSION: release:2025.2, the test was always skipped and the bug was never caught.

Scylla error from node4 startup log:

ERROR snitch_logger - Failed to parse a properties file (cassandra-rackdc.properties). Halting...
ERROR init - Startup failed: locator::bad_property_file_error (std::exception)

Testing

Verified 10/10 passes on Scylla 2026.1 after the fix (previously 0/10).

@dkropachev

Copy link
Copy Markdown
Collaborator

@mykaul , this fix is wrong - test needs to specify concrete dc and rack it wants to place new nodes in, taking them from nodes already provisioned

@mykaul

mykaul commented Mar 28, 2026

Copy link
Copy Markdown
Author

@mykaul , this fix is wrong - test needs to specify concrete dc and rack it wants to place new nodes in, taking them from nodes already provisioned

@dkropachev - isn't it correct anyway, to specify the topology explicitly? I can fix the test in addition, but this seems correct as well.

@mykaul mykaul marked this pull request as draft March 29, 2026 06:57
@dkropachev

Copy link
Copy Markdown
Collaborator

@mykaul , this fix is wrong - test needs to specify concrete dc and rack it wants to place new nodes in, taking them from nodes already provisioned

@dkropachev - isn't it correct anyway, to specify the topology explicitly? I can fix the test in addition, but this seems correct as well.

The problem is that test will dump all new nodes into the same dc+rack, which is wrong, it should spread them through existing racks.

… properties

The _bootstrap_node() method in TestFullNodeReplacementThroughNlb calls
ccm_cluster.add() without passing data_center or rack.  CCM does not
infer these from existing nodes, so the new node's
cassandra-rackdc.properties file is left with only template comments.
Scylla's GossipingPropertyFileSnitch fails to parse the empty file and
crashes on startup with 'locator::bad_property_file_error'.

Fix by reading data_center/rack from an existing cluster node and
passing them explicitly to ccm_cluster.add().

This test was added in PR scylladb#706 with a @skip_scylla_version_lt(2026.1.0)
decorator and CI runs Scylla 2025.2, so the bug was never caught.
@mykaul mykaul force-pushed the fix/nlb-test-bootstrap-rackdc branch from aab7e79 to 150ae37 Compare April 1, 2026 13:24
@mykaul

mykaul commented Apr 1, 2026

Copy link
Copy Markdown
Author

v2: Addressed review feedback from @dkropachev.

Instead of inferring data_center/rack from the first existing node (which is semantically wrong for multi-DC/multi-rack topologies), _bootstrap_node now accepts explicit data_center and rack parameters. The caller passes the known topology value (data_center='dc1') directly, making the intent clear at the call site.

Changes:

  • _bootstrap_node(self, ccm_cluster, node_id)_bootstrap_node(self, ccm_cluster, node_id, data_center=None, rack=None)
  • Removed inference logic (existing = next(iter(...)) / or 'dc1' / or 'RAC1')
  • Call site now passes data_center='dc1' explicitly
  • Rebased on current master

@mykaul mykaul marked this pull request as ready for review April 7, 2026 06:51
@dkropachev dkropachev merged commit 117abb2 into scylladb:master Apr 7, 2026
12 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants