Skip to content

client-routes: preserve partial route state#848

Closed
dkropachev wants to merge 1 commit intomasterfrom
fix/issue-846-client-routes-route-state
Closed

client-routes: preserve partial route state#848
dkropachev wants to merge 1 commit intomasterfrom
fix/issue-846-client-routes-route-state

Conversation

@dkropachev
Copy link
Copy Markdown
Collaborator

@dkropachev dkropachev commented May 5, 2026

Fixes #846
Refs #813

Summary

  • limit partial CLIENT_ROUTES_CHANGE merge invalidation to host IDs from configured event pairs
  • query all configured connection_ids for affected hosts so the current preferred proxy route can be retained
  • include original_port in ClientRoutesEndPoint identity and make endpoint ordering tolerate missing ports

Why cassandra/client_routes.py changed

handle_client_routes_change() receives parallel connection_ids and host_ids from the server. The driver filters those pairs to configured connection_ids before querying system.client_routes.

The first bug was that the query used the filtered pairs, but _RouteStore.merge() was told that every host ID from the original event was affected. In a mixed event, an unrelated connection_id could therefore cause merge() to delete a valid cached route even though that entry was filtered out and never refreshed. The fix passes only host IDs from filtered configured pairs as affected_host_ids.

The second bug was same-host stickiness. The route store keeps one preferred route per host. If the host currently used proxy A, but a partial event arrived for configured proxy B, the old query fetched only B. _select_preferred_routes() could not see that A still existed, so it could switch away from the preferred route. The fix queries all configured connection_ids for affected hosts, so existing preferred routes are preserved when still present.

Risk

Low. Route invalidation gets narrower, and the broader per-host query only covers configured proxy IDs for hosts already marked affected by a valid event.

Testing

  • repo-ci fast: workflow-test passed; overall managed run timed out during workflow-build-2 (62ce3571fc997d5e5b5a93ecb55943e0083dc2d4a9cb6a4b409835a4d1836748).

@dkropachev dkropachev self-assigned this May 5, 2026
@dkropachev dkropachev requested a review from sylwiaszunejko May 5, 2026 21:21
Partial CLIENT_ROUTES_CHANGE handling must not treat filtered event entries as affected route state. Limit merge invalidation to configured event pairs so unrelated connection IDs cannot drop cached proxy routes.

For same-host partial updates, fetch all configured connection IDs for affected hosts. This lets the route store keep the currently preferred proxy route when it is still present instead of switching because the partial event omitted it.

Also keep ClientRoutesEndPoint identity port-aware and sortable when original_port is missing.

Fixes #846

Refs #813
@dkropachev dkropachev force-pushed the fix/issue-846-client-routes-route-state branch from 278d66e to d30b6d4 Compare May 5, 2026 21:47
@dkropachev dkropachev changed the title client-routes: preserve mixed event route state client-routes: preserve partial route state May 5, 2026
@dkropachev dkropachev requested a review from Lorak-mmk May 5, 2026 22:01
@dkropachev
Copy link
Copy Markdown
Collaborator Author

Closed in favor of #853

@dkropachev dkropachev closed this May 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Client routes: mixed CLIENT_ROUTES_CHANGE events can drop cached proxy routes

1 participant