Prevent libev watchers running on closed socket by Lorak-mmk · Pull Request #903 · scylladb/python-driver

Lorak-mmk · 2026-06-12T16:20:35Z

The way I understand Connection semantics, close can be called from any thread at any time.
It may happen when handle_write / handle_read are already running. If close closes the socket, and one of watchers uses it, we may get EBADF.
Apart from this specific error, it just seems conceptually weird that resources operating on socket (watchers) are closed later than the socket itself.
The solution for this issue that I implemented here is quite simple: close the TCP socket only after watchers are stopped.

close calls connection_destroyed, which registers connection to be destroyed. Some time later, _loop_will_run is called which goes trough all connections that are registered to be destroyed, and stops their watchers. I moved socket close from close to after watchers are stopped for the connection in _loop_will_run.

Additionally, I took the early returns from PRs of @vponomaryov and @fruch .
I'm not 100% convinced they are necessary for correctness, but:

If connection is closed, it makes sense to avoid additional work
If both watchers are scheduled in a single loop iteration, and first one closes connection, then there is no point in executing the second one.

I decided to not take @vponomaryov change that sets last_error when connection is gracefully closed by the server. After a brief look at other reactors, it looks like its an established convention that graceful close does not set last_error - only Twisted does not abide by this.

I did pick up another optimization, in slightly changed form: factory now checks if connection is closed without last_error, and raises if so.
This is not necessary for correctness at all imo, because the connection may get closed just after being returned from factory, so we are not preventing any scenarios. It may however be an optimization in some cases, because we'll learn quicker that connection is dead.

Fixes: #614 (hopefully)

Pre-review checklist

I have split my patch into logically separate commits.
All commit messages clearly explain what they change and why.
~~I added relevant tests for new features and bug fixes.~~
All commits compile, pass static checks and pass test.
PR description sums up the changes and reasons why they should be introduced.
~~I have provided docstrings for the public items that I want to introduce.~~
~~I have adjusted the documentation in ./docs/source/.~~
I added appropriate Fixes: annotations to PR description.

`close` can be called from anywhere, not only reactor threads. If such `close` call closes socket during `handle_write` / `handle_read`, then those functions may try to operate on closed socket. Solution implemented in this commit: defer socket closing until both watchers are stopped.

Previous commit defered socket close until watchers are stopped, but there is one more case worth considering. If during one libev loop iteration socket gets ready for both read and write, then both watchers will be called. If one decides to close the connection, the other one will still get called anyway. This shouldn't cause EBADF, because socket won't be closed yet, but I see no reason to perform unnecessary work.

When connection is closed by the server, but there is no other error, it will be close (is_cloes == True) without setting `last_error`. This is true for all reactors apart from Twisted as far as I can tell. If we try to use such connection, we'll quickly discover that its broken, but we can slightly optimize this process by raising directly from factory().

coderabbitai · 2026-06-12T16:20:49Z

No actionable comments were generated in the recent review. 🎉

ℹ️ Recent review info

⚙️ Run configuration

Configuration used: Organization UI

Review profile: CHILL

Plan: Pro Plus

Run ID: 792b7aa6-3c56-4318-9fbb-4ef32eb95729

📥 Commits

Reviewing files that changed from the base of the PR and between bf7966f and 7a9211f.

📒 Files selected for processing (2)

cassandra/connection.py
cassandra/io/libevreactor.py

📝 Walkthrough

Walkthrough

This PR addresses socket lifecycle management and server-initiated connection closure handling in the Python driver. The connection factory now detects when the server closes a connection after the connection event completes and raises ConnectionShutdown explicitly. LibevConnection handlers check for closed connections early and return immediately to prevent stale processing. Socket cleanup responsibility shifts from LibevConnection.close() to the libev reactor's cleanup and closed-connection handling paths, which now explicitly close sockets and log debug messages during shutdown.

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 28.57% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.

✅ Passed checks (4 passed)

Check name	Status	Explanation
Title check	✅ Passed	The title accurately describes the main objective: preventing libev watchers from operating on closed sockets.
Description check	✅ Passed	The PR description is detailed, explains the race condition, implementation approach, design decisions, and includes Fixes annotation. However, no new tests were added per author's note.
Linked Issues check	✅ Passed	The changes directly address issue `#614` by eliminating the EBADF race condition through deferring socket closure until after watchers stop.
Out of Scope Changes check	✅ Passed	All changes are within scope: socket closure deferral, watcher early returns, factory optimization, and debug logging for socket cleanup.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

Lorak-mmk added 3 commits June 12, 2026 17:47

Lorak-mmk requested review from sylwiaszunejko and vponomaryov June 12, 2026 16:20

vponomaryov mentioned this pull request Jun 12, 2026

fix(py-drv): use fixed python driver with proper reconnections - v4 scylladb/scylla-cluster-tests#15037

Draft

2 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prevent libev watchers running on closed socket#903

Prevent libev watchers running on closed socket#903
Lorak-mmk wants to merge 3 commits into
scylladb:masterfrom
Lorak-mmk:fix-libev

Lorak-mmk commented Jun 12, 2026

Uh oh!

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

Lorak-mmk commented Jun 12, 2026

Pre-review checklist

Uh oh!

coderabbitai Bot commented Jun 12, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Walkthrough

❌ Failed checks (1 warning)

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

coderabbitai Bot commented Jun 12, 2026 •

edited

Loading