-
Notifications
You must be signed in to change notification settings - Fork 125
Fix 2 bugs and improve performance from O(N²+NxM) to O(N+M) in _assign_requests_to_connections #1035
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
Redesign of the connection pool's core request-connection assignment algorithm (_assign_requests_to_connections) fixing bugs and performance issues identified in the original implementation. Fixes: - Fixed incorrect idle connection counting, causing premature connection evictions - Fixed bug where multiple requests could be assigned to the same HTTP/1.1 connection, leading to ConnectionNotAvailable errors Performance improvements: - Reduced time complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests
Already tried here the same but httpcore seems to be abandoned #929 |
@MarkusSintonen Thanks for replying and for linking that PR. |
Citing some issues related to this, including in downstream libraries relying on |
+1 |
1 similar comment
+1 |
Good luck, #1000 has been hanging for 6 months, despite being so tiny. The review you saw is not binding, I can't merge it until some maintainer approves it. This project does seem abandoned, I'm afraid. |
@blazewicz Nope, just the attention isn't on this PR for a reason... v1 comprehensively addresses this issue and is being quietly worked on. It's worth taking a look at the relative complexity of the connection pool handling there vs. the connection pool handling in the the existing httpcore. |
Summary
Fixes 2 logic bugs in
_assign_requests_to_connections
and improves complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests.Although I attempted a smaller PR with an atomic change, I ultimately decided for a redesign of the method because the algorithm design of the method was fundamentally flawed.
Motivation
_assign_requests_to_connections
is in the hot path of this library and the bugs and complexity issue were severly slowing down my application.Bug 1:
max_keepalive_connections
is not respectedThis causes idle connections to be evicted unnecessarily. This bug is also addressed in this PR (currently approved but not merged).
The bug is on lines 294-295. The expression on line 294 simply returns
len(self._connections)
, ignoring the number of idle connections, and compares this number toself._max_keepalive_connections
.httpcore/httpcore/_async/connection_pool.py
Lines 294 to 295 in 5974b03
This test demonstrates the bug. This test fails with the current code on master, but passes with the new implementation of this PR.
Bug 2: multiple requests are assigned to the same HTTP/1.1 connection, leading to
ConnectionNotAvailable
exceptionsThe bug is on lines 322-323. An available connection is assigned to a request, but the status of that connection is not actually changed, resulting in that connection also being assigned to the next request in the queue.
httpcore/httpcore/_async/connection_pool.py
Lines 322 to 323 in 5974b03
This test demonstrates the bug. This test fails with the current code on master, but passes with the new implementation of this PR.
Perf: Reduce complexity from quadratic to linear
This PR reduces time complexity from O(N²+NxM) to O(N+M) where N=connections, M=requests.
In a typical use case, with for instance 100 connections in the pool and 500 requests in the queue this reduces complexity dramatically, especially since
_assign_requests_to_connections
is in the hot path of this library because it will be called twice for every request (on start and on finish).In my use case, this PR reduces the CPU utilization of my application by half. Profiling shows that
_assign_requests_to_connections
previously accounted for 59% of all samples; after applying this PR, its share dropped to 4%.The problem: unnecesary nested loops
This snippet contains a nested loop over all connections (O(N²)). This is unnecessary and can be done in a single pass (O(N)).
This snippet contains a nested loops over all requests and connections (O(NxM)). This is unnecessary and can be done in a single pass (O(M)) with early exit if all connections are occupied.
Checklist