-
Notifications
You must be signed in to change notification settings - Fork 417
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adds consecutive exception tracker to connections #249
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm personally very +1 on this change overall, given my review comments. The failover case, especially, is something that doesn't seem to be handle-able with the asyncpg API otherwise.
@@ -1431,6 +1447,11 @@ def _drop_global_statement_cache(self): | |||
default). Pass ``0`` to allow all statements to be cached | |||
regardless of their size. | |||
|
|||
:param int max_consecutive_exceptions: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably important to allow users to disable this, ie. by sending 0
in the same way as many of the rest of these parameters.
@asyncpg/maintainers: personally, I think the default of 5ish is pretty reasonable here, but depending on your preference in backwards-compatibility we might want to default this to 0
for now behavior change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea with 0 setting
asyncpg/connection.py
Outdated
@@ -1331,6 +1332,8 @@ def _drop_global_statement_cache(self): | |||
# It is not possible to recover (the statement is already done at | |||
# the server's side), the only way is to drop our caches and | |||
# reraise the exception to the caller. | |||
# | |||
await self._maybe_close_bad_connection() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure OutdatedSchemaCacheError
is necessarily an indicator of a bad connection. Thoughts on this?
asyncpg/connection.py
Outdated
@@ -1356,15 +1359,27 @@ def _drop_global_statement_cache(self): | |||
# and https://github.com/MagicStack/asyncpg/issues/76 | |||
# for discussion. | |||
# | |||
await self._maybe_close_bad_connection() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm unsure OutdatedSchemaCacheError
is necessarily an indicator of a bad connection. Maybe, move this down a few lines into the if not retry
box? Thoughts on this?
asyncpg/connection.py
Outdated
self._drop_global_statement_cache() | ||
if self._protocol.is_in_transaction() or not retry: | ||
raise | ||
else: | ||
return await self._do_execute( | ||
query, executor, timeout, retry=False) | ||
except: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As far as I see from my (admittedly cursory) glance of the custom exceptions here, it looks like they all inherit from Exception
. At the very least, inheriting from that would be great, although it might be even better to go through and mark some as "potentially signifying a bad connection" and catching only those.
Just grabbing a random exception from that file as an example: DeprecatedFeature
probably shouldn't lead to killing&restarting the connection, though (as you commented) ReadOnlySQLTransactionError
definitely could indicate this case and I am +1 on reacting to that case.
asyncpg/connection.py
Outdated
self._drop_global_statement_cache() | ||
if self._protocol.is_in_transaction() or not retry: | ||
raise | ||
else: | ||
return await self._do_execute( | ||
query, executor, timeout, retry=False) | ||
except: | ||
logging.warning('exception - maybe closing bad connection') |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would move this into the _maybe_close_bad_connection
method, and only log a logging.warning
when we actually close the connection.
Also -- need an import logging
. Even better, initialize your logger before using with:
import logging
# ...
log = logging.getLogger(__name__)
# ...
log.warning("I'm a configured log warning!")
I have a strong feeling that this feature should be part of the connection pool. It already has the machinery to overload all methods on the Connection object. This way Connection object (which has already a very complex implementation) stays unmodified. There are many different things we want to do about the pool in asyncpg, including factoring it out in a separate generic package. This way we'll be able to have different connection pool implementations fined tuned for specific use cases. So although I understand the motivation for this PR, I'm -1 on it in its current form. |
@1st1 Fair points. I'm working towards implementing this in the pool itself, just getting a bit lost in the pool/ch/proxy implementation at the moment. |
Rather than reverting work, it was simpler to just open a new PR. See #253 |
TBH I'd like to put this on hold till we figure out how exactly we want asyncpg pools to be refactored. For instance, I'm not sure that |
This change might not be for everyone, so I respect the right of the core maintainers to downvote/close.
The backstory here is that I am using
asyncpg
to connect to an HA postgres cluster. In the event of a failover, thewriter
(where theasyncpg
connection pool is connected to) crashes/fails, and is rebooted. Thereader
is then promoted towriter
. The DB DNS is automatically updated, but this takes time. In addition, the connections in the pool continue operating once the crashed DB recovers, except now that DB is the reader (i.e.SHOW transaction_read_only; -> on
). As a result, INSERT/UPDATE/DELETE operations result in:Other errors, like timeouts, are also possible offenders. Unfortunately, the only way I've found around this to refresh the connections is to set a low
max_queries
config param. This is generally ok, but degrades performance due to increased cycles of closing and opening new connections.With this PR, a configurable
max_consecutive_exceptions
config param is introduced. This param is checked against every time_do_execute
results in an exception of any kind. I would very much like to be more precise with exception handling. Initially, I wanted toexcept exceptions.__all__
, but some exception classes do not inherit fromException
, which causes other exceptions. I would certainly appreciate input on how to make the_do_execute
catch-all more limited to asyncpg exceptions.