Skip to content

Conversation

@rafiss
Copy link
Collaborator

@rafiss rafiss commented Oct 17, 2025

Backport:

Please see individual commits for details.

Release justification: low risk change that is gated by an opt-in cluster setting; will help reduce CPU usage during network timeouts.

In 39067de we added behavior to give up and close a network connection
if a threshold of repeated errors was reached. It retried on errors
since some network errors could be transient.

It was retrying tens of thousands of times, which is excessive. We lower
this to 256 now. This is motivated by a few tests that identifed the
error handling logic in this tight loop being quite expensive. Retrying
fewer times means that we'll reduce CPU usage during failure scenarios.

Release note: None
Previously, the maximum number of repeated network read errors before
aborting a connection was a hardcoded constant set to 256 (1 << 8).
This change makes the value configurable via a non-public cluster
setting `sql.pgwire.max_repeated_error_count`.

This allows operators to tune the threshold for aborting connections
experiencing repeated network errors, and allows us to backport this
change along with 5f562ad.

Epic: None
Release note: None
@rafiss rafiss requested a review from msbutler October 17, 2025 22:06
@rafiss rafiss requested review from a team as code owners October 17, 2025 22:06
@blathers-crl
Copy link

blathers-crl bot commented Oct 17, 2025

Thanks for opening a backport.

Before merging, please confirm that it falls into one of the following categories (select one):

  • Non-production code changes. Includes test-only changes, build system changes, etc.
  • Fixes for serious issues. Defined in the policy as correctness, stability, or security issues, data corruption/loss, significant performance regressions, breaking working and widely used functionality, or an inability to detect and debug production issues.
  • Other approved changes. These changes must be gated behind a disabled-by-default feature flag unless there is a strong justification not to.

Add a brief release justification to the PR description explaining your selection.

Also, confirm that the change does not break backward compatibility and complies with all aspects of the backport policy.

All backports must be reviewed by the TL and EM for the owning area.

@blathers-crl blathers-crl bot added backport Label PR's that are backports to older release branches T-multitenant Issues owned by the multi-tenant virtual team labels Oct 17, 2025
@cockroach-teamcity
Copy link
Member

This change is Reviewable

@rafiss rafiss merged commit b645213 into cockroachdb:release-25.2 Oct 20, 2025
15 checks passed
@rafiss rafiss deleted the backport25.2-154916-155656 branch October 22, 2025 14:38
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport Label PR's that are backports to older release branches T-multitenant Issues owned by the multi-tenant virtual team target-release-25.2.8

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants