transport: Suppress warning when the connection is closing #8654

twz123 · 2025-10-16T08:19:12Z

Don't emit a warning when the connection is closing. A warning implies that something is not as it should be, but it's expected that the connection attempt is interrupted in this case.

RELEASE NOTES: N/A

Don't emit a warning when the connection is closing. A warning implies that something is not as it should be, but it's expected that the connection attempt is interrupted in this case. RELEASE NOTES: N/A Signed-off-by: Tom Wieczorek <[email protected]>

codecov · 2025-10-16T08:22:37Z

Codecov Report

❌ Patch coverage is 57.14286% with 3 lines in your changes missing coverage. Please review.
✅ Project coverage is 80.83%. Comparing base (ae62635) to head (4edf517).
⚠️ Report is 13 commits behind head on master.

Files with missing lines	Patch %	Lines
clientconn.go	57.14%	2 Missing and 1 partial ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master    #8654      +/-   ##
==========================================
- Coverage   81.21%   80.83%   -0.38%     
==========================================
  Files         416      416              
  Lines       41002    41096      +94     
==========================================
- Hits        33298    33220      -78     
- Misses       6226     6331     +105     
- Partials     1478     1545      +67

Files with missing lines	Coverage Δ
clientconn.go	`78.61% <57.14%> (-10.83%)`	⬇️

... and 26 files with indirect coverage changes

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

easwars · 2025-10-16T22:28:27Z

@twz123 : Could you please describe the problem you are seeing in a little bit more detail and say how your change helps fix the problem? Thanks.

twz123 · 2025-10-17T11:04:57Z

When connecting to etcd using their client, I'm seeing warnings during normal operations like this

2025/10/15 12:55:29 WARNING: [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: Error while dialing: dial tcp 127.0.0.1:2379: operation was canceled"

or this:

2025/10/15 12:56:24 WARNING: [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: authentication handshake failed: context canceled"

This broke something on our side (k0sproject/k0sctl#953), as we weren't prepared for the additional output. While that was an oversight and easy to fix in k0sctl, we were wondering why we're seeing the warning in the first place. It seems to happen when/after the GRPC client connection gets established. I don't have a clue about GRPC internals, but it seems that a subchannel gets cancelled while it's still creating its transport. This seems like "normal operations" to me. A warning implies that something is not as it should be, so I figured the warning would be inappropriate in this situation?

Below an example output of my test app, which connects to etcd, executes a command, and disconnects again:

Oct 17 12:58:48.922 Connecting.
2025/10/17 12:58:48 INFO: [core] original dial target is: "etcd-endpoints://0xc000334960/127.0.0.1:2379"
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel created for target "etcd-endpoints://0xc000334960/127.0.0.1:2379"
2025/10/17 12:58:48 INFO: [core] [Channel #1] parsed dial target is: resolver.Target{URL:url.URL{Scheme:"etcd-endpoints", Opaque:"", User:(*url.Userinfo)(nil), Host:"0xc000334960", Path:"/127.0.0.1:2379", RawPath:"", OmitHost:false, ForceQuery:false, RawQuery:"", Fragment:"", RawFragment:""}}
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel authority set to "127.0.0.1:2379"
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel switches to new LB policy "pick_first"
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel Connectivity change to CONNECTING
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #2] Subchannel created
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel switches to new LB policy "round_robin"
2025/10/17 12:58:48 INFO: [roundrobin] [0xc000368bd0] Created
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #3] Subchannel created
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel exiting idle mode
Oct 17 12:58:48.923 Connected, Sleeping one sec.
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to CONNECTING
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #2] Subchannel picks a new address "127.0.0.1:2379" to connect
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #3] Subchannel Connectivity change to CONNECTING
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #2] Subchannel Connectivity change to SHUTDOWN
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #2] Subchannel deleted
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #3] Subchannel picks a new address "127.0.0.1:2379" to connect
2025/10/17 12:58:48 WARNING: [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: authentication handshake failed: context canceled"
2025/10/17 12:58:48 WARNING: [core] [Channel #1 SubChannel #2] grpc: addrConn.createTransport failed to connect to {Addr: "127.0.0.1:2379", ServerName: "127.0.0.1:2379", }. Err: connection error: desc = "transport: authentication handshake failed: context canceled"
2025/10/17 12:58:48 INFO: [core] [Channel #1 SubChannel #3] Subchannel Connectivity change to READY
2025/10/17 12:58:48 INFO: [core] [Channel #1] Channel Connectivity change to READY
Oct 17 12:58:49.923 Listing members.
{"members":{"controller-0":"https://172.29.182.39:2380"}}
Oct 17 12:58:49.924 Disconnecting.
2025/10/17 12:58:49 INFO: [core] [Channel #1] Channel Connectivity change to SHUTDOWN
2025/10/17 12:58:49 INFO: [core] [Channel #1] Closing the name resolver
2025/10/17 12:58:49 INFO: [core] [Channel #1] ccBalancerWrapper: closing
2025/10/17 12:58:49 INFO: [core] [Channel #1 SubChannel #3] Subchannel Connectivity change to SHUTDOWN
2025/10/17 12:58:49 INFO: [core] [Channel #1 SubChannel #3] Subchannel deleted
2025/10/17 12:58:49 INFO: [core] [Channel #1] Channel deleted
Oct 17 12:58:49.924 Disconnected.
Done with warnings.

easwars · 2025-10-29T05:34:51Z

@twz123 Thank you for sharing logs.

From the output I do see that the load balancing policy on the gRPC Channel is changing from pick_first to round_robin. So, while you don't see any connectivity problems on the client, internally the gRPC Channel (which is a virtual connection from the client to the server) is changing its load balancing policy. And the latter is the one which creates subchannels (these are actual connections to real backend servers). And when the pick_first policy is closed, it closes the subchannels that it created. And then round_robin comes along and creates new a subchannel. Although, in your case, both these subchannels are for the same backend address 127.0.0.1:2379.

We would like to keep throwing this warning log since a subchannel not being able to connect to the backend is an important event that we care about. This is in fact specifically mentioned in one of our docs that we use as a guideline for log statements: https://github.com/grpc/grpc-go/blob/master/Documentation/log_levels.md

I'd like to close this and keep existing behavior since it looks like this can and has been easily fixed on your side.

Please feel free to reopen if you think otherwise.

twz123 · 2025-10-29T08:03:06Z

From the output I do see that the load balancing policy on the gRPC Channel is changing from pick_first to round_robin. So, while you don't see any connectivity problems on the client, internally the gRPC Channel (which is a virtual connection from the client to the server) is changing its load balancing policy. And the latter is the one which creates subchannels (these are actual connections to real backend servers). And when the pick_first policy is closed, it closes the subchannels that it created. And then round_robin comes along and creates new a subchannel. Although, in your case, both these subchannels are for the same backend address 127.0.0.1:2379.

Thanks for the explanation @easwars!

We would like to keep throwing this warning log since a subchannel not being able to connect to the backend is an important event that we care about. This is in fact specifically mentioned in one of our docs that we use as a guideline for log statements: https://github.com/grpc/grpc-go/blob/master/Documentation/log_levels.md

I agree that this warning is important and useful. I didn't remove it; I just silenced it in one specific case where I believe it is inappropriate: When the connection has been closed by the user. This is a normal operation, and subchannels are expected to close. They simply didn't finish connecting and were interrupted along the way. The code was issuing a warning in any case. Now, it will only issue a warning if the connection does not close.

The PR diff might make it look like the warning has been removed completely, but it was actually just moved from createTransport to tryAllAddrs. In createTransport, it was the last statement before returning an error, and tryAllAddrs is the only caller of createTransport. In tryAllAddrs, the warning will be logged as before, except when the connection closes.

I'd like to close this and keep existing behavior since it looks like this can and has been easily fixed on your side.

The hard error has been fixed in k0sctl, yes. However, the warning remains, not only in k0sctl, but every time something interacts with etcd on localhost. It's confusing for users and developers alike. Users are seeing it every time when they query the etcd member status via k0s, and the kube-apiserver logs are full with these warnings, too. We've already received bug reports claiming that etcd clusters were broken because users misinterpreted the logs.

twz123 · 2025-10-29T08:04:00Z

Please feel free to reopen if you think otherwise.

PS: I don't have the permissions to reopen this.

twz123 mentioned this pull request Oct 16, 2025

Performing reset, "Failed to decode etcd member-list output" k0sproject/k0sctl#953

Closed

4 tasks

easwars assigned twz123 Oct 16, 2025

easwars added the Status: Requires Reporter Clarification label Oct 16, 2025

easwars removed the Status: Requires Reporter Clarification label Oct 17, 2025

easwars assigned easwars and unassigned twz123 Oct 17, 2025

easwars closed this Oct 29, 2025

twz123 mentioned this pull request Oct 29, 2025

Allow waiting for all goroutines to exit on client connection close #8655

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

transport: Suppress warning when the connection is closing #8654

transport: Suppress warning when the connection is closing #8654

Uh oh!

twz123 commented Oct 16, 2025

Uh oh!

codecov bot commented Oct 16, 2025 •

edited

Loading

Uh oh!

easwars commented Oct 16, 2025

Uh oh!

twz123 commented Oct 17, 2025

Uh oh!

easwars commented Oct 29, 2025

Uh oh!

twz123 commented Oct 29, 2025 •

edited

Loading

Uh oh!

twz123 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

transport: Suppress warning when the connection is closing #8654

transport: Suppress warning when the connection is closing #8654

Uh oh!

Conversation

twz123 commented Oct 16, 2025

Uh oh!

codecov bot commented Oct 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

easwars commented Oct 16, 2025

Uh oh!

twz123 commented Oct 17, 2025

Uh oh!

easwars commented Oct 29, 2025

Uh oh!

twz123 commented Oct 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

twz123 commented Oct 29, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Oct 16, 2025 •

edited

Loading

twz123 commented Oct 29, 2025 •

edited

Loading