-
Notifications
You must be signed in to change notification settings - Fork 831
Add retries for netlink calls that may return a EINTR #1154
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Signed-off-by: Adrian Moisey <[email protected]>
Signed-off-by: Adrian Moisey <[email protected]>
A wrapper to add retry on for netlink when it receives a ErrDumpInterrupted Signed-off-by: Adrian Moisey <[email protected]>
These are functions identified as potentially receiving ErrDumpInterrupted and needing to retry Signed-off-by: Adrian Moisey <[email protected]>
| github.com/pkg/errors v0.9.1 | ||
| github.com/safchain/ethtool v0.5.10 | ||
| github.com/vishvananda/netlink v1.3.0 | ||
| github.com/vishvananda/netlink v1.3.1-0.20250303224720-0e7078ed04c8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this contains the docker patch in the library that will alliviate the problem considerably, as it will not return early
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
aha, interesting -- do you have a link to the commit / patch?
| @@ -0,0 +1,321 @@ | |||
| // Package netlinksafe wraps vishvandanda/netlink functions that may return EINTR. | |||
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if you forking the code it is better to indicate the source , open source licenses and copyright should be respected
License is apache so it is ok, but I noticed they have copyright https://github.com/moby/moby/blob/6de8ba3bc52377ddb06d80c76b0c8d302cb81261/LICENSE#L179
// Copyright Docker, Inc.
// Copied from https://github.com/moby/moby/blob/edaa0eb56d7e749df81bd3dcc2945b81e911b52e/internal/nlwrap/nlwrap_linux.go
Different projects follow different rules, but this seems to be correct to the best of my knowledge
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, a copyright attribution is the right thing to do here. Something like
// SPDX-License-Identifier: Apache-2.0
// Copyright Docker, Inc.
// Copied from https://github.com/moby/moby/blob/edaa0eb56d7e749df81bd3dcc2945b81e911b52e/internal/nlwrap/nlwrap_linux.go
Given that the upstream files do not have a copyright header, we're on our own to Do The Right Thing.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've made #1159 as a follow up to get this added
|
LGTM module the comment for respecting the copyright #1154 (comment) Folks this is causing a lot of pain in CIs with errors spawning Pods, please take a look and prioritize if possible Thanks |
|
I really wish the maintainers of the netlink pkg did a major version bump for this. It really had a large impact. |
| // by netlink functions along with the EINTR. So, it's not possible to do | ||
| // anything but retry. After maxAttempts the EINTR will be returned to the | ||
| // caller. | ||
| package netlinksafe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably bike-shedding, but I'm slightly wondering if netlinksafe is a good name for this; wondering if "safe" gives incorrect expectations for what it does.
Also similar comments as on the containerd PR;
- should this be an
internal/package, or do we want external users to use it directly? - if we're implementing it in this repository as a package, it could be used by containerd (instead of duplicating it there)
Slightly wondering if (now that more projects start to fork this code) it would perhaps be a better idea to upstream the package from moby to the github.com/vishvananda/netlink repository (as a utility package), so that we don't end up with a gazillion forks of the same code.
To my understanding, it was not correct to do this by default, but having it as an optional utility package could still work?
Any thoughts, @robmry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@thaJeztah want to discuss this on slack? I sent you a dm on the cncf slack.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably bike-shedding, but I'm slightly wondering if
netlinksafeis a good name for this; wondering if "safe" gives incorrect expectations for what it does.
Agreed, when I picked the name it felt wrong. Happy to adjust it when a name is figured out
Slightly wondering if (now that more projects start to fork this code) it would perhaps be a better idea to upstream the package from moby to the github.com/vishvananda/netlink repository (as a utility package), so that we don't end up with a gazillion forks of the same code.
Agreed on this too.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MikeZappa87 about to jump into another call (sorry! etoomanycalls), but will have a peek later
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alternative is to downgrade the netlink pkg version until this is resolved as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with netlinksafe, mostly we just need to know which package and where to go
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Honestly, cni/plugins isn't that big of a project, so I'm not too worried about having perfect naming conventions. safenetlink is fine by me 🤷.
Yeah, we really need one; I know debian packages of Docker are (or were) broken because of that. (also have some PRs pending there to slightly lift CI to a better situation (vishvananda/netlink#1033, vishvananda/netlink#1048, vishvananda/netlink#1009) @aboch @vishvananda any chance we could get help with that? |
I reached out to Vish via email. I am hoping we can possibly undo some of the blast radius from the change. |
| // by netlink functions along with the EINTR. So, it's not possible to do | ||
| // anything but retry. After maxAttempts the EINTR will be returned to the | ||
| // caller. | ||
| package netlinksafe |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm ok with netlinksafe, mostly we just need to know which package and where to go
|
We want to cut a release -- we can worry about the code header if it's a problem :-) |
Fixes #1121
Heavily inspired by moby/moby#48407