-
Notifications
You must be signed in to change notification settings - Fork 45
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Process drain-failure nodes at the end #394
Merged
ZihanJiang96
merged 7 commits into
keikoproj:master
from
ssheladiya:skip-drain-failures
Oct 25, 2023
Merged
Process drain-failure nodes at the end #394
ZihanJiang96
merged 7 commits into
keikoproj:master
from
ssheladiya:skip-drain-failures
Oct 25, 2023
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
ZihanJiang96
approved these changes
Oct 19, 2023
Codecov Report
@@ Coverage Diff @@
## master #394 +/- ##
=======================================
Coverage 39.09% 39.09%
=======================================
Files 7 7
Lines 931 931
=======================================
Hits 364 364
Misses 540 540
Partials 27 27
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
Signed-off-by: ssheladiya <[email protected]>
Signed-off-by: ssheladiya <[email protected]>
…new func Signed-off-by: ssheladiya <[email protected]>
Signed-off-by: ssheladiya <[email protected]>
Signed-off-by: ssheladiya <[email protected]>
ssheladiya
force-pushed
the
skip-drain-failures
branch
from
October 20, 2023 03:29
db15bfa
to
10a5f5a
Compare
tekenstam
approved these changes
Oct 24, 2023
shreyas-badiger
approved these changes
Oct 24, 2023
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Merged
shreyas-badiger
added a commit
that referenced
this pull request
Dec 5, 2023
* early-cordon nodes Signed-off-by: sbadiger <[email protected]> * early cordon Signed-off-by: sbadiger <[email protected]> * include context in cordon and drain functions * cordon only drifted instances * add unit tests * Update aws-sdk-go-cache to v0.0.2 (#399) Signed-off-by: Todd Ekenstam <[email protected]> Signed-off-by: sbadiger <[email protected]> * Process drain-failure nodes at the end (#394) * Process drain-failures at the end Signed-off-by: ssheladiya <[email protected]> Signed-off-by: sbadiger <[email protected]> * early-cordon nodes Signed-off-by: sbadiger <[email protected]> * early cordon Signed-off-by: sbadiger <[email protected]> * include context in cordon and drain functions Signed-off-by: sbadiger <[email protected]> * Release v1.0.8 (#400) Signed-off-by: sbadiger <[email protected]> * cordon only drifted instances Signed-off-by: sbadiger <[email protected]> * add unit tests Signed-off-by: sbadiger <[email protected]> * resolve merge conflicts * update go.sum * resolve test errors * remove cordon as upgrade strategy * remove space * improve test coverage * improve code coverage * remove redundant code * remove unused imports * Update controllers/providers/kubernetes/nodes.go Co-authored-by: Venkata Gunapati <[email protected]> * uncordon the nodes * error handling for uncordoning * add tests * handle uncordon scenario properly * Update controllers/providers/kubernetes/nodes.go Co-authored-by: Venkata Gunapati <[email protected]> * fix typo * fix lint errors * default the feature to false --------- Signed-off-by: sbadiger <[email protected]> Signed-off-by: Todd Ekenstam <[email protected]> Co-authored-by: Todd Ekenstam <[email protected]> Co-authored-by: Siddharth Sheladiya <[email protected]> Co-authored-by: Venkata Gunapati <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Problem:
At present, if upgrade-manager comes across a node failure, it becomes stuck and fails to proceed with draining other nodes until the failed node is repaired. This may cause significant delays if the drain-failure node requires a considerable amount of time to fix
Proposal:
Upgrade-manager skip the drain-failed nodes and return to them when all the other nodes in the InstanceGroups are rotated
Changes introduced by this PR:
failed-drain
valuefailed-drain
instances while selecting the target and moves on to other instancesfailed-drain
instances again