-
Notifications
You must be signed in to change notification settings - Fork 628
✨ Support EKS upgrade policy #5471
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Welcome @phuhung273! |
Hi @phuhung273. Thanks for your PR. I'm waiting for a kubernetes-sigs member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
70e3003
to
fd458b2
Compare
Hi @richardcase, can you please help me review this PR if you have time |
/ok-to-test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @phuhung273 .
Feel free to ping me on the issue or in slack if you want a hand with anything.
// The default value is EXTENDED. Use STANDARD to disable extended support. | ||
// +kubebuilder:validation:Enum=EXTENDED;STANDARD | ||
// +optional | ||
UpgradePolicy *string `json:"upgradePolicy,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We generally use a string type alias for something like this and then variables defined for each supported value. Like this: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/controlplane/eks/api/v1beta2/types.go#L147-L157.
Also, the actual string values are normally lowercase so extended
instead of EXTENDED
. This does mean that when creating the AWS request that some conversion is done, like this for addons: https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/pkg/cloud/services/eks/addons.go#L212
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you, I didn't know this. Updated to use string type alias.
pkg/cloud/services/eks/cluster.go
Outdated
} | ||
|
||
if err := wait.WaitForWithRetryable(wait.NewBackoff(), func() (bool, error) { | ||
if _, err := s.EKSClient.UpdateClusterConfig(input); err != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is already a UpdateClusterConfig call in the reconciler, it would be better to use this instead of having a new separate update. https://github.com/kubernetes-sigs/cluster-api-provider-aws/blob/main/pkg/cloud/services/eks/cluster.go#L543
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, updated to use current UpdateClusterConfig call
24f8b66
to
353e345
Compare
Hi @richardcase, if you have time, please help me review the latest update |
4b49e38
to
00efef1
Compare
Hi @richardcase, can you help me review this |
Hi @richardcase, can you help review the change addressing your comments. |
// The support policy to use for the cluster. | ||
// Extended support indicates that the cluster will not be automatically upgraded | ||
// when it leaves the standard support period, and will enter extended support. | ||
// Clusters in extended support have higher costs. | ||
// The default value is extended. Use standard to disable extended support. | ||
// +kubebuilder:validation:Enum=extended;standard | ||
// +optional | ||
UpgradePolicy *UpgradePolicy `json:"upgradePolicy,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does it have to be a pointer? Considering it is marked as optional If we omit it it'll be empty right?
Then in that case we can default it to extended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agree. As this is a type alias for a string, we can do without the pointer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In case current cluster is STANDARD, eg: by workaround since CAPA doesn't support yet. Defaulting to EXTENDED will suddenly switch all clusters to EXTENDED, increasing the cost.
There is a testcase for this situation: if the field is nil, don't do anything https://github.com/phuhung273/cluster-api-provider-aws/blob/00efef15c7cb5263a6409d2b998f8583d5e0d1bd/pkg/cloud/services/eks/cluster_test.go#L703-L710
Do you think it is a valid case ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the default in the SDK for EKS if nothing is specified as a policy at the moment? If the default is "standard"
then we should keep standard as default here IMO. And that wouldn't cause the issue you are talking about
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default:
- CreateCluster is EXTENDED
- UpdateCluster doesn't touch that param if not specified.
The case im mentioning is when user have other stuff managing clusters on top of CAPA.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes I agree. In that case we can add a comment that explains how CAPA will behave when the user omits the value or sets it to ""
.
i.e. It will let the AWS Platform choose whatever default value is there for the upgrade policy (at the time of writing Extended, but this may change in the future).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems like we have an agreement on: Extended, Standard and omitted (""). Thank you both.
I think for new clusters we should explicitly state what the default is in our API.
Regarding to this point, we should explicitly set EXTENDED in createCluster
if user ommit rite ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Regarding to this point, we should explicitly set EXTENDED in createCluster if user ommit rite ?
We should omit setting it in createCluster if the user omits it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the confirmation, let me update the PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've updated to: Extended, Standard or omitted (""). PTAL @damdo @richardcase, thanks.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good @phuhung273 .
It would also be worth adding a line to the docs so state that by default we create a cluster with "extended" support, perhaps somehwre here: https://cluster-api-aws.sigs.k8s.io/topics/eks/creating-a-cluster
// The support policy to use for the cluster. | ||
// Extended support indicates that the cluster will not be automatically upgraded | ||
// when it leaves the standard support period, and will enter extended support. | ||
// Clusters in extended support have higher costs. | ||
// The default value is extended. Use standard to disable extended support. | ||
// +kubebuilder:validation:Enum=extended;standard | ||
// +optional | ||
UpgradePolicy *UpgradePolicy `json:"upgradePolicy,omitempty"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes agree. As this is a type alias for a string, we can do without the pointer.
pkg/cloud/services/eks/cluster.go
Outdated
return true, nil | ||
} | ||
|
||
func convertUpgradePolicy(input ekscontrolplanev1.UpgradePolicy) ekstypes.SupportType { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We sometimes but these in the "convert" package when converting to/from SDK types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, can see a similar converter package. I moved this function there.
b5b2dc7
to
12b6056
Compare
/test pull-cluster-api-provider-aws-e2e-blocking |
/test pull-cluster-api-provider-aws-e2e |
2 similar comments
/test pull-cluster-api-provider-aws-e2e |
/test pull-cluster-api-provider-aws-e2e |
Look like test infra issue |
/test pull-cluster-api-provider-aws-e2e |
2 similar comments
/test pull-cluster-api-provider-aws-e2e |
/test pull-cluster-api-provider-aws-e2e |
@phuhung273 context for that error here: https://kubernetes.slack.com/archives/CD6U2V71N/p1758795545213209 |
This one is spot instance test so if it is flaky i'm not surprised 😅 So i think it make sense to retry. But this run https://prow.k8s.io/view/gs/kubernetes-ci-logs/pr-logs/pull/kubernetes-sigs_cluster-api-provider-aws/5471/pull-cluster-api-provider-aws-e2e/1971153449110212608 it failed 3 other cases, while the spot one you're referring passed. |
The underlying issue is the same one. If we fix that, we fix all of them. Let's keep chatting on that thread |
/test pull-cluster-api-provider-aws-e2e |
Co-authored-by: Damiano Donati <[email protected]>
Co-authored-by: Damiano Donati <[email protected]>
Co-authored-by: Damiano Donati <[email protected]>
…olplanes.yaml Co-authored-by: Faiq <[email protected]>
Co-authored-by: Faiq <[email protected]>
Co-authored-by: Damiano Donati <[email protected]>
a8f7d49
to
52f04d0
Compare
Just rebase to see if thing gets better. |
/test pull-cluster-api-provider-aws-e2e |
@phuhung273: The following tests failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
What type of PR is this?
/kind feature
What this PR does / why we need it:
Support EKS upgrade policy
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)
format, will close the issue(s) when PR gets merged):Fixes #5183
Special notes for your reviewer:
Checklist:
Release note: