-
Notifications
You must be signed in to change notification settings - Fork 1.4k
📖 Update in-place update proposal #13088
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
📖 Update in-place update proposal #13088
Conversation
|
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
873a1ce to
cffc8a3
Compare
|
/cherry-pick release-1.12 |
|
@fabriziopandini: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
cffc8a3 to
e9b8505
Compare
|
/cherry-pick release-1.12 |
|
@fabriziopandini: once the present PR merges, I will cherry-pick it on top of In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. |
ef31d28 to
595ae83
Compare
alexander-demicev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thanks a lot! We should add a note to not forget to update infra machine contract too.
/lgtm
|
LGTM label has been added. Git tree hash: 18d3cbb6c27cde15c6e9dab3dee09e56952ad61e
|
Added to the tracking issue |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be a separate PR / Bugfix so it shows up in the release notes properly too? (ok to keep it on this, just curious)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes definitely (and this PR already merged)
| * [Implementing Runtime Extensions](./implement-extensions.md) | ||
| * [Implementing Lifecycle Hook Extensions](./implement-lifecycle-hooks.md) | ||
| * [Implementing Topology Mutation Hook Extensions](./implement-topology-mutation-hook.md) | ||
| * [Implementing In-Place Update Hooks Extensions](./implement-in-place-update-hooks.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: we should either use In-place or In-Place everywhere :-)
|
|
||
| ## Introduction | ||
|
|
||
| The proposal for [n-place updates in Cluster API](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/20240807-in-place-updates.md) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| The proposal for [n-place updates in Cluster API](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/20240807-in-place-updates.md) | |
| The proposal for [in-place updates in Cluster API](https://github.com/kubernetes-sigs/cluster-api/blob/main/docs/20240807-in-place-updates.md) |
| is “buffer” for in-place, in-place update can proceed. | ||
| - When in-place is possible, the system should try to in-place update as many machines as possible. | ||
| In practice, this means that maxSurge might be not fully used (it is used only for scale up by one if maxUnavailable=0). | ||
| - No in-place updates are performed for workers machines when using rollout strategy on delete. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| - No in-place updates are performed for workers machines when using rollout strategy on delete. | |
| - No in-place updates are performed for workers machines when using rollout strategy `OnDelete`. |
|
|
||
| This hook is called by KCP when performing the "can update in-place" for a control plane machine. | ||
|
|
||
| Example request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Example request | |
| Example request: |
| - Only spec is provided, status fields are not included | ||
| - When more than one extension will be supported, the current state will already include changes that can handle in-place by other runtime extensions. | ||
| Example Response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Example Response | |
| Example Response: |
| Note: | ||
| - All the objects will have the latest API version known by Cluster API. | ||
| - Only spec is provided, status fields are not included | ||
| - When more than one extension will be supported, the current state will already include changes that can handle in-place by other runtime extensions. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"will be supported" is unclear to me, can we be more specific?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
clarified, PTAL
| This hook is called by the MachineDeployment controller when performing the "can update in-place" for all the Machines controlled by | ||
| a MachineSet. | ||
| Example request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Example request | |
| Example request: |
| - Only spec is provided, status fields are not included | ||
| - When more than one extension will be supported, the current state will already include changes that can handle in-place by other runtime extensions. | ||
| Example Response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Example Response | |
| Example Response: |
| This hook is called by the Machine controller when performing the in-place updates for a Machine. | ||
| Example request |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Example request | |
| Example request: |
| - Only desired is provided (the external updater extension should know current state of the Machine). | ||
| - Only spec is provided, status fields are not included | ||
| Example Response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Example Response | |
| Example Response: |
|
|
||
| <h1>High complexity</h1> | ||
|
|
||
| Implementing the in-place update transition in a race condition free, re-entrant way is more complex that it might seem. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Implementing the in-place update transition in a race condition free, re-entrant way is more complex that it might seem. | |
| Implementing the in-place update transition in a race condition-free, re-entrant way is more complex than it might seem. |
| As a first step, CAPI controllers will compute the set of desired changes (current and desired state). | ||
|
|
||
| If any of the desired changes cannot be covered by the updaters capabilities, CAPI will determine the desired state cannot be reached through external updaters. In this case, it will fallback to the rolling update strategy, replacing machines as needed. | ||
| Then CAPI controllers will then iterate over the registered external updaters, requesting to each updater if it can handle required changes through the `CanUpdateMachineSet` (MachineDeployment) and `CanUpdateMachine` (KCP). |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
| Then CAPI controllers will then iterate over the registered external updaters, requesting to each updater if it can handle required changes through the `CanUpdateMachineSet` (MachineDeployment) and `CanUpdateMachine` (KCP). | |
| Then CAPI controllers will iterate over the registered external updaters, requesting each updater if it can handle required changes through the `CanUpdateMachineSet` (MachineDeployment) and `CanUpdateMachine` (KCP). |
66f9925 to
f577682
Compare
f577682 to
b3e756d
Compare
|
Feedback addressed, also found a two other places where to add a few words about in-place |
|
Looks like all comments have been addressed, and overall looks good to me. /lgtm |
|
LGTM label has been added. Git tree hash: 6e60c5f91e39ba912b6cfef8b988dbc6c9e9e94b
|
What this PR does / why we need it:
Which issue(s) this PR fixes (optional, in
fixes #<issue number>(, fixes #<issue_number>, ...)format, will close the issue(s) when PR gets merged):Fixes #
/area documentation