Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KIM has to be able to deal with hibernated Shoot clusters #722

Open
5 tasks
tobiscr opened this issue Mar 11, 2025 · 0 comments
Open
5 tasks

KIM has to be able to deal with hibernated Shoot clusters #722

tobiscr opened this issue Mar 11, 2025 · 0 comments

Comments

@tobiscr
Copy link
Contributor

tobiscr commented Mar 11, 2025

Description

KIM is currently applying patch & update operations to Shoot-Specs without verifying whether the Shoot is hibernated. Hibernated Shoots are staying endless in state PENDING if their Shoot-Spec gets updated.

We have to agree on the technical approach, how we treat hibernated clusters by KIM.

First ideas of strategies how to handle such clusters:

  1. Don't update the cluster and mark it as dirty
    PRO: Clusters can be out of sync, but cluster-states won't switch into PENDING state.
    CON: When cluster is resumed, the state is not in-sync with the configuration described in RuntimeCR. It can take a while until the cluster will be synchronized during the next full-reconciliation loop. We could short this delay by reducing the full-reconciliation loop to a few minutes (requires Avoid Gardener API requests by using caching layer for Runtime CR #697).
  2. Apply the update and mark the cluster as dirty so that KIM will not consider the PENDING state and mark it as READY
    PRO: The shoot is always in Sync with Runtime-CR.
    CON: Shoot-spec shows PENDING state and KIM has to ignore the state until the cluster gets resumed. If Gardener failed in applying the changes, Shoot-state could switch into an error state and KIM won't notice it. Could be mitigated by shorten the interval of the full-reconciliation loop to a few minutes (requires Avoid Gardener API requests by using caching layer for Runtime CR #697).

Independent to the used approach, we have to mark such Shoots as dirty (out of sync with RuntimeCR). This could be done by adding a label or annotation to the RuntimeCR. During a full reconciliation of all RuntimeCRs, such clusters have to be reconciled and we have to ensure the RuntimeCR is applied in the Shoot-Spec and the cluster is in an healthy state.

AC:

Expected result

Shoots which are hibernated cannot retrieve updates otherwise they stay endless in PENDING state. KIM has to be extended to deal properly with such Shoots.

Actual result

Hibernerated shoots are updated and stay endless in PENDING state until the cluster was resumed and updates applied by Gardener.

Steps to reproduce

  1. Create a cluster and hibernate it
  2. Update the cluster with KIM
  3. Shoot-Spec shows PENDING state

Troubleshooting

@tobiscr tobiscr changed the title KIM has to ensure Shoot is not hibernated before applying patch/update operations on Shoot-Spec KIM has to be able to deal with hibernated Shoot clusters Mar 11, 2025
@tobiscr tobiscr added kind/bug Categorizes issue or PR as related to a bug. and removed kind/bug Categorizes issue or PR as related to a bug. labels Mar 11, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant