Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversion webhook hardcoded at kube-system namespace #6818

Open
marcofranssen opened this issue Aug 21, 2024 · 21 comments
Open

Conversion webhook hardcoded at kube-system namespace #6818

marcofranssen opened this issue Aug 21, 2024 · 21 comments
Labels
bug Something isn't working needs-triage Issues that need to be triaged

Comments

@marcofranssen
Copy link

marcofranssen commented Aug 21, 2024

Description

Observed Behavior:

When upgrading our karpenter to the v1.0.0 chart it fails at the conversion webhook which is targeting the kube-system namespace. Our karpenter is deployed in the karpenter namespace.

controller logs

{"level":"ERROR","time":"2024-08-21T09:02:09.801Z","logger":"controller","message":"k8s.io/[email protected]/tools/cache/reflector.go:232: Failed to watch *v1.NodeCl │
│ aim: failed to list *v1.NodeClaim: conversion webhook for karpenter.sh/v1beta1, Kind=NodeClaim failed: Post \"https://karpenter.kube-system.svc:8443/conversion/karpenter.sh?timeout=30s\": service  │
│ \"karpenter\" not found","commit":"5bdf9c3"}

Expected Behavior:

The conversion webhook targets the Helm Release namespace.

Reproduction Steps (Please include YAML):

Install Karpenter in the karpenter namespace using the release prior to v1.0.0. Then upgrade karpenter to v1.0.0.

Versions:

  • Chart Version: v1.0.0
  • Kubernetes Version (kubectl version): 1.30
  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • Please do not leave "+1" or "me too" comments, they generate extra noise for issue followers and do not help prioritize the request
  • If you are interested in working on this issue or have submitted a pull request, please leave a comment
@marcofranssen marcofranssen added bug Something isn't working needs-triage Issues that need to be triaged labels Aug 21, 2024
@marcofranssen
Copy link
Author

Turns out this is controller by

https://github.com/aws/karpenter-provider-aws/blob/main/charts/karpenter-crd/values.yaml#L5

This is something that should be made very clear in the documentation or migration guide.

It isn't mentioned here https://aws.amazon.com/blogs/containers/announcing-karpenter-1-0/

@woehrl01
Copy link
Contributor

@marcofranssen We just stumbled over this as well. At best, this could be by default the namespace of the helm chart deployment.

@marcofranssen
Copy link
Author

yes indeed. that would even be better

@denniszag
Copy link

I encountered the same problem using ArgoCD. The karpenter-crd chart is installed correctly in the karpenter namespace. But the karpenter chart also installs the same CRDs (using ArgoCD) but wants to change it to kube-system namespace.
An option to disable installing CRDs using the karpenter chart can help.

@marcofranssen
Copy link
Author

I encountered the same problem using ArgoCD. The karpenter-crd chart is installed correctly in the karpenter namespace. But the karpenter chart also installs the same CRDs (using ArgoCD) but wants to change it to kube-system namespace. An option to disable installing CRDs using the karpenter chart can help.

You can already do so by --skip-crds.

@woehrl01
Copy link
Contributor

woehrl01 commented Aug 21, 2024

@marcofranssen using that value only works if you install the "karpenter-crd" helm chart, if you use the "normal" one, they are symlinked and replacement does not happen. (https://github.com/aws/karpenter-provider-aws/tree/28da0b96b6086679f75e656d31ac65bd7fca2bc0/charts/karpenter)

@woehrl01
Copy link
Contributor

woehrl01 commented Aug 21, 2024

Bildschirmfoto 2024-08-21 um 14 34 10

Looks like the modification is made with a post job hook, and the referenced image digest is not compatible with ARM

@woehrl01
Copy link
Contributor

Related to #6819 #6765

@Vinaum8
Copy link

Vinaum8 commented Aug 22, 2024

Related to #6544

@marcofranssen
Copy link
Author

Seems to be resolved in #6827

@jmdeal
Copy link
Contributor

jmdeal commented Aug 26, 2024

There's some additional justification given here, but if you need to template the CRDs (required if installing Karpenter outside of kube-system or customizing the service), you will need to use the karpenter-crd chart. This is why the karpenter-crd chart is used in the v1 migration guide, with an alternative of manually patching the CRDs.

As far as what's missing from our docs, I do think we can be a little more explicit. We do instruct users to install the CRD chart as part of the installation guide, but without a justification of why I can understand why users would continue to just use the standard chart since it already includes the CRDs, just without templating.

@booleanbetrayal
Copy link

As far as what's missing from our docs, I do think we can be a little more explicit. We do instruct users to install the CRD chart as part of the installation guide, but without a justification of why I can understand why users would continue to just use the standard chart since it already includes the CRDs, just without templating.

@jmdeal - In our case, in an ArgoCD environment, it's not the lack of clear instructions or justifications for using the karpenter-crd chart ... it's that you cannot deploy the CRDs from the karpenter-crd at all over ArgoCD managed resources, because of .Release.Service / managed-by and release-name stamping. You get fun errors like:

Error: rendered manifests contain a resource that already exists. Unable to continue with install: CustomResourceDefinition "ec2nodeclasses.karpenter.k8s.aws" in namespace "" exists and cannot be imported into the current release: invalid ownership metadata; label validation error: missing key "app.kubernetes.io/managed-by": must be set to "Helm"; annotation validation error: missing key "meta.helm.sh/release-name": must be set to "karpenter-crd"

So you're stuck with the situation of having to figure it out yourself and potentially dealing with the Application construction as we are discussing in #6847.

@frimik
Copy link

frimik commented Sep 12, 2024

For now, "fixing" it with an override on top of the helm chart. Because the validation in the webhook is still useful for me on older Kubernetes versions.

From Tanka with ❤️.

{
          local conversionWebhookSpecMixin = {
            spec+: {
              conversion+: {
                webhook+: {
                  clientConfig+: {
                    service+: {
                      namespace: c.karpenter.namespace,
                    },
                  },
                },
              },
            },
          },

      karpenter: helm.template(releaseName, './vendor/charts/karpenter', {
          apiVersions: apiVersions,
          includeCrds: includeCrds,
          kubeVersion: kubeVersion,
          namespace: c.karpenter.namespace,
          noHooks: noHooks,
          values: values,
        }) + {
          // https://github.com/aws/karpenter-provider-aws/issues/6818
          'custom_resource_definition_ec_2nodeclasses.karpenter.k_8s.aws'+: conversionWebhookSpecMixin,
          'custom_resource_definition_nodeclaims.karpenter.sh'+: conversionWebhookSpecMixin,
          'custom_resource_definition_nodepools.karpenter.sh'+: conversionWebhookSpecMixin,
       },
}

@frimik
Copy link

frimik commented Sep 12, 2024

Seems to be backported to v1.0.2. See #6855, #6849 and then there are possibly several other open issues at least partially related to this issue and fix. Right?

@icicimov
Copy link

I'm seeing the same error:

{"level":"ERROR","time":"2024-10-23T04:35:12.676Z","logger":"controller","message":"k8s.io/[email protected]/tools/cache/reflector.go:232: Failed to watch *v1.NodePool: failed to list *v1.NodePool: conversion webhook for karpenter.sh/v1beta1, Kind=NodePool failed: Post \"https://karpenter.kube-system.svc:8443/conversion/karpenter.sh?timeout=30s\": service \"karpenter\" not found","commit":"f587167"}
panic: Conversion webhook enabled but unable to complete call: Timeout: failed waiting for *v1.NodePool Informer to sync

but for karpenter chart 0.37.5 (not using karpenter-crd chart) installed in karpenter NameSpace:

$ kubectl get pods -n karpenter karpenter-5bdb98d7c-clpn9 -o yaml | grep 'image:'
    image: public.ecr.aws/karpenter/controller:0.37.5@sha256:df978d853e0d96f905eac1e7abd87075e9fb617ce610bee3dd9a4f947c31d573

I thought this should be an issue with 1.0+ as per the above conversations? Bit suspicious that it is looking for version v1.NodePool though right (since 0.37.5 is still v1beta1 API) ?

@honarkhah
Copy link

honarkhah commented Oct 23, 2024

Thanks to all this worked out for me

$ export KARPENTER_VERSION=1.0.6
$ helm registry logout public.ecr.aws
$ helm upgrade --install karpenter-crd oci://public.ecr.aws/karpenter/karpenter-crd  --version "${KARPENTER_VERSION}" --namespace karpenter --create-namespace
$ helm upgrade --install karpenter oci://public.ecr.aws/karpenter/karpenter --version "${KARPENTER_VERSION}" -f values/karpenter-values.yaml -n karpenter --skip-crds

@ketozhang
Copy link

ketozhang commented Nov 1, 2024

Those that installs Karpenter as a Helm subchart are susceptible to this issue.

Options are to move Karpenter out of the parent chart OR to install karpenter-crd separately.

@Makeshift
Copy link

Additionally, the service that Karpenter creates is named after the release name.

The webhook seems to be hardcoded to look for a service named karpenter, so mine was erroring due to me having called the release karpenter-controller.

@corkupine
Copy link

corkupine commented Dec 5, 2024

I thought this should be an issue with 1.0+ as per the above conversations? Bit suspicious that it is looking for version v1.NodePool though right (since 0.37.5 is still v1beta1 API) ?

@icicimov - I am running into the same issue with 0.35.11 while preparing to migrate to 1.0. Were you able to resolve it?

@icicimov
Copy link

icicimov commented Dec 9, 2024

@corkupine in preparation to 1.0 move I just split the charts into separate dependency as recommended in the docs:

dependencies:
  - name: karpenter-crd
    version: 0.37.5
    repository: oci://public.ecr.aws/karpenter
  - name: karpenter
    version: 0.37.5
    repository: oci://public.ecr.aws/karpenter

@Vinaum8
Copy link

Vinaum8 commented Dec 11, 2024

@corkupine in preparation to 1.0 move I just split the charts into separate dependency as recommended in the docs:

dependencies:
  - name: karpenter-crd
    version: 0.37.5
    repository: oci://public.ecr.aws/karpenter
  - name: karpenter
    version: 0.37.5
    repository: oci://public.ecr.aws/karpenter

I did this too, I'm using argocd and it works.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working needs-triage Issues that need to be triaged
Projects
None yet
Development

No branches or pull requests