-
Notifications
You must be signed in to change notification settings - Fork 380
MON-4361: Annotate optional monitoring manifests #2675
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
@rexagod: This pull request references MON-4361 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the task to target the "4.21.0" version, but no target version was set. In response to this:
Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
[APPROVALNOTIFIER] This PR is APPROVED This pull-request has been approved by: rexagod The full list of commands accepted by this bot can be found here. The pull request process is described here
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
c4db7d8
to
a322ff8
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be good to wait for #2649 since it's migrating all the dashboards to static assets.
On the jsonnet implementation side, I wonder if it wouldn't easier to read/maintain if we inject the annotation into each component that needs it.
E.g. here for components for which all resources are OptionalMonitoring
cluster-monitoring-operator/jsonnet/main.jsonnet
Lines 520 to 539 in ea9a533
{ ['alertmanager/' + name]: inCluster.alertmanager[name] for name in std.objectFields(inCluster.alertmanager) } + | |
{ ['alertmanager-user-workload/' + name]: userWorkload.alertmanager[name] for name in std.objectFields(userWorkload.alertmanager) } + | |
{ ['cluster-monitoring-operator/' + name]: inCluster.clusterMonitoringOperator[name] for name in std.objectFields(inCluster.clusterMonitoringOperator) } + | |
{ ['dashboards/' + name]: inCluster.dashboards[name] for name in std.objectFields(inCluster.dashboards) } + | |
{ ['kube-state-metrics/' + name]: inCluster.kubeStateMetrics[name] for name in std.objectFields(inCluster.kubeStateMetrics) } + | |
{ ['node-exporter/' + name]: inCluster.nodeExporter[name] for name in std.objectFields(inCluster.nodeExporter) } + | |
{ ['openshift-state-metrics/' + name]: inCluster.openshiftStateMetrics[name] for name in std.objectFields(inCluster.openshiftStateMetrics) } + | |
{ ['prometheus-k8s/' + name]: inCluster.prometheus[name] for name in std.objectFields(inCluster.prometheus) } + | |
{ ['admission-webhook/' + name]: inCluster.admissionWebhook[name] for name in std.objectFields(inCluster.admissionWebhook) } + | |
{ ['prometheus-operator/' + name]: inCluster.prometheusOperator[name] for name in std.objectFields(inCluster.prometheusOperator) } + | |
{ ['prometheus-operator-user-workload/' + name]: userWorkload.prometheusOperator[name] for name in std.objectFields(userWorkload.prometheusOperator) } + | |
{ ['prometheus-user-workload/' + name]: userWorkload.prometheus[name] for name in std.objectFields(userWorkload.prometheus) } + | |
{ ['metrics-server/' + name]: inCluster.metricsServer[name] for name in std.objectFields(inCluster.metricsServer) } + | |
// needs to be removed once remote-write is allowed for sending telemetry | |
{ ['telemeter-client/' + name]: inCluster.telemeterClient[name] for name in std.objectFields(inCluster.telemeterClient) } + | |
{ ['monitoring-plugin/' + name]: inCluster.monitoringPlugin[name] for name in std.objectFields(inCluster.monitoringPlugin) } + | |
{ ['thanos-querier/' + name]: inCluster.thanosQuerier[name] for name in std.objectFields(inCluster.thanosQuerier) } + | |
{ ['thanos-ruler/' + name]: inCluster.thanosRuler[name] for name in std.objectFields(inCluster.thanosRuler) } + | |
{ ['control-plane/' + name]: inCluster.controlPlane[name] for name in std.objectFields(inCluster.controlPlane) } + | |
{ ['manifests/' + name]: inCluster.manifests[name] for name in std.objectFields(inCluster.manifests) } + |
Or at the level of the jsonnet component file in case it's per resource.
CHANGELOG.md
Outdated
- `KubePdbNotEnoughHealthyPods` | ||
- `KubeNodePressure` | ||
- `KubeNodeEviction` | ||
- []() Allow cluster-admins to opt-into optional monitoring using the `OptionalMonitoring` capability. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I realize that adding the annotation to the manifests under the assets/
directory will have no direct effect since there's no logic in CMO to deploy these resources conditionally, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've raised a PR for that: https://github.com/openshift/cluster-monitoring-operator/pull/2688/files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
461f6b9
to
cbdd203
Compare
Reverted the |
Metric rules and metrics exporters have not been opted-in to keep the telemetry rules functioning. Optional components include: * Alertmanager * AlertmanagerUWM * ClusterMonitoringOperatorDeps (partially, for AM) * MonitoringPlugin * PromtheusOperator (partially, for AM) * PromtheusOperatorUWM * ThanosRuler Signed-off-by: Pranshu Srivastava <[email protected]>
@rexagod: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
kind: ValidatingWebhookConfiguration | ||
metadata: | ||
annotations: | ||
capability.openshift.io/name: OptionalMonitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If the service is optional (*), shouldn't we apply the annotation to all admission-webhook resources?
(*) there could be an argument that we still want the admission webhook for PrometheusRule resources because of telemetry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly related to this change but if the console is disabled, wouldn't it be logical to avoid deploying the monitoring plugin resources?
annotations: | ||
api-approved.openshift.io: https://github.com/openshift/api/pull/1406 | ||
api.openshift.io/merged-by-featuregates: "true" | ||
capability.openshift.io/name: OptionalMonitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure that CMO will start if the CRDs aren't present.
kind: CustomResourceDefinition | ||
metadata: | ||
annotations: | ||
capability.openshift.io/name: OptionalMonitoring |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same here: IIRC the Prometheus operator will (at the minimum) log errors if the CRDs aren't installed.
Metric rules and metrics exporters have not been opted-in to keep the telemetry rules functioning.