-
Notifications
You must be signed in to change notification settings - Fork 570
MON-4031: Add prometheusOperatorConfig API #2481
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
MON-4031: Add prometheusOperatorConfig API #2481
Conversation
Signed-off-by: Mario Fernandez <[email protected]>
Hello @marioferh! Some important instructions when contributing to openshift/api: |
/assign |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most, if not all, of my recent comments from #2461 apply here
// Specifically, it can configure how the Prometheus Operator instance is deployed, pod scheduling, and resource allocation. | ||
// When omitted, this means no opinion and the platform is left to choose a reasonable default, which is subject to change over time. | ||
// +optional | ||
PrometheusOperatorConfig PrometheusOperatorConfig `json:"prometheusOperatorConfig,omitempty,omitzero"` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How does this configuration relate to the configuration proposed in #2463?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is Prometheus Operator, the other one is Prometheus config. Of course they are related but they have different configs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the Prometheus config used by the PrometheusOperator?
Would it make sense to co-locate the configurations under a top-level prometheus
field?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not directly . Prometheus Config is use by Prometheus. PrometheusOperator manages Prometheus instances, a
Alertmanagare, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So what configures the Prometheus instances created by the Prometheus Operator to use the Prometheus Config?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
CMO takes PrometheusK8sConfing configmap and create a CR.
PrometheosOperator takes that CR and configure Prometheus.
I can understand your idea but PrometheusOperator manages all these components and I it's not a good idea to have all fields inside PrometheusOperator.
A core feature of the Prometheus Operator is to monitor the Kubernetes API server for changes to specific objects and ensure that the current Prometheus deployments match these objects. The Operator acts on the following [Custom Resource Definitions (CRDs)](https://kubernetes.io/docs/tasks/access-kubernetes-api/extend-api-custom-resource-definitions/):
Prometheus, which defines a desired Prometheus deployment.
PrometheusAgent, which defines a desired Prometheus deployment, but running in Agent mode.
Alertmanager, which defines a desired Alertmanager deployment.
ThanosRuler, which defines a desired Thanos Ruler deployment.
ServiceMonitor, which declaratively specifies how groups of Kubernetes services should be monitored. The Operator automatically generates Prometheus scrape configuration based on the current state of the objects in the API server.
PodMonitor, which declaratively specifies how group of pods should be monitored. The Operator automatically generates Prometheus scrape configuration based on the current state of the objects in the API server.
Probe, which declaratively specifies how groups of ingresses or static targets should be monitored. The Operator automatically generates Prometheus scrape configuration based on the definition.
ScrapeConfig, which declaratively specifies scrape configurations to be added to Prometheus. This CustomResourceDefinition helps with scraping resources outside the Kubernetes cluster.
PrometheusRule, which defines a desired set of Prometheus alerting and/or recording rules. The Operator generates a rule file, which can be used by Prometheus instances.
AlertmanagerConfig, which declaratively specifies subsections of the Alertmanager configuration, allowing routing of alerts to custom receivers, and setting inhibit rules.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So to make sure I am following along, the CMO will:
- Deploy the PrometheusOperator based on the
PrometheusOperatorConfig
- Create
Prometheus
CRs using the configurations provided inPrometheusK8sConfig
. Does this apply to all Prometheus CRs?
While these are two distinct things, they are both inherently related to how the CMO handles prometheus configuration on the cluster.
I can understand your idea but PrometheusOperator manages all these components and I it's not a good idea to have all fields inside PrometheusOperator.
I'm not suggesting that we put all the fields under PrometheusOperatorConfig
, I'm suggesting we use a shared parent field named prometheus
that can have sibling fields for configuring the Prometheus Operator itself and, separately, configuring the individual Prometheus instance configurations. This way, if you want to add additional configuration options related to prometheus in the future, you don't have to add another Prometheus*
field.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Deploy the PrometheusOperator based on the PrometheusOperatorConfig
- Create Prometheus CRs using the configurations provided in PrometheusK8sConfig. Does this apply to all Prometheus CRs?
Correct
I'm not suggesting that we put all the fields under PrometheusOperatorConfig, I'm suggesting we use a shared parent field named prometheus that can have sibling fields for configuring the Prometheus Operator itself and, separately, configuring the individual Prometheus instance configurations. This way, if you want to add additional configuration options related to prometheus in the future, you don't have to add another Prometheus* field.
But they are different things, the are related but from my point of view and how CMO works it makes no sense.
https://github.com/prometheus-operator/prometheus-operator
https://github.com/prometheus/prometheus
@danielmellado @simonpasquier any thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would keep @marioferh approach. While I understand @everettraven concern about API organization, the reality is that prometheusOperatorConfig and prometheusK8sConfig are solving different problems and IMHO having them under a shared parent would actually make things more confusing.
Think about it from an operator's perspective: when you're configuring prometheusOperatorConfig, you're basically saying "how should we deploy and run the Prometheus Operator pods themselves" - stuff like resource limits, node scheduling, log levels. But when you're dealing with prometheusK8sConfig, you're configuring "what should the actual Prometheus servers do" - scraping rules, storage, retention policies, etc. Again, I think mixing them together would be confusing.
Plus, we already have a working pattern with alertmanagerConfig and metricsServerConfig that users understand. Why break that consistency for a theoretical future problem?
If we do end up with too many prometheus fields later, I'm totally happy to revisit the structure, but I think that for now the separation actually makes the API clearer and more intuitive.
@simonpasquier wdyt?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we do end up with too many prometheus fields later, I'm totally happy to revisit the structure
If you go with a distributed field approach, you have to maintain that essentially forever or go through a pretty painful process to refine the structure once you've promoted the API to v1
.
Why break that consistency for a theoretical future problem?
I'm not sold that doing something like:
prometheusConfig:
operator:
...
servers:
...
breaks that consistency, but if you folks feel strongly that users will have a better experience with multiple prometheus*Config
fields I won't block it.
If you don't think you'll ever have more than the two fields for the operator and servers respectively, this probably isn't that big of a deal.
Think about it from an operator's perspective: when you're configuring prometheusOperatorConfig, you're basically saying "how should we deploy and run the Prometheus Operator pods themselves" - stuff like resource limits, node scheduling, log levels. But when you're dealing with prometheusK8sConfig, you're configuring "what should the actual Prometheus servers do" - scraping rules, storage, retention policies, etc. Again, I think mixing them together would be confusing
I think the example above still considers this perspective and difference of field responsibilities. Except now you have a dedicated umbrella field that captures everything related to the configuration of the "Prometheus stack" (operator and the servers).
Again, if you folks feel strongly that this doesn't make sense and users would have a better experience with the currently implemented approach I won't stop it from being done, but it must be clear to users what each prometheus*Config
field is responsible for and when they should/should not be specifying the fields.
Signed-off-by: Mario Fernandez <[email protected]>
[APPROVALNOTIFIER] This PR is NOT APPROVED This pull-request has been approved by: The full list of commands accepted by this bot can be found here.
Needs approval from an approver in each of these files:
Approvers can indicate their approval by writing |
@marioferh: This pull request references MON-4031 which is a valid jira issue. Warning: The referenced jira issue has an invalid target version for the target branch this PR targets: expected the sub-task to target the "4.21.0" version, but no target version was set. In response to this: Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the openshift-eng/jira-lifecycle-plugin repository. |
Signed-off-by: Mario Fernandez <[email protected]>
@marioferh: The following tests failed, say
Full PR test history. Your PR dashboard. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes-sigs/prow repository. I understand the commands that are listed here. |
No description provided.