-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
use keep_firing_for to prevent flappy alerts #63
Conversation
@@ -6,6 +6,7 @@ groups: | |||
rules: | |||
- alert: DeploymentMissingReplicas | |||
expr: (kube_deployment_spec_replicas != kube_deployment_status_replicas_available) * ON (deployment, namespace) group_left(annotation_app_uw_systems_tier, annotation_app_uw_systems_system, annotation_app_uw_systems_owner) kube_deployment_annotations{} | |||
keep_firing_for: 5m |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is great, thanks Anna - should the value be the same for: for
and keep_firing_for
?
They are doing the same thing right? One is before alert triggers and the other is once its triggered. But the same purpose?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was considering aligning for
and keep_firing_for
because it seems less arbitrary but I think that might defeat the purpose of still wanting to alert quickly but avoid alerts being re-opened frequently.
Example:
Deployment is missing replicas for 6 minutes - you want to receive an alert within 5 minutes (for: 5m
) and prevent alert from reopening for next 10 minutes (keep_firing_for:10m
)
I forgot about this, thanks to HH for reminder: thanos-io/thanos#6165 |
|
Not included in the latest release 😞 |
👀 It's coming @george-angel https://github.com/thanos-io/thanos/releases/tag/v0.34.0-rc.0 |
Set
|
|
Proposition to use
keep_firing_for
in missing replicas alerts to reduce alert flapping.Address situations where replica sets become available for a short time periods.
Default time:
10m
Example