Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
56 changes: 56 additions & 0 deletions .github/workflows/alerts-test.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,56 @@
name: Alerts-test-qubership-monitoring-operator
on:
workflow_run:
workflows: ["Build Artifacts"]
types:
- completed
pull_request:
branches:
- all

env:
kind_name: kind-cluster
kind_version: v0.27.0
vm_namespace: vm
max_attempts: 30
delay: 10

permissions:
contents: read

jobs:
Run-Alerts-Test:
runs-on: ubuntu-latest
timeout-minutes: 30
steps:
- name: Check out repository code
uses: actions/checkout@v4

- name: Check yq version
run: yq --version

- name: Install Helm
run: |
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash

- name: Render rules file from helm chart
run: |
helm template alertrules ./charts/qubership-monitoring-operator/charts/prometheus-rules -f ./test/alerts-tests/rendervalues.yaml > ./test/alerts-tests/rules.yaml
sed '1,7d' -i ./test/alerts-tests/rules.yaml

- name: Check that all necessary tests exists
run: |
chmod +x ./test/alerts-tests/tests-checker.sh
cd ./test/alerts-tests/
./tests-checker.sh
continue-on-error: true

- name: Install vmalert-tool
run: |
wget https://github.com/VictoriaMetrics/VictoriaMetrics/releases/download/v1.122.4/vmutils-linux-amd64-v1.122.4-enterprise.tar.gz
tar -xvf vmutils-linux-amd64-v1.122.4-enterprise.tar.gz
chmod +x vmalert-tool-prod

- name: Run test
run: |
./vmalert-tool-prod unittest --files ./test/alerts-tests/test.yaml
6 changes: 6 additions & 0 deletions charts/qubership-monitoring-operator/Chart.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -103,3 +103,9 @@ dependencies:
condition: stackdriverExporter.install
version: ~0
repository: "file://charts/stackdriver-exporter"

# Qubership monitoring configuration
- name: prometheusrules
condition: prometheusRules.install
version: ~0
repository: "file://charts/prometheus-rules"
Original file line number Diff line number Diff line change
@@ -0,0 +1,24 @@
apiVersion: v2
name: prometheusrules
description: A Helm chart for Kubernetes

# A chart can be either an 'application' or a 'library' chart.
#
# Application charts are a collection of templates that can be packaged into versioned archives
# to be deployed.
#
# Library charts provide useful utilities or functions for the chart developer. They're included as
# a dependency of application charts to inject those utilities and functions into the rendering
# pipeline. Library charts do not define any templates and therefore cannot be deployed.
type: application

# This is the chart version. This version number should be incremented each time you make changes
# to the chart and its templates, including the app version.
# Versions are expected to follow Semantic Versioning (https://semver.org/)
version: 0.1.0

# This is the version number of the application being deployed. This version number should be
# incremented each time you make changes to the application. Versions are not expected to
# follow Semantic Versioning. They should reflect the version the application is using.
# It is recommended to use it with quotes.
appVersion: "1.16.0"

Large diffs are not rendered by default.

Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
{{- if and (eq .Values.alertsPackVersion "v2") (.Values.install) }}
apiVersion: operator.victoriametrics.com/v1beta1
kind: VMRule
metadata:
name: prometheusrules
spec:
groups:

{{- $defaultConfig := fromYaml (include "defaultAlerts" . ) -}}
{{- $overrideConfig := .Values.alerts -}}
{{- $finalConfig := merge $overrideConfig $defaultConfig -}}
{{- $alertGroups := .Values.ruleGroups -}}


{{- range $defaultGroupName, $defaultGroup := $finalConfig }}
{{- $found := false }}
{{- range $alertGroups }}
{{- if eq $defaultGroupName . }}
{{- $found = true }}
{{- end }}
{{- end }}
{{- if $found }}
- name: {{ $defaultGroupName }}
labels:
{{- range $defaultLabelName, $defaultLabelValue := $defaultGroup.labels }}
{{ $defaultLabelName }}: {{ $defaultLabelValue }}
{{- end }}
{{- if $defaultGroup.interval }}
interval: {{ $defaultGroup.interval }}
{{- end }}
{{- if $defaultGroup.concurrency }}
concurrency: {{ $defaultGroup.concurrency }}
{{- end }}
rules:
{{- range $defaultRuleName, $defaultRule := $defaultGroup.rules }}
- alert: {{ $defaultRuleName }}
expr: {{ $defaultRule.expr }}
{{- if $defaultRule.for }}
for: {{ $defaultRule.for }}
{{- end }}
labels:
{{- range $defaultLabelName, $defaultLabelValue := $defaultRule.labels }}
{{ $defaultLabelName }}: {{ $defaultLabelValue }}
{{- end }}
annotations:
{{- range $defaultAnnotationName, $defaultAnnotationValue := $defaultRule.annotations }}
{{ $defaultAnnotationName }}: {{ printf $defaultAnnotationValue | trimAll "\n" | toJson | replace "\\u0026" "&" | replace "\\u003e" ">" | nindent 14 }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}
{{- end }}
Original file line number Diff line number Diff line change
Expand Up @@ -1041,6 +1041,8 @@ spec:
{{- end }}
{{- end }}
{{- if and .Values.prometheusRules .Values.prometheusRules.install }}

{{- if ne .Values.prometheusRules.alertsPackVersion "v2" }}
prometheusRules:
install: {{ .Values.prometheusRules.install }}
ruleGroups:
Expand All @@ -1060,6 +1062,8 @@ spec:
{{- toYaml .Values.prometheusRules.override | nindent 6 }}
{{- end }}
{{- end }}
{{- end }}

{{- if .Values.alertManager.install }}
alertManager:
install: {{ .Values.alertManager.install }}
Expand Down
9 changes: 7 additions & 2 deletions charts/qubership-monitoring-operator/values.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ global:
# Type: object
# Mandatory: no
#

role:
# Allow to disable create Role and ClusterRole for monitoring-operator during deploy.
# If global.privilegedRights parameter is set to false, ClusterRole will not be installed in any case.
Expand Down Expand Up @@ -991,7 +992,7 @@ kubernetesMonitors:
metricRelabelings: []
relabelings: []
apiserverServiceMonitor:
install: true
install: false
interval: 30s
scrapeTimeout: 10s
metricRelabelings:
Expand Down Expand Up @@ -1280,18 +1281,22 @@ grafanaDashboards:
#
prometheusRules:
install: true
alertsPackVersion: v1
ruleGroups:
- SelfMonitoring
- AlertManager
- KubebernetesAlerts
- KubernetesAlerts
- NodeProcesses
- NodeExporters
- DockerContainers
- HAmode
- HAproxy
- Etcd
- NginxIngressAlerts
- CoreDnsAlerts
- DRAlerts
- BackupAlerts

# override:
# - group: SelfMonitoring
# alert: PrometheusNotificationsBacklog
Expand Down
1 change: 0 additions & 1 deletion docs/integration/google-cloud.md
Original file line number Diff line number Diff line change
Expand Up @@ -368,7 +368,6 @@ To monitor external VM-s, use a Google Cloud monitoring agent:

* Cloud Monitoring agent overview - [https://cloud.google.com/monitoring/agent](https://cloud.google.com/monitoring/agent)
* Installing the Cloud Monitoring agent on a single VM - [https://cloud.google.com/monitoring/agent/installation](https://cloud.google.com/monitoring/agent/installation)
* Virtual Machine monitoring quick start - [https://cloud.google.com/monitoring/quickstart-lamp](https://cloud.google.com/monitoring/quickstart-lamp)

# Links

Expand Down
11 changes: 11 additions & 0 deletions docs/monitoring-configuration/alerts.md
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,17 @@ parameter.

You can find examples of configuration [in the appropriate section](#examples).

### Deep alerts tuning using subchart

If you want to make deep customizations on alerts (add new ones, override any alert fields, disable alerts etd) you can use v2 alerts functionality.
To use it you need:

1) Set alertsPackVersion: v2 value in prometheusRules section in values yaml.
2) Use subchart`s values yaml (/charts/prometheus-rules) to set overrides for alerts. Overrides will be merged with default alerts, described in subchart helpers.tpl with higher priority.

If you will set any other value for alertsPackVersion except "v2" or wont set this value at all - installation will happen on old flavour.
Alert groups in subchart are supported in same manner as described above.

### Dead Man's Switch alert

[Dead Man's Switch](https://en.wikipedia.org/wiki/Dead_man%27s_switch) alert is a special always-firing alert that meant
Expand Down
16 changes: 16 additions & 0 deletions test/alerts-tests/rendervalues.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,16 @@
alertsPackVersion: v2
install: true
ruleGroups:
- SelfMonitoring
- AlertManager
- KubernetesAlerts
- NodeProcesses
- NodeExporters
- DockerContainers
- HAmode
- HAproxy
- Etcd
- NginxIngressAlerts
- CoreDnsAlerts
- DRAlerts
- BackupAlerts
Loading
Loading