Skip to content

Commit 217882e

Browse files
Merge pull request #218 from sysdiglabs/staging
Staging to Prod Y22W20
2 parents 3de3b53 + 361ecc6 commit 217882e

25 files changed

+2587
-2
lines changed

apps/fluentd.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
---
2+
apiVersion: v1
3+
kind: App
4+
name: "Fluentd"
5+
keywords:
6+
- Observability
7+
- Logging
8+
- Available
9+
availableVersions:
10+
- '1.12.4'
11+
shortDescription: "Fluentd is an open source data collector for unified logging layer."
12+
description: |
13+
Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data.
14+
icon: https://raw.githubusercontent.com/sysdiglabs/promcat-resources/master/apps/images/fluentd.png
15+
website: https://www.fluentd.org/
16+
available: true

apps/images/fluentd.png

11 KB
Loading

apps/images/ntp.png

29.1 KB
Loading

apps/ntp.yaml

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,15 @@
1+
---
2+
apiVersion: v1
3+
kind: App
4+
name: "NTP"
5+
keywords:
6+
- Network
7+
- Available
8+
availableVersions:
9+
- '4'
10+
shortDescription: "The Network Time Protocol (NTP) is a networking protocol for clock synchronization between computer systems"
11+
description: |
12+
The Network Time Protocol (NTP) is a networking protocol for clock synchronization between computer systems over packet-switched, variable-latency data networks. In operation since before 1985, NTP is one of the oldest Internet protocols in current use. NTP was designed by David L. Mills of the University of Delaware.
13+
icon: https://raw.githubusercontent.com/sysdiglabs/promcat-resources/master/apps/images/ntp.png
14+
website: http://www.ntp.org/
15+
available: yes

resources/fluentd/ALERTS.md

Lines changed: 28 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,28 @@
1+
# Alerts
2+
## No Input From Container
3+
No Input From Container.
4+
5+
## High Error Ratio
6+
High Error Ratio.
7+
8+
## High Retry Ratio
9+
High Retry Ratio.
10+
11+
## High Retry Wait
12+
High Retry Wait.
13+
14+
## Low Buffer Available Space
15+
Low Buffer Available Space.
16+
17+
## Buffer Queue Length Increasing
18+
Buffer Queue Length Increasing.
19+
20+
## Buffer Total Bytes Increasing
21+
Buffer Total Bytes Increasing.
22+
23+
## High Slow Flush Ratio
24+
High Slow Flush Ratio.
25+
26+
## No Output Records From Plugin
27+
No Output Records From Plugin.
28+

resources/fluentd/INSTALL.md

Lines changed: 30 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,30 @@
1+
# Prerequisites
2+
Fluentd instruments Prometheus metrics and annotates the pods with Prometheus annotations.
3+
4+
For Fluentd to expose Prometheus metrics, the following plugins need to be enabled:
5+
- 'prometheus' input plugin
6+
- 'prometheus_monitor' input plugin
7+
- 'prometheus_output_monitor' input plugin
8+
9+
As seen in the official plugin documentation (https://github.com/fluent/fluent-plugin-prometheus/blob/master/README.md), they can be enabled with the following configurations:
10+
```
11+
<source>
12+
@type prometheus
13+
@id in_prometheus
14+
bind "0.0.0.0"
15+
port 24231
16+
metrics_path "/metrics"
17+
</source>
18+
19+
<source>
20+
@type prometheus_monitor
21+
@id in_prometheus_monitor
22+
</source>
23+
24+
<source>
25+
@type prometheus_output_monitor
26+
@id in_prometheus_output_monitor
27+
</source>
28+
```
29+
30+
If you are deploying Fluentd using the official Helm chart (https://github.com/fluent/helm-charts/tree/main/charts/fluentd), it already has these plugins enabled by default in its configuration, so no additional actions are needed.

resources/fluentd/README.md

Lines changed: 12 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,12 @@
1+
# Fluentd
2+
Fluentd is an open source data collector, which lets you unify the data collection and consumption for a better use and understanding of data.
3+
4+
5+
# Prometheus and exporters
6+
Fluentd already has a Prometheus endpoint with all the metrics exposed on the port 24231. In Kubernetes the pod is already annotated, so with the Sysdig agent you can scrape the endpoint right away.
7+
8+
# Metrics
9+
- Fluentd internal statistics
10+
11+
# Attributions
12+
Configuration files, dashboards and alerts are maintained by [Sysdig team](https://sysdig.com/).

resources/fluentd/alerts.yaml

Lines changed: 85 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,85 @@
1+
apiVersion: v1
2+
kind: Alert
3+
app: Fluentd
4+
version: 1.0.0
5+
appVersion:
6+
- '1.12.4'
7+
descriptionFile: ALERTS.md
8+
configurations:
9+
- kind: Prometheus
10+
data: |-
11+
groups:
12+
- name: Fluentd
13+
rules:
14+
- alert: '[Fluentd] No Input From Container'
15+
expr: |
16+
sum by (input_namespace, input_container)(rate(fluentd_input_status_num_records_total[5m])) == 0
17+
for: 5m
18+
labels:
19+
severity: warning
20+
annotations:
21+
description: No Input From Container.
22+
- alert: '[Fluentd] High Error Ratio'
23+
expr: |
24+
sum by (type, plugin_id)(rate(fluentd_output_status_num_errors[5m])) /sum by (type, plugin_id)(rate(fluentd_output_status_emit_count[5m]))> 0.05
25+
for: 5m
26+
labels:
27+
severity: critical
28+
annotations:
29+
description: High Error Ratio.
30+
- alert: '[Fluentd] High Retry Ratio'
31+
expr: |
32+
sum by (type, plugin_id)(rate(fluentd_output_status_retry_count[5m])) /sum by (type, plugin_id)(rate(fluentd_output_status_emit_count[5m]))> 0.05
33+
for: 5m
34+
labels:
35+
severity: critical
36+
annotations:
37+
description: High Retry Ratio.
38+
- alert: '[Fluentd] High Retry Wait'
39+
expr: |
40+
sum by (type, plugin_id)(max_over_time(fluentd_output_status_retry_wait[5m])) > 60
41+
for: 5m
42+
labels:
43+
severity: critical
44+
annotations:
45+
description: High Retry Wait.
46+
- alert: '[Fluentd] Low Buffer Available Space'
47+
expr: |
48+
fluentd_output_status_buffer_available_space_ratio < 10
49+
for: 5m
50+
labels:
51+
severity: warning
52+
annotations:
53+
description: Low Buffer Available Space.
54+
- alert: '[Fluentd] Buffer Queue Length Increasing'
55+
expr: |
56+
avg_over_time(fluentd_output_status_buffer_queue_length[5m]) - avg_over_time(fluentd_output_status_buffer_queue_length[5m] offset 5m)> 0
57+
for: 5m
58+
labels:
59+
severity: warning
60+
annotations:
61+
description: Buffer Queue Length Increasing.
62+
- alert: '[Fluentd] Buffer Total Bytes Increasing'
63+
expr: |
64+
avg_over_time(fluentd_output_status_buffer_total_bytes[5m]) - avg_over_time(fluentd_output_status_buffer_total_bytes[5m] offset 5m)> 0
65+
for: 15m
66+
labels:
67+
severity: warning
68+
annotations:
69+
description: Buffer Total Bytes Increasing.
70+
- alert: '[Fluentd] High Slow Flush Ratio'
71+
expr: |
72+
sum by (type, plugin_id)(rate(fluentd_output_status_slow_flush_count[5m])) /sum by (type, plugin_id)(rate(fluentd_output_status_emit_count[5m]))> 0.05
73+
for: 5m
74+
labels:
75+
severity: warning
76+
annotations:
77+
description: High Slow Flush Ratio.
78+
- alert: '[Fluentd] No Output Records From Plugin'
79+
expr: |
80+
rate(fluentd_output_status_emit_records[5m]) == 0
81+
for: 5m
82+
labels:
83+
severity: warning
84+
annotations:
85+
description: No Output Records From Plugin.

resources/fluentd/dashboards.yaml

Lines changed: 16 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,16 @@
1+
apiVersion: v1
2+
kind: Dashboard
3+
app: Fluentd
4+
version: 1.0.0
5+
appVersion:
6+
- '1.12.4'
7+
configurations:
8+
- name: Fluentd
9+
kind: Sysdig
10+
image: fluentd/images/fluentd.png
11+
description: |
12+
This dashboard offers information on:
13+
* Input/Output
14+
* Buffer
15+
* Flush
16+
file: include/Fluentd.json

resources/fluentd/description.yaml

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,7 @@
1+
apiVersion: v1
2+
kind: Description
3+
app: Fluentd
4+
version: 1.0.0
5+
appVersion:
6+
- '1.12.4'
7+
descriptionFile: README.md

0 commit comments

Comments
 (0)