Skip to content

Commit 32d80d2

Browse files
authored
Add non-cumulative histogram (influxdata#7071)
1 parent a6dc099 commit 32d80d2

File tree

3 files changed

+223
-110
lines changed

3 files changed

+223
-110
lines changed

plugins/aggregators/histogram/README.md

+47-24
Original file line numberDiff line numberDiff line change
@@ -3,8 +3,9 @@
33
The histogram aggregator plugin creates histograms containing the counts of
44
field values within a range.
55

6-
Values added to a bucket are also added to the larger buckets in the
7-
distribution. This creates a [cumulative histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg).
6+
If `cumulative` is set to true, values added to a bucket are also added to the
7+
larger buckets in the distribution. This creates a [cumulative histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg).
8+
Otherwise, values are added to only one bucket, which creates an [ordinary histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg)
89

910
Like other Telegraf aggregators, the metric is emitted every `period` seconds.
1011
By default bucket counts are not reset between periods and will be non-strictly
@@ -16,7 +17,7 @@ increasing while Telegraf is running. This behavior can be changed by setting th
1617
Each metric is passed to the aggregator and this aggregator searches
1718
histogram buckets for those fields, which have been specified in the
1819
config. If buckets are found, the aggregator will increment +1 to the appropriate
19-
bucket otherwise it will be added to the `+Inf` bucket. Every `period`
20+
bucket. Otherwise, it will be added to the `+Inf` bucket. Every `period`
2021
seconds this data will be forwarded to the outputs.
2122

2223
The algorithm of hit counting to buckets was implemented on the base
@@ -39,16 +40,20 @@ of the algorithm which is implemented in the Prometheus
3940
## of accumulating the results.
4041
reset = false
4142

43+
## Whether bucket values should be accumulated. If set to false, "gt" tag will be added.
44+
## Defaults to true.
45+
cumulative = true
46+
4247
## Example config that aggregates all fields of the metric.
4348
# [[aggregators.histogram.config]]
44-
# ## The set of buckets.
49+
# ## Right borders of buckets (with +Inf implicitly added).
4550
# buckets = [0.0, 15.6, 34.5, 49.1, 71.5, 80.5, 94.5, 100.0]
4651
# ## The name of metric.
4752
# measurement_name = "cpu"
4853

4954
## Example config that aggregates only specific fields of the metric.
5055
# [[aggregators.histogram.config]]
51-
# ## The set of buckets.
56+
# ## Right borders of buckets (with +Inf implicitly added).
5257
# buckets = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
5358
# ## The name of metric.
5459
# measurement_name = "diskio"
@@ -64,8 +69,9 @@ option. Optionally, if `fields` is set only the fields listed will be
6469
aggregated. If `fields` is not set all fields are aggregated.
6570

6671
The `buckets` option contains a list of floats which specify the bucket
67-
boundaries. Each float value defines the inclusive upper bound of the bucket.
72+
boundaries. Each float value defines the inclusive upper (right) bound of the bucket.
6873
The `+Inf` bucket is added automatically and does not need to be defined.
74+
(For left boundaries, these specified bucket borders and `-Inf` will be used).
6975

7076
### Measurements & Fields:
7177

@@ -77,26 +83,43 @@ The postfix `bucket` will be added to each field key.
7783

7884
### Tags:
7985

80-
All measurements are given the tag `le`. This tag has the border value of
81-
bucket. It means that the metric value is less than or equal to the value of
82-
this tag. For example, let assume that we have the metric value 10 and the
83-
following buckets: [5, 10, 30, 70, 100]. Then the tag `le` will have the value
84-
10, because the metrics value is passed into bucket with right border value
85-
`10`.
86+
* `cumulative = true` (default):
87+
* `le`: Right bucket border. It means that the metric value is less than or
88+
equal to the value of this tag. If a metric value is sorted into a bucket,
89+
it is also sorted into all larger buckets. As a result, the value of
90+
`<field>_bucket` is rising with rising `le` value. When `le` is `+Inf`,
91+
the bucket value is the count of all metrics, because all metric values are
92+
less than or equal to positive infinity.
93+
* `cumulative = false`:
94+
* `gt`: Left bucket border. It means that the metric value is greater than
95+
(and not equal to) the value of this tag.
96+
* `le`: Right bucket border. It means that the metric value is less than or
97+
equal to the value of this tag.
98+
* As both `gt` and `le` are present, each metric is sorted in only exactly
99+
one bucket.
100+
86101

87102
### Example Output:
88103

104+
Let assume we have the buckets [0, 10, 50, 100] and the following field values
105+
for `usage_idle`: [50, 7, 99, 12]
106+
107+
With `cumulative = true`:
108+
109+
```
110+
cpu,cpu=cpu1,host=localhost,le=0.0 usage_idle_bucket=0i 1486998330000000000 # none
111+
cpu,cpu=cpu1,host=localhost,le=10.0 usage_idle_bucket=1i 1486998330000000000 # 7
112+
cpu,cpu=cpu1,host=localhost,le=50.0 usage_idle_bucket=2i 1486998330000000000 # 7, 12
113+
cpu,cpu=cpu1,host=localhost,le=100.0 usage_idle_bucket=4i 1486998330000000000 # 7, 12, 50, 99
114+
cpu,cpu=cpu1,host=localhost,le=+Inf usage_idle_bucket=4i 1486998330000000000 # 7, 12, 50, 99
115+
```
116+
117+
With `cumulative = false`:
118+
89119
```
90-
cpu,cpu=cpu1,host=localhost,le=0.0 usage_idle_bucket=0i 1486998330000000000
91-
cpu,cpu=cpu1,host=localhost,le=10.0 usage_idle_bucket=0i 1486998330000000000
92-
cpu,cpu=cpu1,host=localhost,le=20.0 usage_idle_bucket=1i 1486998330000000000
93-
cpu,cpu=cpu1,host=localhost,le=30.0 usage_idle_bucket=2i 1486998330000000000
94-
cpu,cpu=cpu1,host=localhost,le=40.0 usage_idle_bucket=2i 1486998330000000000
95-
cpu,cpu=cpu1,host=localhost,le=50.0 usage_idle_bucket=2i 1486998330000000000
96-
cpu,cpu=cpu1,host=localhost,le=60.0 usage_idle_bucket=2i 1486998330000000000
97-
cpu,cpu=cpu1,host=localhost,le=70.0 usage_idle_bucket=2i 1486998330000000000
98-
cpu,cpu=cpu1,host=localhost,le=80.0 usage_idle_bucket=2i 1486998330000000000
99-
cpu,cpu=cpu1,host=localhost,le=90.0 usage_idle_bucket=2i 1486998330000000000
100-
cpu,cpu=cpu1,host=localhost,le=100.0 usage_idle_bucket=2i 1486998330000000000
101-
cpu,cpu=cpu1,host=localhost,le=+Inf usage_idle_bucket=2i 1486998330000000000
120+
cpu,cpu=cpu1,host=localhost,gt=-Inf,le=0.0 usage_idle_bucket=0i 1486998330000000000 # none
121+
cpu,cpu=cpu1,host=localhost,gt=0.0,le=10.0 usage_idle_bucket=1i 1486998330000000000 # 7
122+
cpu,cpu=cpu1,host=localhost,gt=10.0,le=50.0 usage_idle_bucket=1i 1486998330000000000 # 12
123+
cpu,cpu=cpu1,host=localhost,gt=50.0,le=100.0 usage_idle_bucket=2i 1486998330000000000 # 50, 99
124+
cpu,cpu=cpu1,host=localhost,gt=100.0,le=+Inf usage_idle_bucket=0i 1486998330000000000 # none
102125
```

plugins/aggregators/histogram/histogram.go

+39-17
Original file line numberDiff line numberDiff line change
@@ -8,16 +8,23 @@ import (
88
"github.com/influxdata/telegraf/plugins/aggregators"
99
)
1010

11-
// bucketTag is the tag, which contains right bucket border
12-
const bucketTag = "le"
11+
// bucketRightTag is the tag, which contains right bucket border
12+
const bucketRightTag = "le"
1313

14-
// bucketInf is the right bucket border for infinite values
15-
const bucketInf = "+Inf"
14+
// bucketPosInf is the right bucket border for infinite values
15+
const bucketPosInf = "+Inf"
16+
17+
// bucketLeftTag is the tag, which contains left bucket border (exclusive)
18+
const bucketLeftTag = "gt"
19+
20+
// bucketNegInf is the left bucket border for infinite values
21+
const bucketNegInf = "-Inf"
1622

1723
// HistogramAggregator is aggregator with histogram configs and particular histograms for defined metrics
1824
type HistogramAggregator struct {
1925
Configs []config `toml:"config"`
2026
ResetBuckets bool `toml:"reset"`
27+
Cumulative bool `toml:"cumulative"`
2128

2229
buckets bucketsByMetrics
2330
cache map[uint64]metricHistogramCollection
@@ -57,8 +64,10 @@ type groupedByCountFields struct {
5764
}
5865

5966
// NewHistogramAggregator creates new histogram aggregator
60-
func NewHistogramAggregator() telegraf.Aggregator {
61-
h := &HistogramAggregator{}
67+
func NewHistogramAggregator() *HistogramAggregator {
68+
h := &HistogramAggregator{
69+
Cumulative: true,
70+
}
6271
h.buckets = make(bucketsByMetrics)
6372
h.resetCache()
6473

@@ -77,16 +86,20 @@ var sampleConfig = `
7786
## of accumulating the results.
7887
reset = false
7988
89+
## Whether bucket values should be accumulated. If set to false, "gt" tag will be added.
90+
## Defaults to true.
91+
cumulative = true
92+
8093
## Example config that aggregates all fields of the metric.
8194
# [[aggregators.histogram.config]]
82-
# ## The set of buckets.
95+
# ## Right borders of buckets (with +Inf implicitly added).
8396
# buckets = [0.0, 15.6, 34.5, 49.1, 71.5, 80.5, 94.5, 100.0]
8497
# ## The name of metric.
8598
# measurement_name = "cpu"
8699
87100
## Example config that aggregates only specific fields of the metric.
88101
# [[aggregators.histogram.config]]
89-
# ## The set of buckets.
102+
# ## Right borders of buckets (with +Inf implicitly added).
90103
# buckets = [0.0, 10.0, 20.0, 30.0, 40.0, 50.0, 60.0, 70.0, 80.0, 90.0, 100.0]
91104
# ## The name of metric.
92105
# measurement_name = "diskio"
@@ -167,18 +180,27 @@ func (h *HistogramAggregator) groupFieldsByBuckets(
167180
tags map[string]string,
168181
counts []int64,
169182
) {
170-
count := int64(0)
171-
for index, bucket := range h.getBuckets(name, field) {
172-
count += counts[index]
183+
sum := int64(0)
184+
buckets := h.getBuckets(name, field) // note that len(buckets) + 1 == len(counts)
173185

174-
tags[bucketTag] = strconv.FormatFloat(bucket, 'f', -1, 64)
175-
h.groupField(metricsWithGroupedFields, name, field, count, copyTags(tags))
176-
}
186+
for index, count := range counts {
187+
if !h.Cumulative {
188+
sum = 0 // reset sum -> don't store cumulative counts
177189

178-
count += counts[len(counts)-1]
179-
tags[bucketTag] = bucketInf
190+
tags[bucketLeftTag] = bucketNegInf
191+
if index > 0 {
192+
tags[bucketLeftTag] = strconv.FormatFloat(buckets[index-1], 'f', -1, 64)
193+
}
194+
}
180195

181-
h.groupField(metricsWithGroupedFields, name, field, count, tags)
196+
tags[bucketRightTag] = bucketPosInf
197+
if index < len(buckets) {
198+
tags[bucketRightTag] = strconv.FormatFloat(buckets[index], 'f', -1, 64)
199+
}
200+
201+
sum += count
202+
h.groupField(metricsWithGroupedFields, name, field, sum, copyTags(tags))
203+
}
182204
}
183205

184206
// groupField groups field by count value

0 commit comments

Comments
 (0)