Create a new ElasticsearchDatastreamWriter to more efficiently store data in elasticsearch. #10577

w1ll-i-code · 2025-10-07T13:05:13Z

PR for #10576. For right now, this will be hidden behind a build-flag, until the feature is sufficiently tested and ready for production use. For all the changes made to the original elasticsearch writer, please consult the individual commits.

Copy the elasticsearchwriter to be adapted for the new datastreamwriter. Setup the build system and hide the feature behind a build flag for now.

Restructure the data sent to elasticsearch to align with the Elastic Common Schema specification and separate document indices by check to reduce the number of distinct fields per index.

Handle the errors returned by elasticsearch gracefully to let the user know what went wrong during execution. Do not discard data as soon as a request to elasticsearch fails, but keep retrying until the data can be sent, relying on the WorkQueue for back pressure. Improve the handling of the flush timer, by rescheduling the timer after each flush, making sure that there are no needless flushes under heavy load.

Re-add support for tags, but make them conform to the ecs specification. Add also support for labels, which are the former tags in the elasticsearchwriter. ref: https://www.elastic.co/docs/reference/ecs/ecs-base

Allow the user to filter for which data should be sent to elasticsearch. This allows the user to use multiple datastreams to send the data to different namespaces, for multi-tenancy of different retention policies.

Accept also an ApiKey for authentication in elasticsearch in addition to username + password and certificates.

Icinga2 should manage its own index template if possible. This allows us to ship potential additions to the document later without requiring further user input. However the user has the option to disable the feature, should they want to manage the template manually. For user customization, we ship the `icinga2@custom` component template, so that users can change the behaviour of the template without having to edit the managed one, making updates easier.

Add a template config with all possible config options for the user as well as a short description on what the parameter does and how it can be used. This allows a user to quickly configure the writer without having to look up lots of documentation online.

Update the documentation to give a clearer overview of the new elasticsearchdatastreamwriter feature and add the object and its fields to the syntax highlighting of nano and vim.

Drop messages in the data buffer if the connection to elasticsearch fails. This guarantees that the icinga2 process can still shut down, even with a missconfigured writer or if elasticsearch is down or not reachable without stalling.

Allow the 'datasteam_namespace' variable to contain macros and expand them properly. This allows data to be written into different datastreams based on object properties, e.g. by zone or custom var.

mcodato

I'm not sure about putting this feature behind a build-flag that is OFF by default.
IMHO, having it compiled by default, when the PR is merged, could make adoption easier and faster. The option to enable or disable the feature already exists.
WDYT @lippserd ?

doc/14-features.md

jschmidt-icinga

I just gave this a quick look, so this is not a complete review.

Firstly, I agree with @mcodato that the build flag is unnecessary. It can just be on by default same as all the other perfdata writers.

Secondly, I'd suggest you squash down your commits because they're all operating on the same new component added by this PR. Also there's some whitespace back-and-forth cluttering up the diff.

See below for a some additional comments on the code, mostly things the linter complained about:

lib/perfdata/elasticsearchdatastreamwriter.hpp

lib/perfdata/elasticsearchdatastreamwriter.cpp

lib/perfdata/elasticsearchdatastreamwriter.hpp

lib/perfdata/elasticsearchdatastreamwriter.cpp

w1ll-i-code · 2025-10-14T15:03:09Z

#10577 (review) Hi, thanks for having a look at it, I really appreciate it.

I'd suggest you squash down your commits [...]

I left the commits separate deliberately, so it's easier to review. Each commit can be looked at on their own, without having to reason about all changes at once.

General improvements to code and documentation. Fixing comments by: - Mattia Codato - Johannes Schmidt

jschmidt-icinga · 2025-10-22T12:03:08Z

Thank you for this PR. I've briefly spoken to @lippserd about this and I'm going to take another look when I can. I know next to nothing about Elasticsearch yet, so it might take some until I can test this thoroughly.

martialblog · 2025-10-23T12:10:20Z

Hi,

Just a hint, I recently addressed other issue with the ElasticsearchWriter format here #10511

I provided a small but breaking change here: #10518

If the ElasticsearchDatastreamWriter is introduced, both Writers should use the same format, right?

@jschmidt-icinga let me know if you need help with Elasticsearch knowhow, you know where my office is.

martialblog · 2025-10-23T12:25:54Z

FYI, I think this would also (somewhat) addresses this issue #9837

With datastreams the backing indices are managed by Elasticsearch.

martialblog · 2025-10-23T13:42:46Z

lib/perfdata/elasticsearchdatastreamwriter.cpp

+		if (!pdv->GetCrit().IsEmpty() && GetEnableSendThresholds())
+			metric->Set("crit", pdv->GetCrit());
+
+		pd_fields->Set(pdv->GetLabel(), metric);


Hi,

Am I reading this correct?
This would result in a schema where the perfdata label is in the field name?

Like so:

"_source": { "timestamp": "1761225482", "perfdata.rta": { "value": 0.091, "warn": 100, "min": 0, "crit": 200, "unit": "ms" }, "perfdata.pl": { "value": 0, "warn": 5, "min": 0, "crit": 15, "unit": "%" } }

I think this might cause issues in the longterm, as described here: #6805 and #10511

Since each new field will cause a new mapping in the index. Correct me if I'm wrong.

The issue with adding new fields for every label is this: https://www.elastic.co/docs/troubleshoot/elasticsearch/mapping-explosion

In the change to the Elasticwriter I proposed, I used a field called "label" in a list of objects.

Like so:

"_source": { "timestamp": "1761225482", "perfdata": [ { "value": 0.091, "label": "rta" "warn": 100, "min": 0, "crit": 200, "unit": "ms" }, { "value": 0, "label": "pl" "warn": 5, "min": 0, "crit": 15, "unit": "%" } ] }

This would just create a single field. And there is an expected field name where to find the label.

Otherwise you need to fetch the entire document, scan for all of the fields that start with "perfdata." to find the perfdata fields... and would still need to split the key name at the . to get to the label, right?

This field can then the either object or nested:

https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/object

https://www.elastic.co/docs/reference/elasticsearch/mapping-reference/nested

Just to make things a bit clearer. When using an array using nested here might be better, since it maintains the independence of each object.

For example, when searching all perfdata for an object:

curl -X POST "http://localhost:9200/icinga2/_search" -H 'Content-Type: application/json' -d' { "query": { "bool": { "must": [ { "match": { "host": "myhost" }}, { "match": { "service": "ping6" }} ] } }, "fields": [ "check_result.perfdata.*" ], "_source": false } '

An object mapping returns this:

{ "_id": "AZoRQYZWWEqPxFmVW73R", "fields": { "check_result.perfdata.unit": [ "ms", "%" ], "check_result.perfdata.label": [ "rta", "pl" ], "check_result.perfdata.warn": [ 100, 5 ], "check_result.perfdata.min": [ 0, 0 ], "check_result.perfdata.value": [ 0, 0 ], "check_result.perfdata.crit": [ 200, 15 ] } }

And a nested mapping returns:

{ "_id": "AZoRUkSmucmeEsXVMUwm", "fields": { "check_result.perfdata": [ { "warn": [ 100 ], "unit": [ "ms" ], "min": [ 0 ], "crit": [200 ], "label": [ "rta" ], "value": [ 0 ] }, { "warn": [ 5 ], "unit": [ "%" ], "min": [ 0 ], "crit": [ 15 ], "label": ["pl" ], "value": [ 0 ] } ] }

w1ll-i-code added 11 commits October 7, 2025 14:45

Create base structure for datastreamwriter

12c6baf

Copy the elasticsearchwriter to be adapted for the new datastreamwriter. Setup the build system and hide the feature behind a build flag for now.

Restructure the document for elasticsearch.

c0c0beb

Restructure the data sent to elasticsearch to align with the Elastic Common Schema specification and separate document indices by check to reduce the number of distinct fields per index.

Allow for user defined tags and labels.

d6acf20

Re-add support for tags, but make them conform to the ecs specification. Add also support for labels, which are the former tags in the elasticsearchwriter. ref: https://www.elastic.co/docs/reference/ecs/ecs-base

Add filter option to the config.

2f78ffd

Allow the user to filter for which data should be sent to elasticsearch. This allows the user to use multiple datastreams to send the data to different namespaces, for multi-tenancy of different retention policies.

Add elastic authentication via ApiKey

3f98090

Accept also an ApiKey for authentication in elasticsearch in addition to username + password and certificates.

Update documentation and syntax highlighting

c1b98d2

Update the documentation to give a clearer overview of the new elasticsearchdatastreamwriter feature and add the object and its fields to the syntax highlighting of nano and vim.

Drop documents on shutdown when connection fails

16b6c1b

Drop messages in the data buffer if the connection to elasticsearch fails. This guarantees that the icinga2 process can still shut down, even with a missconfigured writer or if elasticsearch is down or not reachable without stalling.

Allow templates in the datastream namespace.

5d847b1

Allow the 'datasteam_namespace' variable to contain macros and expand them properly. This allows data to be written into different datastreams based on object properties, e.g. by zone or custom var.

cla-bot bot added the cla/signed label Oct 7, 2025

mcodato reviewed Oct 7, 2025

View reviewed changes

doc/14-features.md Outdated Show resolved Hide resolved

doc/14-features.md Outdated Show resolved Hide resolved

jschmidt-icinga reviewed Oct 14, 2025

View reviewed changes

Remove the build flag for the ElasticsearchDatastreamWriter

5dee72e

w1ll-i-code force-pushed the wp_elasticsearchdatastreamwriter branch 5 times, most recently from 707f002 to 4a9699b Compare October 21, 2025 14:12

Cleanup code after the first round of comments.

9a83f44

General improvements to code and documentation. Fixing comments by: - Mattia Codato - Johannes Schmidt

w1ll-i-code force-pushed the wp_elasticsearchdatastreamwriter branch from 4a9699b to 9a83f44 Compare October 21, 2025 15:06

martialblog reviewed Oct 23, 2025

View reviewed changes

Create a new ElasticsearchDatastreamWriter to more efficiently store data in elasticsearch. #10577

Are you sure you want to change the base?

Create a new ElasticsearchDatastreamWriter to more efficiently store data in elasticsearch. #10577

Uh oh!

Conversation

w1ll-i-code commented Oct 7, 2025

Uh oh!

mcodato left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

jschmidt-icinga left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

w1ll-i-code commented Oct 14, 2025

Uh oh!

jschmidt-icinga commented Oct 22, 2025

Uh oh!

martialblog commented Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

martialblog commented Oct 23, 2025

Uh oh!

martialblog Oct 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

martialblog Oct 23, 2025

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

martialblog commented Oct 23, 2025 •

edited

Loading

martialblog Oct 23, 2025 •

edited

Loading