Skip to content

Conversation

@w1ll-i-code
Copy link
Contributor

PR for #10576. For right now, this will be hidden behind a build-flag, until the feature is sufficiently tested and ready for production use. For all the changes made to the original elasticsearch writer, please consult the individual commits.

Copy the elasticsearchwriter to be adapted for the new datastreamwriter.
Setup the build system and hide the feature behind a build flag for now.
Restructure the data sent to elasticsearch to align with the Elastic
Common Schema specification and separate document indices by check to
reduce the number of distinct fields per index.
Handle the errors returned by elasticsearch gracefully to let the user
know what went wrong during execution. Do not discard data as soon as a
request to elasticsearch fails, but keep retrying until the data can be
sent, relying on the WorkQueue for back pressure. Improve the handling
of the flush timer, by rescheduling the timer after each flush, making
sure that there are no needless flushes under heavy load.
Re-add support for tags, but make them conform to the ecs specification. Add
also support for labels, which are the former tags in the
elasticsearchwriter.

ref: https://www.elastic.co/docs/reference/ecs/ecs-base
Allow the user to filter for which data should be sent to elasticsearch.
This allows the user to use multiple datastreams to send the data to
different namespaces, for multi-tenancy of different retention policies.
Accept also an ApiKey for authentication in elasticsearch in addition to
username + password and certificates.
Icinga2 should manage its own index template if possible. This allows us
to ship potential additions to the document later without requiring
further user input. However the user has the option to disable the
feature, should they want to manage the template manually. For user
customization, we ship the `icinga2@custom` component template, so that
users can change the behaviour of the template without having to edit
the managed one, making updates easier.
Add a template config with all possible config options for the user as
well as a short description on what the parameter does and how it can be
used. This allows a user to quickly configure the writer without having
to look up lots of documentation online.
Update the documentation to give a clearer overview of the new
elasticsearchdatastreamwriter feature and add the object and its fields
to the syntax highlighting of nano and vim.
Drop messages in the data buffer if the connection to elasticsearch
fails. This guarantees that the icinga2 process can still shut down,
even with a missconfigured writer or if elasticsearch is down or not
reachable without stalling.
Allow the 'datasteam_namespace' variable to contain macros and expand
them properly. This allows data to be written into different datastreams
based on object properties, e.g. by zone or custom var.
@cla-bot cla-bot bot added the cla/signed label Oct 7, 2025
Copy link
Contributor

@mcodato mcodato left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure about putting this feature behind a build-flag that is OFF by default.
IMHO, having it compiled by default, when the PR is merged, could make adoption easier and faster. The option to enable or disable the feature already exists.
WDYT @lippserd ?

Copy link
Contributor

@jschmidt-icinga jschmidt-icinga left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just gave this a quick look, so this is not a complete review.

Firstly, I agree with @mcodato that the build flag is unnecessary. It can just be on by default same as all the other perfdata writers.

Secondly, I'd suggest you squash down your commits because they're all operating on the same new component added by this PR. Also there's some whitespace back-and-forth cluttering up the diff.

See below for a some additional comments on the code, mostly things the linter complained about:

@w1ll-i-code
Copy link
Contributor Author

#10577 (review) Hi, thanks for having a look at it, I really appreciate it.

I'd suggest you squash down your commits [...]

I left the commits separate deliberately, so it's easier to review. Each commit can be looked at on their own, without having to reason about all changes at once.

@w1ll-i-code w1ll-i-code force-pushed the wp_elasticsearchdatastreamwriter branch 5 times, most recently from 707f002 to 4a9699b Compare October 21, 2025 14:12
General improvements to code and documentation. Fixing comments by:
- Mattia Codato
- Johannes Schmidt
@w1ll-i-code w1ll-i-code force-pushed the wp_elasticsearchdatastreamwriter branch from 4a9699b to 9a83f44 Compare October 21, 2025 15:06
@jschmidt-icinga
Copy link
Contributor

Thank you for this PR. I've briefly spoken to @lippserd about this and I'm going to take another look when I can. I know next to nothing about Elasticsearch yet, so it might take some until I can test this thoroughly.

@martialblog
Copy link
Member

martialblog commented Oct 23, 2025

Hi,

Just a hint, I recently addressed other issue with the ElasticsearchWriter format here #10511

I provided a small but breaking change here: #10518

If the ElasticsearchDatastreamWriter is introduced, both Writers should use the same format, right?

@jschmidt-icinga let me know if you need help with Elasticsearch knowhow, you know where my office is.

@martialblog
Copy link
Member

FYI, I think this would also (somewhat) addresses this issue #9837

With datastreams the backing indices are managed by Elasticsearch.

if (!pdv->GetCrit().IsEmpty() && GetEnableSendThresholds())
metric->Set("crit", pdv->GetCrit());

pd_fields->Set(pdv->GetLabel(), metric);
Copy link
Member

@martialblog martialblog Oct 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi,

Am I reading this correct?
This would result in a schema where the perfdata label is in the field name?

Like so:

"_source": {
  "timestamp": "1761225482",
  "perfdata.rta": {
      "value": 0.091,
      "warn": 100,
      "min": 0,
      "crit": 200,
      "unit": "ms"
  },
  "perfdata.pl": {
      "value": 0,
      "warn": 5,
      "min": 0,
      "crit": 15,
      "unit": "%"
  }
}

I think this might cause issues in the longterm, as described here: #6805 and #10511

Since each new field will cause a new mapping in the index. Correct me if I'm wrong.

The issue with adding new fields for every label is this: https://www.elastic.co/docs/troubleshoot/elasticsearch/mapping-explosion

In the change to the Elasticwriter I proposed, I used a field called "label" in a list of objects.

Like so:

"_source": {
  "timestamp": "1761225482",
  "perfdata": [
    {
      "value": 0.091,
      "label": "rta"
      "warn": 100,
      "min": 0,
      "crit": 200,
      "unit": "ms"
    },
    {
      "value": 0,
      "label": "pl"
      "warn": 5,
      "min": 0,
      "crit": 15,
      "unit": "%"
    }
  ]
}

This would just create a single field. And there is an expected field name where to find the label.

Otherwise you need to fetch the entire document, scan for all of the fields that start with "perfdata." to find the perfdata fields... and would still need to split the key name at the . to get to the label, right?

This field can then the either object or nested:

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to make things a bit clearer. When using an array using nested here might be better, since it maintains the independence of each object.

For example, when searching all perfdata for an object:

curl -X POST "http://localhost:9200/icinga2/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "host": "myhost" }},
        { "match": { "service": "ping6" }}
      ]
    }
  },
   "fields": [
     "check_result.perfdata.*"
   ],
   "_source": false
}
'

An object mapping returns this:

{
  "_id": "AZoRQYZWWEqPxFmVW73R",
  "fields": {
    "check_result.perfdata.unit": [ "ms", "%" ],
    "check_result.perfdata.label": [ "rta", "pl" ],
    "check_result.perfdata.warn": [ 100, 5 ],
    "check_result.perfdata.min": [ 0, 0 ],
    "check_result.perfdata.value": [ 0, 0 ],
    "check_result.perfdata.crit": [ 200, 15 ]
  }
}

And a nested mapping returns:

{
 "_id": "AZoRUkSmucmeEsXVMUwm",
 "fields": {
 "check_result.perfdata": [
   {
     "warn": [ 100 ],
     "unit": [ "ms" ],
     "min": [ 0 ],
     "crit": [200 ],
     "label": [ "rta" ],
     "value": [ 0 ]
   },
   {
     "warn": [ 5 ],
     "unit": [ "%" ],
     "min": [ 0 ],
     "crit": [ 15 ],
     "label": ["pl" ],
     "value": [ 0 ]
   }
 ]
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants