- 
                Notifications
    You must be signed in to change notification settings 
- Fork 598
Create a new ElasticsearchDatastreamWriter to more efficiently store data in elasticsearch. #10577
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Create a new ElasticsearchDatastreamWriter to more efficiently store data in elasticsearch. #10577
Conversation
Copy the elasticsearchwriter to be adapted for the new datastreamwriter. Setup the build system and hide the feature behind a build flag for now.
Restructure the data sent to elasticsearch to align with the Elastic Common Schema specification and separate document indices by check to reduce the number of distinct fields per index.
Handle the errors returned by elasticsearch gracefully to let the user know what went wrong during execution. Do not discard data as soon as a request to elasticsearch fails, but keep retrying until the data can be sent, relying on the WorkQueue for back pressure. Improve the handling of the flush timer, by rescheduling the timer after each flush, making sure that there are no needless flushes under heavy load.
Re-add support for tags, but make them conform to the ecs specification. Add also support for labels, which are the former tags in the elasticsearchwriter. ref: https://www.elastic.co/docs/reference/ecs/ecs-base
Allow the user to filter for which data should be sent to elasticsearch. This allows the user to use multiple datastreams to send the data to different namespaces, for multi-tenancy of different retention policies.
Accept also an ApiKey for authentication in elasticsearch in addition to username + password and certificates.
Icinga2 should manage its own index template if possible. This allows us to ship potential additions to the document later without requiring further user input. However the user has the option to disable the feature, should they want to manage the template manually. For user customization, we ship the `icinga2@custom` component template, so that users can change the behaviour of the template without having to edit the managed one, making updates easier.
Add a template config with all possible config options for the user as well as a short description on what the parameter does and how it can be used. This allows a user to quickly configure the writer without having to look up lots of documentation online.
Update the documentation to give a clearer overview of the new elasticsearchdatastreamwriter feature and add the object and its fields to the syntax highlighting of nano and vim.
Drop messages in the data buffer if the connection to elasticsearch fails. This guarantees that the icinga2 process can still shut down, even with a missconfigured writer or if elasticsearch is down or not reachable without stalling.
Allow the 'datasteam_namespace' variable to contain macros and expand them properly. This allows data to be written into different datastreams based on object properties, e.g. by zone or custom var.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure about putting this feature behind a build-flag that is OFF by default.
IMHO, having it compiled by default, when the PR is merged, could make adoption easier and faster. The option to enable or disable the feature already exists.
WDYT @lippserd ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just gave this a quick look, so this is not a complete review.
Firstly, I agree with @mcodato that the build flag is unnecessary. It can just be on by default same as all the other perfdata writers.
Secondly, I'd suggest you squash down your commits because they're all operating on the same new component added by this PR. Also there's some whitespace back-and-forth cluttering up the diff.
See below for a some additional comments on the code, mostly things the linter complained about:
| #10577 (review) Hi, thanks for having a look at it, I really appreciate it. 
 I left the commits separate deliberately, so it's easier to review. Each commit can be looked at on their own, without having to reason about all changes at once. | 
707f002    to
    4a9699b      
    Compare
  
    General improvements to code and documentation. Fixing comments by: - Mattia Codato - Johannes Schmidt
4a9699b    to
    9a83f44      
    Compare
  
    | Thank you for this PR. I've briefly spoken to @lippserd about this and I'm going to take another look when I can. I know next to nothing about Elasticsearch yet, so it might take some until I can test this thoroughly. | 
| Hi, Just a hint, I recently addressed other issue with the ElasticsearchWriter format here #10511 I provided a small but breaking change here: #10518 If the ElasticsearchDatastreamWriter is introduced, both Writers should use the same format, right? @jschmidt-icinga let me know if you need help with Elasticsearch knowhow, you know where my office is. | 
| FYI, I think this would also (somewhat) addresses this issue #9837 With datastreams the backing indices are managed by Elasticsearch. | 
| if (!pdv->GetCrit().IsEmpty() && GetEnableSendThresholds()) | ||
| metric->Set("crit", pdv->GetCrit()); | ||
|  | ||
| pd_fields->Set(pdv->GetLabel(), metric); | 
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi,
Am I reading this correct?
This would result in a schema where the perfdata label is in the field name?
Like so:
"_source": {
  "timestamp": "1761225482",
  "perfdata.rta": {
      "value": 0.091,
      "warn": 100,
      "min": 0,
      "crit": 200,
      "unit": "ms"
  },
  "perfdata.pl": {
      "value": 0,
      "warn": 5,
      "min": 0,
      "crit": 15,
      "unit": "%"
  }
}I think this might cause issues in the longterm, as described here: #6805 and #10511
Since each new field will cause a new mapping in the index. Correct me if I'm wrong.
The issue with adding new fields for every label is this: https://www.elastic.co/docs/troubleshoot/elasticsearch/mapping-explosion
In the change to the Elasticwriter I proposed, I used a field called "label" in a list of objects.
Like so:
"_source": {
  "timestamp": "1761225482",
  "perfdata": [
    {
      "value": 0.091,
      "label": "rta"
      "warn": 100,
      "min": 0,
      "crit": 200,
      "unit": "ms"
    },
    {
      "value": 0,
      "label": "pl"
      "warn": 5,
      "min": 0,
      "crit": 15,
      "unit": "%"
    }
  ]
}This would just create a single field. And there is an expected field name where to find the label.
Otherwise you need to fetch the entire document, scan for all of the fields that start with "perfdata." to find the perfdata fields... and would still need to split the key name at the . to get to the label, right?
This field can then the either object or nested:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just to make things a bit clearer. When using an array using nested here might be better, since it maintains the independence of each object.
For example, when searching all perfdata for an object:
curl -X POST "http://localhost:9200/icinga2/_search" -H 'Content-Type: application/json' -d'
{
  "query": {
    "bool": {
      "must": [
        { "match": { "host": "myhost" }},
        { "match": { "service": "ping6" }}
      ]
    }
  },
   "fields": [
     "check_result.perfdata.*"
   ],
   "_source": false
}
'An object mapping returns this:
{
  "_id": "AZoRQYZWWEqPxFmVW73R",
  "fields": {
    "check_result.perfdata.unit": [ "ms", "%" ],
    "check_result.perfdata.label": [ "rta", "pl" ],
    "check_result.perfdata.warn": [ 100, 5 ],
    "check_result.perfdata.min": [ 0, 0 ],
    "check_result.perfdata.value": [ 0, 0 ],
    "check_result.perfdata.crit": [ 200, 15 ]
  }
}And a nested mapping returns:
{
 "_id": "AZoRUkSmucmeEsXVMUwm",
 "fields": {
 "check_result.perfdata": [
   {
     "warn": [ 100 ],
     "unit": [ "ms" ],
     "min": [ 0 ],
     "crit": [200 ],
     "label": [ "rta" ],
     "value": [ 0 ]
   },
   {
     "warn": [ 5 ],
     "unit": [ "%" ],
     "min": [ 0 ],
     "crit": [ 15 ],
     "label": ["pl" ],
     "value": [ 0 ]
   }
 ]
}
PR for #10576. For right now, this will be hidden behind a build-flag, until the feature is sufficiently tested and ready for production use. For all the changes made to the original elasticsearch writer, please consult the individual commits.