Description
Describe the enhancement:
I am using the Kinesis input to collect structured logs from AWS CloudWatch that have been partly populated using the ecs-pino-format library. Inspecting the data output to Elasticsearch I found that there are a couple layers of structure around my application logs:
- Elastic Serverless Forwarder's ECS fields
- AWS log record structure, JSON stringified on the ECS
message
field - My application logs are further stringified on the log record's
message
field
{
"@timestamp": "2023-02-14T21:26:04.387857Z",
"message": "{\"id\":\"37385191276969030971785538657917125344955573022299717635\",\"timestamp\":1676409955997,\"message\":\"{\\\"log.level\\\":\\\"info\\\",\\\"@timestamp\\\":\\\"2023-02-14T21:25:55.996Z\\\",\\\"message\\\":\\\"beepboop\\\"}\"}",
}
I was expecting to see my application logs as-is. This could be achieved if:
- The CloudWatch log record structure was parsed by ESF
- The log record
message
string was attempted to be parsed as JSON- If
message
is valid JSON, merge the content into the root object that is forwarded - If
message
is not JSON, preserve it as-is
- If
- Optionally add (or overwrite/merge) ECS objects with data from ESF
Describe a specific use case for the enhancement or feature:
When I initially saw this I thought that it was a reasonable preservation of the various layers my logs are passing through. I assumed that I could use Ingest Pipelines to parse each layer, extract my logs, and merge them into the ESF structure in order to leverage some of it's ECS content while adding my own. For reference, here's a portion of my pipeline:
{
"processors": [
{
"json": {
"field": "message",
"target_field": "parsed_cloudwatch_log_event"
}
},
{
"json": {
"field": "parsed_cloudwatch_log_event.message",
"add_to_root": true
}
},
{
"set": {
"field": "@timestamp",
"copy_from": "parsed_app_log.@timestamp",
"ignore_failure": true
}
},
{
"set": {
"field": "message",
"copy_from": "parsed_app_log.message",
"ignore_failure": true
}
},
{
"remove": {
"field": "parsed_cloudwatch_log_event"
}
},
{
"remove": {
"field": "parsed_app_log"
}
}
]
}
However I have encountered some limitations in the pipeline processors that are currently blocking me from accessing a log.level
field that ecs-pino-format is adding.
In my own case I may need to stop using ecs-pino-format as I don't see the blocking issues in the data processors being resolved anytime soon. I don't really expect that this enhancement will be implemented either. But I wanted to at least document what I'm dealing with as a user.