Skip to content

including parsed "cookie" HTTP request header in ingest can lead to mapping conflicts #4006

@trentm

Description

@trentm

Take this "cookie" HTTP request header value:

sessionid=42; foo=somevalue; foo.bar=some-other-value

As of #3322, this has been included in intake v2 transaction objects as:

...
  context: {
    request: {
      headers: {
        cookie: '[REDACTED]',
...
      },
      cookies: {
        sessionid: '[REDACTED]',
        'foo': 'somevalue',
        'foo.bar': 'some-other-value'
      }
...

This results in the APM server rejecting the transaction object due to a mapping conflict, with a log error something like:

failed to index documents in '.ds-traces-apm-default-2024.05.09-000001' (illegal_argument_exception): can't merge a non object mapping [http.request.cookies.foo] with an object mapping

The issue is that the current mapping:

              {
                "http.request.cookies": {
                  "path_match": "http.request.cookies.*",
                  "mapping": {
                    "type": "keyword"
                  },
                  "match_mapping_type": "string"
                }
              },

attempts to add http.request.cookies.foo and http.request.cookies.foo.bar to the document, requiring http.request.cookies.foo to be a string and an object.

The behaviour ends up being subtle, because the intake request responds with 202 -- i.e. the APM agent doesn't see any issue. The transaction name ends up being visible in the Kibana APM app -- because transaction metrics are still created for it. However, the trace waterfall cannot be rendered, because of the missing document in the traces-apm-default data stream.

(Aside: The [REDACTED] value is according to the https://www.elastic.co/guide/en/apm/agent/nodejs/current/configuration.html#sanitize-field-names config var. Whether it is redacted is unrelated to this issue.)

proposal

Given that including context.request.cookies.* can result in a subtle issue, I think we should stop including it and go back to just including context.request.headers.cookie. Two options here:

  1. Leave context.request.headers.cookie and [REDACTED], i.e. fully redact it every time.
  2. Put the cookie string back together with sanitized field values [REDACTED]. In the example above this would be 'sessionid=[REDACTED]; foo=somevalue; foo.bar=some-other-value'.

The former is less processing, but not by much. Currently we are already parsing the cookie header.
I favour doing option 2.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions