Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[doc] Improve wording regarding to overflow attribute once cardinality limit is reached for easier understanding for end-users #5939

Open
xiang17 opened this issue Oct 30, 2024 · 0 comments
Labels
documentation Documentation related enhancement New feature or request good first issue Good for newcomers pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Milestone

Comments

@xiang17
Copy link
Contributor

xiang17 commented Oct 30, 2024

Package

OpenTelemetry

Is your feature request related to a problem?

According to the spec https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/sdk.md#overflow-attribute, once cardinality limit is reached, the measurements could not be "independently aggregated", but it's not called "dropped". What happens is a "synthetic aggregation":

An overflow attribute set is defined, containing a single attribute otel.metric.overflow having (boolean) value true, which is used to report a synthetic aggregation of the Measurements that could not be independently aggregated because of the limit.

This can be confusing to end-users.

I'm opening this issue to track @cijothomas' suggestion for improvement at #5909 (comment):

Agree that the wording may not easily be understood by all end-users. The https://github.com/open-telemetry/opentelemetry-dotnet/tree/main/docs/metrics#cardinality-limits doc is very good, and perhaps we can make some additions to it to explain how the sdk behaves when limit is hit, and how to interpret overflow attribute correctly.

One important think I'd like called out is, if user has a query like sum of all requests, where route=foo, and an overflow exists - then that query is no longer trustable, as there is no way to tell if a route=foo measurement was folded into overflow. The only thing trustable in the event an overflow exists is the total metrics (i.e the one which do not filter based on any dimensions).

If none volunteers to make this change in the doc, I can cover it. (I am implementing similar thing for OTel Rust right now, so I can hopefully steal some wordings! Ideally this should be covered in otel docs website, so every language can benefit) https://github.com/utpilla/MetricOverflowAttribute?tab=readme-ov-file can be a good starting point.

What is the expected behavior?

N/A

Which alternative solutions or features have you considered?

N/A

Additional context

No response

@xiang17 xiang17 added enhancement New feature or request needs-triage New issues which have not been classified or triaged by a community member labels Oct 30, 2024
@github-actions github-actions bot added the pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package label Oct 30, 2024
@TimothyMothra TimothyMothra added good first issue Good for newcomers documentation Documentation related and removed needs-triage New issues which have not been classified or triaged by a community member labels Oct 31, 2024
@TimothyMothra TimothyMothra added this to the Future milestone Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Documentation related enhancement New feature or request good first issue Good for newcomers pkg:OpenTelemetry Issues related to OpenTelemetry NuGet package
Projects
None yet
Development

No branches or pull requests

2 participants