Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix a bucket mapping issue with the origin inspector latency metrics #172

Merged
merged 1 commit into from
Jul 25, 2024

Conversation

crivera-fastly
Copy link
Contributor

The exporter maps bucketed counters from the real-time stats API to a Prometheus histogram. Since it doesn't have the actual values that were used to build the bucketed counters it has to pick a value for each bucket to call Observe() with.

The current code uses values that are mapping to the wrong buckets. Here's a subset of the buckets from the origin latency histogram definition:

0.001, 0.005, 0.010

Calling Observe() with the value of 0.005 places the value in the 1-5ms bucket, not the 5-10ms bucket. The result of the current behavior is that the counts for each bucket are shifted to the next smallest bucket.

This fixes the issue by using the values at the end of each bucket interval as the parameters to Observe(). This will result in increased origin latency values when users upgrade to a release with this commit, but the increased numbers are more accurate.

The exporter maps bucketed counters from the real-time stats API to
a Prometheus histogram. Since it doesn't have the actual values that
were used to build the bucketed counters it has to pick a value for
each bucket to call Observe() with.

The current code uses values that are mapping to the wrong buckets.
Here's a subset of the buckets from the origin latency histogram
definition:

0.001, 0.005, 0.010

Calling Observe() with the value of 0.005 places the value in the 1-5ms
bucket, not the 5-10ms bucket. The result of the current behavior is that
the counts for each bucket are shifted to the next smallest bucket.

This fixes the issue by using the values at the end of each bucket interval
as the parameters to Observe(). This will result in increased origin latency
values when users upgrade to a release with this commit, but the increased
numbers are more accurate.
@leklund leklund added breaking-change Breaking Change bug Something isn't working labels Jul 25, 2024
Copy link
Member

@leklund leklund left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good 👍

@crivera-fastly crivera-fastly merged commit 2e8b6cf into main Jul 25, 2024
14 checks passed
@crivera-fastly crivera-fastly deleted the latency-fix branch July 25, 2024 18:56
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking-change Breaking Change bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants