-
Notifications
You must be signed in to change notification settings - Fork 371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support Block Adapter metrics with adapter_stats_id #8680
Conversation
Not sure what I'm meant to review here: PR says "stats", code says "metrics". I think we need only to rename the PR - if it's about stats then we might need to do rather different things. |
Well, let's have it written: If so, not the mix, since the Having said that, I don't mind renaming this PR 😄 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, requesting changes mostly because of the label name, that I'm not sure what it means
pkg/block/metrics.go
Outdated
adapterStatsID *string | ||
} | ||
|
||
func BuildHistogramsInstance(name string, adapterStatsID *string) Histograms { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is adapterStatsID? not sure what this means, please change name or document
pkg/block/metrics.go
Outdated
func BuildHistogramsInstance(name string, adapterStatsID *string) Histograms { | ||
labelNames := []string{"operation", "error"} | ||
if adapterStatsID != nil { | ||
labelNames = append(labelNames, "adapter_stats_id") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same for the label, I have no idea what this represents
pkg/block/metrics.go
Outdated
if s.adapterStatsID != nil { | ||
labels = append(labels, *s.adapterStatsID) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not a blocker (thinking out load), but this one disturbs me a bit, why do we need to check every report? It's a bit weird that we have a label that is added only in some cases.
Maybe having two instead of making it generic, or providing getDurationHistogramWithLables. Also maybe moving the optional to the beginning will be safer because we tend to add labels to the end.
IMO just add the label, for everyone, it shouldn't affect too much if it has only one value
pkg/block/s3/stats.go
Outdated
durationHistograms.WithLabelValues(operation, isErrStr).Observe(time.Since(start).Seconds()) | ||
if sizeBytes != nil { | ||
requestSizeHistograms.WithLabelValues(operation, isErrStr).Observe(float64(*sizeBytes)) | ||
func NewS3Stats(adapterStatsID *string) block.Histograms { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func NewS3Stats(adapterStatsID *string) block.Histograms { | |
func NewStats(adapterStatsID *string) block.Histograms { |
pkg/block/azure/stats.go
Outdated
durationHistograms.WithLabelValues(operation, isErrStr).Observe(time.Since(start).Seconds()) | ||
if sizeBytes != nil { | ||
requestSizeHistograms.WithLabelValues(operation, isErrStr).Observe(float64(*sizeBytes)) | ||
func NewAzureStats(adapterStatsID *string) block.Histograms { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
func NewAzureStats(adapterStatsID *string) block.Histograms { | |
func NewStats(adapterStatsID *string) block.Histograms { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Requesting changes again but this one is critical.
We can't initiate the same adapter twice, which raises the question, how was this tested? Did we try this out for two Adapters with same name and different AdapterStatID?
var durationHistograms = promauto.NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Name: name + "_operation_duration_seconds", | ||
Help: "durations of outgoing " + name + " operations", | ||
}, | ||
labelNames, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe there is an issue here, each histogram can be initiated one time per name, otherwise it will panic. This means that:
- We can't initiate a block adapter twice!
- Don't think this can be used for 2 different adapterStatesIDs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itaigilo how was this addressed?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, I see there's a nil check to ensure initialization happens only once. However, I still believe we should prioritize registering the metric during package-level initialization rather than run time
I've done a major refactoring here - Also renamed Tested it with two (S3) blockstores, and it reports each with a different label. @guy-har PTAL again - |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks! Guy's point is obviously blocking here.
@@ -134,3 +137,59 @@ func (m *MetricsAdapter) GetRegion(ctx context.Context, storageID, storageNamesp | |||
func (m *MetricsAdapter) RuntimeStats() map[string]string { | |||
return m.adapter.RuntimeStats() | |||
} | |||
|
|||
type Histograms struct { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Consider calling this a Histogram - after all that's literally the name Prometheus gives it.
prometheus.HistogramOpts{ | ||
Name: name + "_operation_size_bytes", | ||
Help: "handled sizes of outgoing " + name + " operations", | ||
Buckets: prometheus.ExponentialBuckets(1, 10, 10), //nolint: mnd |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TIL these the buckets we have. That's fairly nasty... (No action in this PR, of course.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I prefer not to initiate metrics on New adapter. I would recommend:
- leave the metrics as they were
- add the adapter id label
- initiated the adapter with a property of adapter id
- call
reportMetrics
with adapterID
I believe this will reduce complexity
var durationHistograms = promauto.NewHistogramVec( | ||
prometheus.HistogramOpts{ | ||
Name: name + "_operation_duration_seconds", | ||
Help: "durations of outgoing " + name + " operations", | ||
}, | ||
labelNames, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@itaigilo how was this addressed?
@@ -132,7 +132,7 @@ func BuildGSClient(ctx context.Context, params params.GS) (*storage.Client, erro | |||
return storage.NewClient(ctx, opts...) | |||
} | |||
|
|||
func buildGSAdapter(ctx context.Context, params params.GS) (*gs.Adapter, error) { | |||
func buildGSAdapter(ctx context.Context, params params.GS, metricsID *string) (*gs.Adapter, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is metricsID, Isn't it the adapter ID?
Closes # 8672.
Change Description
Currently, the Block Adapters that use
reportMetrics
only report withoperation
andisErrStr
labels -Adding another
adapter_stats_id
label to it, to allow better tracing.