-
Notifications
You must be signed in to change notification settings - Fork 3.8k
chore(engine): Refactor streams result builder to use column-oriented processsing #19505
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
0720a8a to
c9658ff
Compare
c9658ff to
3e4384d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Generally LGTM.
Left a suggestion how to reduce allocations. Also, feel free to add the benchmark code and run the benchmarks for old and new version and post the output of benchstat old.txt new.txt
251bf2b to
3d6ac88
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The benchmarks look great!
ab921b8 to
761be1c
Compare
761be1c to
a8d54c8
Compare
|
@chaudum I updated the implementation to use a slice of structs for rows buffer. Benchmark shows there is no major difference: here's benchstat between the previously reviewed implementation and the latest one (diff): I also tried to use func TestLabelsBuilder_Playground(t *testing.T) {
lbs := labels.EmptyLabels()
b := NewBaseLabelsBuilder().ForLabels(lbs, labels.StableHash(lbs))
// b.Set(ParsedLabel, "a", "b") // will be returned as part of result.Labels()
b.Set(StreamLabel, "a", "b") // won't be returned as result.Labels()
result := b.LabelsResult()
require.Equal(t, "{a=\"b\"}", result.Labels().String())
}I'm trying to add this support but it takes time and increases the size of the PR so I'd like to have a different PR for it, wdyt? |
pkg/engine/compat.go
Outdated
| "github.com/prometheus/prometheus/promql" | ||
|
|
||
| "github.com/grafana/loki/pkg/push" | ||
| "github.com/grafana/loki/v3/pkg/logproto" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the linter will complain about this line not being in the third import block.
… processing Change streamsResultBuilder.CollectRecord() from row-by-row to column-by-column iteration. This better aligns with Arrow's columnar memory layout, improving CPU cache locality by reading contiguous memory regions.
a8d54c8 to
3bfcf9c
Compare
What this PR does / why we need it:
Change streamsResultBuilder.CollectRecord() from row-by-row to column-by-column iteration. This better aligns with Arrow's columnar memory layout, improving CPU cache locality by reading contiguous memory regions.
I also remove
IsValidchecks given thatIsNullchecks are present.IsValidseems to be an exact opposite ofIsNull(based on this code).The slices and label builders for intermediate rows data are reused between CollectRecord.
old vs latest version
Special notes for your reviewer:
Checklist
CONTRIBUTING.mdguide (required)featPRs are unlikely to be accepted unless a case can be made for the feature actually being a bug fix to existing behavior.docs/sources/setup/upgrade/_index.mddeprecated-config.yamlanddeleted-config.yamlfiles respectively in thetools/deprecated-config-checkerdirectory. Example PR