-
Notifications
You must be signed in to change notification settings - Fork 6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Use -Array
variants of aggregates in schema_array_transformer
#1152
Open
avelanarius
wants to merge
1
commit into
QuesmaOrg:main
Choose a base branch
from
avelanarius:arrayjoin_fixes
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
schema_array_transformer transforms the SQL query for Array columns. Before this change, if an aggregation was performed on a Array column, e.g. sum(myArrayColumn), the transformer would change it into sum(arrayJoin(myArrayColumn)). However using arrayJoin function has problems - arrayJoin modifies the result set of SQL query introducing additional rows. If there are many arrayJoins, a Cartesian product many rows will be introduced: this causes query slowdown and makes the result invalid (we don't actually want to do a Cartesian product!). Solve the problem by using "Array" variants of aggregates (e.g. sumArray instead of sum(arrayJoin())), which does not inflate the number of result rows. Note that this PR does NOT get rid of arrayJoin() fully. There are panels that actually need it, such as "Top products this week" in eCommerce dashboard.
avelanarius
force-pushed
the
arrayjoin_fixes
branch
from
January 7, 2025 10:18
68315d9
to
68e7149
Compare
avelanarius
changed the title
schema_array_transformer fixes (part 1)
Use "Array" variants of aggregates in schema_array_transformer
Jan 7, 2025
avelanarius
changed the title
Use "Array" variants of aggregates in schema_array_transformer
Use Jan 7, 2025
-Array
variants of aggregates in schema_array_transformer
avelanarius
requested review from
nablaone,
jakozaur,
mieciu,
trzysiek and
pdelewski
January 7, 2025 10:29
jakozaur
reviewed
Jan 7, 2025
@@ -81,13 +205,28 @@ func NewArrayTypeVisitor(resolver arrayTypeResolver) model.ExprVisitor { | |||
if ok { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
IF suffix ends with merge, do special logic.
nablaone
reviewed
Jan 13, 2025
@@ -81,13 +205,28 @@ func NewArrayTypeVisitor(resolver arrayTypeResolver) model.ExprVisitor { | |||
if ok { | |||
dbType := resolver.dbColumnType(column.ColumnName) | |||
if strings.HasPrefix(dbType, "Array") { | |||
if strings.HasPrefix(e.Name, "sum") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we should resurrect that idea
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
schema_array_transformer
transforms the SQL query forArray
columns. Before this change, if an aggregation was performed on aArray
column, e.g.sum(myArrayColumn)
, the transformer would change it intosum(arrayJoin(myArrayColumn))
.However using
arrayJoin
function has problems -arrayJoin
modifies the result set of SQL query introducing additional rows. If there are manyarrayJoin
s, a Cartesian product many rows will be performed: this causes query slowdown and makes the result invalid (we don't actually want to do a Cartesian product!).Solve the problem by using
-Array
variants of aggregates (e.g.sumArray
instead ofsum(arrayJoin())
), which does not inflate the number of result rows.Note that this PR does NOT get rid of
arrayJoin()
fully in all cases. There are panels that actually need it, such as "Top products this week" in eCommerce dashboard, where weGROUP BY
an array column.This remaining case should use the
ARRAY JOIN
operator, but this is out-of-scope of this PR.