-
Notifications
You must be signed in to change notification settings - Fork 1.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Cannot Count(Expr:Wildcard) with DataFrame API #5518
Conversation
I think such logic should be moved to the Analyzer. |
thanks @mingmwang , as @Jefffrey mention but use dafaframe will got error
I'm trying another way following @Jefffrey's method |
2f0ae2a
to
56f464f
Compare
fff20ed
to
0809980
Compare
0809980
to
22f9e23
Compare
As everyone's suggestions, I made some modifications. |
3d424b2
to
f5197b1
Compare
//handle Count(Expr:Wildcard) with DataFrame API | ||
pub fn handle_wildcard(exprs: Vec<Expr>) -> Result<Vec<Expr>> { | ||
let exprs: Vec<Expr> = exprs | ||
.iter() | ||
.map(|expr| { | ||
if let Expr::AggregateFunction(AggregateFunction { | ||
fun, | ||
args, | ||
distinct, | ||
filter, | ||
}) = expr | ||
{ | ||
if let aggregate_function::AggregateFunction::Count = fun { | ||
if args.len() == 1 { | ||
let arg = args.get(0).unwrap().clone(); | ||
match arg { | ||
Expr::Wildcard => { | ||
Expr::AggregateFunction(AggregateFunction { | ||
fun: fun.clone(), | ||
args: vec![lit(ScalarValue::UInt8(Some(1)))], | ||
distinct: *distinct, | ||
filter: filter.clone(), | ||
}) | ||
} | ||
_ => expr.clone(), | ||
} | ||
} else { | ||
expr.clone() | ||
} | ||
} else { | ||
expr.clone() | ||
} | ||
} else { | ||
expr.clone() | ||
} | ||
}) | ||
.collect(); | ||
Ok(exprs) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the extreme nesting is not ideal, i think can simplify to this:
//handle Count(Expr:Wildcard) with DataFrame API | |
pub fn handle_wildcard(exprs: Vec<Expr>) -> Result<Vec<Expr>> { | |
let exprs: Vec<Expr> = exprs | |
.iter() | |
.map(|expr| { | |
if let Expr::AggregateFunction(AggregateFunction { | |
fun, | |
args, | |
distinct, | |
filter, | |
}) = expr | |
{ | |
if let aggregate_function::AggregateFunction::Count = fun { | |
if args.len() == 1 { | |
let arg = args.get(0).unwrap().clone(); | |
match arg { | |
Expr::Wildcard => { | |
Expr::AggregateFunction(AggregateFunction { | |
fun: fun.clone(), | |
args: vec![lit(ScalarValue::UInt8(Some(1)))], | |
distinct: *distinct, | |
filter: filter.clone(), | |
}) | |
} | |
_ => expr.clone(), | |
} | |
} else { | |
expr.clone() | |
} | |
} else { | |
expr.clone() | |
} | |
} else { | |
expr.clone() | |
} | |
}) | |
.collect(); | |
Ok(exprs) | |
} | |
//handle Count(Expr:Wildcard) with DataFrame API | |
pub fn handle_wildcard(exprs: Vec<Expr>) -> Result<Vec<Expr>> { | |
let exprs: Vec<Expr> = exprs | |
.iter() | |
.map(|expr| match expr { | |
Expr::AggregateFunction(AggregateFunction { | |
fun: aggregate_function::AggregateFunction::Count, | |
args, | |
distinct, | |
filter, | |
}) if args.len() == 1 => match args[0] { | |
Expr::Wildcard => Expr::AggregateFunction(AggregateFunction { | |
fun: aggregate_function::AggregateFunction::Count, | |
// TODO: replace with the constant | |
args: vec![lit(ScalarValue::UInt8(Some(1)))], | |
distinct: *distinct, | |
filter: filter.clone(), | |
}), | |
_ => expr.clone(), | |
}, | |
_ => expr.clone(), | |
}) | |
.collect(); | |
Ok(exprs) | |
} |
unsure if can simplify more, feel free to explore that
also check my TODO comment on the constant (i think i mention it in the original issue)
2f91005
to
566ed24
Compare
I saw @mingmwang's PR #5570, maybe I can migrate the current logic to @mingmwang's solution. |
566ed24
to
7f5cd7c
Compare
try adding AnalyzerRule #5627 to do the samething |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
does look good, though worth checking other opinions about where the logic should reside (whether like this or in analyzer as mentioned)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM -- I also think #5627 looks good to but defer to @mingmwang
.await?; | ||
|
||
//make sure sql plan same with df plan | ||
assert_eq!( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
Which issue does this PR close?
Closes #5473