Skip to content

Commit

Permalink
update expected plans for Spark 4.0
Browse files Browse the repository at this point in the history
  • Loading branch information
andygrove committed Jan 19, 2025
1 parent 9d84407 commit 9ddf582
Show file tree
Hide file tree
Showing 2 changed files with 104 additions and 104 deletions.
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
== Physical Plan ==
* CometColumnarToRow (4)
+- CometProject (3)
* Project (4)
+- * CometColumnarToRow (3)
+- CometFilter (2)
+- CometScan parquet spark_catalog.default.reason (1)

Expand All @@ -16,16 +16,16 @@ ReadSchema: struct<r_reason_sk:int>
Input [1]: [r_reason_sk#1]
Condition : (isnotnull(r_reason_sk#1) AND (r_reason_sk#1 = 1))

(3) CometProject
(3) CometColumnarToRow [codegen id : 1]
Input [1]: [r_reason_sk#1]
Arguments: [bucket1#2, bucket2#3, bucket3#4, bucket4#5, bucket5#6], [CASE WHEN (Subquery scalar-subquery#7, [id=#8].count(1) > 62316685) THEN ReusedSubquery Subquery scalar-subquery#7, [id=#8].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#7, [id=#8].avg(ss_net_paid) END AS bucket1#2, CASE WHEN (Subquery scalar-subquery#9, [id=#10].count(1) > 19045798) THEN ReusedSubquery Subquery scalar-subquery#9, [id=#10].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#9, [id=#10].avg(ss_net_paid) END AS bucket2#3, CASE WHEN (Subquery scalar-subquery#11, [id=#12].count(1) > 365541424) THEN ReusedSubquery Subquery scalar-subquery#11, [id=#12].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#11, [id=#12].avg(ss_net_paid) END AS bucket3#4, CASE WHEN (Subquery scalar-subquery#13, [id=#14].count(1) > 216357808) THEN ReusedSubquery Subquery scalar-subquery#13, [id=#14].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#13, [id=#14].avg(ss_net_paid) END AS bucket4#5, CASE WHEN (Subquery scalar-subquery#15, [id=#16].count(1) > 184483884) THEN ReusedSubquery Subquery scalar-subquery#15, [id=#16].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#15, [id=#16].avg(ss_net_paid) END AS bucket5#6]

(4) CometColumnarToRow [codegen id : 1]
Input [5]: [bucket1#2, bucket2#3, bucket3#4, bucket4#5, bucket5#6]
(4) Project [codegen id : 1]
Output [5]: [CASE WHEN (Subquery scalar-subquery#2, [id=#3].count(1) > 62316685) THEN ReusedSubquery Subquery scalar-subquery#2, [id=#3].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#2, [id=#3].avg(ss_net_paid) END AS bucket1#4, CASE WHEN (Subquery scalar-subquery#5, [id=#6].count(1) > 19045798) THEN ReusedSubquery Subquery scalar-subquery#5, [id=#6].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#5, [id=#6].avg(ss_net_paid) END AS bucket2#7, CASE WHEN (Subquery scalar-subquery#8, [id=#9].count(1) > 365541424) THEN ReusedSubquery Subquery scalar-subquery#8, [id=#9].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#8, [id=#9].avg(ss_net_paid) END AS bucket3#10, CASE WHEN (Subquery scalar-subquery#11, [id=#12].count(1) > 216357808) THEN ReusedSubquery Subquery scalar-subquery#11, [id=#12].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#11, [id=#12].avg(ss_net_paid) END AS bucket4#13, CASE WHEN (Subquery scalar-subquery#14, [id=#15].count(1) > 184483884) THEN ReusedSubquery Subquery scalar-subquery#14, [id=#15].avg(ss_ext_discount_amt) ELSE ReusedSubquery Subquery scalar-subquery#14, [id=#15].avg(ss_net_paid) END AS bucket5#16]
Input [1]: [r_reason_sk#1]

===== Subqueries =====

Subquery:1 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#7, [id=#8]
Subquery:1 Hosting operator id = 4 Hosting Expression = Subquery scalar-subquery#2, [id=#3]
* Project (13)
+- * HashAggregate (12)
+- * CometColumnarToRow (11)
Expand Down Expand Up @@ -80,11 +80,11 @@ Results [3]: [count(1)#31 AS count(1)#34, cast((avg(UnscaledValue(ss_ext_discoun
Output [1]: [named_struct(count(1), count(1)#34, avg(ss_ext_discount_amt), avg(ss_ext_discount_amt)#35, avg(ss_net_paid), avg(ss_net_paid)#36) AS mergedValue#37]
Input [3]: [count(1)#34, avg(ss_ext_discount_amt)#35, avg(ss_net_paid)#36]

Subquery:2 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#7, [id=#8]
Subquery:2 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#2, [id=#3]

Subquery:3 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#7, [id=#8]
Subquery:3 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#2, [id=#3]

Subquery:4 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#9, [id=#10]
Subquery:4 Hosting operator id = 4 Hosting Expression = Subquery scalar-subquery#5, [id=#6]
* Project (22)
+- * HashAggregate (21)
+- * CometColumnarToRow (20)
Expand Down Expand Up @@ -139,11 +139,11 @@ Results [3]: [count(1)#52 AS count(1)#55, cast((avg(UnscaledValue(ss_ext_discoun
Output [1]: [named_struct(count(1), count(1)#55, avg(ss_ext_discount_amt), avg(ss_ext_discount_amt)#56, avg(ss_net_paid), avg(ss_net_paid)#57) AS mergedValue#58]
Input [3]: [count(1)#55, avg(ss_ext_discount_amt)#56, avg(ss_net_paid)#57]

Subquery:5 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#9, [id=#10]
Subquery:5 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#5, [id=#6]

Subquery:6 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#9, [id=#10]
Subquery:6 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#5, [id=#6]

Subquery:7 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#11, [id=#12]
Subquery:7 Hosting operator id = 4 Hosting Expression = Subquery scalar-subquery#8, [id=#9]
* Project (31)
+- * HashAggregate (30)
+- * CometColumnarToRow (29)
Expand Down Expand Up @@ -198,11 +198,11 @@ Results [3]: [count(1)#73 AS count(1)#76, cast((avg(UnscaledValue(ss_ext_discoun
Output [1]: [named_struct(count(1), count(1)#76, avg(ss_ext_discount_amt), avg(ss_ext_discount_amt)#77, avg(ss_net_paid), avg(ss_net_paid)#78) AS mergedValue#79]
Input [3]: [count(1)#76, avg(ss_ext_discount_amt)#77, avg(ss_net_paid)#78]

Subquery:8 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#11, [id=#12]
Subquery:8 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#8, [id=#9]

Subquery:9 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#11, [id=#12]
Subquery:9 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#8, [id=#9]

Subquery:10 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#13, [id=#14]
Subquery:10 Hosting operator id = 4 Hosting Expression = Subquery scalar-subquery#11, [id=#12]
* Project (40)
+- * HashAggregate (39)
+- * CometColumnarToRow (38)
Expand Down Expand Up @@ -257,11 +257,11 @@ Results [3]: [count(1)#94 AS count(1)#97, cast((avg(UnscaledValue(ss_ext_discoun
Output [1]: [named_struct(count(1), count(1)#97, avg(ss_ext_discount_amt), avg(ss_ext_discount_amt)#98, avg(ss_net_paid), avg(ss_net_paid)#99) AS mergedValue#100]
Input [3]: [count(1)#97, avg(ss_ext_discount_amt)#98, avg(ss_net_paid)#99]

Subquery:11 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#13, [id=#14]
Subquery:11 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#11, [id=#12]

Subquery:12 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#13, [id=#14]
Subquery:12 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#11, [id=#12]

Subquery:13 Hosting operator id = 3 Hosting Expression = Subquery scalar-subquery#15, [id=#16]
Subquery:13 Hosting operator id = 4 Hosting Expression = Subquery scalar-subquery#14, [id=#15]
* Project (49)
+- * HashAggregate (48)
+- * CometColumnarToRow (47)
Expand Down Expand Up @@ -316,8 +316,8 @@ Results [3]: [count(1)#115 AS count(1)#118, cast((avg(UnscaledValue(ss_ext_disco
Output [1]: [named_struct(count(1), count(1)#118, avg(ss_ext_discount_amt), avg(ss_ext_discount_amt)#119, avg(ss_net_paid), avg(ss_net_paid)#120) AS mergedValue#121]
Input [3]: [count(1)#118, avg(ss_ext_discount_amt)#119, avg(ss_net_paid)#120]

Subquery:14 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#15, [id=#16]
Subquery:14 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#14, [id=#15]

Subquery:15 Hosting operator id = 3 Hosting Expression = ReusedSubquery Subquery scalar-subquery#15, [id=#16]
Subquery:15 Hosting operator id = 4 Hosting Expression = ReusedSubquery Subquery scalar-subquery#14, [id=#15]


Original file line number Diff line number Diff line change
@@ -1,86 +1,86 @@
WholeStageCodegen (1)
CometColumnarToRow
InputAdapter
CometProject [bucket1,bucket2,bucket3,bucket4,bucket5]
Subquery #1
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #1
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #1
ReusedSubquery [mergedValue] #1
Subquery #2
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #2
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #2
ReusedSubquery [mergedValue] #2
Subquery #3
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #3
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #3
ReusedSubquery [mergedValue] #3
Subquery #4
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #4
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #4
ReusedSubquery [mergedValue] #4
Subquery #5
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #5
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #5
ReusedSubquery [mergedValue] #5
Project
Subquery #1
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #1
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #1
ReusedSubquery [mergedValue] #1
Subquery #2
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #2
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #2
ReusedSubquery [mergedValue] #2
Subquery #3
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #3
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #3
ReusedSubquery [mergedValue] #3
Subquery #4
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #4
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #4
ReusedSubquery [mergedValue] #4
Subquery #5
WholeStageCodegen (2)
Project [count(1),avg(ss_ext_discount_amt),avg(ss_net_paid)]
HashAggregate [count,sum,count,sum,count] [count(1),avg(UnscaledValue(ss_ext_discount_amt)),avg(UnscaledValue(ss_net_paid)),count(1),avg(ss_ext_discount_amt),avg(ss_net_paid),count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometColumnarExchange #5
WholeStageCodegen (1)
HashAggregate [ss_ext_discount_amt,ss_net_paid] [count,sum,count,sum,count,count,sum,count,sum,count]
CometColumnarToRow
InputAdapter
CometProject [ss_ext_discount_amt,ss_net_paid]
CometFilter [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
CometScan parquet spark_catalog.default.store_sales [ss_quantity,ss_ext_discount_amt,ss_net_paid,ss_sold_date_sk]
ReusedSubquery [mergedValue] #5
ReusedSubquery [mergedValue] #5
CometColumnarToRow
InputAdapter
CometFilter [r_reason_sk]
CometScan parquet spark_catalog.default.reason [r_reason_sk]

0 comments on commit 9ddf582

Please sign in to comment.