Releases: apache/datafusion-comet
0.5.0
DataFusion Comet 0.5.0 Changelog
This release consists of 69 commits from 15 contributors. See credits at the end of this changelog for more information.
Fixed bugs:
- fix: Unsigned type related bugs #1095 (kazuyukitanimura)
- fix: Use RDD partition index #1112 (viirya)
- fix: Various metrics bug fixes and improvements #1111 (andygrove)
- fix: Don't create CometScanExec for subclasses of ParquetFileFormat #1129 (Kimahriman)
- fix: Fix metrics regressions #1132 (andygrove)
- fix: Enable scenarios accidentally commented out in CometExecBenchmark #1151 (mbutrovich)
- fix: Spark 4.0-preview1 SPARK-47120 #1156 (kazuyukitanimura)
- fix: Document enabling comet explain plan usage in Spark (4.0) #1176 (parthchandra)
- fix: stddev_pop should not directly return 0.0 when count is 1.0 #1184 (viirya)
- fix: fix missing explanation for then branch in case when #1200 (rluvaton)
- fix: Fall back to Spark for unsupported partition or sort expressions in window aggregates #1253 (andygrove)
- fix: Fall back to Spark for distinct aggregates #1262 (andygrove)
- fix: disable initCap by default #1276 (kazuyukitanimura)
Performance related:
- perf: Stop passing Java config map into native createPlan #1101 (andygrove)
- feat: Make native shuffle compression configurable and respect
spark.shuffle.compress
#1185 (andygrove) - perf: Improve query planning to more reliably fall back to columnar shuffle when native shuffle is not supported #1209 (andygrove)
- feat: Move shuffle block decompression and decoding to native code and add LZ4 & Snappy support #1192 (andygrove)
- feat: Implement custom RecordBatch serde for shuffle for improved performance #1190 (andygrove)
Implemented enhancements:
- feat: support array_insert #1073 (SemyonSinchenko)
- feat: enable decimal to decimal cast of different precision and scale #1086 (himadripal)
- feat: Improve ScanExec native metrics #1133 (andygrove)
- feat: Add Spark-compatible implementation of SchemaAdapterFactory #1169 (andygrove)
- feat: Improve shuffle metrics (second attempt) #1175 (andygrove)
- feat: Add a
spark.comet.exec.memoryPool
configuration for experimenting with various datafusion memory pool setups. #1021 (Kontinuation) - feat: Reenable tests for filtered SMJ anti join #1211 (comphead)
- feat: add support for array_remove expression #1179 (jatin510)
Documentation updates:
- docs: Update documentation for 0.4.0 release #1096 (andygrove)
- docs: Fix readme typo FGPA -> FPGA #1117 (gstvg)
- docs: Add more technical detail and new diagram to Comet plugin overview #1119 (andygrove)
- docs: Add some documentation explaining how shuffle works #1148 (andygrove)
- docs: Update TPC-H benchmark results #1257 (andygrove)
Other:
- chore: Add changelog for 0.4.0 #1089 (andygrove)
- chore: Prepare for 0.5.0 development #1090 (andygrove)
- build: Skip installation of spark-integration and fuzz testing modules #1091 (parthchandra)
- minor: Add hint for finding the GPG key to use when publishing to maven #1093 (andygrove)
- chore: Include first ScanExec batch in metrics #1105 (andygrove)
- chore: Improve CometScan metrics #1100 (andygrove)
- chore: Add custom metric for native shuffle fetching batches from JVM #1108 (andygrove)
- chore: Remove unused StringView struct #1143 (andygrove)
- test: enable more Spark 4.0 tests #1145 (kazuyukitanimura)
- chore: Refactor cast to use SparkCastOptions param #1146 (andygrove)
- chore: Move more expressions from core crate to spark-expr crate #1152 (andygrove)
- chore: Remove dead code #1155 (andygrove)
- chore: Move string kernels and expressions to spark-expr crate #1164 (andygrove)
- chore: Move remaining expressions to spark-expr crate + some minor refactoring #1165 (andygrove)
- chore: Add ignored tests for reading complex types from Parquet #1167 (andygrove)
- test: enabling Spark tests with offHeap requirement #1177 (kazuyukitanimura)
- minor: move shuffle classes from common to spark #1193 (andygrove)
- minor: refactor to move decodeBatches to broadcast exchange code as private function #1195 (andygrove)
- minor: refactor prepare_output so that it does not require an ExecutionContext #1194 (andygrove)
- minor: remove unused source files #1202 (andygrove)
- chore: Upgrade to DataFusion 44.0.0-rc2 #1154 (andygrove)
- chore: Add safety check to CometBuffer #1050 (viirya)
- chore: Remove unreachable code #1213 (andygrove)
- test: Enable Comet by default except some tests in SparkSessionExtensionSuite #1201 (kazuyukitanimura)
- chore: extract
struct
expressions to folders based on spark grouping #1216 (rluvaton) - chore: extract static invoke expressions to folders based on spark grouping #1217 (rluvaton)
- chore: Follow-on PR to fully enable onheap memory usage #1210 (andygrove)
- chore: extract agg_funcs expressions to folders based on spark grouping #1224 (rluvaton)
- chore: extract datetime_funcs expressions to folders based on spark grouping #1222 (rluvaton)
- chore: Upgrade to DataFusion 44.0.0 from 44.0.0 RC2 #1232 (rluvaton)
- chore: extract strings file to
strings_func
like in spark grouping #1215 (rluvaton) - chore: extract predicate_functions expressions to folders based on spark grouping #1218 (rluvaton)
- build(deps): bump protobuf version to 3.21.12 #1234 (wForget)
- chore: extract json_funcs expressions to folders based on spark grouping #1220 (rluvaton)
- test: Enable shuffle by default in Spark tests #1240 (kazuyukitanimura)
- chore: extract hash_funcs expressions to folders based on spark grouping #1221 (rluvaton)
- build: Fix test failure caused by merging conflicting PRs #1259 (andygrove)
Credits
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
37 Andy Grove
10 Raz Luvaton
7 KAZUYUKI TANIMURA
3 Liang-Chi Hsieh
2 Parth Chandra
1 Adam Binford
1 Dharan Aditya
1 Himadri Pal
1 Jagdish Parihar
1 Kristin Cowalcijk
1 Matt Butrovich
1 Oleks V
1 Sem
1 Zhen Wang
1 gstvg
Thank you also to everyone who contributed ...
0.4.0
DataFusion Comet 0.4.0 Changelog
This release consists of 51 commits from 10 contributors. See credits at the end of this changelog for more information.
Fixed bugs:
- fix: Use the number of rows from underlying arrays instead of logical row count from RecordBatch #972 (viirya)
- fix: The spilled_bytes metric of CometSortExec should be size instead of time #984 (Kontinuation)
- fix: Properly handle Java exceptions without error messages; fix loading of comet native library from java.library.path #982 (Kontinuation)
- fix: Fallback to Spark if scan has meta columns #997 (viirya)
- fix: Fallback to Spark if named_struct contains duplicate field names #1016 (viirya)
- fix: Make comet-git-info.properties optional #1027 (andygrove)
- fix: TopK operator should return correct results on dictionary column with nulls #1033 (viirya)
- fix: need default value for getSizeAsMb(EXECUTOR_MEMORY.key) #1046 (neyama)
Performance related:
- perf: Remove one redundant CopyExec for SMJ #962 (andygrove)
- perf: Add experimental feature to replace SortMergeJoin with ShuffledHashJoin #1007 (andygrove)
- perf: Cache jstrings during metrics collection #1029 (mbutrovich)
Implemented enhancements:
- feat: Support
GetArrayStructFields
expression #993 (Kimahriman) - feat: Implement bloom_filter_agg #987 (mbutrovich)
- feat: Support more types with BloomFilterAgg #1039 (mbutrovich)
- feat: Implement CAST from struct to string #1066 (andygrove)
- feat: Use official DataFusion 43 release #1070 (andygrove)
- feat: Implement CAST between struct types #1074 (andygrove)
- feat: support array_append #1072 (NoeB)
- feat: Require offHeap memory to be enabled (always use unified memory) #1062 (andygrove)
Documentation updates:
- doc: add documentation interlinks #975 (comphead)
- docs: Add IntelliJ documentation for generated source code #985 (mbutrovich)
- docs: Update tuning guide #995 (andygrove)
- docs: Various documentation improvements #1005 (andygrove)
- docs: clarify that Maven central only has jars for Linux #1009 (andygrove)
- doc: fix K8s links and doc #1058 (comphead)
- docs: Update benchmarking.md #1085 (rluvaton-flarion)
Other:
- chore: Generate changelog for 0.3.0 release #964 (andygrove)
- chore: fix publish-to-maven script #966 (andygrove)
- chore: Update benchmarks results based on 0.3.0-rc1 #969 (andygrove)
- chore: update rem expression guide #976 (kazuyukitanimura)
- chore: Enable additional CreateArray tests #928 (Kimahriman)
- chore: fix compatibility guide #978 (kazuyukitanimura)
- chore: Update for 0.3.0 release, prepare for 0.4.0 development #970 (andygrove)
- chore: Don't transform the HashAggregate to CometHashAggregate if Comet shuffle is disabled #991 (viirya)
- chore: Make parquet reader options Comet options instead of Hadoop options #968 (parthchandra)
- chore: remove legacy comet-spark-shell #1013 (andygrove)
- chore: Reserve memory for native shuffle writer per partition #988 (viirya)
- chore: Bump arrow-rs to 53.1.0 and datafusion #1001 (kazuyukitanimura)
- chore: Revert "chore: Reserve memory for native shuffle writer per partition (#988)" #1020 (viirya)
- minor: Remove hard-coded version number from Dockerfile #1025 (andygrove)
- chore: Reserve memory for native shuffle writer per partition #1022 (viirya)
- chore: Improve error handling when native lib fails to load #1000 (andygrove)
- chore: Use twox-hash 2.0 xxhash64 oneshot api instead of custom implementation #1041 (NoeB)
- chore: Refactor Arrow Array and Schema allocation in ColumnReader and MetadataColumnReader #1047 (viirya)
- minor: Refactor binary expr serde to reduce code duplication #1053 (andygrove)
- chore: Upgrade to DataFusion 43.0.0-rc1 #1057 (andygrove)
- chore: Refactor UnaryExpr and MathExpr in protobuf #1056 (andygrove)
- minor: use defaults instead of hard-coding values #1060 (andygrove)
- minor: refactor UnaryExpr handling to make code more concise #1065 (andygrove)
- chore: Refactor binary and math expression serde code #1069 (andygrove)
- chore: Simplify CometShuffleMemoryAllocator to use Spark unified memory allocator #1063 (viirya)
- test: Restore one test in CometExecSuite by adding COMET_SHUFFLE_MODE config #1087 (viirya)
Credits
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
19 Andy Grove
13 Matt Butrovich
8 Liang-Chi Hsieh
3 KAZUYUKI TANIMURA
2 Adam Binford
2 Kristin Cowalcijk
1 NoeB
1 Oleks V
1 Parth Chandra
1 neyama
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.
0.3.0
DataFusion Comet 0.3.0 Changelog
This release consists of 57 commits from 12 contributors. See credits at the end of this changelog for more information.
Fixed bugs:
- fix: Support type coercion for ScalarUDFs #865 (Kimahriman)
- fix: CometTakeOrderedAndProjectExec native scan node should use child operator's output #896 (viirya)
- fix: Fix various memory leaks problems #890 (Kontinuation)
- fix: Add output to Comet operators equal and hashCode #902 (viirya)
- fix: Fallback to Spark when cannot resolve AttributeReference #926 (viirya)
- fix: Fix memory bloat caused by holding too many unclosed
ArrowReaderIterator
s #929 (Kontinuation) - fix: Normalize NaN and zeros for floating number comparison #953 (viirya)
- fix: window function range offset should be long instead of int #733 (huaxingao)
- fix: CometScanExec on Spark 3.5.2 #915 (Kimahriman)
- fix: div and rem by negative zero #960 (kazuyukitanimura)
Performance related:
- perf: Optimize CometSparkToColumnar for columnar input #892 (mbutrovich)
- perf: Fall back to Spark if query uses DPP with v1 data sources #897 (andygrove)
- perf: Report accurate total time for scans #916 (andygrove)
- perf: Add metric for time spent casting in native scan #919 (andygrove)
- perf: Add criterion benchmark for aggregate expressions #948 (andygrove)
- perf: Add metric for time spent in CometSparkToColumnarExec #931 (mbutrovich)
- perf: Optimize decimal precision check in decimal aggregates (sum and avg) #952 (andygrove)
Implemented enhancements:
- feat: Add config option to enable converting CSV to columnar #871 (andygrove)
- feat: Implement basic version of string to float/double/decimal #870 (andygrove)
- feat: Implement to_json for subset of types #805 (andygrove)
- feat: Add ShuffleQueryStageExec to direct child node for CometBroadcastExchangeExec #880 (viirya)
- feat: Support sort merge join with a join condition #553 (viirya)
- feat: Array element extraction #899 (Kimahriman)
- feat: date_add and date_sub functions #910 (mbutrovich)
- feat: implement scripts for binary release build #932 (parthchandra)
- feat: Publish artifacts to maven #946 (parthchandra)
Documentation updates:
- doc: Documenting Helm chart for Comet Kube execution #874 (comphead)
- doc: Update native code path in development #921 (viirya)
- docs: Add more detailed architecture documentation #922 (andygrove)
Other:
- chore: Update installation.md #869 (haoxins)
- chore: Update versions to 0.3.0 / 0.3.0-SNAPSHOT #868 (andygrove)
- chore: Add documentation on running benchmarks with Microk8s #848 (andygrove)
- chore: Improve CometExchange metrics #873 (viirya)
- chore: Add spilling metrics of SortMergeJoin #878 (viirya)
- chore: change shuffle mode default from jvm to auto #877 (andygrove)
- chore: Enable shuffle by default #881 (andygrove)
- chore: print Comet native version to logs after Comet is initialized #900 (SemyonSinchenko)
- chore: Revise batch pull approach to more follow C Data interface semantics #893 (viirya)
- chore: Close dictionary provider when iterator is closed #904 (andygrove)
- chore: Remove unused function #906 (viirya)
- chore: Upgrade to latest DataFusion revision #909 (andygrove)
- build: fix build #917 (andygrove)
- chore: Revise array import to more follow C Data Interface semantics #905 (viirya)
- chore: Address reviews #920 (viirya)
- chore: Enable Comet shuffle for Spark core-1 test #924 (viirya)
- build: Add maven-compiler-plugin for java cross-build #911 (viirya)
- build: Disable upload-test-reports for macos-13 runner #933 (viirya)
- minor: cast timestamp test #468 #923 (himadripal)
- build: Set Java version arg for scala-maven-plugin #934 (viirya)
- chore: Remove redundant RowToColumnar from CometShuffleExchangeExec for columnar shuffle #944 (viirya)
- minor: rename CometMetricNode
add
toset
and update documentation #940 (andygrove) - chore: Add config for enabling SMJ with join condition #937 (andygrove)
- chore: Change maven group ID to
org.apache.datafusion
#941 (andygrove) - chore: Upgrade to DataFusion 42.0.0 #945 (andygrove)
- build: Fix regression in jar packaging #950 (andygrove)
- chore: Show reason for falling back to Spark when SMJ with join condition is not enabled #956 (andygrove)
- chore: clarify tarball installation #959 (comphead)
Credits
Thank you to everyone who contributed to this release. Here is a breakdown of commits (PRs merged) per contributor.
22 Andy Grove
18 Liang-Chi Hsieh
3 Adam Binford
3 Matt Butrovich
2 Kristin Cowalcijk
2 Oleks V
2 Parth Chandra
1 Himadri Pal
1 Huaxin Gao
1 KAZUYUKI TANIMURA
1 Semyon
1 Xin Hao
Thank you also to everyone who contributed in other ways such as filing issues, reviewing PRs, and providing feedback on this release.
0.2.0
DataFusion Comet 0.2.0 Changelog
This release consists of 87 commits from 14 contributors. See credits at the end of this changelog for more information.
Fixed bugs:
- fix: dictionary decimal vector optimization #705 (kazuyukitanimura)
- fix: Unsupported window expression should fall back to Spark #710 (viirya)
- fix: ReusedExchangeExec can be child operator of CometBroadcastExchangeExec #713 (viirya)
- fix: Fallback to Spark for window expression with range frame #719 (viirya)
- fix: Remove
skip.surefire.tests
mvn property #739 (wForget) - fix: subquery execution under CometTakeOrderedAndProjectExec should not fail #748 (viirya)
- fix: skip negative scale checks for creating decimals #723 (kazuyukitanimura)
- fix: Fallback to Spark for unsupported partitioning #759 (viirya)
- fix: Unsupported types for SinglePartition should fallback to Spark #765 (viirya)
- fix: unwrap dictionaries in CreateNamedStruct #754 (andygrove)
- fix: Fallback to Spark for unsupported input besides ordering #768 (viirya)
- fix: Native window operator should be CometUnaryExec #774 (viirya)
- fix: Fallback to Spark when shuffling on struct with duplicate field name #776 (viirya)
- fix: withInfo was overwriting information in some cases #780 (andygrove)
- fix: Improve support for nested structs #800 (eejbyfeldt)
- fix: Sort on single struct should fallback to Spark #811 (viirya)
- fix: Check sort order of SortExec instead of child output #821 (viirya)
- fix: Fix panic in
avg
aggregate and disablestddev
by default #819 (andygrove) - fix: Supported nested types in HashJoin #735 (eejbyfeldt)
Performance related:
- perf: Improve performance of CASE .. WHEN expressions #703 (andygrove)
- perf: Optimize IfExpr by delegating to CaseExpr #681 (andygrove)
- fix: optimize isNullAt #732 (kazuyukitanimura)
- perf: decimal decode improvements #727 (parthchandra)
- fix: Remove castting on decimals with a small precision to decimal256 #741 (kazuyukitanimura)
- fix: optimize some bit functions #718 (kazuyukitanimura)
- fix: Optimize getDecimal for small precision #758 (kazuyukitanimura)
- perf: add metrics to CopyExec and ScanExec #778 (andygrove)
- fix: Optimize decimal creation macros #764 (kazuyukitanimura)
- perf: Improve count aggregate performance #784 (andygrove)
- fix: Optimize read_side_padding #772 (kazuyukitanimura)
- perf: Remove some redundant copying of batches #816 (andygrove)
- perf: Remove redundant copying of batches after FilterExec #835 (andygrove)
- fix: Optimize CheckOverflow #852 (kazuyukitanimura)
- perf: Add benchmarks for Spark Scan + Comet Exec #863 (andygrove)
Implemented enhancements:
- feat: Add support for time-zone, 3 & 5 digit years: Cast from string to timestamp. #704 (akhilss99)
- feat: Support count AggregateUDF for window function #736 (huaxingao)
- feat: Implement basic version of RLIKE #734 (andygrove)
- feat: show executed native plan with metrics when in debug mode #746 (andygrove)
- feat: Add GetStructField expression #731 (Kimahriman)
- feat: Add config to enable native upper and lower string conversion #767 (andygrove)
- feat: Improve native explain #795 (andygrove)
- feat: Add support for null literal with struct type #797 (eejbyfeldt)
- feat: Optimze CreateNamedStruct preserve dictionaries #789 (eejbyfeldt)
- feat:
CreateArray
support #793 (Kimahriman) - feat: Add native thread configs #828 (viirya)
- feat: Add specific configs for converting Spark Parquet and JSON data to Arrow #832 (andygrove)
- feat: Support sum in window function #802 (huaxingao)
- feat: Simplify configs for enabling/disabling operators #855 (andygrove)
- feat: Enable
clippy::clone_on_ref_ptr
onproto
andspark_exprs
crates #859 (comphead) - feat: Enable
clippy::clone_on_ref_ptr
oncore
crate #860 (comphead) - feat: Use CometPlugin as main entrypoint #853 (andygrove)
Documentation updates:
- doc: Update outdated spark.comet.columnar.shuffle.enabled configuration doc #738 (wForget)
- docs: Add explicit configs for enabling operators #801 (andygrove)
- doc: Document CometPlugin to start Comet in cluster mode #836 (comphead)
Other:
- chore: Make rust clippy happy #701 (Xuanwo)
- chore: Update version to 0.2.0 and add 0.1.0 changelog #696 (andygrove)
- chore: Use rust-toolchain.toml for better toolchain support #699 (Xuanwo)
- chore(native): Make sure all targets in workspace been covered by clippy #702 (Xuanwo)
- Apache DataFusion Comet Logo #697 (aocsa)
- chore: Add logo to rat exclude list #709 (andygrove)
- chore: Use new logo in README and website #724 (andygrove)
- build: Add Comet logo files into exclude list #726 (viirya)
- chore: Remove TPC-DS benchmark results #728 (andygrove)
- chore: make Cast's logic reusable for other projects #716 (Blizzara)
- chore: move scalar_funcs into spark-expr #712 (Blizzara)
- chore: Bump DataFusion to rev 35c2e7e #740 (andygrove)
- chore: add more aggregate functions to benchmark test #706 (huaxingao)
- chore: Add criterion benchmark for decimal_div #743 (andygrove)
- build: Re-enable TPCDS q72 for broadcast and hash join configs #781 (viirya)
- chore: bump DataFusion to rev f4e519f #783 (huaxingao)
- chore: Upgrade to DataFusion rev bddb641 and disable "skip partial aggregates" feature [#788](https://github.com/apa...
0.1.0
DataFusion Comet 0.1.0 Changelog
This release consists of 343 commits from 41 contributors. See credits at the end of this changelog for more information.
Implemented enhancements:
- feat: Add native shuffle and columnar shuffle #30 (viirya)
- feat: Support Emit::First for SumDecimalGroupsAccumulator #47 (viirya)
- feat: Nested map support for columnar shuffle #51 (viirya)
- feat: Support Count(Distinct) and similar aggregation functions #42 (huaxingao)
- feat: Upgrade to
jni-rs
0.21 #50 (sunchao) - feat: Handle exception thrown from native side #61 (sunchao)
- feat: Support InSet expression in Comet #59 (viirya)
- feat: Add
CometNativeException
for exceptions thrown from the native side #62 (sunchao) - feat: Add cause to native exception #63 (viirya)
- feat: Pull based native execution #69 (viirya)
- feat: Add executeColumnarCollectIterator to CometExec to collect Comet operator result #71 (viirya)
- feat: Add CometBroadcastExchangeExec to support broadcasting the result of Comet native operator #80 (viirya)
- feat: Reduce memory consumption when writing sorted shuffle files #82 (sunchao)
- feat: Add struct/map as unsupported map key/value for columnar shuffle #84 (viirya)
- feat: Support multiple input sources for CometNativeExec #87 (viirya)
- feat: Date and timestamp trunc with format array #94 (parthchandra)
- feat: Support
First
/Last
aggregate functions #97 (huaxingao) - feat: Add support of TakeOrderedAndProjectExec in Comet #88 (viirya)
- feat: Support Binary in shuffle writer #106 (advancedxy)
- feat: Add license header by spotless:apply automatically #110 (advancedxy)
- feat: Add dictionary binary to shuffle writer #111 (viirya)
- feat: Minimize number of connections used by parallel reader #126 (parthchandra)
- feat: Support CollectLimit operator #100 (advancedxy)
- feat: Enable min/max for boolean type #165 (huaxingao)
- feat: Introduce
CometTaskMemoryManager
and native side memory pool #83 (sunchao) - feat: Fix old style names #201 (comphead)
- feat: enable comet shuffle manager for comet shell #204 (zuston)
- feat: Support bitwise aggregate functions #197 (huaxingao)
- feat: Support BloomFilterMightContain expr #179 (advancedxy)
- feat: Support sort merge join #178 (viirya)
- feat: Support HashJoin operator #194 (viirya)
- feat: Remove use of nightly int_roundings feature #228 (psvri)
- feat: Support Broadcast HashJoin #211 (viirya)
- feat: Enable Comet broadcast by default #213 (viirya)
- feat: Add CometRowToColumnar operator #206 (advancedxy)
- feat: Document the class path / classloader issue with the shuffle manager #256 (holdenk)
- feat: Port Datafusion Covariance to Comet #234 (huaxingao)
- feat: Add manual test to calculate spark builtin functions coverage #263 (comphead)
- feat: Support ANSI mode in CAST from String to Bool #290 (andygrove)
- feat: Add extended explain info to Comet plan #255 (parthchandra)
- feat: Improve CometSortMergeJoin statistics #304 (planga82)
- feat: Add compatibility guide #316 (andygrove)
- feat: Improve CometHashJoin statistics #309 (planga82)
- feat: Support Variance #297 (huaxingao)
- feat: Support murmur3_hash and sha2 family hash functions #226 (advancedxy)
- feat: Disable cast string to timestamp by default #337 (andygrove)
- feat: Improve CometBroadcastHashJoin statistics #339 (planga82)
- feat: Implement Spark-compatible CAST from string to integral types #307 (andygrove)
- feat: Implement Spark-compatible CAST from string to timestamp types #335 (vaibhawvipul)
- feat: Implement Spark-compatible CAST float/double to string #346 (mattharder91)
- feat: Only allow incompatible cast expressions to run in comet if a config is enabled #362 (andygrove)
- feat: Implement Spark-compatible CAST between integer types #340 (ganeshkumar269)
- feat: Supports Stddev #348 (huaxingao)
- feat: Improve cast compatibility tests and docs #379 (andygrove)
- feat: Implement Spark-compatible CAST from non-integral numeric types to integral types #399 (rohitrastogi)
- feat: Implement Spark unhex #342 (tshauck)
- feat: Enable columnar shuffle by default #250 (viirya)
- feat: Implement Spark-compatible CAST from floating-point/double to decimal #384 (vaibhawvipul)
- feat: Add logging to explain reasons for Comet not being able to run a query stage natively #397 (andygrove)
- feat: Add support for TryCast expression in Spark 3.2 and 3.3 #416 (vaibhawvipul)
- feat: Supports UUID column #395 (huaxingao)
- feat: correlation support #456 (huaxingao)
- feat: Implement Spark-compatible CAST from String to Date #383 (vidyasankarv)
- feat: Add COMET_SHUFFLE_MODE config to control Comet shuffle mode #460 (viirya)
- feat: Add random row generator in data generator #451 (advancedxy)
- feat: Add xxhash64 function support #424 (advancedxy)
- feat: add hex scalar function #449 (tshauck)
- feat: Add "Comet Fuzz" fuzz-testing utility #472 (andygrove)
- feat: Use enum to represent CAST eval_mode in expr.proto #415 (prashantksharma)
- feat: Implement ANSI support for UnaryMinus #471 (vaibhawvipul)
- feat: Add specific fuzz tests for cast and try_cast and fix NPE found during fuzz testing #514 (andygrove)
- feat: Add fuzz testing for arithmetic expressions #519 (andygr...