Smarter default for split_out
in groupby aggregations
#867
Labels
enhancement
New feature or request
split_out
in groupby aggregations
#867
Currently, the default for
split_out
is based on the number of partitions and groupby columns:https://github.com/dask-contrib/dask-expr/blob/fdeee4375df55499462bc35af50d3d216d5e256c/dask_expr/_groupby.py#L81-L82
Ideally, this would take (an estimate of) the number of unique groups into account to avoid situations like coiled/benchmarks#1376.
The text was updated successfully, but these errors were encountered: