Skip to content

Support array and map argument in array_aggregate#15149

Open
thirtiseven wants to merge 4 commits into
NVIDIA:mainfrom
thirtiseven:array-aggregate-nested-arg
Open

Support array and map argument in array_aggregate#15149
thirtiseven wants to merge 4 commits into
NVIDIA:mainfrom
thirtiseven:array-aggregate-nested-arg

Conversation

@thirtiseven

Copy link
Copy Markdown
Collaborator

Fixes #15147.

Description

This PR adds array and map type support in array_aggregate overrides. It has been implemented but never been enbaled in GpuOverriedes.

Also added some integration tests.

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Please provide the names of the existing tests in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
@thirtiseven thirtiseven self-assigned this Jun 26, 2026
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
@thirtiseven thirtiseven marked this pull request as ready for review June 26, 2026 09:17
@greptile-apps

greptile-apps Bot commented Jun 26, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR enables ARRAY and MAP element types in the array_aggregate GPU override — support that was already implemented in GpuArrayAggregateMeta / ArrayAggregateDecomposer but was blocked by the TypeSig declaration in GpuOverrides. The docs and integration tests are updated to match.

  • GpuOverrides.scala: Adds TypeSig.ARRAY and TypeSig.MAP to the argument TypeSig for ArrayAggregate, enabling inputs like ARRAY<ARRAY<...>>, ARRAY<MAP<...>>, and ARRAY<STRUCT<..., MAP/ARRAY field, ...>> to run on the GPU.
  • docs/supported_ops.md: Removes ARRAY and MAP from the list of unsupported child types for this operator, consistent with the signature change.
  • higher_order_functions_test.py: Adds four integration tests covering STRUCTs with nested ARRAY/MAP children, direct ARRAY-of-ARRAY and ARRAY-of-MAP inputs (parametrized), and a realistic nested filter+aggregate pattern.

Confidence Score: 5/5

Safe to merge — the change is a minimal TypeSig unlock in GpuOverrides with no modifications to the GPU execution path itself.

The TypeSig change is a single-line addition that unblocks a code path already implemented and validated by GpuArrayAggregateMeta / ArrayAggregateDecomposer. The accumulator types remain scalar-only. Four new integration tests verify GPU execution.

No files require special attention.

Important Files Changed

Filename Overview
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOverrides.scala One-line TypeSig addition — adds ARRAY and MAP to the argument nested type set.
integration_tests/src/main/python/higher_order_functions_test.py Four new GPU-verified integration tests for the newly enabled input shapes.
docs/supported_ops.md Removes ARRAY and MAP from unsupported child-type list, consistent with TypeSig change.

Reviews (2): Last reviewed commit: "add test coverage" | Re-trigger Greptile

Comment thread integration_tests/src/main/python/higher_order_functions_test.py
Signed-off-by: Haoyang Li <haoyangl@nvidia.com>
@thirtiseven

Copy link
Copy Markdown
Collaborator Author

build

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEA] Support ArrayAggregate when the input array struct has unsupported unused nested fields

3 participants