Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-51439][SQL] Support SQL UDF with DEFAULT argument #50408

Open
wants to merge 25 commits into
base: master
Choose a base branch
from

Conversation

wengh
Copy link
Contributor

@wengh wengh commented Mar 26, 2025

Continuing @allisonwang-db's work on #50373 and #49471

What changes were proposed in this pull request?

This PR adds support for DEFAULT arguments in SQL UDF. Examples:

CREATE FUNCTION foo1d1(a INT DEFAULT 10) RETURNS INT RETURN a;
SELECT foo1d1();   -- 10
SELECT foo1d1(20); -- 20

CREATE FUNCTION foo1d6(a INT, b INT DEFAULT 7) RETURNS TABLE(a INT, b INT) RETURN SELECT a, b;
SELECT * FROM foo1d6(5);    -- 5, 7
SELECT * FROM foo1d6(5, 2); -- 5, 2

See sql-udf.sql for more valid and invalid examples.

Why are the changes needed?

To support default arguments in SQL UDFs.

Does this PR introduce any user-facing change?

Yes. Now SQL UDFs support DEFAULT arguments.

A side effect of the grammar change is that some invalid function parameter definitions are now no longer rejected by the grammar, but instead rejected by the parser logic.

Examples:

-- multiple COMMENT or multiple NOT NULL
CREATE TEMPORARY FUNCTION foo(a INT COMMENT 'hello' COMMENT 'world') RETURNS INT RETURN a;

-- before:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'COMMENT'. SQLSTATE: 42601
== SQL (line 2, position 1) ==
CREATE TEMPORARY FUNCTION foo(a INT COMMENT 'hello' COMMENT 'world') RETURNS INT RETURN a;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-- after:
[CREATE_TABLE_COLUMN_DESCRIPTOR_DUPLICATE] CREATE TABLE column a specifies descriptor "COMMENT" more than once, which is invalid. SQLSTATE: 42710
== SQL (line 1, position 1) ==
CREATE TEMPORARY FUNCTION foo(a INT COMMENT 'hello' COMMENT 'world')...
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-- GENERATED ALWAYS AS
CREATE TEMPORARY FUNCTION foo(a INT GENERATED ALWAYS AS (1)) RETURNS INT RETURN a;

-- before:
[PARSE_SYNTAX_ERROR] Syntax error at or near 'GENERATED'. SQLSTATE: 42601
== SQL (line 2, position 1) ==
CREATE TEMPORARY FUNCTION foo(a INT GENERATED ALWAYS AS (1)) RETURNS INT RETURN a;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

-- after:
[INVALID_SQL_SYNTAX.CREATE_FUNC_WITH_GENERATED_COLUMNS_AS_PARAMETERS] Invalid SQL syntax: CREATE FUNCTION with generated columns as parameters is not allowed. SQLSTATE: 42000
== SQL (line 2, position 1) ==
CREATE TEMPORARY FUNCTION foo(a INT GENERATED ALWAYS AS (1)) RETURNS INT RETURN a;
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

This doesn't change the behavior of existing valid SQL.

How was this patch tested?

End-to-end regression tests in sql-udf.sql and simple tests in SQLFunctionSuite.

Was this patch authored or co-authored using generative AI tooling?

No

@wengh wengh changed the title [WIP][SPARK-51439] Support SQL UDF with DEFAULT argument [WIP][SPARK-51439][SQL] Support SQL UDF with DEFAULT argument Mar 26, 2025
@wengh wengh marked this pull request as ready for review March 27, 2025 15:05
@wengh wengh changed the title [WIP][SPARK-51439][SQL] Support SQL UDF with DEFAULT argument [SPARK-51439][SQL] Support SQL UDF with DEFAULT argument Mar 27, 2025
@wengh
Copy link
Contributor Author

wengh commented Mar 27, 2025

@wengh wengh force-pushed the sql-udf-default branch from ecfb9ce to aee7338 Compare March 27, 2025 16:01
Copy link
Contributor

@allisonwang-db allisonwang-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great!

@wengh wengh marked this pull request as draft March 28, 2025 00:31
@wengh wengh force-pushed the sql-udf-default branch from 8f51e83 to 8c73a14 Compare March 28, 2025 16:48
@wengh wengh marked this pull request as ready for review March 28, 2025 17:13
Copy link
Contributor Author

@wengh wengh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explain refactor

@wengh wengh requested review from cloud-fan and zhengruifeng April 1, 2025 00:22
@wengh wengh requested a review from cloud-fan April 2, 2025 23:03
@wengh wengh force-pushed the sql-udf-default branch from 275f637 to e4a6883 Compare April 8, 2025 21:51
Copy link
Contributor

@cloud-fan cloud-fan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM if CI passes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants