feat: [datafusion-spark] Implement `next_day` function #16780

petern48 · 2025-07-15T04:04:40Z

Which issue does this PR close?

Closes [datafusion-spark] Implement Spark date function next_day #16775

Rationale for this change

See #16775

What changes are included in this PR?

Implement spark-compatible next_day function

Are these changes tested?

Yes, I added tests from all of the links in the Spark Test Files README.md

Are there any user-facing changes?

Yes, new function.

2010YOUY01 · 2025-07-15T06:37:40Z

datafusion/spark/src/function/datetime/next_day.rs

+impl SparkNextDay {
+    pub fn new() -> Self {
+        Self {
+            signature: Signature::user_defined(Volatility::Immutable),


We can define a specific signature here to be (Date32, Utf8/Utf8View/LargeUtf8)
After that I think the implementation can be simplified

No need to implement coerce_types(), there are code to handle that automatically based on the signature.

We can assume the signature is valid inside invoke_with_args(), so there would be no need to check invalid input (sanity checks like unreachable!() or returning internal errors for invalid input can still be applied)

2010YOUY01 · 2025-07-15T06:42:43Z

datafusion/sqllogictest/test_files/spark/datetime/next_day.slt

@@ -23,5 +23,17 @@

 ## Original Query: SELECT next_day('2015-01-14', 'TU');
 ## PySpark 3.5.5 Result: {'next_day(2015-01-14, TU)': datetime.date(2015, 1, 20), 'typeof(next_day(2015-01-14, TU))': 'date', 'typeof(2015-01-14)': 'string', 'typeof(TU)': 'string'}
-#query
-#SELECT next_day('2015-01-14'::string, 'TU'::string);
+query D


I recommend to add tests for invalid inputs:

0 or >2 inputs

Each element can be either valid input, invalid input of correct type like 2015-13-32, or invalid types, and finally nulls. We want to test different combinations, to ensure for invalid inputs, the expected (and easy-to-understand) errors are returned, instead of panicking.

Also here we only checked ScalarValue() input, let's also do the tests for Array inputs.

Copilot

Pull Request Overview

Add support for Spark’s next_day function in DataFusion by implementing the UDF and its tests, registering it in the datetime module, and adding chrono as a dependency.

Introduced SQLLogicTest cases for next_day
Implemented SparkNextDay UDF (scalar + array)
Registered the UDF in mod.rs and updated Cargo.toml

Reviewed Changes

Copilot reviewed 4 out of 5 changed files in this pull request and generated 1 comment.

File	Description
next_day.slt	Added functional tests for `next_day` with various inputs
next_day.rs	Full implementation of `next_day` UDF logic
mod.rs	Registered and exported `next_day` in datetime module
Cargo.toml	Added `chrono` as a workspace dependency

Comments suppressed due to low confidence (3)

datafusion/spark/src/function/datetime/next_day.rs:77

The code only handles Date32 inputs for the date argument but the tests pass string dates. You need to add a branch to parse ScalarValue::Utf8/LargeUtf8 as ISO-8601 dates and convert them to Date32 before computing the next day.

            (ColumnarValue::Scalar(date), ColumnarValue::Scalar(day_of_week)) => {

datafusion/sqllogictest/test_files/spark/datetime/next_day.slt:32

Consider adding tests for edge cases such as NULL inputs and invalid weekday strings to verify null propagation and error handling behavior.

SELECT next_day('2015-07-27'::string, 'Sun'::string);

datafusion/spark/Cargo.toml:40

The syntax for adding a workspace dependency is incorrect. Change to chrono = { workspace = true } to match the other entries.

chrono.workspace = true

Copilot · 2025-07-15T06:44:27Z

datafusion/spark/src/function/datetime/next_day.rs

+fn spark_next_day(days: i32, day_of_week: &str) -> Option<i32> {
+    let date = Date32Type::to_naive_date(days);
+


[nitpick] The spark_next_day function recomputes trim().to_uppercase() and parses the weekday for each element in an array. You could pre-normalize and parse the target Weekday once outside loops for better performance on large arrays.

Suggested change

fn spark_next_day(days: i32, day_of_week: &str) -> Option<i32> {

let date = Date32Type::to_naive_date(days);

fn spark_next_day_with_weekday(days: i32, day_of_week: Weekday) -> Option<i32> {

let date = Date32Type::to_naive_date(days);

Some(Date32Type::from_naive_date(

date + Duration::days(

(7 - date.weekday().days_since(day_of_week)) as i64,

),

))

}

fn normalize_and_parse_weekday(day_of_week: &str) -> Option<Weekday> {

datafusion/spark/Cargo.toml

Omega359 · 2025-07-15T12:15:58Z

datafusion/spark/src/function/datetime/mod.rs

+
+    export_functions!((
+        next_day,
+        "Returns the first date which is later than start_date and named as indicated. The function returns NULL if at least one of the input parameters is NULL. When both of the input parameters are not NULL and day_of_week is an invalid input, the function throws SparkIllegalArgumentException if spark.sql.ansi.enabled is set to true, otherwise NULL.",


I think this needs to be adjusted. Rust does not have exceptions and ansi mode is not hooked up yet (might need something like #16661 for that to happen)

Co-authored-by: Bruce Ritchie <[email protected]>

Implement next_day

26b6f24

github-actions bot added sqllogictest SQL Logic Tests (.slt) spark labels Jul 15, 2025

cargo fmt

1314e81

petern48 changed the title ~~feat: Implement next_day~~ feat: [datafusion-spark] Implement next_day function Jul 15, 2025

petern48 marked this pull request as ready for review July 15, 2025 04:55

2010YOUY01 reviewed Jul 15, 2025

View reviewed changes

2010YOUY01 requested a review from Copilot July 15, 2025 06:43

Copilot AI reviewed Jul 15, 2025

View reviewed changes

Omega359 reviewed Jul 15, 2025

View reviewed changes

datafusion/spark/Cargo.toml Outdated Show resolved Hide resolved

Omega359 reviewed Jul 15, 2025

View reviewed changes

petern48 marked this pull request as draft July 16, 2025 05:09

Update datafusion/spark/Cargo.toml

0c2b126

Co-authored-by: Bruce Ritchie <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: [datafusion-spark] Implement `next_day` function #16780

feat: [datafusion-spark] Implement `next_day` function #16780

petern48 commented Jul 15, 2025

Uh oh!

2010YOUY01 Jul 15, 2025

Uh oh!

2010YOUY01 Jul 15, 2025

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Jul 15, 2025

Uh oh!

Uh oh!

Omega359 Jul 15, 2025

Uh oh!

Uh oh!

		fn spark_next_day(days: i32, day_of_week: &str) -> Option<i32> {
		let date = Date32Type::to_naive_date(days);

-fn spark_next_day(days: i32, day_of_week: &str) -> Option<i32> {
-    let date = Date32Type::to_naive_date(days);
+fn spark_next_day_with_weekday(days: i32, day_of_week: Weekday) -> Option<i32> {
+    let date = Date32Type::to_naive_date(days);
+    Some(Date32Type::from_naive_date(
+        date + Duration::days(
+            (7 - date.weekday().days_since(day_of_week)) as i64,
+        ),
+    ))
+}
+fn normalize_and_parse_weekday(day_of_week: &str) -> Option<Weekday> {

feat: [datafusion-spark] Implement next_day function #16780

Are you sure you want to change the base?

feat: [datafusion-spark] Implement next_day function #16780

Conversation

petern48 commented Jul 15, 2025

Which issue does this PR close?

Rationale for this change

What changes are included in this PR?

Are these changes tested?

Are there any user-facing changes?

Uh oh!

2010YOUY01 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

2010YOUY01 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull Request Overview

Reviewed Changes

Uh oh!

Copilot AI Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Omega359 Jul 15, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

feat: [datafusion-spark] Implement `next_day` function #16780

feat: [datafusion-spark] Implement `next_day` function #16780