Skip to content

Add ARRAY_TO_CSV helper function #4587#5404

Open
ishag4 wants to merge 2 commits intoopensearch-project:mainfrom
ishag4:issue-4587
Open

Add ARRAY_TO_CSV helper function #4587#5404
ishag4 wants to merge 2 commits intoopensearch-project:mainfrom
ishag4:issue-4587

Conversation

@ishag4
Copy link
Copy Markdown

@ishag4 ishag4 commented May 4, 2026

Description

[Describe what this change achieves]

Related Issues

Resolves #4587

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

PR Reviewer Guide 🔍

(Review updated until commit baa3553)

Here are some key observations to aid the review process:

🧪 PR contains tests
🔒 No security concerns identified
✅ No TODO sections
🔀 No multiple PR themes
⚡ Recommended focus areas for review

CSV Escaping

The arrayToCsv method does not handle CSV special characters. If an element contains the delimiter, a double-quote, or a newline, the output will be malformed CSV. Proper CSV encoding should quote fields containing special characters and escape internal double-quotes (e.g., RFC 4180 compliance).

public static String arrayToCsv(List<Object> array, String delimiter) {
  if (array == null) {
    return null;
  }

  if (delimiter == null) {
    delimiter = ",";
  }

  if (array.isEmpty()) {
    return "";
  }

  StringBuilder result = new StringBuilder();
  for (int i = 0; i < array.size(); i++) {
    if (i > 0) {
      result.append(delimiter);
    }
    Object element = array.get(i);
    if (element != null) {
      result.append(element.toString());
    }
  }

  return result.toString();
}
Null Policy Mismatch

The constructor uses NullPolicy.ARG0, which means the function returns null when the first argument (the array) is null — this is handled correctly. However, the arrayToCsv static method also has an explicit null check for the array. These are redundant but consistent. More importantly, if NullPolicy.ARG0 short-circuits before calling arrayToCsv, the null-delimiter fallback logic inside arrayToCsv will still be reached when the array is non-null but the delimiter is null. This is fine, but the interaction between the null policy and the runtime method should be validated to ensure no unexpected behavior when the delimiter argument is null at the Calcite expression level.

public ArrayToCsvFunctionImpl() {
  super(new ArrayToCsvImplementor(), NullPolicy.ARG0);
}
Missing ARRAY_SLICE/ARRAY_COMPACT Pattern

Other array functions like ARRAY_SLICE and ARRAY_COMPACT are registered with true (indicating they are variadic or have special handling) in BuiltinFunctionName. It should be verified whether ARRAY_TO_CSV also needs such a flag, given it supports an optional second argument.

}

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 4, 2026

PR Code Suggestions ✨

Latest suggestions up to baa3553
Explore these optional code suggestions:

CategorySuggestion                                                                                                                                    Impact
General
Use explicit null policy for consistent behavior

Using NullPolicy.ARG0 means the function will return null if the first argument (the
array) is null, which is the intended behavior. However, when a two-argument form is
used and the second argument (delimiter) is null, the NullPolicy.ARG0 policy will
NOT short-circuit on a null delimiter — the arrayToCsv method handles this case by
defaulting to ",". This is correct, but it should be verified that NullPolicy.ARG0
does not also short-circuit on ARG1 being null in the two-argument case, potentially
bypassing the null-delimiter fallback logic. Consider using NullPolicy.NONE and
handling all null checks explicitly in arrayToCsv to ensure consistent behavior.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/ArrayToCsvFunctionImpl.java [30-32]

 public ArrayToCsvFunctionImpl() {
-  super(new ArrayToCsvImplementor(), NullPolicy.ARG0);
+  super(new ArrayToCsvImplementor(), NullPolicy.NONE);
 }
Suggestion importance[1-10]: 5

__

Why: The suggestion raises a valid concern about NullPolicy.ARG0 potentially short-circuiting on the first argument while the second argument's null handling is done manually in arrayToCsv. Switching to NullPolicy.NONE ensures all null checks are handled explicitly in the method, providing more predictable behavior. This is a moderate correctness/consistency concern.

Low
Clarify null element handling in loop

The current implementation skips null elements entirely (no output for null), but
the tests expect null elements to produce empty strings between delimiters (e.g.,
"GET,,WRITE"). The if (element != null) block should use an else branch to append an
empty string, or simply call result.append(element != null ? element.toString() :
""). The current code actually does produce the correct output since the delimiter
is appended regardless, but the logic is misleading — the if (element != null) block
effectively appends nothing for null, which is correct. However, to make the intent
explicit and avoid future bugs, use a ternary expression.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/ArrayToCsvFunctionImpl.java [99-106]

 public static String arrayToCsv(List<Object> array, String delimiter) {
   if (array == null) {
     return null;
   }
 
   if (delimiter == null) {
     delimiter = ",";
   }
 
   if (array.isEmpty()) {
     return "";
   }
 
   StringBuilder result = new StringBuilder();
   for (int i = 0; i < array.size(); i++) {
     if (i > 0) {
       result.append(delimiter);
     }
     Object element = array.get(i);
-    if (element != null) {
-      result.append(element.toString());
-    }
+    result.append(element != null ? element.toString() : "");
   }
 
   return result.toString();
 }
Suggestion importance[1-10]: 4

__

Why: The suggestion correctly identifies that the current code produces the right output but the intent is unclear. Replacing the if (element != null) block with a ternary expression makes the null-handling intent explicit and reduces the chance of future bugs. This is a minor readability/maintainability improvement.

Low

Previous suggestions

Suggestions up to commit 1a85890
CategorySuggestion                                                                                                                                    Impact
Possible issue
Fix null policy to cover all arguments

The class uses NullPolicy.ARG0, which means the implementor is only called when the
first argument (the array) is non-null. However, the second argument (delimiter)
could still be null at runtime, and the arrayToCsv method handles that by defaulting
to ",". This is fine, but the NotNullImplementor contract assumes non-null inputs —
if the delimiter expression evaluates to null at the Calcite expression level, it
may not be passed correctly. Consider using NullPolicy.ANY or NullPolicy.NONE to
ensure consistent null handling for both arguments.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/ArrayToCsvFunctionImpl.java [30-32]

-public static class ArrayToCsvImplementor implements NotNullImplementor {
-  @Override
-  public Expression implement(
-      RexToLixTranslator translator, RexCall call, List<Expression> translatedOperands) {
-    // Handle both 1-argument (with default delimiter) and 2-argument cases
-    if (translatedOperands.size() == 1) {
-      ...
-    } else if (translatedOperands.size() == 2) {
-      ...
-    } else {
-      throw new IllegalArgumentException(
-          "ARRAY_TO_CSV expects 1 or 2 arguments, got " + translatedOperands.size());
-    }
-  }
+public ArrayToCsvFunctionImpl() {
+  super(new ArrayToCsvImplementor(), NullPolicy.ANY);
 }
Suggestion importance[1-10]: 5

__

Why: Using NullPolicy.ARG0 means only the first argument's nullability is checked, but the second argument (delimiter) could be null at the Calcite expression level. Changing to NullPolicy.ANY would ensure consistent null handling for both arguments, though the arrayToCsv method already handles null delimiters gracefully.

Low
General
Use safer null-tolerant string conversion

When an element is null, the delimiter is still appended but no value is added,
which produces correct output (e.g., "a,,b"). However, the leading-null case (e.g.,
[null, "a"]) produces ",a" — a leading delimiter — which may be unexpected. The
current behavior is tested and accepted, but consider using String.join with
null-to-empty-string mapping via streams for clarity and correctness consistency.
More critically, using String.valueOf(element) instead of element.toString() avoids
a potential NullPointerException if the null check is ever removed or refactored.

core/src/main/java/org/opensearch/sql/expression/function/CollectionUDF/ArrayToCsvFunctionImpl.java [99-107]

 for (int i = 0; i < array.size(); i++) {
   if (i > 0) {
     result.append(delimiter);
   }
   Object element = array.get(i);
   if (element != null) {
-    result.append(element.toString());
+    result.append(String.valueOf(element));
   }
 }
Suggestion importance[1-10]: 2

__

Why: The change from element.toString() to String.valueOf(element) is functionally equivalent here since there's already a null check (if (element != null)) guarding the call. The suggestion provides minimal safety improvement since the null check already prevents NPE.

Low

@github-actions
Copy link
Copy Markdown
Contributor

github-actions Bot commented May 6, 2026

Persistent review updated to latest commit baa3553

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[FEATURE] Add ARRAY_TO_CSV helper function

1 participant