Skip to content

Conversation

@srikanthpadakanti
Copy link

Description

This pull request adds native support for the mvexpand command in PPL to OpenSearch SQL, enabling users to expand multivalue fields (arrays) into separate rows directly within queries. This functionality is analogous to Splunk's mvexpand command and streamlines analytics, dashboarding, and data preparation involving arrays or multivalue fields.

Key features introduced:

Native mvexpand command for PPL queries to expand array fields into separate rows/events.
Optional limit parameter to restrict the number of expanded values per event/document.
Robust handling of empty/null arrays, large arrays (with memory/resource limits), and non-array fields.
Streaming/distributable execution for performance and scalability.
Comprehensive documentation and edge case coverage.
This feature makes OpenSearch SQL more powerful and user-friendly for log analytics, data exploration, and migration from platforms like Splunk.

Related Issues

Resolves #4439
#4439

Check List

  • [ X] New functionality includes testing.
  • [ X] New functionality has been documented.
  • [ X] New functionality has javadoc added.
  • [ X] New functionality has a user manual doc added.
  • [ X] New PPL command checklist all confirmed.
  • [ X] API changes companion pull request created.
  • [ X] Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Srikanth Padakanti added 30 commits November 22, 2025 14:42
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
@srikanthpadakanti
Copy link
Author

srikanthpadakanti commented Jan 5, 2026

@ykmr1224
can you please trigger the workflows and review my changes. thanks.

dai-chen
dai-chen previously approved these changes Jan 6, 2026

PPL query:
```ppl
source=mvexpand_logs | where case='basic' | mvexpand tags | fields tags.value
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you explain the reason why we make this deviation?

Comment on lines 1 to 7
calcite:
logical: |
LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(skills=[$0], username=[$3], VALUE=[$10])
LogicalValues(tuples=[[]])
physical: |
EnumerableValues(tuples=[[]]) No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't see any plan related to mvexpand.

Comment on lines 1 to 7
calcite:
logical: |
LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT])
LogicalProject(skills=[$0], username=[$3], VALUE=[$10])
LogicalValues(tuples=[[]])
physical: |
EnumerableValues(tuples=[[]]) No newline at end of file
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java:
- Around line 318-324: The testMvexpandExplain currently loads the expected plan
into the variable expected via loadExpectedPlan("explain_mvexpand.yaml") but
never asserts it against the explain output; update the test to capture the
explain output by calling explainQueryYaml(...) (e.g., assign its return to a
variable like actual) and assert equality with expected using the same assertion
pattern used elsewhere in this class (e.g., assertEquals(expected, actual) or
the test helper assertion method), referencing testMvexpandExplain, expected,
loadExpectedPlan, and explainQueryYaml to locate the code to change.
🧹 Nitpick comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

851-855: Shared expand/mvexpand pipeline with per-doc limit is well-structured; double-check limit validation

Good points here:

  • visitPatterns and visitExpand now both delegate to buildExpandRelNode(..., null, context), so existing expand semantics are preserved while centralizing the unnest + correlate logic.
  • buildExpandRelNode scopes perDocLimit via relBuilder.limit(0, perDocLimit) on the right-hand uncollect side, which correctly enforces a per-document cap for mvexpand without affecting other expand users.

One thing to verify: what are the intended semantics for limit <= 0 (including 0 and negative) in mvexpand?

  • Today, a non-null perDocLimit is passed straight to limit(0, perDocLimit), so limit=0 yields no expansions and any negative value is handed directly to Calcite’s RelBuilder.limit, which may be undefined or error-prone.
  • If the syntax/analysis layer already enforces limit > 0, consider documenting that assumption here. Otherwise, it may be safer to guard and throw a SemanticCheckException on invalid values or treat 0 as “no cap” by mapping it to null.

Functionally this is solid; the request is just to confirm and, if needed, tighten the validation path.

Also applies to: 3110-3123, 3434-3486

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 600637f and 79c9b9d.

📒 Files selected for processing (2)
  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
core/src/main/java/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit

  • Public methods MUST have JavaDoc with @param, @return, and @throws
  • Follow existing function implementation patterns in the same package
  • New expression functions should follow ExpressionFunction interface patterns
  • Validate function naming follows project conventions (camelCase)

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read

  • Suggest extracting complex logic into helper methods
  • Check for code organization and logical grouping
  • Validate all RelNode types are handled

Files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag generic exception handling during code review

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Use specific exception types with meaningful messages for error handling

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

  • core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
  • Index (218-262)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
  • GitHub Check: WhiteSource Security Check
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)

31-49: MVEXPAND edge-cases index wiring looks correct

Loading Index.MVEXPAND_EDGE_CASES in init() is consistent with the new mvexpand explain test and follows the existing pattern used for other indices. No changes needed here.

Srikanth Padakanti added 2 commits January 7, 2026 13:52
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java:
- Around line 148-166: The tests testMvExpandEmptyOrNullArray,
testMvExpandWithDuplicates, testMvExpandLargeArray, and
testMvExpandPrimitiveArray currently only call assertMvexpandPlanned (verifying
planning/Uncollect) but not runtime behavior; update each to either (a) convert
to an integration-style test that executes the query and asserts concrete
results (e.g., for testMvExpandEmptyOrNullArray assert the result set is empty,
for testMvExpandWithDuplicates assert duplicate rows are/aren’t produced per
spec, for testMvExpandLargeArray assert behavior under large payloads or
expected error/memory handling, and for testMvExpandPrimitiveArray assert
expanded primitive values and types) or (b) enhance the unit assertions to
inspect the logical/physical plan nodes produced by assertMvexpandPlanned to
validate predicates, Uncollect node contents and null/empty handling rather than
only existence; locate usages of assertMvexpandPlanned and the named test
methods to implement these runtime assertions or deeper plan inspections.
🧹 Nitpick comments (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1)

118-130: Consider verifying SQL generation for the limit parameter.

While the test confirms that planning succeeds and limit-related operators appear in the plan, it doesn't verify the actual SQL output contains the limit clause. Consider adding a verifyPPLToSparkSQL() call to ensure the generated SQL correctly implements the per-document limit.

Based on learnings, Calcite integration changes should test SQL generation and optimization paths.

💡 Example verification
 @Test
 public void testMvExpandWithLimitParameter() {
   String ppl = "source=DEPT | mvexpand EMPNOS limit=2";
   RelNode root = getRelNode(ppl);
   String plan = root.explain();
   Assert.assertTrue("Expected Uncollect in plan but got:\n" + plan, plan.contains("Uncollect"));
   Assert.assertTrue(
       "Expected limit-related operator in plan but got:\n" + plan,
       plan.contains("fetch=")
           || plan.contains("LIMIT")
           || plan.contains("RowNumber")
           || plan.contains("Window"));
+  // Verify the generated SQL includes the limit
+  String sql = toSparkSql(root);
+  Assert.assertTrue("Expected LIMIT or FETCH in SQL but got:\n" + sql, 
+      sql.contains("LIMIT") || sql.contains("FETCH"));
 }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 19c2065 and 16dbaad.

📒 Files selected for processing (1)
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 0

🧹 Nitpick comments (3)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (3)

85-96: Minor: Clarify comment about test table.

The comment says "empty table" but the table has a defined schema (DEPTNO, EMPNOS). Consider clarifying that the table has no data rows rather than being "empty" structurally.

-    // Add an empty table with name DEPT for test purpose
+    // Add a test table DEPT with array column (no data rows)
     schema.add("DEPT", new TableWithArray());

118-130: Consider strengthening the limit parameter assertion.

The test verifies that some limit-related operator exists but doesn't confirm the actual limit value 2 is correctly applied. The broad OR-condition (fetch= || LIMIT || RowNumber || Window) may pass even if the limit isn't correctly wired.

Consider verifying the specific limit value appears in the plan:

     Assert.assertTrue(
         "Expected limit-related operator in plan but got:\n" + plan,
-        plan.contains("fetch=")
-            || plan.contains("LIMIT")
-            || plan.contains("RowNumber")
-            || plan.contains("Window"));
+        plan.contains("fetch=[2]") || plan.contains("LIMIT 2"));
   }

Alternatively, add a second test with a different limit value (e.g., limit=5) to ensure the parameter is correctly parsed and propagated.


157-160: Redundant test - identical query to testMvExpandBasic.

testMvExpandPrimitiveArray uses the exact same PPL query as testMvExpandBasic (source=DEPT | mvexpand EMPNOS), but with weaker assertions. Since the EMPNOS column is already defined as INTEGER ARRAY in TableWithArray, this test doesn't add coverage.

Consider either removing this test or modifying it to test a different primitive array type (e.g., VARCHAR array) if that's a distinct code path worth validating.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 16dbaad and 1d0a56e.

📒 Files selected for processing (1)
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🔇 Additional comments (4)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (4)

44-83: LGTM!

The TableWithArray inner class correctly implements the Calcite Table interface with appropriate schema definition for testing array expansion. The JavaDoc properly documents the purpose and cross-references related test infrastructure.


98-116: LGTM!

Well-structured test that verifies both the logical plan structure (with Uncollect and LogicalCorrelate) and the Spark SQL translation. This provides good coverage for the basic mvexpand functionality.


132-140: LGTM!

This test appropriately verifies that projection after mvexpand is correctly planned with both Uncollect and LogicalProject operators in the plan.


142-155: Edge case tests are smoke tests only - consider adding behavioral assertions.

Per coding guidelines, NULL input tests and boundary condition tests should validate actual behavior. These tests only verify that planning succeeds with Uncollect in the plan, but don't validate:

  • testMvExpandEmptyOrNullArray: That null/empty arrays produce zero expanded rows
  • testMvExpandLargeArray: Memory limits or expected behavior with large data
  • testMvExpandWithDuplicates: How duplicates are handled in expansion

If full integration tests with data execution aren't feasible here, consider at minimum:

  1. Documenting that these are planning-only smoke tests (update method names or add comments)
  2. Adding TODO comments to track integration test coverage
  3. Verifying more plan structure (e.g., that the filter predicate is correctly placed relative to Uncollect)

Based on coding guidelines requiring boundary and NULL input test validation.

Also applies to: 162-167

Signed-off-by: Srikanth Padakanti <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java:
- Line 34: Add a JavaDoc block to the public class CalcitePPLMvExpandTest
documenting its purpose: state that it tests Calcite query planning for the
mvexpand PPL command and list covered scenarios (basic expansion, limit
handling, null/empty arrays, and edge cases); place the JavaDoc immediately
above the class declaration "public class CalcitePPLMvExpandTest" and follow
project style (brief description, @see or @author if required).
🧹 Nitpick comments (3)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (3)

120-129: Strengthen limit parameter verification.

The test uses assertAnyContains which only checks if ANY limit-related token exists in the plan, but doesn't verify that the specific limit value (2) is actually applied. This could pass even if the limit parameter is ignored or incorrectly set.

Consider enhancing the assertion to verify the actual limit value is present in the plan, similar to how testMvExpandBasic validates the full logical plan structure.

♻️ Suggested enhancement
 @Test
 public void testMvExpandWithLimitParameter() {
   String ppl = "source=DEPT | mvexpand EMPNOS limit=2";
   RelNode root = getRelNode(ppl);
 
   assertContains(root, "LogicalCorrelate");
   assertContains(root, "Uncollect");
-
-  assertAnyContains(root, "fetch=", "LIMIT", "RowNumber", "Window");
+  
+  // Verify the limit value is actually applied
+  String plan = root.explain();
+  Assert.assertTrue(
+    "Expected plan to contain limit value of 2",
+    plan.contains("fetch=2") || plan.contains("LIMIT 2") || plan.contains("RowNumber") && plan.contains("2"));
 }

131-139: Consider verifying projection field selection.

While the test confirms that mvexpand works with subsequent projections, it only verifies that the operators exist in the plan. Consider adding assertions to verify that the projection actually selects the correct fields (DEPTNO, EMPNOS), similar to the approach in testMvExpandBasic.


34-167: Consider adding test coverage for documented edge cases.

The PR objectives document several edge cases that should be tested but are not covered in this test class:

  1. Non-array field behavior: The objectives state behavior should be defined for "non-array fields (specify whether non-array is treated as single-value, yields no rows, or errors)" but no test verifies this scenario.

  2. Resource limit errors: The objectives mention "configurable memory/resource limits and surface clear, actionable error messaging when limits are exceeded" but no test verifies error handling when limits are exceeded.

  3. Multiple multivalue fields: The objectives mention "multiple multivalue fields" as an edge case but this scenario is not tested.

Per coding guidelines, error condition tests should validate invalid inputs and exceptions. Consider adding tests for these documented edge cases to ensure completeness.

Example test cases to consider
@Test
public void testMvExpandOnNonArrayField() {
  // Test behavior when mvexpand is applied to non-array field DEPTNO
  String ppl = "source=DEPT | mvexpand DEPTNO";
  // Verify expected behavior: error, single row, or no rows
}

@Test(expected = SomeResourceLimitException.class)
public void testMvExpandExceedsResourceLimit() {
  // Test that exceeding resource limits produces clear error
  // This may require integration test setup
}
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1d0a56e and d3651f8.

📒 Files selected for processing (1)
  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🔇 Additional comments (4)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (4)

36-84: LGTM! Well-documented test fixture.

The TableWithArray inner class is well-documented and correctly implements the Calcite Table interface to provide a test table with array fields for mvexpand testing.


86-97: LGTM! Test configuration properly set up.

The config override correctly adds the test DEPT table and configures the Calcite planner for mvexpand testing.


99-118: LGTM! Comprehensive test with meaningful assertions.

This test properly validates both the logical plan structure and SQL generation for basic mvexpand functionality, including verification of LogicalCorrelate and Uncollect operators.


169-184: LGTM! Helper methods are correctly implemented.

The assertion helper methods provide clear error messages and serve their purpose for plan verification testing.

…fferent case by different document data

Signed-off-by: Srikanth Padakanti <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java:
- Around line 26-70: The test in CalciteMvExpandCommandIT.init() embeds the
index mapping and bulk documents inline instead of loading the standard resource
files; change init() to read mvexpand_edge_cases_mapping.json and
mvexpand_edge_cases.json from integ-test/src/test/resources, pass the mapping
string to createIndex(INDEX, mapping) and expand the bulkInsert call to use the
documents loaded from the JSON resource (e.g., parse the resource array or
iterate resource lines) so the test follows the project pattern of using
mvexpand_edge_cases_mapping.json and mvexpand_edge_cases.json rather than
hard-coded literals.
🧹 Nitpick comments (4)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

658-666: Follow the established pattern for field extraction and remove redundant comment.

The method has two consistency concerns:

  1. Redundant comment (line 661): The comment "Always anonymize field names" is obvious in an anonymizer class—all visitor methods mask sensitive data. Based on learnings, redundant comments that restate the obvious should be avoided.

  2. Pattern inconsistency (line 661): Other similar visitor methods extract the field from the node and call visitExpression on it (see visitExpand line 462, visitFlatten line 629, visitRegex line 622). This method should follow the same pattern for maintainability.

♻️ Proposed refactor for consistency
 @Override
 public String visitMvExpand(MvExpand node, String context) {
   String child = node.getChild().get(0).accept(this, context);
-  String field = MASK_COLUMN; // Always anonymize field names
+  String field = visitExpression(node.getField());
   if (node.getLimit() != null) {
     return StringUtils.format("%s | mvexpand %s limit=%s", child, field, MASK_LITERAL);
   }
   return StringUtils.format("%s | mvexpand %s", child, field);
 }
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (3)

291-311: Remove unused ID separator logic.

The bulkInsert method includes logic to parse explicit document IDs using a "|" separator (lines 297-300), but none of the test data uses this feature. All documents rely on auto-incrementing IDs.

This unused complexity violates the YAGNI (You Aren't Gonna Need It) principle and makes the code harder to understand.

♻️ Simplified implementation
  private static void bulkInsert(String index, String... docs) throws IOException {
    StringBuilder bulk = new StringBuilder();
    int nextAutoId = 1;
    for (String doc : docs) {
-     String id;
-     String json;
-     if (doc.contains("|")) {
-       String[] parts = doc.split("\\|", 2);
-       id = parts[0];
-       json = parts[1];
-     } else {
-       id = String.valueOf(nextAutoId++);
-       json = doc;
-     }
+     String id = String.valueOf(nextAutoId++);
      bulk.append("{\"index\":{\"_id\":").append(id).append("}}\n");
-     bulk.append(json).append("\n");
+     bulk.append(doc).append("\n");
    }
    Request request = new Request("POST", "/" + index + "/_bulk?refresh=true");
    request.setJsonEntity(bulk.toString());
    PPLIntegTestCase.adminClient().performRequest(request);
  }

260-272: Make semantic error assertion more specific.

The assertion at line 270 uses a broad condition checking for either "Cannot expand field" OR "Semantic" in the error message. This could match unrelated errors and reduce test reliability.

♻️ More specific assertion
  ResponseException ex = assertThrows(ResponseException.class, () -> executeQuery(query));
  Assertions.assertTrue(
-     ex.getMessage().contains("Cannot expand field") || ex.getMessage().contains("Semantic"),
-     "Expected semantic error for non-array field, got: " + ex.getMessage());
+     ex.getMessage().contains("Cannot expand field 'skills_int': expected ARRAY type but found INTEGER"),
+     "Expected semantic error for integer field, got: " + ex.getMessage());

26-70: Consider adding explicit test cleanup.

The test creates an index in init() and deletes it if it exists at the start, but there's no explicit cleanup after all tests complete (e.g., @AfterAll method). While the current approach works, explicit cleanup ensures resources are released even if tests fail mid-execution.

♻️ Add cleanup method
+ @org.junit.jupiter.api.AfterAll
+ public static void cleanup() throws IOException {
+   deleteIndexIfExists(INDEX);
+ }
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between d3651f8 and 5efd096.

📒 Files selected for processing (4)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
  • integ-test/src/test/resources/mvexpand_edge_cases_mapping.json
  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
🚧 Files skipped from review as they are similar to previous changes (2)
  • integ-test/src/test/resources/mvexpand_edge_cases_mapping.json
  • integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
🧰 Additional context used
📓 Path-based instructions (6)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

  • Check AST generation for new syntax
  • Ensure corresponding AST builder classes are updated
  • Validate edge cases and boundary conditions

Files:

  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧠 Learnings (6)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review

Applied to files:

  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Avoid comments that merely restate what the code does

Applied to files:

  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

  • ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
  • PPLIntegTestCase (36-415)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
  • Index (218-262)
🔇 Additional comments (4)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)

85-85: LGTM!

The import for MvExpand is correctly added and properly positioned alphabetically.

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (3)

72-258: Excellent test coverage and organization.

The test suite comprehensively covers mvexpand edge cases including:

  • Empty, null, and missing arrays
  • Single and multiple elements
  • Duplicates and partial data
  • Mixed shapes and heterogeneous types
  • Schema flattening
  • Limit parameters
  • Large arrays
  • Semantic errors for non-array fields

Test methods are focused, independent, and follow clear naming conventions. The use of appropriate matchers (verifyDataRows, verifySchema, verifyNumOfRows) ensures meaningful assertions.


72-80: This test correctly implements mvexpand semantics for nested objects and aligns with documented behavior.

The mvexpand skills | fields username, skills.name pattern is the intended design. After expanding an array of objects, you can project nested subfields using dot notation. The official mvexpand documentation explicitly states: "If the array contains objects, you can reference subfields (for example, skills.name)." All 10+ test cases in this file consistently verify this behavior across edge cases (empty arrays, null values, missing fields, mixed object shapes, and partial objects), confirming the implementation is correct.


274-316: Remove duplicate helper methods.

A previous review comment indicated that these helper methods already exist in SQLIntegTestCase and should be reused. The local implementations of createIndex, deleteIndexIfExists, bulkInsert, and refreshIndex duplicate parent class functionality.

Please use the existing helper methods from SQLIntegTestCase or PPLIntegTestCase to maintain consistency across the test suite and reduce code duplication.

#!/bin/bash
# Search for existing index management helpers in parent test classes
ast-grep --pattern $'class SQLIntegTestCase {
  $$$
  createIndex($$$) {
    $$$
  }
  $$$
}'

rg -nP --type=java 'protected.*createIndex|protected.*deleteIndex|protected.*bulkInsert' integ-test/src/test/java/org/opensearch/sql/

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 3

🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java:
- Around line 26-70: The init() method currently builds the index mapping and
bulk documents inline (mapping variable, createIndex(INDEX, mapping), and
bulkInsert(...)); replace those inline strings by loading the mapping and test
documents from resources using the loadFromFile helper (e.g.,
loadFromFile("mvexpand_edge_cases_mapping.json") and
loadFromFile("mvexpand_edge_cases.json")), then pass the loaded mapping to
createIndex and the loaded documents to bulkInsert (or adapt bulkInsert to
accept the parsed resource entries) so the test uses
integ-test/src/test/resources/mvexpand_edge_cases_mapping.json and
mvexpand_edge_cases.json instead of hard-coded JSON.
- Around line 72-82: Remove the dead, commented-out test block for
testMvexpandSingleElement in the CalciteMvExpandCommandIT class: delete the
commented @Test annotation and the entire testMvexpandSingleElement method block
(the String query, executeQuery call, and verifyDataRows call) so the file
contains no leftover commented test code; keep formatting intact and rely on VCS
for history.
- Around line 319-361: The four local helper methods createIndex,
deleteIndexIfExists, bulkInsert, and refreshIndex duplicate utilities provided
by SQLIntegTestCase; remove these methods from CalciteMvExpandCommandIT and
replace any calls to them with the corresponding inherited methods from
SQLIntegTestCase (use the class's index creation/deletion, bulk/indexing, and
refresh helpers instead). After removing the local methods, update call sites to
the inherited utilities, clean up any now-unused imports, and run tests to
ensure behavior matches the original helpers.
🧹 Nitpick comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (1)

106-125: Consider inlining or using standard matchers for single-use helper.

This helper is used only once (Line 92) and adds complexity. Consider inlining the assertion logic in the test method for better readability, or verify if standard matchers from MatcherUtils can handle nested field validation.

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 5efd096 and e067b46.

📒 Files selected for processing (1)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
  • PPLIntegTestCase (36-415)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
  • Index (218-262)
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (1)

84-317: Excellent test coverage for mvexpand edge cases.

The test suite comprehensively covers:

  • Single and multiple element expansion (Lines 84-100, 171-179)
  • Edge cases: empty arrays, null arrays, missing fields (Lines 128-157)
  • Duplicate values (Lines 160-168)
  • Partial/mixed element shapes (Lines 182-205)
  • Schema flattening and type inference (Lines 208-278)
  • Limit parameter (Lines 257-266)
  • Large arrays (Lines 281-303)
  • Error handling for non-array fields (Lines 230-242, 306-317)

Test naming follows conventions, assertions are specific, and the tests appear independent. Great work on thorough validation of mvexpand behavior.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🤖 Fix all issues with AI agents
In @docs/user/ppl/cmd/mvexpand.md:
- Around line 7-10: The fenced code block showing the mvexpand syntax is missing
a language identifier; update the triple-backtick fence for the snippet
"mvexpand <field> [limit=<int>]" to include the ppl language (i.e., change ```
to ```ppl) so it matches other examples and satisfies markdownlint MD040.
- Line 4: The description for the mvexpand command currently includes a
comparative phrase "similar to Splunk's `mvexpand` command"; remove that
comparative wording and rewrite the sentence to describe only the OpenSearch
implementation—e.g., state that the `mvexpand` command expands each value in a
multivalue (array) field into a separate row and that for each document every
element in the specified array field is returned as a new row, keeping the
reference to `mvexpand` but omitting any mention of Splunk.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between e067b46 and 514a7dc.

📒 Files selected for processing (2)
  • docs/user/ppl/cmd/mvexpand.md
  • docs/user/ppl/index.md
🚧 Files skipped from review as they are similar to previous changes (1)
  • docs/user/ppl/index.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Applied to files:

  • docs/user/ppl/cmd/mvexpand.md
🪛 markdownlint-cli2 (0.18.1)
docs/user/ppl/cmd/mvexpand.md

8-8: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

🔇 Additional comments (2)
docs/user/ppl/cmd/mvexpand.md (2)

12-23: LGTM.

Field descriptions and output naming guidance are clear and consistent with the design intent.


15-18: Reposition the "Notes about these doctests" section.

This section is not inherent to the mvexpand command specification and interrupts the natural flow (Syntax → Output Naming → Examples). Per prior feedback, either remove it entirely or move it to the end of the documentation as procedural guidance.

💡 Suggested actions

Option 1 (Preferred): Move lines 15–18 to the end of the document (after all examples), as a meta-note about how the examples are constructed.

Option 2: Remove the section if it's redundant; the examples themselves make it clear that array() is being used.

Likely an incorrect or invalid review comment.

Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In @docs/user/ppl/cmd/mvexpand.md:
- Around line 7-10: The fenced code block under the "Syntax" section is missing
a language identifier (MD040); update that block to include the same language
tag used elsewhere (use "ppl") so the block becomes a fenced code block with the
ppl language identifier, matching other examples like the blocks at lines 27 and
52 for linting consistency and clarity.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 514a7dc and 9e8ea2c.

📒 Files selected for processing (1)
  • docs/user/ppl/cmd/mvexpand.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist

Applied to files:

  • docs/user/ppl/cmd/mvexpand.md
🪛 markdownlint-cli2 (0.18.1)
docs/user/ppl/cmd/mvexpand.md

8-8: Fenced code blocks should have a language specified

(MD040, fenced-code-language)

…instead of checking message

Signed-off-by: Srikanth Padakanti <[email protected]>
Srikanth Padakanti added 2 commits January 7, 2026 16:38
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java:
- Around line 26-67: Replace the hard-coded mapping and inline bulkInsert JSONs
in init() with the resource-based loading helpers: call
loadFromFile("mvexpand_edge_cases_mapping.json") to get the mapping string and
pass it to createIndex(INDEX, mapping), then call
loadFromFile("mvexpand_edge_cases.json") to get the test documents and feed
those documents into the existing bulk insertion flow (e.g., split or parse the
file contents into individual JSON docs and pass them to bulkInsert(INDEX,
...)). Keep the existing refreshIndex(INDEX) call and remove the inline mapping
and JSON literals so the test uses the resource files.
📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 00f440b and 50031ce.

📒 Files selected for processing (1)
  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

  • Flag classes >500 lines as needing organization review
  • Check for dead code, unused imports, and unused variables
  • Identify code reuse opportunities across similar implementations
  • Assess holistic maintainability - is code easy to understand and modify?
  • Flag code that appears AI-generated without sufficient human review
  • Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
  • Check for proper JavaDoc on public classes and methods
  • Flag redundant comments that restate obvious code
  • Ensure proper error handling with specific exception types
  • Check for Optional usage instead of null returns
  • Validate proper use of try-with-resources for resource management

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

  • Verify test data files exist in integ-test/src/test/resources/
  • Check test assertions are meaningful and specific
  • Validate tests clean up resources after execution
  • Ensure tests are independent and can run in any order
  • Flag tests that reference non-existent indices (e.g., EMP)
  • Verify integration tests are in correct module (integ-test/)
  • Check tests can be run with ./gradlew :integ-test:integTest
  • Ensure proper test data setup and teardown
  • Validate end-to-end scenario coverage

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

  • Check boundary condition tests (min/max values, empty inputs)
  • Validate error condition tests (invalid inputs, exceptions)
  • Ensure multi-document tests for per-document operations
  • Flag smoke tests without meaningful assertions
  • Check test naming follows pattern: test
  • Verify test data is realistic and covers edge cases
  • Verify test coverage for new business logic
  • Ensure tests are independent and don't rely on execution order
  • Validate meaningful test data that reflects real-world scenarios
  • Check for proper cleanup of test resources

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

  • Verify RelNode visitor implementations are complete
  • Check RexNode handling follows project conventions
  • Validate SQL generation is correct and optimized
  • Ensure Calcite version compatibility
  • Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
  • Document any Calcite-specific workarounds
  • Test compatibility with Calcite version constraints

Files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

  • integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
  • PPLIntegTestCase (36-415)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
  • Index (218-262)
🔇 Additional comments (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)

304-346: Remove duplicate helper methods and use inherited utilities.

These helper methods (createIndex, deleteIndexIfExists, bulkInsert, refreshIndex) duplicate functionality available in SQLIntegTestCase. A previous review indicated these should be replaced with existing helpers, but they remain in the file.

Use the inherited methods from SQLIntegTestCase to follow DRY principles and maintain consistency with other integration tests.

Based on coding guidelines: "Identify code reuse opportunities across similar implementations."

⛔ Skipped due to learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

69-85: OpenSearch SQL mvexpand behavior diverges from Splunk—clarify if this is intentional.

This test validates OpenSearch SQL's mvexpand on typed nested fields: projecting the full field returns the nested object {"name":"go"}, while projecting skills.name returns the scalar "go". However, Splunk's mvexpand works fundamentally differently—it expands multivalue fields and requires explicit spath JSON parsing to extract nested fields. Verify whether OpenSearch SQL's implementation (type-aware nested field expansion) is intentionally different from Splunk's approach, or if Splunk compatibility is required. If compatibility is mandatory, the implementation and test approach may need adjustment.

@srikanthpadakanti
Copy link
Author

@penghuo @anasalkouz @dai-chen Can you please review this. Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

backport 2.19-dev enhancement New feature or request PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Add native support for mvexpand command in PPL

4 participants