-
Notifications
You must be signed in to change notification settings - Fork 181
Mvexpand feature #4944
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Mvexpand feature #4944
Conversation
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
…it supposed to Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
|
@ykmr1224 |
ppl/src/test/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizerTest.java
Show resolved
Hide resolved
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Outdated
Show resolved
Hide resolved
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
docs/user/ppl/cmd/mvexpand.md
Outdated
|
|
||
| PPL query: | ||
| ```ppl | ||
| source=mvexpand_logs | where case='basic' | mvexpand tags | fields tags.value |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you explain the reason why we make this deviation?
| calcite: | ||
| logical: | | ||
| LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT]) | ||
| LogicalProject(skills=[$0], username=[$3], VALUE=[$10]) | ||
| LogicalValues(tuples=[[]]) | ||
| physical: | | ||
| EnumerableValues(tuples=[[]]) No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still don't see any plan related to mvexpand.
| calcite: | ||
| logical: | | ||
| LogicalSystemLimit(sort0=[$2], dir0=[ASC], fetch=[10000], type=[QUERY_SIZE_LIMIT]) | ||
| LogicalProject(skills=[$0], username=[$3], VALUE=[$10]) | ||
| LogicalValues(tuples=[[]]) | ||
| physical: | | ||
| EnumerableValues(tuples=[[]]) No newline at end of file |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
Outdated
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java:
- Around line 318-324: The testMvexpandExplain currently loads the expected plan
into the variable expected via loadExpectedPlan("explain_mvexpand.yaml") but
never asserts it against the explain output; update the test to capture the
explain output by calling explainQueryYaml(...) (e.g., assign its return to a
variable like actual) and assert equality with expected using the same assertion
pattern used elsewhere in this class (e.g., assertEquals(expected, actual) or
the test helper assertion method), referencing testMvexpandExplain, expected,
loadExpectedPlan, and explainQueryYaml to locate the code to change.
🧹 Nitpick comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
851-855: Shared expand/mvexpand pipeline with per-doc limit is well-structured; double-check limit validationGood points here:
visitPatternsandvisitExpandnow both delegate tobuildExpandRelNode(..., null, context), so existing expand semantics are preserved while centralizing the unnest + correlate logic.buildExpandRelNodescopesperDocLimitviarelBuilder.limit(0, perDocLimit)on the right-hand uncollect side, which correctly enforces a per-document cap for mvexpand without affecting other expand users.One thing to verify: what are the intended semantics for
limit <= 0(including 0 and negative) inmvexpand?
- Today, a non-null
perDocLimitis passed straight tolimit(0, perDocLimit), solimit=0yields no expansions and any negative value is handed directly to Calcite’sRelBuilder.limit, which may be undefined or error-prone.- If the syntax/analysis layer already enforces
limit > 0, consider documenting that assumption here. Otherwise, it may be safer to guard and throw aSemanticCheckExceptionon invalid values or treat0as “no cap” by mapping it tonull.Functionally this is solid; the request is just to confirm and, if needed, tighten the validation path.
Also applies to: 3110-3123, 3434-3486
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧰 Additional context used
📓 Path-based instructions (7)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧠 Learnings (8)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag generic exception handling during code review
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Use specific exception types with meaningful messages for error handling
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (1)
- GitHub Check: WhiteSource Security Check
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java (1)
31-49: MVEXPAND edge-cases index wiring looks correctLoading
Index.MVEXPAND_EDGE_CASESininit()is consistent with the new mvexpand explain test and follows the existing pattern used for other indices. No changes needed here.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java:
- Around line 148-166: The tests testMvExpandEmptyOrNullArray,
testMvExpandWithDuplicates, testMvExpandLargeArray, and
testMvExpandPrimitiveArray currently only call assertMvexpandPlanned (verifying
planning/Uncollect) but not runtime behavior; update each to either (a) convert
to an integration-style test that executes the query and asserts concrete
results (e.g., for testMvExpandEmptyOrNullArray assert the result set is empty,
for testMvExpandWithDuplicates assert duplicate rows are/aren’t produced per
spec, for testMvExpandLargeArray assert behavior under large payloads or
expected error/memory handling, and for testMvExpandPrimitiveArray assert
expanded primitive values and types) or (b) enhance the unit assertions to
inspect the logical/physical plan nodes produced by assertMvexpandPlanned to
validate predicates, Uncollect node contents and null/empty handling rather than
only existence; locate usages of assertMvexpandPlanned and the named test
methods to implement these runtime assertions or deeper plan inspections.
🧹 Nitpick comments (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (1)
118-130: Consider verifying SQL generation for the limit parameter.While the test confirms that planning succeeds and limit-related operators appear in the plan, it doesn't verify the actual SQL output contains the limit clause. Consider adding a
verifyPPLToSparkSQL()call to ensure the generated SQL correctly implements the per-document limit.Based on learnings, Calcite integration changes should test SQL generation and optimization paths.
💡 Example verification
@Test public void testMvExpandWithLimitParameter() { String ppl = "source=DEPT | mvexpand EMPNOS limit=2"; RelNode root = getRelNode(ppl); String plan = root.explain(); Assert.assertTrue("Expected Uncollect in plan but got:\n" + plan, plan.contains("Uncollect")); Assert.assertTrue( "Expected limit-related operator in plan but got:\n" + plan, plan.contains("fetch=") || plan.contains("LIMIT") || plan.contains("RowNumber") || plan.contains("Window")); + // Verify the generated SQL includes the limit + String sql = toSparkSql(root); + Assert.assertTrue("Expected LIMIT or FETCH in SQL but got:\n" + sql, + sql.contains("LIMIT") || sql.contains("FETCH")); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (3)
85-96: Minor: Clarify comment about test table.The comment says "empty table" but the table has a defined schema (DEPTNO, EMPNOS). Consider clarifying that the table has no data rows rather than being "empty" structurally.
- // Add an empty table with name DEPT for test purpose + // Add a test table DEPT with array column (no data rows) schema.add("DEPT", new TableWithArray());
118-130: Consider strengthening the limit parameter assertion.The test verifies that some limit-related operator exists but doesn't confirm the actual limit value
2is correctly applied. The broad OR-condition (fetch=||LIMIT||RowNumber||Window) may pass even if the limit isn't correctly wired.Consider verifying the specific limit value appears in the plan:
Assert.assertTrue( "Expected limit-related operator in plan but got:\n" + plan, - plan.contains("fetch=") - || plan.contains("LIMIT") - || plan.contains("RowNumber") - || plan.contains("Window")); + plan.contains("fetch=[2]") || plan.contains("LIMIT 2")); }Alternatively, add a second test with a different limit value (e.g.,
limit=5) to ensure the parameter is correctly parsed and propagated.
157-160: Redundant test - identical query totestMvExpandBasic.
testMvExpandPrimitiveArrayuses the exact same PPL query astestMvExpandBasic(source=DEPT | mvexpand EMPNOS), but with weaker assertions. Since the EMPNOS column is already defined asINTEGER ARRAYinTableWithArray, this test doesn't add coverage.Consider either removing this test or modifying it to test a different primitive array type (e.g., VARCHAR array) if that's a distinct code path worth validating.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🔇 Additional comments (4)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (4)
44-83: LGTM!The
TableWithArrayinner class correctly implements the CalciteTableinterface with appropriate schema definition for testing array expansion. The JavaDoc properly documents the purpose and cross-references related test infrastructure.
98-116: LGTM!Well-structured test that verifies both the logical plan structure (with
UncollectandLogicalCorrelate) and the Spark SQL translation. This provides good coverage for the basic mvexpand functionality.
132-140: LGTM!This test appropriately verifies that projection after mvexpand is correctly planned with both
UncollectandLogicalProjectoperators in the plan.
142-155: Edge case tests are smoke tests only - consider adding behavioral assertions.Per coding guidelines, NULL input tests and boundary condition tests should validate actual behavior. These tests only verify that planning succeeds with
Uncollectin the plan, but don't validate:
- testMvExpandEmptyOrNullArray: That null/empty arrays produce zero expanded rows
- testMvExpandLargeArray: Memory limits or expected behavior with large data
- testMvExpandWithDuplicates: How duplicates are handled in expansion
If full integration tests with data execution aren't feasible here, consider at minimum:
- Documenting that these are planning-only smoke tests (update method names or add comments)
- Adding TODO comments to track integration test coverage
- Verifying more plan structure (e.g., that the filter predicate is correctly placed relative to Uncollect)
Based on coding guidelines requiring boundary and NULL input test validation.
Also applies to: 162-167
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java:
- Line 34: Add a JavaDoc block to the public class CalcitePPLMvExpandTest
documenting its purpose: state that it tests Calcite query planning for the
mvexpand PPL command and list covered scenarios (basic expansion, limit
handling, null/empty arrays, and edge cases); place the JavaDoc immediately
above the class declaration "public class CalcitePPLMvExpandTest" and follow
project style (brief description, @see or @author if required).
🧹 Nitpick comments (3)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (3)
120-129: Strengthen limit parameter verification.The test uses
assertAnyContainswhich only checks if ANY limit-related token exists in the plan, but doesn't verify that the specific limit value (2) is actually applied. This could pass even if the limit parameter is ignored or incorrectly set.Consider enhancing the assertion to verify the actual limit value is present in the plan, similar to how
testMvExpandBasicvalidates the full logical plan structure.♻️ Suggested enhancement
@Test public void testMvExpandWithLimitParameter() { String ppl = "source=DEPT | mvexpand EMPNOS limit=2"; RelNode root = getRelNode(ppl); assertContains(root, "LogicalCorrelate"); assertContains(root, "Uncollect"); - - assertAnyContains(root, "fetch=", "LIMIT", "RowNumber", "Window"); + + // Verify the limit value is actually applied + String plan = root.explain(); + Assert.assertTrue( + "Expected plan to contain limit value of 2", + plan.contains("fetch=2") || plan.contains("LIMIT 2") || plan.contains("RowNumber") && plan.contains("2")); }
131-139: Consider verifying projection field selection.While the test confirms that mvexpand works with subsequent projections, it only verifies that the operators exist in the plan. Consider adding assertions to verify that the projection actually selects the correct fields (
DEPTNO,EMPNOS), similar to the approach intestMvExpandBasic.
34-167: Consider adding test coverage for documented edge cases.The PR objectives document several edge cases that should be tested but are not covered in this test class:
Non-array field behavior: The objectives state behavior should be defined for "non-array fields (specify whether non-array is treated as single-value, yields no rows, or errors)" but no test verifies this scenario.
Resource limit errors: The objectives mention "configurable memory/resource limits and surface clear, actionable error messaging when limits are exceeded" but no test verifies error handling when limits are exceeded.
Multiple multivalue fields: The objectives mention "multiple multivalue fields" as an edge case but this scenario is not tested.
Per coding guidelines, error condition tests should validate invalid inputs and exceptions. Consider adding tests for these documented edge cases to ensure completeness.
Example test cases to consider
@Test public void testMvExpandOnNonArrayField() { // Test behavior when mvexpand is applied to non-array field DEPTNO String ppl = "source=DEPT | mvexpand DEPTNO"; // Verify expected behavior: error, single row, or no rows } @Test(expected = SomeResourceLimitException.class) public void testMvExpandExceedsResourceLimit() { // Test that exceeding resource limits produces clear error // This may require integration test setup }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🧠 Learnings (4)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
🔇 Additional comments (4)
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java (4)
36-84: LGTM! Well-documented test fixture.The
TableWithArrayinner class is well-documented and correctly implements the CalciteTableinterface to provide a test table with array fields for mvexpand testing.
86-97: LGTM! Test configuration properly set up.The config override correctly adds the test DEPT table and configures the Calcite planner for mvexpand testing.
99-118: LGTM! Comprehensive test with meaningful assertions.This test properly validates both the logical plan structure and SQL generation for basic mvexpand functionality, including verification of
LogicalCorrelateandUncollectoperators.
169-184: LGTM! Helper methods are correctly implemented.The assertion helper methods provide clear error messages and serve their purpose for plan verification testing.
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvExpandTest.java
Show resolved
Hide resolved
…fferent case by different document data Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java:
- Around line 26-70: The test in CalciteMvExpandCommandIT.init() embeds the
index mapping and bulk documents inline instead of loading the standard resource
files; change init() to read mvexpand_edge_cases_mapping.json and
mvexpand_edge_cases.json from integ-test/src/test/resources, pass the mapping
string to createIndex(INDEX, mapping) and expand the bulkInsert call to use the
documents loaded from the JSON resource (e.g., parse the resource array or
iterate resource lines) so the test follows the project pattern of using
mvexpand_edge_cases_mapping.json and mvexpand_edge_cases.json rather than
hard-coded literals.
🧹 Nitpick comments (4)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)
658-666: Follow the established pattern for field extraction and remove redundant comment.The method has two consistency concerns:
Redundant comment (line 661): The comment "Always anonymize field names" is obvious in an anonymizer class—all visitor methods mask sensitive data. Based on learnings, redundant comments that restate the obvious should be avoided.
Pattern inconsistency (line 661): Other similar visitor methods extract the field from the node and call
visitExpressionon it (seevisitExpandline 462,visitFlattenline 629,visitRegexline 622). This method should follow the same pattern for maintainability.♻️ Proposed refactor for consistency
@Override public String visitMvExpand(MvExpand node, String context) { String child = node.getChild().get(0).accept(this, context); - String field = MASK_COLUMN; // Always anonymize field names + String field = visitExpression(node.getField()); if (node.getLimit() != null) { return StringUtils.format("%s | mvexpand %s limit=%s", child, field, MASK_LITERAL); } return StringUtils.format("%s | mvexpand %s", child, field); }integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (3)
291-311: Remove unused ID separator logic.The
bulkInsertmethod includes logic to parse explicit document IDs using a"|"separator (lines 297-300), but none of the test data uses this feature. All documents rely on auto-incrementing IDs.This unused complexity violates the YAGNI (You Aren't Gonna Need It) principle and makes the code harder to understand.
♻️ Simplified implementation
private static void bulkInsert(String index, String... docs) throws IOException { StringBuilder bulk = new StringBuilder(); int nextAutoId = 1; for (String doc : docs) { - String id; - String json; - if (doc.contains("|")) { - String[] parts = doc.split("\\|", 2); - id = parts[0]; - json = parts[1]; - } else { - id = String.valueOf(nextAutoId++); - json = doc; - } + String id = String.valueOf(nextAutoId++); bulk.append("{\"index\":{\"_id\":").append(id).append("}}\n"); - bulk.append(json).append("\n"); + bulk.append(doc).append("\n"); } Request request = new Request("POST", "/" + index + "/_bulk?refresh=true"); request.setJsonEntity(bulk.toString()); PPLIntegTestCase.adminClient().performRequest(request); }
260-272: Make semantic error assertion more specific.The assertion at line 270 uses a broad condition checking for either "Cannot expand field" OR "Semantic" in the error message. This could match unrelated errors and reduce test reliability.
♻️ More specific assertion
ResponseException ex = assertThrows(ResponseException.class, () -> executeQuery(query)); Assertions.assertTrue( - ex.getMessage().contains("Cannot expand field") || ex.getMessage().contains("Semantic"), - "Expected semantic error for non-array field, got: " + ex.getMessage()); + ex.getMessage().contains("Cannot expand field 'skills_int': expected ARRAY type but found INTEGER"), + "Expected semantic error for integer field, got: " + ex.getMessage());
26-70: Consider adding explicit test cleanup.The test creates an index in
init()and deletes it if it exists at the start, but there's no explicit cleanup after all tests complete (e.g.,@AfterAllmethod). While the current approach works, explicit cleanup ensures resources are released even if tests fail mid-execution.♻️ Add cleanup method
+ @org.junit.jupiter.api.AfterAll + public static void cleanup() throws IOException { + deleteIndexIfExists(INDEX); + }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.javainteg-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javainteg-test/src/test/resources/mvexpand_edge_cases_mapping.jsonppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
🚧 Files skipped from review as they are similar to previous changes (2)
- integ-test/src/test/resources/mvexpand_edge_cases_mapping.json
- integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
🧰 Additional context used
📓 Path-based instructions (6)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧠 Learnings (6)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Avoid comments that merely restate what the code does
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
PPLIntegTestCase(36-415)integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
🔇 Additional comments (4)
ppl/src/main/java/org/opensearch/sql/ppl/utils/PPLQueryDataAnonymizer.java (1)
85-85: LGTM!The import for
MvExpandis correctly added and properly positioned alphabetically.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (3)
72-258: Excellent test coverage and organization.The test suite comprehensively covers mvexpand edge cases including:
- Empty, null, and missing arrays
- Single and multiple elements
- Duplicates and partial data
- Mixed shapes and heterogeneous types
- Schema flattening
- Limit parameters
- Large arrays
- Semantic errors for non-array fields
Test methods are focused, independent, and follow clear naming conventions. The use of appropriate matchers (
verifyDataRows,verifySchema,verifyNumOfRows) ensures meaningful assertions.
72-80: This test correctly implements mvexpand semantics for nested objects and aligns with documented behavior.The
mvexpand skills | fields username, skills.namepattern is the intended design. After expanding an array of objects, you can project nested subfields using dot notation. The official mvexpand documentation explicitly states: "If the array contains objects, you can reference subfields (for example,skills.name)." All 10+ test cases in this file consistently verify this behavior across edge cases (empty arrays, null values, missing fields, mixed object shapes, and partial objects), confirming the implementation is correct.
274-316: Remove duplicate helper methods.A previous review comment indicated that these helper methods already exist in
SQLIntegTestCaseand should be reused. The local implementations ofcreateIndex,deleteIndexIfExists,bulkInsert, andrefreshIndexduplicate parent class functionality.Please use the existing helper methods from
SQLIntegTestCaseorPPLIntegTestCaseto maintain consistency across the test suite and reduce code duplication.#!/bin/bash # Search for existing index management helpers in parent test classes ast-grep --pattern $'class SQLIntegTestCase { $$$ createIndex($$$) { $$$ } $$$ }' rg -nP --type=java 'protected.*createIndex|protected.*deleteIndex|protected.*bulkInsert' integ-test/src/test/java/org/opensearch/sql/Likely an incorrect or invalid review comment.
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 3
🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java:
- Around line 26-70: The init() method currently builds the index mapping and
bulk documents inline (mapping variable, createIndex(INDEX, mapping), and
bulkInsert(...)); replace those inline strings by loading the mapping and test
documents from resources using the loadFromFile helper (e.g.,
loadFromFile("mvexpand_edge_cases_mapping.json") and
loadFromFile("mvexpand_edge_cases.json")), then pass the loaded mapping to
createIndex and the loaded documents to bulkInsert (or adapt bulkInsert to
accept the parsed resource entries) so the test uses
integ-test/src/test/resources/mvexpand_edge_cases_mapping.json and
mvexpand_edge_cases.json instead of hard-coded JSON.
- Around line 72-82: Remove the dead, commented-out test block for
testMvexpandSingleElement in the CalciteMvExpandCommandIT class: delete the
commented @Test annotation and the entire testMvexpandSingleElement method block
(the String query, executeQuery call, and verifyDataRows call) so the file
contains no leftover commented test code; keep formatting intact and rely on VCS
for history.
- Around line 319-361: The four local helper methods createIndex,
deleteIndexIfExists, bulkInsert, and refreshIndex duplicate utilities provided
by SQLIntegTestCase; remove these methods from CalciteMvExpandCommandIT and
replace any calls to them with the corresponding inherited methods from
SQLIntegTestCase (use the class's index creation/deletion, bulk/indexing, and
refresh helpers instead). After removing the local methods, update call sites to
the inherited utilities, clean up any now-unused imports, and run tests to
ensure behavior matches the original helpers.
🧹 Nitpick comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (1)
106-125: Consider inlining or using standard matchers for single-use helper.This helper is used only once (Line 92) and adds complexity. Consider inlining the assertion logic in the test method for better readability, or verify if standard matchers from
MatcherUtilscan handle nested field validation.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
PPLIntegTestCase(36-415)integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
🔇 Additional comments (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (1)
84-317: Excellent test coverage for mvexpand edge cases.The test suite comprehensively covers:
- Single and multiple element expansion (Lines 84-100, 171-179)
- Edge cases: empty arrays, null arrays, missing fields (Lines 128-157)
- Duplicate values (Lines 160-168)
- Partial/mixed element shapes (Lines 182-205)
- Schema flattening and type inference (Lines 208-278)
- Limit parameter (Lines 257-266)
- Large arrays (Lines 281-303)
- Error handling for non-array fields (Lines 230-242, 306-317)
Test naming follows conventions, assertions are specific, and the tests appear independent. Great work on thorough validation of mvexpand behavior.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Outdated
Show resolved
Hide resolved
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 2
🤖 Fix all issues with AI agents
In @docs/user/ppl/cmd/mvexpand.md:
- Around line 7-10: The fenced code block showing the mvexpand syntax is missing
a language identifier; update the triple-backtick fence for the snippet
"mvexpand <field> [limit=<int>]" to include the ppl language (i.e., change ```
to ```ppl) so it matches other examples and satisfies markdownlint MD040.
- Line 4: The description for the mvexpand command currently includes a
comparative phrase "similar to Splunk's `mvexpand` command"; remove that
comparative wording and rewrite the sentence to describe only the OpenSearch
implementation—e.g., state that the `mvexpand` command expands each value in a
multivalue (array) field into a separate row and that for each document every
element in the specified array field is returned as a new row, keeping the
reference to `mvexpand` but omitting any mention of Splunk.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
docs/user/ppl/cmd/mvexpand.mddocs/user/ppl/index.md
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/user/ppl/index.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Applied to files:
docs/user/ppl/cmd/mvexpand.md
🪛 markdownlint-cli2 (0.18.1)
docs/user/ppl/cmd/mvexpand.md
8-8: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
🔇 Additional comments (2)
docs/user/ppl/cmd/mvexpand.md (2)
12-23: LGTM.Field descriptions and output naming guidance are clear and consistent with the design intent.
15-18: Reposition the "Notes about these doctests" section.This section is not inherent to the
mvexpandcommand specification and interrupts the natural flow (Syntax → Output Naming → Examples). Per prior feedback, either remove it entirely or move it to the end of the documentation as procedural guidance.💡 Suggested actions
Option 1 (Preferred): Move lines 15–18 to the end of the document (after all examples), as a meta-note about how the examples are constructed.
Option 2: Remove the section if it's redundant; the examples themselves make it clear that
array()is being used.Likely an incorrect or invalid review comment.
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @docs/user/ppl/cmd/mvexpand.md:
- Around line 7-10: The fenced code block under the "Syntax" section is missing
a language identifier (MD040); update that block to include the same language
tag used elsewhere (use "ppl") so the block becomes a fenced code block with the
ppl language identifier, matching other examples like the blocks at lines 27 and
52 for linting consistency and clarity.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/user/ppl/cmd/mvexpand.md
🧰 Additional context used
🧠 Learnings (2)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: For PPL command PRs, refer docs/dev/ppl-commands.md and verify the PR satisfies the checklist
Applied to files:
docs/user/ppl/cmd/mvexpand.md
🪛 markdownlint-cli2 (0.18.1)
docs/user/ppl/cmd/mvexpand.md
8-8: Fenced code blocks should have a language specified
(MD040, fenced-code-language)
…instead of checking message Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In
@integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java:
- Around line 26-67: Replace the hard-coded mapping and inline bulkInsert JSONs
in init() with the resource-based loading helpers: call
loadFromFile("mvexpand_edge_cases_mapping.json") to get the mapping string and
pass it to createIndex(INDEX, mapping), then call
loadFromFile("mvexpand_edge_cases.json") to get the test documents and feed
those documents into the existing bulk insertion flow (e.g., split or parse the
file contents into individual JSON docs and pass them to bulkInsert(INDEX,
...)). Keep the existing refreshIndex(INDEX) call and remove the inline mapping
and JSON literals so the test uses the resource files.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧠 Learnings (5)
📓 Common learnings
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to integ-test/**/*IT.java : End-to-end scenarios need integration tests in `integ-test/` module
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
🧬 Code graph analysis (1)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
integ-test/src/test/java/org/opensearch/sql/ppl/PPLIntegTestCase.java (1)
PPLIntegTestCase(36-415)integ-test/src/test/java/org/opensearch/sql/sql/IdentifierIT.java (1)
Index(218-262)
🔇 Additional comments (2)
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java (2)
304-346: Remove duplicate helper methods and use inherited utilities.These helper methods (
createIndex,deleteIndexIfExists,bulkInsert,refreshIndex) duplicate functionality available inSQLIntegTestCase. A previous review indicated these should be replaced with existing helpers, but they remain in the file.Use the inherited methods from
SQLIntegTestCaseto follow DRY principles and maintain consistency with other integration tests.Based on coding guidelines: "Identify code reuse opportunities across similar implementations."
⛔ Skipped due to learnings
Learnt from: CR Repo: opensearch-project/sql PR: 0 File: .rules/REVIEW_GUIDELINES.md:0-0 Timestamp: 2025-12-02T17:27:55.938Z Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
69-85: OpenSearch SQL mvexpand behavior diverges from Splunk—clarify if this is intentional.This test validates OpenSearch SQL's mvexpand on typed
nestedfields: projecting the full field returns the nested object{"name":"go"}, while projectingskills.namereturns the scalar"go". However, Splunk's mvexpand works fundamentally differently—it expands multivalue fields and requires explicitspathJSON parsing to extract nested fields. Verify whether OpenSearch SQL's implementation (type-aware nested field expansion) is intentionally different from Splunk's approach, or if Splunk compatibility is required. If compatibility is mandatory, the implementation and test approach may need adjustment.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvExpandCommandIT.java
Show resolved
Hide resolved
|
@penghuo @anasalkouz @dai-chen Can you please review this. Thanks |
Description
This pull request adds native support for the mvexpand command in PPL to OpenSearch SQL, enabling users to expand multivalue fields (arrays) into separate rows directly within queries. This functionality is analogous to Splunk's mvexpand command and streamlines analytics, dashboarding, and data preparation involving arrays or multivalue fields.
Key features introduced:
Native mvexpand command for PPL queries to expand array fields into separate rows/events.
Optional limit parameter to restrict the number of expanded values per event/document.
Robust handling of empty/null arrays, large arrays (with memory/resource limits), and non-array fields.
Streaming/distributable execution for performance and scalability.
Comprehensive documentation and edge case coverage.
This feature makes OpenSearch SQL more powerful and user-friendly for log analytics, data exploration, and migration from platforms like Splunk.
Related Issues
Resolves #4439
#4439
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.