-
Notifications
You must be signed in to change notification settings - Fork 181
Feature/mvcombine #5025
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Feature/mvcombine #5025
Conversation
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
📝 WalkthroughSummary by CodeRabbit
✏️ Tip: You can customize this high-level summary in your review settings. WalkthroughAdds end-to-end support for a new PPL command Changes
Sequence Diagram(s)sequenceDiagram
participant User as User / PPL Query
participant LexerParser as Lexer & Parser
participant AstBuilder as AST Builder
participant Analyzer as Analyzer
participant CalciteVisitor as Calcite RelVisitor
participant RelBuilder as Calcite RelBuilder
User->>LexerParser: mvcombine <field> [DELIM='<s>']
LexerParser->>AstBuilder: parse mvcombineCommand
AstBuilder->>AstBuilder: build MvCombine(field, delim)
AstBuilder->>Analyzer: pass MvCombine node
Analyzer->>CalciteVisitor: route node for Calcite lowering (getOnlyForCalciteException path)
CalciteVisitor->>RelBuilder: lower child input plan
CalciteVisitor->>RelBuilder: build group-by (non-target fields)
CalciteVisitor->>RelBuilder: aggregate COLLECT(target) excluding NULLs -> MULTISET
CalciteVisitor->>RelBuilder: cast MULTISET -> ARRAY and restore column order
RelBuilder-->>CalciteVisitor: RelNode plan
CalciteVisitor-->>User: final plan/result
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Suggested labels
Suggested reviewers
🚥 Pre-merge checks | ✅ 3 | ❌ 2❌ Failed checks (1 warning, 1 inconclusive)
✅ Passed checks (3 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing touches
📜 Recent review detailsConfiguration used: Path: .coderabbit.yaml Review profile: CHILL Plan: Pro 📒 Files selected for processing (1)
🚧 Files skipped from review as they are similar to previous changes (1)
Comment |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 8
🤖 Fix all issues with AI agents
In @core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java:
- Around line 3125-3128: CalciteRelNodeVisitor currently defaults delim to a
single space — change the fallback to a comma by updating the delim assignment
in CalciteRelNodeVisitor (the local variable "delim" derived from
node.getDelim()) so the default is ","; also update MvCombine (its constructor
around the current line 29) to normalize null/default delim to "," instead of "
" so both places share the comma default and remove any redundant null-handling
in CalciteRelNodeVisitor if you rely on MvCombine's normalization.
In @docs/user/ppl/cmd/mvcombine.md:
- Line 49: Rename the heading/text labeled "Example 6: Multiple groups" to
"Example 5: Multiple groups" so the examples are sequential (since Example 5 is
missing); locate the "Example 6" occurrence in docs/user/ppl/cmd/mvcombine.md
(the example heading and any internal references or example captions) and change
the numeral from 6 to 5 to restore correct numbering and update any
cross-references if present.
- Line 16: Update the mvcombine usage line to require an explicit boolean for
nomv: replace "mvcombine <field> [nomv] [delim=<string>]" with "mvcombine
<field> [nomv=<boolean>] [delim=<string>]" so the docs match the parser
behavior; this corresponds to the implementation in AstBuilder.java which
expects a boolean literal (e.g., nomv=true or nomv=false) rather than a bare
flag, so adjust any nearby examples or references to use nomv=<boolean> syntax
as well.
- Line 19: The docs state the default delimiter is a comma but the code in
MvCombine.java (constructor assigning this.delim = (delim == null) ? " " :
delim) uses a space; update the docs' line describing "delim" to say the default
is a space (i.e., "Defaults to a space when not provided") and ensure the
example remains consistent with the space-separated output shown elsewhere.
In @ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java:
- Around line 872-873: There are two identical Javadoc comment lines "/**
mvcombine command. */" in AstBuilder; remove the duplicate so only one Javadoc
remains immediately above the mvcombine-related node or method (e.g., the
mvcombine command handling method in class AstBuilder) to avoid redundant
comments.
- Around line 1313-1325: Replace usages of the local helper unquoteStringLiteral
with StringUtils.unquoteText so string delimiters and escape sequences are
handled consistently (e.g., where delimiter is processed in AstBuilder). Update
calls referencing unquoteStringLiteral to call StringUtils.unquoteText(text) and
remove or deprecate the unquoteStringLiteral method; if you keep it, add proper
JavaDoc describing behavior and that it is legacy and does not handle escape
sequences. Ensure imports include StringUtils and run tests to validate
escape/quoted-string handling.
In
@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java:
- Around line 36-40: Add a JavaDoc to the public test class
CalcitePPLMvCombineTest explaining its purpose: state that it verifies Calcite
integration for the PPL "mvcombine" command, including tests for basic
aggregation behavior and delimiter handling; place the JavaDoc immediately above
the class declaration for CalcitePPLMvCombineTest and reference the mvcombine
functionality so readers know this class focuses on aggregation and delimiter
functionality within the SCOTT_WITH_TEMPORAL schema.
🧹 Nitpick comments (4)
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java (1)
472-482: Add JavaDoc for public factory methods.Per coding guidelines, all public methods should have proper JavaDoc with
@paramand@returndocumentation. Consider adding documentation similar to other factory methods in this class.📝 Suggested JavaDoc
+ /** + * Creates an MvCombine node with default settings (no delimiter, array output). + * + * @param field the target field to combine + * @return MvCombine node + */ public static MvCombine mvcombine(Field field) { return new MvCombine(field, null, false); } + /** + * Creates an MvCombine node with a custom delimiter. + * + * @param field the target field to combine + * @param delim the delimiter used when nomv is enabled + * @return MvCombine node + */ public static MvCombine mvcombine(Field field, String delim) { return new MvCombine(field, delim, false); } + /** + * Creates an MvCombine node with full configuration. + * + * @param field the target field to combine + * @param delim the delimiter used when nomv is enabled + * @param nomv if true, returns a joined string instead of multivalue array + * @return MvCombine node + */ public static MvCombine mvcombine(Field field, String delim, boolean nomv) { return new MvCombine(field, delim, nomv); }ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)
874-890: Add JavaDoc for visitor method.Per coding guidelines, public methods should have JavaDoc. While other visitor methods in this file follow a minimal comment style, consider adding a brief description for consistency and clarity.
The implementation looks correct:
- Field extraction follows the established pattern
- DELIM and NOMV parameter handling is appropriate
- The boolean check
ctx.booleanLiteral().TRUE() != nullcorrectly distinguishes TRUE from FALSEcore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
3114-3215: Method exceeds 50-line threshold — consider extracting helper methods.This method is ~100 lines and handles multiple responsibilities: validation, group-key construction, aggregation, projection reordering, and nomv transformation. Per coding guidelines, methods >50 lines should be flagged. Consider extracting logical sections into helper methods (e.g.,
buildMvCombineGroupKey,buildNomvProjection) to improve readability and testability.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java (1)
113-153: Defensive ordering check is redundant — query already sorts.Since the query at line 125 includes
| sort ip, the result order should be deterministic ("10.0.0.7"<"10.0.0.8"lexicographically). The if-else block at lines 136-152 handles both orderings defensively, which obscures the test intent and could mask sorting bugs.Simplified test assertions
JSONArray r0 = result.getJSONArray("datarows").getJSONArray(0); JSONArray r1 = result.getJSONArray("datarows").getJSONArray(1); - String ip0 = r0.getString(0); - String ip1 = r1.getString(0); - - if ("10.0.0.7".equals(ip0)) { - List<String> mv0 = toStringListDropNulls(r0.get(3)); - Assertions.assertTrue( - mv0.contains("1") && mv0.contains("2"), - "Expected 10.0.0.7 to include 1 and 2, got " + mv0); - - List<String> mv1 = toStringListDropNulls(r1.get(3)); - Assertions.assertTrue(mv1.contains("9"), "Expected 10.0.0.8 to include 9, got " + mv1); - } else { - List<String> mv0 = toStringListDropNulls(r0.get(3)); - Assertions.assertTrue(mv0.contains("9"), "Expected 10.0.0.8 to include 9, got " + mv0); - - List<String> mv1 = toStringListDropNulls(r1.get(3)); - Assertions.assertTrue( - mv1.contains("1") && mv1.contains("2"), - "Expected 10.0.0.7 to include 1 and 2, got " + mv1); - } + // After sort, 10.0.0.7 < 10.0.0.8 lexicographically + Assertions.assertEquals("10.0.0.7", r0.getString(0)); + Assertions.assertEquals("10.0.0.8", r1.getString(0)); + + List<String> mv0 = toStringListDropNulls(r0.get(3)); + Assertions.assertTrue( + mv0.contains("1") && mv0.contains("2"), + "Expected 10.0.0.7 to include 1 and 2, got " + mv0); + + List<String> mv1 = toStringListDropNulls(r1.get(3)); + Assertions.assertTrue(mv1.contains("9"), "Expected 10.0.0.8 to include 9, got " + mv1);
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (19)
core/src/main/java/org/opensearch/sql/analysis/Analyzer.javacore/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.javacore/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.javacore/src/main/java/org/opensearch/sql/ast/tree/MvCombine.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javadocs/category.jsondocs/user/dql/metadata.rstdocs/user/ppl/cmd/mvcombine.mddoctest/test_data/mvcombine.jsondoctest/test_docs.pydoctest/test_mapping/mvcombine.jsoninteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javainteg-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javainteg-test/src/test/resources/mvcombine.jsoninteg-test/src/test/resources/mvcombine_index_mapping.jsonppl/src/main/antlr/OpenSearchPPLLexer.g4ppl/src/main/antlr/OpenSearchPPLParser.g4ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
🧰 Additional context used
📓 Path-based instructions (14)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.javacore/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.javacore/src/main/java/org/opensearch/sql/analysis/Analyzer.javacore/src/main/java/org/opensearch/sql/ast/tree/MvCombine.javappl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javainteg-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.javacore/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.javacore/src/main/java/org/opensearch/sql/analysis/Analyzer.javacore/src/main/java/org/opensearch/sql/ast/tree/MvCombine.javappl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javainteg-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.javacore/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.javacore/src/main/java/org/opensearch/sql/analysis/Analyzer.javacore/src/main/java/org/opensearch/sql/ast/tree/MvCombine.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/ast/**/*.java: - AST nodes must be immutable where possible
- Follow visitor pattern for AST traversal
- Ensure proper toString() implementation for debugging
Files:
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.javacore/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.javacore/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java
**/*.g4
⚙️ CodeRabbit configuration file
**/*.g4: - Check if modifying unrelated grammar files (scope creep)
- Verify grammar rule placement follows project patterns
- Question if new rule needed vs reusing existing rules
- Ensure changes are relevant to the PR's stated purpose
Files:
ppl/src/main/antlr/OpenSearchPPLLexer.g4ppl/src/main/antlr/OpenSearchPPLParser.g4
docs/**/*.rst
⚙️ CodeRabbit configuration file
docs/**/*.rst: - Verify examples use valid test data and indices
- Check documentation follows existing patterns and structure
- Validate syntax examples are complete and correct
- Ensure code examples are tested and working
- Check for consistent formatting and style
Files:
docs/user/dql/metadata.rst
integ-test/src/test/resources/**/*
⚙️ CodeRabbit configuration file
integ-test/src/test/resources/**/*: - Verify test data is realistic and representative
- Check data format matches expected schema
- Ensure test data covers edge cases and boundary conditions
Files:
integ-test/src/test/resources/mvcombine.jsoninteg-test/src/test/resources/mvcombine_index_mapping.json
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
**/test/**/*.java
⚙️ CodeRabbit configuration file
**/test/**/*.java: - Verify NULL input tests for all new functions
- Check boundary condition tests (min/max values, empty inputs)
- Validate error condition tests (invalid inputs, exceptions)
- Ensure multi-document tests for per-document operations
- Flag smoke tests without meaningful assertions
- Check test naming follows pattern: test
- Verify test data is realistic and covers edge cases
- Verify test coverage for new business logic
- Ensure tests are independent and don't rely on execution order
- Validate meaningful test data that reflects real-world scenarios
- Check for proper cleanup of test resources
Files:
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
End-to-end scenarios need integration tests in
integ-test/module
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
⚙️ CodeRabbit configuration file
integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources
- Verify test data files exist in integ-test/src/test/resources/
- Check test assertions are meaningful and specific
- Validate tests clean up resources after execution
- Ensure tests are independent and can run in any order
- Flag tests that reference non-existent indices (e.g., EMP)
- Verify integration tests are in correct module (integ-test/)
- Check tests can be run with ./gradlew :integ-test:integTest
- Ensure proper test data setup and teardown
- Validate end-to-end scenario coverage
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
**/*IT.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
Name integration tests with
*IT.javasuffix in OpenSearch SQL
Files:
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/main/antlr/OpenSearchPPLParser.g4
⚙️ CodeRabbit configuration file
ppl/src/main/antlr/OpenSearchPPLParser.g4: - Identify code reuse opportunities with existing commands
- Validate command follows PPL naming and structure patterns
- Check if command should be alias vs separate implementation
Files:
ppl/src/main/antlr/OpenSearchPPLParser.g4
**/*Test.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*Test.java: All new business logic requires unit tests
Name unit tests with*Test.javasuffix in OpenSearch SQL
Files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
🧠 Learnings (9)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.javacore/src/main/java/org/opensearch/sql/analysis/Analyzer.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL
Applied to files:
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes
Applied to files:
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javainteg-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
📚 Learning: 2025-12-11T05:27:39.856Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.
Applied to files:
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
🪛 Biome (2.1.2)
integ-test/src/test/resources/mvcombine.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 4-5: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 5-6: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 7-8: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 8-9: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 10-11: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 11-12: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 12-13: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 13-14: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 14-15: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 15-16: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 17-18: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 18-19: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 19-20: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 20-21: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 21-22: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 22-23: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 24-25: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 25-26: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 26-27: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 27-28: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 28-29: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 29-30: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 31-32: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 32-33: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 33-34: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 34-35: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 35-36: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 36-37: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 38-39: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 39-40: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 40-41: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 41-42: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
doctest/test_data/mvcombine.json
[error] 1-2: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 2-3: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 3-4: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 4-5: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 5-6: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 6-7: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 7-8: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 8-9: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 9-10: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 10-11: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 11-12: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 12-13: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 13-14: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 14-15: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 15-16: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 16-17: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
[error] 17-18: End of file expected
Use an array for a sequence of values: [1, 2]
(parse)
🪛 LanguageTool
docs/user/ppl/cmd/mvcombine.md
[style] ~12-~12: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...elimiter only when nomv is enabled. * If some rows are missing the target field,...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🔇 Additional comments (19)
doctest/test_mapping/mvcombine.json (1)
1-12: LGTM!This is a well-structured test mapping file with appropriate field types for testing the mvcombine feature. The mapping includes a good mix of field types (keywords, numeric, specialized IP type) and field names that logically represent test scenarios (grouping fields like
caseandip, combination-candidate fields liketagsandpackets_str).ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java (2)
42-61: LGTM! Test data appropriately includes non-consecutive rows.The test data correctly includes non-consecutive "A" values (rows 1, 2, and 4) to verify that
mvcombinegroups all matching rows across the result set, not just adjacent ones. This aligns with the PR objectives stating "Grouping is not limited to adjacent rows."
111-160: LGTM! Clean ScannableTable implementation.The inner class is well-structured:
- Properly implements all
ScannableTableinterface methods- Uses Lombok
@RequiredArgsConstructorappropriately for immutable fields- Schema definition is clear with explicit nullability
- Statistics implementation provides deterministic behavior for test planning
docs/category.json (1)
39-39: LGTM! Documentation reference added correctly.The mvcombine.md reference is properly added to the ppl_cli_calcite category and follows the expected alphabetical ordering.
doctest/test_data/mvcombine.json (1)
1-18: LGTM! Test data structure is appropriate.The NDJSON (newline-delimited JSON) format is correct for this test data file. The static analysis errors from Biome are false positives—NDJSON is the expected format where each line contains a separate JSON object.
The test data covers relevant scenarios:
- Multiple rows with identical grouping keys (ip/bytes/tags) for combining
- Missing target fields
- Duplicate values
- Empty strings
doctest/test_docs.py (1)
60-61: LGTM! Test data mapping added correctly.The mvcombine_data entry follows the existing pattern and properly maps the index name to the test data file.
docs/user/dql/metadata.rst (1)
38-38: LGTM! Example output correctly updated.The fetched rows count and table entry for the new mvcombine_data index are correctly added, maintaining consistency with the existing format and alphabetical ordering.
Also applies to: 51-51
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java (1)
71-71: LGTM!The import and visitor method for
MvCombinefollow the established visitor pattern correctly, delegating tovisitChildrenfor traversal like other AST node types.Also applies to: 466-468
integ-test/src/test/resources/mvcombine_index_mapping.json (1)
1-13: LGTM!The index mapping defines appropriate test fields for mvcombine integration tests. The static analysis errors from Biome are false positives—this is valid Elasticsearch mapping JSON format.
ppl/src/main/antlr/OpenSearchPPLLexer.g4 (1)
73-74: LGTM!The new lexer tokens
MVCOMBINEandNOMVare correctly defined and appropriately placed in the COMMAND KEYWORDS section, following the existing pattern.integ-test/src/test/resources/mvcombine.json (1)
1-42: LGTM!The test data is comprehensive and covers important edge cases for mvcombine testing:
- Multiple rows with identical non-target fields for combining
- Missing target field values
- Duplicate target values
- Empty strings
- Varying field presence
The static analysis errors are false positives—this is valid NDJSON format for Elasticsearch bulk API indexing.
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java (1)
83-83: LGTM!The
visitMvCombineimplementation correctly follows the pattern for Calcite-only commands, throwing the appropriate exception to route mvcombine operations to the Calcite execution path. This is consistent with other Calcite-only commands likeaddtotalsandaddcoltotals.Also applies to: 537-540
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java (1)
924-928: LGTM!The new
MVCOMBINEtest index entry follows the established pattern of other entries in theIndexenum. The naming convention, type, and file path patterns are consistent with existing test indexes.ppl/src/main/antlr/OpenSearchPPLParser.g4 (3)
535-541: Consider semantic validation fordelimwithoutnomv.Per the PR description, "delim is applicable only when nomv is enabled." However, the grammar allows specifying
delimwithoutnomv. While the grammar correctly parses this, ensure semantic validation exists downstream (in the analyzer or execution layer) to either:
- Warn/error when
delimis specified butnomvis false/unset, or- Document that
delimis silently ignored whennomv=falseAdditionally, the parameter order is fixed (DELIM must come before NOMV). Consider if this is intentional or if the grammar should support either order.
1585-1585: LGTM!Adding
NOMVtosearchableKeyWordis correct and consistent with how other argument keywords are handled.
91-91: LGTM!The
mvcombineCommandis properly integrated into thecommandsproduction, following the pattern of other similar commands.core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java (1)
17-46: Well-structured AST node following existing patterns.The class correctly implements the visitor pattern and follows the existing
UnresolvedPlanconventions. The mutablechildfield viaattach()mirrors the pattern used by other AST nodes in this package.Note: The delimiter default issue (space vs. comma per spec) is addressed in the
CalciteRelNodeVisitorcomment.integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java (2)
159-186: Strengthen delimiter default verification after fixing default.The regex
Pattern.compile("1.*2.*3")at line 179 matches regardless of delimiter. Once the default delimiter is corrected to comma (per spec), consider tightening the assertion to verify the exact format.// After fixing default to comma, verify exact format: Assertions.assertTrue( s.matches("1 2 3") || s.equals("1, 2, 3") || s.contains(","), "Expected comma-delimited output, got: " + s);
22-31: Good test initialization and comprehensive coverage.The test class correctly initializes Calcite mode and loads the MVCOMBINE test index. Test coverage includes happy paths (basic grouping, multi-group), edge cases (single-row groups, missing target values), nomv/delim variations, and error semantics (missing field → 4xx).
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Outdated
Show resolved
Hide resolved
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
Outdated
Show resolved
Hide resolved
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
Show resolved
Hide resolved
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (3)
docs/user/ppl/cmd/mvcombine.md (1)
7-13: Consider varying sentence structure for better readability.Three consecutive bullet points begin with "If", which slightly reduces readability. Consider rephrasing to vary the sentence structure.
♻️ Suggested rewording
Key aspects of `mvcombine`: * It generates one row per group, where the group keys are all fields currently in the pipeline except the target field. * The target field becomes a multivalue field containing the combined values from the grouped rows. -* If `nomv` is specified, the target field is returned as a single scalar string instead of a multivalue array. -* If `delim` is specified, it controls the delimiter only when `nomv` is enabled. -* If some rows are missing the target field, those rows contribute no value to the combined output for that group. +* When `nomv` is specified, the target field is returned as a single scalar string instead of a multivalue array. +* The `delim` parameter controls the delimiter only when `nomv` is enabled. +* Rows missing the target field contribute no value to the combined output for that group.Based on static analysis hints.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (2)
3114-3216: Consider extracting the nomv conversion logic into a helper method.The
visitMvCombinemethod is 102 lines, exceeding the 50-line complexity guideline. The nomv conversion logic (lines 3180-3212) is a self-contained block that could be extracted to improve readability and maintainability.♻️ Proposed refactor
Extract the nomv conversion logic to a private helper method:
/** * Applies nomv conversion: transforms multivalue array output to delimited string. * Casts array elements to VARCHAR and applies ARRAY_TO_STRING with the given delimiter. */ private void applyNomvConversion( List<String> inputFieldNames, int targetIndex, String delim, CalcitePlanContext context) { final RelBuilder relBuilder = context.relBuilder; final RexBuilder rexBuilder = context.rexBuilder; final List<RexNode> nomvProjections = new ArrayList<>(inputFieldNames.size()); final List<String> nomvNames = new ArrayList<>(inputFieldNames.size()); // Build ARRAY<VARCHAR> type once final RelDataType stringType = relBuilder .getTypeFactory() .createSqlType(org.apache.calcite.sql.type.SqlTypeName.VARCHAR); final RelDataType stringArrayType = relBuilder.getTypeFactory().createArrayType(stringType, -1); for (int i = 0; i < inputFieldNames.size(); i++) { final String name = inputFieldNames.get(i); nomvNames.add(name); if (i == targetIndex) { final RexNode stringArray = rexBuilder.makeCast(stringArrayType, relBuilder.field(i)); nomvProjections.add( relBuilder.call( org.apache.calcite.sql.fun.SqlLibraryOperators.ARRAY_TO_STRING, stringArray, relBuilder.literal(delim))); } else { nomvProjections.add(relBuilder.field(i)); } } relBuilder.project(nomvProjections, nomvNames, /* force= */ true); }Then simplify the main method:
relBuilder.project(projections, projNames, /* force= */ true); // 7) nomv=true converts multivalue output to a single delimited string. - // arrayToString in this engine supports only String/ByteString, so cast elements to STRING - // first. if (node.isNomv()) { - final List<RexNode> nomvProjections = new ArrayList<>(inputFieldNames.size()); - final List<String> nomvNames = new ArrayList<>(inputFieldNames.size()); - - // Build ARRAY<VARCHAR> type once - final RelDataType stringType = - relBuilder - .getTypeFactory() - .createSqlType(org.apache.calcite.sql.type.SqlTypeName.VARCHAR); - final RelDataType stringArrayType = - relBuilder.getTypeFactory().createArrayType(stringType, -1); - - for (int i = 0; i < inputFieldNames.size(); i++) { - final String name = inputFieldNames.get(i); - nomvNames.add(name); - - if (i == targetIndex) { - // At this point relBuilder.field(i) is ARRAY<elemType>. Cast to ARRAY<VARCHAR> so - // ARRAY_TO_STRING works. - final RexNode stringArray = rexBuilder.makeCast(stringArrayType, relBuilder.field(i)); - - nomvProjections.add( - relBuilder.call( - org.apache.calcite.sql.fun.SqlLibraryOperators.ARRAY_TO_STRING, - stringArray, - relBuilder.literal(delim))); - } else { - nomvProjections.add(relBuilder.field(i)); - } - } - - relBuilder.project(nomvProjections, nomvNames, /* force= */ true); + applyNomvConversion(inputFieldNames, targetIndex, delim, context); } return relBuilder.peek();As per coding guidelines: methods should be kept under 50 lines, and methods over 50 lines should be flagged as potentially too complex.
56-56: Consider using explicit imports instead of wildcard.The wildcard import
import org.apache.calcite.rex.*;was introduced, which can reduce code clarity. Explicit imports make it easier to understand which classes are actually used.However, if many Rex classes are used throughout the file, the wildcard import may be acceptable for brevity.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (4)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javadocs/user/ppl/cmd/mvcombine.mdppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javappl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
🚧 Files skipped from review as they are similar to previous changes (1)
- ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java
🧰 Additional context used
📓 Path-based instructions (5)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.javacore/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/ppl/**/*.java
⚙️ CodeRabbit configuration file
**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases
- Check AST generation for new syntax
- Ensure corresponding AST builder classes are updated
- Validate edge cases and boundary conditions
Files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (4)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Flag redundant or obvious comments during code review
Applied to files:
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🪛 LanguageTool
docs/user/ppl/cmd/mvcombine.md
[style] ~12-~12: Three successive sentences begin with the same word. Consider rewording the sentence or use a thesaurus to find a synonym.
Context: ...elimiter only when nomv is enabled. * If some rows are missing the target field,...
(ENGLISH_WORD_REPEAT_BEGINNING_RULE)
🔇 Additional comments (1)
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)
872-889: LGTM! Clean and straightforward implementation.The
visitMvcombineCommandmethod correctly:
- Extracts the required field parameter
- Handles optional
delimparameter with proper unquoting- Handles optional
nomvparameter with boolean literal parsing- Constructs the MvCombine AST node with appropriate defaults
The implementation follows existing patterns in the AstBuilder class and is concise at 17 lines.
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java:
- Around line 3120-3121: Add a JavaDoc block to the public method visitMvCombine
in CalciteRelNodeVisitor: describe the method's purpose and behavior (it visits
an MvCombine AST node and produces a RelNode that combines multivalue fields),
document @param node as the MvCombine AST node to process, @param context as the
CalcitePlanContext used during planning, @return as the resulting RelNode with
combined multivalue fields, and include @throws SemanticCheckException when the
target is not a direct field reference; ensure formatting follows the project's
JavaDoc style and appears immediately above the visitMvCombine method
declaration.
🧹 Nitpick comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
3120-3221: Consider refactoring: method exceeds complexity guidelines.This 101-line method exceeds the 50-line guideline for this file. The
nomvhandling logic (lines 3186-3218) is a self-contained 33-line block that could be extracted into a private helper method, improving readability and maintainability.Based on coding guidelines, methods >50 lines in this file should be flagged as potentially too complex.
Suggested refactoring approach
Consider extracting:
private void applyNomvProjection( RelBuilder relBuilder, RexBuilder rexBuilder, List<String> inputFieldNames, int targetIndex, String delim) { // Lines 3187-3217 logic here }Then call from
visitMvCombine:if (node.isNomv()) { applyNomvProjection(relBuilder, rexBuilder, inputFieldNames, targetIndex, delim); }
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.javadocs/user/ppl/cmd/mvcombine.md
🚧 Files skipped from review as they are similar to previous changes (1)
- docs/user/ppl/cmd/mvcombine.md
🧰 Additional context used
📓 Path-based instructions (4)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🔇 Additional comments (2)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (2)
125-125: LGTM: Import added for MvCombine support.The import is correctly placed and necessary for the new
visitMvCombinemethod implementation.
3153-3181: Verify ARRAY_TO_STRING edge case handling and complex type support.Integration tests confirm that COLLECT aggregation preserves null values in the target field (see
testMvCombine_missingTargetWithinGroup_collapses_nonNullPreserved()), and emptygroupExprs(when all non-target fields form the group key) is the intended behavior for global aggregation.However, the following edge cases lack test coverage:
- ARRAY_TO_STRING (lines 3210-3212): No tests verify behavior with null array elements or empty arrays in nomv mode
- Complex types: No test coverage for nested arrays or struct types as the target field
These gaps should be addressed with additional integration tests to ensure robustness across all supported scenarios.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
🤖 Fix all issues with AI agents
In @core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java:
- Around line 3121-3222: The method visitMvCombine is too long and missing
JavaDoc; split its long blocks into two private helpers
(restoreColumnOrderWithArrayCast and applyNomvConversion) and add JavaDoc to
visitMvCombine with @param, @return, and @throws. Extract the logic that
restores original column order and casts MULTISET→ARRAY (current block around
building projections and relBuilder.project where aggPos, elemType, arrayType,
projections, projNames, groupPos are used) into private void
restoreColumnOrderWithArrayCast(... ) taking relBuilder, rexBuilder, input,
inputFieldNames, targetIndex, and groupExprs as needed; extract the nomv
conversion block (where stringArrayType, stringArray cast, and ARRAY_TO_STRING
call are built) into private void applyNomvConversion(... ) taking relBuilder,
rexBuilder, inputFieldNames, targetIndex, and delim; update visitMvCombine to
call these helpers and keep behavior identical; ensure helpers are private,
well-documented if public, and preserve existing variable semantics and
exceptions (SemanticCheckException) so functionality and error handling remain
unchanged.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧰 Additional context used
📓 Path-based instructions (4)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🔇 Additional comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
56-56: LGTM: New imports are appropriate.The
RexBuilderandMvCombineimports are necessary for the newvisitMvCombineimplementation and are used correctly.Also applies to: 126-126
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <[email protected]>
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧰 Additional context used
📓 Path-based instructions (4)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (1)
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🔇 Additional comments (6)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (6)
56-56: LGTM: Necessary imports for mvcombine implementation.The added imports are appropriate for the mvcombine functionality:
RexBuilderis needed for type casting operations (MULTISET → ARRAY)MvCombineis the AST node representing the mvcombine commandAlso applies to: 126-126
3177-3184: LGTM: Clear validation with specific error handling.The helper method correctly validates that the mvcombine target is a direct field reference (not an expression), which aligns with the command semantics. The use of
SemanticCheckExceptionwith a descriptive error message follows the project's error-handling guidelines.
3186-3194: LGTM: Correct group-by logic.The helper correctly builds group expressions for all fields except the target. Returning an empty list when only the target field exists is the correct behavior—it groups all rows together into a single combined multivalue result.
3206-3236: LGTM: Correct column order restoration with proper type casting.The helper method correctly:
- Identifies the aggregated MULTISET position after grouping
- Extracts the original element type for the target field
- Constructs the appropriate ARRAY type
- Rebuilds projections in the original column order
- Casts MULTISET → ARRAY for the target field position
The use of
force=truein the project call is appropriate to ensure the transformation is applied. The unbounded array size (-1) is correct for this use case.Note: If the target field is already an array type, this will create a nested
ARRAY<ARRAY<T>>structure. This may be the intended behavior, but verify with test coverage for nested array scenarios.
3238-3273: LGTM: Correct nomv conversion using native Calcite functions.The helper method properly implements the
nomv=truebehavior by:
- Building the target
ARRAY<VARCHAR>type once for efficiency- Casting the multivalue array to
ARRAY<VARCHAR>to ensure type compatibility- Using Calcite's native
ARRAY_TO_STRINGfunction with the provided delimiter- Preserving all other fields unchanged
- Maintaining the original column order and names
The two-step cast (ARRAY → ARRAY → STRING) is necessary for type safety with
ARRAY_TO_STRING.
3196-3204: NULL handling in COLLECT aggregation is correct and tested.The SQL standard
COLLECTfunction includes NULL values in the resulting multiset, which is the expected behavior here. Unit tests (CalcitePPLMvCombineTest.java, testMvCombineWithNullTargetValues) and integration tests (CalciteMvCombineCommandIT.java) verify NULL handling works correctly. The documentation clarifies that "rows missing the target field" (i.e., field not present in the schema) do not contribute; this is distinct from NULL values, which are properly included per SQL semantics. No changes needed.
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
Show resolved
Hide resolved
Signed-off-by: Srikanth Padakanti <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)
3139-3175: Consider refactoring: Method exceeds complexity threshold.The
visitMvCombinemethod is ~55 lines, exceeding the 50-line threshold. While the logic is well-organized with helper methods, the main method still orchestrates multiple concerns (validation, grouping, aggregation, projection, conversion).Consider extracting the nomv conversion orchestration into a separate method to improve readability and align with the single-responsibility principle.
♻️ Suggested refactoring approach
Extract the final conversion step:
@Override public RelNode visitMvCombine(MvCombine node, CalcitePlanContext context) { // 1) Lower child first visitChildren(node, context); final RelBuilder relBuilder = context.relBuilder; final RexBuilder rexBuilder = context.rexBuilder; final RelNode input = relBuilder.peek(); final List<String> inputFieldNames = input.getRowType().getFieldNames(); - // 2) Accept delim (default is single space). - // NOTE: delim only affects single-value rendering when nomv=true. - final String delim = node.getDelim(); // 3) Resolve target field to an input ref (must be a direct field reference) final Field targetField = node.getField(); final int targetIndex = resolveTargetIndex(targetField, context); final String targetName = inputFieldNames.get(targetIndex); // 4) Group key = all fields except target final List<RexNode> groupExprs = buildGroupExpressionsExcludingTarget(targetIndex, inputFieldNames, relBuilder); // 5) Aggregate target values: COLLECT => MULTISET performCollectAggregation(relBuilder, targetIndex, targetName, groupExprs); // 6) Restore original output column order AND cast MULTISET -> ARRAY using RexBuilder restoreColumnOrderWithArrayCast( relBuilder, rexBuilder, input, inputFieldNames, targetIndex, groupExprs); // 7) nomv=true converts multivalue output to a single delimited string. if (node.isNomv()) { - applyNomvConversion(relBuilder, rexBuilder, inputFieldNames, targetIndex, delim); + applyNomvConversion(relBuilder, rexBuilder, inputFieldNames, targetIndex, node.getDelim()); } return relBuilder.peek(); }This minor change removes the intermediate
delimvariable and keeps the method focused on orchestration.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧰 Additional context used
📓 Path-based instructions (4)
**/*.java
📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)
**/*.java: UsePascalCasefor class names (e.g.,QueryExecutor)
UsecamelCasefor method and variable names (e.g.,executeQuery)
UseUPPER_SNAKE_CASEfor constants (e.g.,MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
PreferOptional<T>for nullable returns in Java
Avoid unnecessary object creation in loops
UseStringBuilderfor string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring
- Flag classes >500 lines as needing organization review
- Check for dead code, unused imports, and unused variables
- Identify code reuse opportunities across similar implementations
- Assess holistic maintainability - is code easy to understand and modify?
- Flag code that appears AI-generated without sufficient human review
- Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)
- Check for proper JavaDoc on public classes and methods
- Flag redundant comments that restate obvious code
- Ensure proper error handling with specific exception types
- Check for Optional usage instead of null returns
- Validate proper use of try-with-resources for resource management
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/**/*.java
⚙️ CodeRabbit configuration file
core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
**/calcite/**/*.java
⚙️ CodeRabbit configuration file
**/calcite/**/*.java: - Follow existing Calcite integration patterns
- Verify RelNode visitor implementations are complete
- Check RexNode handling follows project conventions
- Validate SQL generation is correct and optimized
- Ensure Calcite version compatibility
- Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor
- Document any Calcite-specific workarounds
- Test compatibility with Calcite version constraints
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
⚙️ CodeRabbit configuration file
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read
- Suggest extracting complex logic into helper methods
- Check for code organization and logical grouping
- Validate all RelNode types are handled
Files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🧠 Learnings (2)
📚 Learning: 2025-12-02T17:27:55.938Z
Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
📚 Learning: 2025-12-29T05:32:03.491Z
Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.
Applied to files:
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
🔇 Additional comments (5)
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (5)
56-56: LGTM: New imports added correctly.The imports for
RexBuilderandMvCombineare appropriately added to support the mvcombine command implementation.Also applies to: 126-126
3177-3199: LGTM: Robust target field validation.The validation logic correctly:
- Ensures the target is a direct field reference (not an expression)
- Prevents mvcombine on already-nested array/multiset types
- Provides clear error messages
3221-3251: Well-structured column order restoration.The
restoreColumnOrderWithArrayCastmethod correctly:
- Preserves the original input column order
- Casts MULTISET to ARRAY using the element type from the input schema
- Uses explicit projection with names to maintain field identity
The logic handles the index mapping properly between group fields and the aggregated field.
3253-3288: LGTM: Correct nomv conversion with type safety.The
applyNomvConversionmethod correctly:
- Casts the array to
ARRAY<VARCHAR>before callingARRAY_TO_STRING(required for type compatibility)- Uses the user-specified delimiter
- Preserves all other fields unchanged
3211-3219: Clarify NULL handling in COLLECT aggregation.The implementation uses Calcite's COLLECT operator, which by default includes NULL values in the multiset. The JavaDoc states "rows missing the target field do not contribute values," but this is ambiguous—clarify whether "missing" refers to NULL values or field-schema absence.
If NULL values should be excluded, add a filter before aggregation:
relBuilder.filter(relBuilder.isNotNull(relBuilder.field(targetIndex)));Otherwise, update the JavaDoc to explicitly document that NULL values are included in the aggregated multiset.
|
@songkant-aws Can you please look into this - integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/4866.yml For context: pasting the error here Caused by: java.lang.AssertionError: datarows didn't match expected value: |
Signed-off-by: Srikanth Padakanti <[email protected]>
|
@penghuo @chenqi0805 @anasalkouz Can you Please add the PR label. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
🧹 Nitpick comments (1)
docs/user/ppl/index.md (1)
83-83: Slightly awkward phrasing in the mvcombine description.The description "Combines values of a specified field across rows identical on all other fields" could be clearer and more natural. Consider rephrasing to "Combine values from a specified field across rows that are identical in all other fields" for better readability.
📜 Review details
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
docs/user/ppl/index.md
🔇 Additional comments (2)
docs/user/ppl/index.md (2)
81-83: Verify command table placement and organization.The three new entries (addtotals, addcoltotals, mvcombine) are placed at the end of the command table after the metadata commands (describe, explain, show datasources). The existing table appears to group commands by functionality or category rather than alphabetically. Verify that this placement is intentional and aligns with the intended documentation structure, especially given that the AI summary indicates addtotals and addcoltotals were "reinstated" (possibly moved).
83-83: All referenced documentation files exist and links are valid.
|
@penghuo @dai-chen can you please review this. @anasalkouz |
…et implemented Signed-off-by: Srikanth Padakanti <[email protected]>
…et implemented Signed-off-by: Srikanth Padakanti <[email protected]>
…et implemented Signed-off-by: Srikanth Padakanti <[email protected]>
…et implemented Signed-off-by: Srikanth Padakanti <[email protected]>
Description
This change introduces the mvcombine PPL command, which combines values of a specified field across rows that are identical on all other fields in the current result set.
mvcombine behaves as a pipeline-level grouping operator: it collapses matching rows into a single row and aggregates the target field values into a multivalue field, while preserving all non-target fields.
Command
mvcombine [delim=""]
Arguments
field (required)
The field whose values will be combined.
delim (optional)
Delimiter to be used when rendering a single-value representation of the combined field.
Semantics
mvcombine operates on the current result set.
Rows are grouped where all fields except have identical values.
For each group:
Scope clarification
Related Issues
Resolves #4766
#4766
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.