Feature/mvcombine #5025

srikanthpadakanti · 2026-01-07T18:39:28Z

Description

This change introduces the mvcombine PPL command, which combines values of a specified field across rows that are identical on all other fields in the current result set.

mvcombine behaves as a pipeline-level grouping operator: it collapses matching rows into a single row and aggregates the target field values into a multivalue field, while preserving all non-target fields.

Command
mvcombine [delim=""]

Arguments

field (required)
The field whose values will be combined.

Must be a direct field reference.
Must be a single-valued (scalar) field.
If the field does not exist in the current schema, the command fails.

delim (optional)
Delimiter to be used when rendering a single-value representation of the combined field.

Defaults to a single space (" "), consistent with Splunk.
Accepted for forward compatibility.
Has no observable effect in this implementation, since nomv is not yet supported.

Semantics

mvcombine operates on the current result set.

Rows are grouped where all fields except have identical values.

For each group:

All non-target fields are preserved.
The target becomes a multivalue array containing the combined values.
Grouping is not limited to adjacent rows; all matching rows in the result set are combined.
Value order is preserved from the input stream order at the point where mvcombine executes.
Users must apply sort before mvcombine if deterministic ordering is required.
Rows missing the target field do not contribute a value to the combined output.
If the target field does not exist or is not a direct field reference, the command fails with an error.

Scope clarification

This PR implements multivalue output only, which is the default and primary behavior in Splunk.
nomv support (single-value string output using delim) is out of scope for this change and will be introduced separately.
The delim argument is parsed and validated for syntax parity and forward compatibility, but does not affect output in this implementation.

Related Issues

Resolves #4766
#4766

Check List

[ X] New functionality includes testing.
[ X] New functionality has been documented.
[ X] New functionality has javadoc added.
[ X] New functionality has a user manual doc added.
[ X] New PPL command checklist all confirmed.
[ X] API changes companion pull request created.
[ X] Commits are signed per the DCO using --signoff or -s.
[ X] Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Srikanth Padakanti <[email protected]>

coderabbitai · 2026-01-07T18:39:44Z

📝 Walkthrough

Summary by CodeRabbit

New Features
- Introduced mvcombine command to group rows by all other fields and aggregate a target field into a multivalue output (optional delimiter parameter supported).
Documentation
- Added comprehensive mvcombine docs with syntax, semantics, examples, and error guidance; updated command index and metadata examples.
Tests
- Added unit, integration, and doctest coverage plus test data and mappings to validate mvcombine behavior.

_{✏️ Tip: You can customize this high-level summary in your review settings.}

Walkthrough

Adds end-to-end support for a new PPL command mvcombine: new AST node and DSL helpers, parser/lexer and AstBuilder support, analyzer visitor stub, Calcite lowering implementation, tests/fixtures, and documentation updates.

Changes

Cohort / File(s)	Summary
AST Node `core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java`	New public `MvCombine` AST node with fields `field`, `delim` (defaults to space), optional attached child, `attach`/`getChild`, and visitor `accept`.
AST DSL `core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java`	Added two `mvcombine` factory methods (`mvcombine(Field)` and `mvcombine(Field, String)`) and import.
Visitor Base `core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java`	Added `visitMvCombine(MvCombine node, C context)` delegating to children.
Analyzer `core/src/main/java/org/opensearch/sql/analysis/Analyzer.java`	Added `visitMvCombine` override that routes `MvCombine` to Calcite-only exception path (`getOnlyForCalciteException("mvcombine")`).
Calcite Lowering `core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java`	New `visitMvCombine(...)` implementation: lowers child, resolves target index, groups by non-target fields, aggregates `COLLECT(target)` excluding NULLs, casts multiset→array, restores column order. New private helpers added.
PPL Grammar & Parser `ppl/src/main/antlr/OpenSearchPPLLexer.g4`, `ppl/src/main/antlr/OpenSearchPPLParser.g4`, `ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java`	Added `MVCOMBINE` token and `mvcombineCommand` rule (optional `DELIM`), and `visitMvcombineCommand()` building `MvCombine(field, delim)`.
Calcite Unit Tests `ppl/src/test/java/.../CalcitePPLMvCombineTest.java`	New unit tests validating logical planning for `mvcombine`, plus deterministic `MvCombineDataTable` helper.
Integration Tests `integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java`, `integ-test/src/test/java/.../SQLIntegTestCase.java`	New integration test class and new test index registration; tests cover grouping, delim handling, error cases, and helpers to inspect multivalue cells.
Test Data & Mappings `doctest/test_data/mvcombine.json`, `doctest/test_mapping/mvcombine.json`, `integ-test/src/test/resources/mvcombine.json`, `integ-test/src/test/resources/mvcombine_index_mapping.json`	Added datasets and ES index mappings used by unit/integration tests.
Test Registry `doctest/test_docs.py`	Registered `mvcombine_data` mapping for doctests.
Documentation `docs/user/ppl/cmd/mvcombine.md`, `docs/category.json`, `docs/user/dql/metadata.rst`, `docs/user/ppl/index.md`	New command docs with examples and manifest entry; sample metadata update and index page entry added.

Sequence Diagram(s)

sequenceDiagram
    participant User as User / PPL Query
    participant LexerParser as Lexer & Parser
    participant AstBuilder as AST Builder
    participant Analyzer as Analyzer
    participant CalciteVisitor as Calcite RelVisitor
    participant RelBuilder as Calcite RelBuilder

    User->>LexerParser: mvcombine <field> [DELIM='<s>']
    LexerParser->>AstBuilder: parse mvcombineCommand
    AstBuilder->>AstBuilder: build MvCombine(field, delim)
    AstBuilder->>Analyzer: pass MvCombine node
    Analyzer->>CalciteVisitor: route node for Calcite lowering (getOnlyForCalciteException path)
    CalciteVisitor->>RelBuilder: lower child input plan
    CalciteVisitor->>RelBuilder: build group-by (non-target fields)
    CalciteVisitor->>RelBuilder: aggregate COLLECT(target) excluding NULLs -> MULTISET
    CalciteVisitor->>RelBuilder: cast MULTISET -> ARRAY and restore column order
    RelBuilder-->>CalciteVisitor: RelNode plan
    CalciteVisitor-->>User: final plan/result

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

Support nested aggregation when calcite enabled #4979 — overlaps on CalciteRelNodeVisitor changes and lowering logic.
Introduce logical dedup operators for PPL #5014 — touches CalciteRelNodeVisitor for visitor/lowering additions.

Suggested labels

PPL, feature, documentation

Suggested reviewers

penghuo
LantaoJin
dai-chen
qianheng-aws
ykmr1224
ps48
kavithacm
joshuali925
GumpacG

🚥 Pre-merge checks | ✅ 3 | ❌ 2

❌ Failed checks (1 warning, 1 inconclusive)

Check name	Status	Explanation	Resolution
Docstring Coverage	⚠️ Warning	Docstring coverage is 37.50% which is insufficient. The required threshold is 80.00%.	Write docstrings for the functions missing them to satisfy the coverage threshold.
Title check	❓ Inconclusive	The title 'Feature/mvcombine' is vague and generic, using a feature branch naming convention rather than describing the actual change clearly.	Use a more descriptive title such as 'Add mvcombine PPL command for field value aggregation' to clearly convey what functionality was added.

✅ Passed checks (3 passed)

Check name	Status	Explanation
Description check	✅ Passed	The PR description comprehensively explains mvcombine's purpose, syntax, arguments, semantics, scope, and includes proper issue resolution links and a complete checklist.
Linked Issues check	✅ Passed	The code changes fully implement the mvcombine feature as specified in #4766: command syntax, argument handling, grouping rules, multivalue output, error handling, and value ordering are all correctly implemented across lexer, parser, AST, Calcite visitor, and tests.
Out of Scope Changes check	✅ Passed	All code changes are directly scoped to mvcombine implementation. Lexer/parser rules, AST nodes, visitor implementations, documentation, and integration tests are all focused on the mvcombine feature with no unrelated modifications.

_{✏️ Tip: You can configure your own custom pre-merge checks in the settings.}

✨ Finishing touches

📝 Generate docstrings

📜 Recent review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 1259270 and 11fb3e0.

📒 Files selected for processing (1)

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

🚧 Files skipped from review as they are similar to previous changes (1)

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

_{Comment @coderabbitai help to get the list of available commands and usage tips.}

coderabbitai

Actionable comments posted: 8

🤖 Fix all issues with AI agents

In @core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java:
- Around line 3125-3128: CalciteRelNodeVisitor currently defaults delim to a
single space — change the fallback to a comma by updating the delim assignment
in CalciteRelNodeVisitor (the local variable "delim" derived from
node.getDelim()) so the default is ","; also update MvCombine (its constructor
around the current line 29) to normalize null/default delim to "," instead of "
" so both places share the comma default and remove any redundant null-handling
in CalciteRelNodeVisitor if you rely on MvCombine's normalization.

In @docs/user/ppl/cmd/mvcombine.md:
- Line 49: Rename the heading/text labeled "Example 6: Multiple groups" to
"Example 5: Multiple groups" so the examples are sequential (since Example 5 is
missing); locate the "Example 6" occurrence in docs/user/ppl/cmd/mvcombine.md
(the example heading and any internal references or example captions) and change
the numeral from 6 to 5 to restore correct numbering and update any
cross-references if present.
- Line 16: Update the mvcombine usage line to require an explicit boolean for
nomv: replace "mvcombine <field> [nomv] [delim=<string>]" with "mvcombine
<field> [nomv=<boolean>] [delim=<string>]" so the docs match the parser
behavior; this corresponds to the implementation in AstBuilder.java which
expects a boolean literal (e.g., nomv=true or nomv=false) rather than a bare
flag, so adjust any nearby examples or references to use nomv=<boolean> syntax
as well.
- Line 19: The docs state the default delimiter is a comma but the code in
MvCombine.java (constructor assigning this.delim = (delim == null) ? " " :
delim) uses a space; update the docs' line describing "delim" to say the default
is a space (i.e., "Defaults to a space when not provided") and ensure the
example remains consistent with the space-separated output shown elsewhere.

In @ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java:
- Around line 872-873: There are two identical Javadoc comment lines "/**
mvcombine command. */" in AstBuilder; remove the duplicate so only one Javadoc
remains immediately above the mvcombine-related node or method (e.g., the
mvcombine command handling method in class AstBuilder) to avoid redundant
comments.
- Around line 1313-1325: Replace usages of the local helper unquoteStringLiteral
with StringUtils.unquoteText so string delimiters and escape sequences are
handled consistently (e.g., where delimiter is processed in AstBuilder). Update
calls referencing unquoteStringLiteral to call StringUtils.unquoteText(text) and
remove or deprecate the unquoteStringLiteral method; if you keep it, add proper
JavaDoc describing behavior and that it is legacy and does not handle escape
sequences. Ensure imports include StringUtils and run tests to validate
escape/quoted-string handling.

In
@ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java:
- Around line 36-40: Add a JavaDoc to the public test class
CalcitePPLMvCombineTest explaining its purpose: state that it verifies Calcite
integration for the PPL "mvcombine" command, including tests for basic
aggregation behavior and delimiter handling; place the JavaDoc immediately above
the class declaration for CalcitePPLMvCombineTest and reference the mvcombine
functionality so readers know this class focuses on aggregation and delimiter
functionality within the SCOTT_WITH_TEMPORAL schema.

🧹 Nitpick comments (4)

core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java (1)
472-482: Add JavaDoc for public factory methods.

Per coding guidelines, all public methods should have proper JavaDoc with @param and @return documentation. Consider adding documentation similar to other factory methods in this class.
📝 Suggested JavaDoc
+  /**
+   * Creates an MvCombine node with default settings (no delimiter, array output).
+   *
+   * @param field the target field to combine
+   * @return MvCombine node
+   */
   public static MvCombine mvcombine(Field field) {
     return new MvCombine(field, null, false);
   }

+  /**
+   * Creates an MvCombine node with a custom delimiter.
+   *
+   * @param field the target field to combine
+   * @param delim the delimiter used when nomv is enabled
+   * @return MvCombine node
+   */
   public static MvCombine mvcombine(Field field, String delim) {
     return new MvCombine(field, delim, false);
   }

+  /**
+   * Creates an MvCombine node with full configuration.
+   *
+   * @param field the target field to combine
+   * @param delim the delimiter used when nomv is enabled
+   * @param nomv if true, returns a joined string instead of multivalue array
+   * @return MvCombine node
+   */
   public static MvCombine mvcombine(Field field, String delim, boolean nomv) {
     return new MvCombine(field, delim, nomv);
   }
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java (1)

874-890: Add JavaDoc for visitor method.

Per coding guidelines, public methods should have JavaDoc. While other visitor methods in this file follow a minimal comment style, consider adding a brief description for consistency and clarity.

The implementation looks correct:

Field extraction follows the established pattern

DELIM and NOMV parameter handling is appropriate

The boolean check ctx.booleanLiteral().TRUE() != null correctly distinguishes TRUE from FALSE

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java (1)

3114-3215: Method exceeds 50-line threshold — consider extracting helper methods.

This method is ~100 lines and handles multiple responsibilities: validation, group-key construction, aggregation, projection reordering, and nomv transformation. Per coding guidelines, methods >50 lines should be flagged. Consider extracting logical sections into helper methods (e.g., buildMvCombineGroupKey, buildNomvProjection) to improve readability and testability.
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java (1)
113-153: Defensive ordering check is redundant — query already sorts.

Since the query at line 125 includes | sort ip, the result order should be deterministic ("10.0.0.7" < "10.0.0.8" lexicographically). The if-else block at lines 136-152 handles both orderings defensively, which obscures the test intent and could mask sorting bugs.
Simplified test assertions
     JSONArray r0 = result.getJSONArray("datarows").getJSONArray(0);
     JSONArray r1 = result.getJSONArray("datarows").getJSONArray(1);

-    String ip0 = r0.getString(0);
-    String ip1 = r1.getString(0);
-
-    if ("10.0.0.7".equals(ip0)) {
-      List<String> mv0 = toStringListDropNulls(r0.get(3));
-      Assertions.assertTrue(
-          mv0.contains("1") && mv0.contains("2"),
-          "Expected 10.0.0.7 to include 1 and 2, got " + mv0);
-
-      List<String> mv1 = toStringListDropNulls(r1.get(3));
-      Assertions.assertTrue(mv1.contains("9"), "Expected 10.0.0.8 to include 9, got " + mv1);
-    } else {
-      List<String> mv0 = toStringListDropNulls(r0.get(3));
-      Assertions.assertTrue(mv0.contains("9"), "Expected 10.0.0.8 to include 9, got " + mv0);
-
-      List<String> mv1 = toStringListDropNulls(r1.get(3));
-      Assertions.assertTrue(
-          mv1.contains("1") && mv1.contains("2"),
-          "Expected 10.0.0.7 to include 1 and 2, got " + mv1);
-    }
+    // After sort, 10.0.0.7 < 10.0.0.8 lexicographically
+    Assertions.assertEquals("10.0.0.7", r0.getString(0));
+    Assertions.assertEquals("10.0.0.8", r1.getString(0));
+
+    List<String> mv0 = toStringListDropNulls(r0.get(3));
+    Assertions.assertTrue(
+        mv0.contains("1") && mv0.contains("2"),
+        "Expected 10.0.0.7 to include 1 and 2, got " + mv0);
+
+    List<String> mv1 = toStringListDropNulls(r1.get(3));
+    Assertions.assertTrue(mv1.contains("9"), "Expected 10.0.0.8 to include 9, got " + mv1);

📜 Review details

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro

📥 Commits

Reviewing files that changed from the base of the PR and between 46302b7 and 5ac0f3d.

📒 Files selected for processing (19)

core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
docs/category.json
docs/user/dql/metadata.rst
docs/user/ppl/cmd/mvcombine.md
doctest/test_data/mvcombine.json
doctest/test_docs.py
doctest/test_mapping/mvcombine.json
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
integ-test/src/test/resources/mvcombine.json
integ-test/src/test/resources/mvcombine_index_mapping.json
ppl/src/main/antlr/OpenSearchPPLLexer.g4
ppl/src/main/antlr/OpenSearchPPLParser.g4
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

🧰 Additional context used

📓 Path-based instructions (14)

**/*.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*.java: Use PascalCase for class names (e.g., QueryExecutor)
Use camelCase for method and variable names (e.g., executeQuery)
Use UPPER_SNAKE_CASE for constants (e.g., MAX_RETRY_COUNT)
Keep methods under 20 lines with single responsibility
All public classes and methods must have proper JavaDoc
Use specific exception types with meaningful messages for error handling
Prefer Optional<T> for nullable returns in Java
Avoid unnecessary object creation in loops
Use StringBuilder for string concatenation in loops
Validate all user inputs, especially queries
Sanitize data before logging to prevent injection attacks
Use try-with-resources for proper resource cleanup in Java
Maintain Java 11 compatibility when possible for OpenSearch 2.x
Document Calcite-specific workarounds in code

Files:

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

⚙️ CodeRabbit configuration file

**/*.java: - Flag methods >50 lines as potentially too complex - suggest refactoring

Flag classes >500 lines as needing organization review

Check for dead code, unused imports, and unused variables

Identify code reuse opportunities across similar implementations

Assess holistic maintainability - is code easy to understand and modify?

Flag code that appears AI-generated without sufficient human review

Verify Java naming conventions (PascalCase for classes, camelCase for methods/variables)

Check for proper JavaDoc on public classes and methods

Flag redundant comments that restate obvious code

Ensure proper error handling with specific exception types

Check for Optional usage instead of null returns

Validate proper use of try-with-resources for resource management

Files:

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java
ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

core/src/main/java/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/**/*.java: - New functions MUST have unit tests in the same commit

Public methods MUST have JavaDoc with @param, @return, and @throws

Follow existing function implementation patterns in the same package

New expression functions should follow ExpressionFunction interface patterns

Validate function naming follows project conventions (camelCase)

Files:

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

core/src/main/java/org/opensearch/sql/ast/**/*.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/ast/**/*.java: - AST nodes must be immutable where possible

Follow visitor pattern for AST traversal

Ensure proper toString() implementation for debugging

Files:

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/ast/dsl/AstDSL.java
core/src/main/java/org/opensearch/sql/ast/tree/MvCombine.java

**/*.g4

⚙️ CodeRabbit configuration file

**/*.g4: - Check if modifying unrelated grammar files (scope creep)

Verify grammar rule placement follows project patterns

Question if new rule needed vs reusing existing rules

Ensure changes are relevant to the PR's stated purpose

Files:

ppl/src/main/antlr/OpenSearchPPLLexer.g4
ppl/src/main/antlr/OpenSearchPPLParser.g4

docs/**/*.rst

⚙️ CodeRabbit configuration file

docs/**/*.rst: - Verify examples use valid test data and indices

Check documentation follows existing patterns and structure

Validate syntax examples are complete and correct

Ensure code examples are tested and working

Check for consistent formatting and style

Files:

docs/user/dql/metadata.rst

integ-test/src/test/resources/**/*

⚙️ CodeRabbit configuration file

integ-test/src/test/resources/**/*: - Verify test data is realistic and representative

Check data format matches expected schema

Ensure test data covers edge cases and boundary conditions

Files:

integ-test/src/test/resources/mvcombine.json
integ-test/src/test/resources/mvcombine_index_mapping.json

**/ppl/**/*.java

⚙️ CodeRabbit configuration file

**/ppl/**/*.java: - For PPL parser changes, verify grammar tests with positive/negative cases

Check AST generation for new syntax

Ensure corresponding AST builder classes are updated

Validate edge cases and boundary conditions

Files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

**/test/**/*.java

⚙️ CodeRabbit configuration file

**/test/**/*.java: - Verify NULL input tests for all new functions

Check boundary condition tests (min/max values, empty inputs)

Validate error condition tests (invalid inputs, exceptions)

Ensure multi-document tests for per-document operations

Flag smoke tests without meaningful assertions

Check test naming follows pattern: test

Verify test data is realistic and covers edge cases

Verify test coverage for new business logic

Ensure tests are independent and don't rely on execution order

Validate meaningful test data that reflects real-world scenarios

Check for proper cleanup of test resources

Files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

**/calcite/**/*.java

⚙️ CodeRabbit configuration file

**/calcite/**/*.java: - Follow existing Calcite integration patterns

Verify RelNode visitor implementations are complete

Check RexNode handling follows project conventions

Validate SQL generation is correct and optimized

Ensure Calcite version compatibility

Follow existing patterns in CalciteRelNodeVisitor and CalciteRexNodeVisitor

Document any Calcite-specific workarounds

Test compatibility with Calcite version constraints

Files:

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

⚙️ CodeRabbit configuration file

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java: - Flag methods >50 lines - this file is known to be hard to read

Suggest extracting complex logic into helper methods

Check for code organization and logical grouping

Validate all RelNode types are handled

Files:

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java

integ-test/**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

End-to-end scenarios need integration tests in integ-test/ module

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java

⚙️ CodeRabbit configuration file

integ-test/**/*IT.java: - Integration tests MUST use valid test data from resources

Verify test data files exist in integ-test/src/test/resources/

Check test assertions are meaningful and specific

Validate tests clean up resources after execution

Ensure tests are independent and can run in any order

Flag tests that reference non-existent indices (e.g., EMP)

Verify integration tests are in correct module (integ-test/)

Check tests can be run with ./gradlew :integ-test:integTest

Ensure proper test data setup and teardown

Validate end-to-end scenario coverage

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java

**/*IT.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

Name integration tests with *IT.java suffix in OpenSearch SQL

Files:

integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java

ppl/src/main/antlr/OpenSearchPPLParser.g4

⚙️ CodeRabbit configuration file

ppl/src/main/antlr/OpenSearchPPLParser.g4: - Identify code reuse opportunities with existing commands

Validate command follows PPL naming and structure patterns

Check if command should be alias vs separate implementation

Files:

ppl/src/main/antlr/OpenSearchPPLParser.g4

**/*Test.java

📄 CodeRabbit inference engine (.rules/REVIEW_GUIDELINES.md)

**/*Test.java: All new business logic requires unit tests
Name unit tests with *Test.java suffix in OpenSearch SQL

Files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

🧠 Learnings (9)

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Follow existing patterns in `CalciteRelNodeVisitor` and `CalciteRexNodeVisitor` for Calcite integration

Applied to files:

core/src/main/java/org/opensearch/sql/ast/AbstractNodeVisitor.java
core/src/main/java/org/opensearch/sql/analysis/Analyzer.java
core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Update corresponding AST builder classes when making PPL grammar changes

Applied to files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Verify AST generation for new PPL parser syntax

Applied to files:

ppl/src/main/java/org/opensearch/sql/ppl/parser/AstBuilder.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*IT.java : Name integration tests with `*IT.java` suffix in OpenSearch SQL

Applied to files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*Test.java : Name unit tests with `*Test.java` suffix in OpenSearch SQL

Applied to files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Test SQL generation and optimization paths for Calcite integration changes

Applied to files:

integ-test/src/test/java/org/opensearch/sql/legacy/SQLIntegTestCase.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

📚 Learning: 2025-12-29T05:32:03.491Z

Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 4993
File: opensearch/src/main/java/org/opensearch/sql/opensearch/planner/physical/CalciteEnumerableTopK.java:20-20
Timestamp: 2025-12-29T05:32:03.491Z
Learning: For any custom Calcite RelNode class (e.g., ones that extend EnumerableLimitSort or other Calcite RelNode types), always override the copy method. If copy is not overridden, cloning/copy operations may downgrade the instance to the parent class type, losing the custom behavior. In your implementation, ensure copy returns a new instance of the concrete class with all relevant fields and traits preserved, mirroring the current instance state.

Applied to files:

core/src/main/java/org/opensearch/sql/calcite/CalciteRelNodeVisitor.java
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteMvCombineCommandIT.java
ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

📚 Learning: 2025-12-02T17:27:55.938Z

Learnt from: CR
Repo: opensearch-project/sql PR: 0
File: .rules/REVIEW_GUIDELINES.md:0-0
Timestamp: 2025-12-02T17:27:55.938Z
Learning: Applies to **/*.java : Document Calcite-specific workarounds in code

Applied to files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

📚 Learning: 2025-12-11T05:27:39.856Z

Learnt from: LantaoJin
Repo: opensearch-project/sql PR: 0
File: :0-0
Timestamp: 2025-12-11T05:27:39.856Z
Learning: In opensearch-project/sql, for SEMI and ANTI join types in CalciteRelNodeVisitor.java, the `max` option has no effect because these join types only use the left side to filter records based on the existence of matches in the right side. The join results are identical regardless of max value (max=1, max=2, or max=∞). The early return for SEMI/ANTI joins before processing the `max` option is intentional and correct behavior.

Applied to files:

ppl/src/test/java/org/opensearch/sql/ppl/calcite/CalcitePPLMvCombineTest.java

🪛 Biome (2.1.2)

integ-test/src/test/resources/mvcombine.json

[error] 1-2: End of file expected