[QA] Add FieldFormatCommandIT for the analytics-engine REST path#21544
Conversation
PPL `fieldformat` is a Calcite-only command that lowers to a plain Eval node (see SQL plugin's `AstBuilder.visitFieldformatCommand`). Its expressions go through Calcite's || (CONCAT) operator and CAST, both already wired in the DataFusion backend's STANDARD_PROJECT_OPS via opensearch-project#21498. **No code changes required for the analytics route — this PR is QA-only.** The unique surface vs plain `eval` is the prefix-{`.`} and suffix-{`.`} string-concat sugar emitted by `AstExpressionBuilder.visitFieldFormatEvalClause` for the StringDotlogicalExpression / LogicalExpressionDotString rules: fieldformat x = "prefix".CAST(y AS STRING)." suffix" expands to a chain of CONCAT calls. Both forms route through the existing CONCAT capability — no extension lookup or adapter needed since isthmus' default catalog binds the || operator natively. Four tests against the in-process QA cluster, exercising the analytics path end-to-end via the test-ppl-frontend plugin: | Test | Shape | |---|---| | `testFieldformatPlusConcat` | `'Hello ' + str0` — basic +-concat. | | `testFieldformatPrefixDotCast` | `'Code: '.CAST(int0 AS STRING)` — StringDotlogicalExpression branch. | | `testFieldformatCastDotSuffix` | `CAST(int0 AS STRING).' pts'` — LogicalExpressionDotString branch. | | `testFieldformatPrefixDotCastDotSuffix` | `'Code: '.CAST(int0 AS STRING).' pts'` — combined. | Tests filter `where isnotnull(int0)` before sorting/limiting so the deterministic-row assertions don't flap on the calcs dataset's six null int0 rows (Calcite's default ascending sort puts nulls first). Out of scope: the v2-side `testFieldFormatStringConcatenationWithNullFieldToString` uses `tostring(age, "commas")` — a multi-mode UDF (binary / hex / commas / duration) with substantial Java logic in `ToStringFunction`. Adding it to the analytics path would need either Calcite-level rewrites or a DataFusion Rust UDF; tracked separately. Validates: 4/4 FieldFormatCommandIT pass; full :sandbox:qa:analytics-engine-rest:integTest suite green (**132 tests across 17 ITs**, no regressions). Signed-off-by: Kai Huang <[email protected]>
PR Reviewer Guide 🔍Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Explore these optional code suggestions:
|
|
❌ Gradle check result for f0ea4c4: FAILURE Please examine the workflow log, locate, and copy-paste the failure(s) below, then iterate to green. Is the failure a flaky test unrelated to your change? |
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## main #21544 +/- ##
============================================
- Coverage 73.42% 73.39% -0.03%
- Complexity 74547 74552 +5
============================================
Files 5978 5978
Lines 338743 338743
Branches 48843 48843
============================================
- Hits 248707 248621 -86
- Misses 70229 70287 +58
- Partials 19807 19835 +28 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
…nsearch-project#21544) PPL `fieldformat` is a Calcite-only command that lowers to a plain Eval node (see SQL plugin's `AstBuilder.visitFieldformatCommand`). Its expressions go through Calcite's || (CONCAT) operator and CAST, both already wired in the DataFusion backend's STANDARD_PROJECT_OPS via opensearch-project#21498. **No code changes required for the analytics route — this PR is QA-only.** The unique surface vs plain `eval` is the prefix-{`.`} and suffix-{`.`} string-concat sugar emitted by `AstExpressionBuilder.visitFieldFormatEvalClause` for the StringDotlogicalExpression / LogicalExpressionDotString rules: fieldformat x = "prefix".CAST(y AS STRING)." suffix" expands to a chain of CONCAT calls. Both forms route through the existing CONCAT capability — no extension lookup or adapter needed since isthmus' default catalog binds the || operator natively. Four tests against the in-process QA cluster, exercising the analytics path end-to-end via the test-ppl-frontend plugin: | Test | Shape | |---|---| | `testFieldformatPlusConcat` | `'Hello ' + str0` — basic +-concat. | | `testFieldformatPrefixDotCast` | `'Code: '.CAST(int0 AS STRING)` — StringDotlogicalExpression branch. | | `testFieldformatCastDotSuffix` | `CAST(int0 AS STRING).' pts'` — LogicalExpressionDotString branch. | | `testFieldformatPrefixDotCastDotSuffix` | `'Code: '.CAST(int0 AS STRING).' pts'` — combined. | Tests filter `where isnotnull(int0)` before sorting/limiting so the deterministic-row assertions don't flap on the calcs dataset's six null int0 rows (Calcite's default ascending sort puts nulls first). Out of scope: the v2-side `testFieldFormatStringConcatenationWithNullFieldToString` uses `tostring(age, "commas")` — a multi-mode UDF (binary / hex / commas / duration) with substantial Java logic in `ToStringFunction`. Adding it to the analytics path would need either Calcite-level rewrites or a DataFusion Rust UDF; tracked separately. Validates: 4/4 FieldFormatCommandIT pass; full :sandbox:qa:analytics-engine-rest:integTest suite green (**132 tests across 17 ITs**, no regressions). Signed-off-by: Kai Huang <[email protected]>
…nsearch-project#21544) PPL `fieldformat` is a Calcite-only command that lowers to a plain Eval node (see SQL plugin's `AstBuilder.visitFieldformatCommand`). Its expressions go through Calcite's || (CONCAT) operator and CAST, both already wired in the DataFusion backend's STANDARD_PROJECT_OPS via opensearch-project#21498. **No code changes required for the analytics route — this PR is QA-only.** The unique surface vs plain `eval` is the prefix-{`.`} and suffix-{`.`} string-concat sugar emitted by `AstExpressionBuilder.visitFieldFormatEvalClause` for the StringDotlogicalExpression / LogicalExpressionDotString rules: fieldformat x = "prefix".CAST(y AS STRING)." suffix" expands to a chain of CONCAT calls. Both forms route through the existing CONCAT capability — no extension lookup or adapter needed since isthmus' default catalog binds the || operator natively. Four tests against the in-process QA cluster, exercising the analytics path end-to-end via the test-ppl-frontend plugin: | Test | Shape | |---|---| | `testFieldformatPlusConcat` | `'Hello ' + str0` — basic +-concat. | | `testFieldformatPrefixDotCast` | `'Code: '.CAST(int0 AS STRING)` — StringDotlogicalExpression branch. | | `testFieldformatCastDotSuffix` | `CAST(int0 AS STRING).' pts'` — LogicalExpressionDotString branch. | | `testFieldformatPrefixDotCastDotSuffix` | `'Code: '.CAST(int0 AS STRING).' pts'` — combined. | Tests filter `where isnotnull(int0)` before sorting/limiting so the deterministic-row assertions don't flap on the calcs dataset's six null int0 rows (Calcite's default ascending sort puts nulls first). Out of scope: the v2-side `testFieldFormatStringConcatenationWithNullFieldToString` uses `tostring(age, "commas")` — a multi-mode UDF (binary / hex / commas / duration) with substantial Java logic in `ToStringFunction`. Adding it to the analytics path would need either Calcite-level rewrites or a DataFusion Rust UDF; tracked separately. Validates: 4/4 FieldFormatCommandIT pass; full :sandbox:qa:analytics-engine-rest:integTest suite green (**132 tests across 17 ITs**, no regressions). Signed-off-by: Kai Huang <[email protected]>
Summary
PPL
fieldformatalready works on the analytics-engine route with no code changes — it lowers to a plainEvalnode whose expressions route throughCONCATandCAST, both already wired into the DataFusion backend'sSTANDARD_PROJECT_OPSvia #21498. This PR is QA-only.The unique surface vs plain
evalis the prefix-./ suffix-.string-concat sugar emitted byAstExpressionBuilder.visitFieldFormatEvalClause:…expands to a chain of
CONCATcalls. Both+-form and the dotted form route through Calcite's||operator, which isthmus' default catalog binds natively — no extension lookup, no adapter required.Pass rate
MultisearchCommandITis added by this PR; before this PR there was no analytics-engine QA pin forfieldformat. After this PR:testFieldformatPlusConcattestFieldformatPrefixDotCasttestFieldformatCastDotSuffixtestFieldformatPrefixDotCastDotSuffixFull QA suite regression check: 132 / 132 across 17 ITs (no regressions in any pre-existing IT after
FieldFormatCommandITjoined the suite).The v2-side
CalciteFieldFormatCommandITitself goes from 0 / 5 → 4 / 5 on the analytics route once the corresponding SQL-repo IT-init fix lands (opensearch-project/sql#5417, modeled on #5407 forCalciteEvalCommandIT). The remaining 5th case usestostring(age, "commas")— a multi-mode UDF tracked separately as out of scope; estimated effort ~1 day for a native Rust UDF + Substrait extension if pursued.Tests
Four tests against the in-process QA cluster via
test-ppl-frontend:testFieldformatPlusConcat'Hello ' + str0— basic+-concat.testFieldformatPrefixDotCast'Code: '.CAST(int0 AS STRING)— StringDotlogicalExpression branch.testFieldformatCastDotSuffixCAST(int0 AS STRING).' pts'— LogicalExpressionDotString branch.testFieldformatPrefixDotCastDotSuffix'Code: '.CAST(int0 AS STRING).' pts'— combined.Each test filters
where isnotnull(int0)before sorting/limiting so the deterministic-row assertions don't flap on thecalcsdataset's six nullint0rows (Calcite's default ascending sort puts nulls first).Test plan
./gradlew :sandbox:qa:analytics-engine-rest:integTest --tests '*FieldFormatCommandIT' -Dsandbox.enabled=true— 4 / 4 green./gradlew :sandbox:qa:analytics-engine-rest:integTest -Dsandbox.enabled=true— 132 / 132 across 17 ITs (no regressions in any existing IT).Related
CalciteFieldFormatCommandITthrough the helper-managed index path so the v2-side IT actually exercises the analytics route undertests.analytics.parquet_indices=true).