Normalize datetime types for unified query API#5408
Normalize datetime types for unified query API#5408dai-chen merged 1 commit intoopensearch-project:mainfrom
Conversation
PR Reviewer Guide 🔍(Review updated until commit af2b288)Here are some key observations to aid the review process:
|
PR Code Suggestions ✨Latest suggestions up to 58b0420 Explore these optional code suggestions:
Previous suggestionsSuggestions up to commit af2b288
Suggestions up to commit cc90d7f
Suggestions up to commit cc90d7f
Suggestions up to commit 01fba06
Suggestions up to commit 70aabfe
|
70aabfe to
01fba06
Compare
|
Persistent review updated to latest commit 01fba06 |
|
All test related CI is failling on Gradle version issue: |
01fba06 to
cc90d7f
Compare
|
Persistent review updated to latest commit cc90d7f |
Will rebase once #5414 merged. |
cc90d7f to
af2b288
Compare
|
Persistent review updated to latest commit af2b288 |
Add postAnalysisRules (List<RelShuttle>) to LanguageSpec.LanguageExtension and register DatetimeExtension in UnifiedPplSpec with two rules: 1. DatetimeUdtNormalizeRule rewrites datetime UDT return types (EXPR_DATE/TIME/TIMESTAMP) on RexCall nodes to standard Calcite DATE/TIME(9)/TIMESTAMP(9) types via call.clone(). Precision is derived from the type system (OpenSearchTypeSystem.getMaxPrecision). 2. DatetimeOutputCastRule adds a final LogicalProject that casts standard datetime output columns to VARCHAR, aligning with PPL's wire-format contract (ISO string representation). Both rules run as postAnalysisRules after the planning strategy produces the RelNode, applied uniformly to both SQL and PPL paths. Also bumps OpenSearchTypeSystem max datetime precision from 3 to 9 (nanosecond) for TIME and TIMESTAMP types. No changes to UDF definitions or implementors in core/ — the mismatch between rewritten signatures and UDF implementations is a known limitation addressed separately. Signed-off-by: Chen Dai <daichen@amazon.com>
af2b288 to
58b0420
Compare
Bring the analytics-engine catch-up PR up to current upstream/main by resolving conflicts introduced by 4 main commits since 2026-04-30: - opensearch-project#5419 (LENGTH/REGEXP_REPLACE/DATE_TRUNC unified function spec) - opensearch-project#5408 (datetime type normalization) - opensearch-project#5414 (Gradle wrapper bump + @ignore exclusion) - opensearch-project#5399 (FGAC-scoped SQL cursor continuation) Resolutions: api/spec/* (LanguageSpec, UnifiedFunctionSpec, UnifiedPplSpec, UnifiedSqlSpec): took main. Main is a strict superset — adds postAnalysisRules and preCompilationRules extension points, the new FunctionSpecBuilder DSL, SCALAR category for length/regexp_replace/ date_trunc, the DatetimeExtension on PPL spec, and the CoreExtension wiring on SQL spec. PR's RELEVANCE category is preserved unchanged. api/UnifiedQueryPlanner.java, api/compiler/UnifiedQueryCompiler.java: took main. Both adopt the new postAnalysisRules / preCompilationRules hooks introduced in opensearch-project#5408 / opensearch-project#5419. core/executor/QueryService.java: composed both sides — kept HEAD's CalciteClassLoaderHelper.withCalciteClassLoader wrapper around main's StageErrorHandler stage tracking. Same pattern as the original PR resolution; both improvements are orthogonal. plugin/SQLPlugin.java: kept HEAD imports for ExplainResponse and QueryType (referenced by createSqlAnalyticsRouter, which only exists on the feature branch). plugin/transport/TransportPPLQueryAction.java: kept HEAD's queryPlanExecutor parameter alongside main's extensionsHolder parameter, since both are referenced in the constructor body. legacy/plugin/RestSqlAction.java: took HEAD. The 3-way merge produced a duplicated handleException/getRawErrorCode block; HEAD already contained both the delegateToV2Engine refactor and the ErrorReport unwrap from main, so HEAD is the correct superset. Build: :api, :core, :opensearch-sql-plugin, :legacy compileJava and unit tests pass; spotlessCheck passes. Signed-off-by: Kai Huang <ahkcs@amazon.com>
…ation Brings the catch-up branch up to current upstream/main (4 commits since this PR was opened) and current feature/mustang-ppl-integration (9 commits since this PR was opened), so the PR is mergeable into feature/mustang-ppl-integration without conflicts. Squashed (rather than two real merge commits) for the same DCO reason the original commit was squashed: upstream commits authored by many contributors with inconsistent or missing Signed-off-by trailers would otherwise be brought into this PR's history. Newer main commits absorbed (4): - opensearch-project#5419 (LENGTH/REGEXP_REPLACE/DATE_TRUNC unified function spec) - opensearch-project#5408 (datetime type normalization) - opensearch-project#5414 (Gradle wrapper bump + @ignore exclusion) - opensearch-project#5399 (FGAC-scoped SQL cursor continuation) Newer feature commits absorbed (9): - opensearch-project#5403 (analytics-engine optional dependency — major rewiring) - opensearch-project#5407 (Carry CalciteEvalCommandIT through helper-managed index path) - opensearch-project#5413 (Default plugins.calcite.enabled=true on unified path) - opensearch-project#5415, opensearch-project#5416, opensearch-project#5417, opensearch-project#5409, opensearch-project#5400, opensearch-project#5406 (smaller carryovers + bumps) Conflict resolutions (10 from main side, 3 from feature side): api/spec/* (LanguageSpec, UnifiedFunctionSpec, UnifiedPplSpec, UnifiedSqlSpec): took main. Main is a strict superset — adds postAnalysisRules and preCompilationRules extension points, the new FunctionSpecBuilder DSL, SCALAR category for length/regexp_replace/ date_trunc, the DatetimeExtension on PPL spec, and the CoreExtension wiring on SQL spec. PR's RELEVANCE category is preserved unchanged. api/UnifiedQueryPlanner.java, api/compiler/UnifiedQueryCompiler.java: took main. Both adopt the new postAnalysisRules / preCompilationRules hooks introduced in opensearch-project#5408 / opensearch-project#5419. core/executor/QueryService.java: composed both sides — kept HEAD's CalciteClassLoaderHelper.withCalciteClassLoader wrapper around main's StageErrorHandler stage tracking. Same pattern as the original PR resolution; both improvements are orthogonal. legacy/plugin/RestSqlAction.java: took HEAD. The 3-way merge produced a duplicated handleException/getRawErrorCode block; HEAD already contained both the delegateToV2Engine refactor and the ErrorReport unwrap from main, so HEAD is the correct superset. integ-test/build.gradle: took feature. Both sides added the same @ignore exclusion block; feature has alphabetical ordering and a more detailed comment explaining the Gradle 9.4.1 TestEventReporterAsListener cast bug. integ-test/.../CalciteEvalCommandIT.java: composed both sides. Took feature's helper-managed test_eval provisioning (createIndexByRestClient + isIndexExist guard, from opensearch-project#5407) so analytics-engine compatibility runs get a parquet-backed index. Added back PR HEAD's test_eval_agent setup (needed by the dotted-path eval tests for opensearch-project#5351) wrapped in its own isIndexExist guard for the same parquet-aware idempotency. plugin/.../TransportPPLQueryAction.java: took feature. PR opensearch-project#5403 made analytics-engine an optional dependency by moving QueryPlanExecutor from a required constructor parameter to an @Inject(optional=true) setter. Feature's design supersedes our prior wiring. plugin/.../SQLPlugin.java: took feature. The same opensearch-project#5403 simplification removed loadExtensions/EngineExtensionsHolder/executionEngineExtensions plumbing (no longer needed once analytics-engine is optionally bound). Feature retains the createSqlAnalyticsRouter method this PR introduced. plugin/.../config/EngineExtensionsHolder.java: deleted. Unreferenced after taking feature's SQLPlugin/TransportPPLQueryAction; not present on feature branch. Build: :api, :core, :opensearch-sql-plugin, :legacy compileJava + :integ-test compileTestJava all pass; unit tests pass; spotlessCheck clean. Signed-off-by: Kai Huang <ahkcs@amazon.com>
Single squashed commit on top of feature/mustang-ppl-integration that absorbs upstream/main's commits not yet on the feature branch. Replaces the prior catch-up squash (opensearch-project#5396 base + the original af831d3 rebase commit) so this PR is a fast-forward into feature/mustang-ppl-integration. Squashed (rather than a merge commit) because upstream main commits were authored by many contributors with inconsistent or missing Signed-off-by trailers; DCO would otherwise reject those commits. Main commits absorbed (54 since divergence; 4 since the original catch-up squash was made on 2026-04-30): - opensearch-project#5419 (LENGTH/REGEXP_REPLACE/DATE_TRUNC unified function spec) - opensearch-project#5408 (datetime type normalization) - opensearch-project#5414 (Gradle wrapper bump + @ignore exclusion) - opensearch-project#5399 (FGAC-scoped SQL cursor continuation) - opensearch-project#5394 (SQL Vector Search), opensearch-project#5361 (OpenSearch 3.7), opensearch-project#5360 (unified SQL language spec), opensearch-project#5240 (PPL Union), and 46 others. Conflict resolutions: api/spec/* (LanguageSpec, UnifiedFunctionSpec, UnifiedPplSpec, UnifiedSqlSpec): took main. Main is a strict superset — adds postAnalysisRules and preCompilationRules extension points, the new FunctionSpecBuilder DSL, SCALAR category for length/regexp_replace/ date_trunc, the DatetimeExtension on PPL spec, and the CoreExtension wiring on SQL spec. RELEVANCE category is preserved unchanged. api/UnifiedQueryPlanner.java, api/compiler/UnifiedQueryCompiler.java: took main. Both adopt the new postAnalysisRules / preCompilationRules hooks introduced in opensearch-project#5408 / opensearch-project#5419. core/executor/QueryService.java: composed both sides — kept feature's CalciteClassLoaderHelper.withCalciteClassLoader wrapper around main's StageErrorHandler stage tracking; both improvements are orthogonal. legacy/plugin/RestSqlAction.java: took feature. The 3-way merge produced a duplicated handleException/getRawErrorCode block; feature already contained both the delegateToV2Engine refactor and the ErrorReport unwrap from main, so feature is the correct superset. CLAUDE.md, docs/user/ppl/functions/condition.md: took main. explain_streamstats_global{,_null_bucket}.yaml: took main (post-opensearch-project#5359 shape). core/CalciteRelNodeVisitor + utils/PlanUtils: took main (collation utility hoisted from CalciteRelNodeVisitor.backtrackForCollation into PlanUtils.findInputCollation). integ-test/CalciteNoPushdownIT.java: added CalciteMixedFieldTypeIT. ppl/antlr/OpenSearchPPLParser.g4: added unionCommand. ppl/calcite/CalcitePPLStreamstatsTest.java: added testMultipleStreamstatsWithWindow. integ-test/build.gradle: took feature. Both sides added the same @ignore exclusion block; feature has alphabetical ordering and a more detailed comment explaining the Gradle 9.4.1 cast bug. integ-test/.../CalciteEvalCommandIT.java: composed both sides. Took feature's helper-managed test_eval provisioning (createIndexByRestClient + isIndexExist guard, from opensearch-project#5407) so analytics-engine compatibility runs get a parquet-backed index. Added the test_eval_agent setup (needed by the dotted-path eval tests for opensearch-project#5351) wrapped in its own isIndexExist guard for the same parquet-aware idempotency. plugin/.../TransportPPLQueryAction.java, plugin/.../SQLPlugin.java: took feature. PR opensearch-project#5403 made analytics-engine an optional dependency by moving QueryPlanExecutor from a required constructor parameter to an @Inject(optional=true) setter, and removed the loadExtensions / EngineExtensionsHolder / executionEngineExtensions plumbing. Feature retains the createSqlAnalyticsRouter method this catch-up introduced. plugin/.../config/EngineExtensionsHolder.java: deleted (unreferenced post-opensearch-project#5403; not present on feature). Build: :api, :core, :opensearch-sql-plugin, :legacy compileJava + :integ-test compileTestJava all pass; unit tests pass; spotlessCheck clean. Signed-off-by: Kai Huang <ahkcs@amazon.com>
…5397) Single squashed commit on top of feature/mustang-ppl-integration that absorbs upstream/main's commits not yet on the feature branch. Replaces the prior catch-up squash (#5396 base + the original af831d3 rebase commit) so this PR is a fast-forward into feature/mustang-ppl-integration. Squashed (rather than a merge commit) because upstream main commits were authored by many contributors with inconsistent or missing Signed-off-by trailers; DCO would otherwise reject those commits. Main commits absorbed (54 since divergence; 4 since the original catch-up squash was made on 2026-04-30): - #5419 (LENGTH/REGEXP_REPLACE/DATE_TRUNC unified function spec) - #5408 (datetime type normalization) - #5414 (Gradle wrapper bump + @ignore exclusion) - #5399 (FGAC-scoped SQL cursor continuation) - #5394 (SQL Vector Search), #5361 (OpenSearch 3.7), #5360 (unified SQL language spec), #5240 (PPL Union), and 46 others. Conflict resolutions: api/spec/* (LanguageSpec, UnifiedFunctionSpec, UnifiedPplSpec, UnifiedSqlSpec): took main. Main is a strict superset — adds postAnalysisRules and preCompilationRules extension points, the new FunctionSpecBuilder DSL, SCALAR category for length/regexp_replace/ date_trunc, the DatetimeExtension on PPL spec, and the CoreExtension wiring on SQL spec. RELEVANCE category is preserved unchanged. api/UnifiedQueryPlanner.java, api/compiler/UnifiedQueryCompiler.java: took main. Both adopt the new postAnalysisRules / preCompilationRules hooks introduced in #5408 / #5419. core/executor/QueryService.java: composed both sides — kept feature's CalciteClassLoaderHelper.withCalciteClassLoader wrapper around main's StageErrorHandler stage tracking; both improvements are orthogonal. legacy/plugin/RestSqlAction.java: took feature. The 3-way merge produced a duplicated handleException/getRawErrorCode block; feature already contained both the delegateToV2Engine refactor and the ErrorReport unwrap from main, so feature is the correct superset. CLAUDE.md, docs/user/ppl/functions/condition.md: took main. explain_streamstats_global{,_null_bucket}.yaml: took main (post-#5359 shape). core/CalciteRelNodeVisitor + utils/PlanUtils: took main (collation utility hoisted from CalciteRelNodeVisitor.backtrackForCollation into PlanUtils.findInputCollation). integ-test/CalciteNoPushdownIT.java: added CalciteMixedFieldTypeIT. ppl/antlr/OpenSearchPPLParser.g4: added unionCommand. ppl/calcite/CalcitePPLStreamstatsTest.java: added testMultipleStreamstatsWithWindow. integ-test/build.gradle: took feature. Both sides added the same @ignore exclusion block; feature has alphabetical ordering and a more detailed comment explaining the Gradle 9.4.1 cast bug. integ-test/.../CalciteEvalCommandIT.java: composed both sides. Took feature's helper-managed test_eval provisioning (createIndexByRestClient + isIndexExist guard, from #5407) so analytics-engine compatibility runs get a parquet-backed index. Added the test_eval_agent setup (needed by the dotted-path eval tests for #5351) wrapped in its own isIndexExist guard for the same parquet-aware idempotency. plugin/.../TransportPPLQueryAction.java, plugin/.../SQLPlugin.java: took feature. PR #5403 made analytics-engine an optional dependency by moving QueryPlanExecutor from a required constructor parameter to an @Inject(optional=true) setter, and removed the loadExtensions / EngineExtensionsHolder / executionEngineExtensions plumbing. Feature retains the createSqlAnalyticsRouter method this catch-up introduced. plugin/.../config/EngineExtensionsHolder.java: deleted (unreferenced post-#5403; not present on feature). Build: :api, :core, :opensearch-sql-plugin, :legacy compileJava + :integ-test compileTestJava all pass; unit tests pass; spotlessCheck clean. Signed-off-by: Kai Huang <ahkcs@amazon.com>
Description
Add
postAnalysisRulestoLanguageSpec.LanguageExtensionand registerDatetimeExtensioninUnifiedPplSpecwith two post-analysis rules:EXPR_DATE/TIME/TIMESTAMP) on RexCall nodes to standard Calcite types, enabling downstream consumers to process the plan without UDT-related failures.LogicalProjectthat casts standard datetime output columns toVARCHAR, aligning with PPL's wire-format contract (ISO string representation).Examples
Case 1: UDT Normalize Rule — source=events | eval d = DATE(name) | fields d Before: LogicalProject(d=[DATE($1):EXPR_DATE]) After: LogicalProject(d=[DATE($1):DATE]) Case 2: Output Cast Rule — source=events | fields hire_date, start_time, created_at Before: LogicalProject(hire_date=[$2], start_time=[$3], created_at=[$4]) LogicalTableScan(table=[[events]]) After: LogicalProject(hire_date=[CAST($0):VARCHAR], start_time=[CAST($1):VARCHAR], created_at=[CAST($2):VARCHAR]) LogicalProject(hire_date=[$2], start_time=[$3], created_at=[$4]) LogicalTableScan(table=[[events]])Notes
UnifiedQueryCompilerpath. Follow-up PR will address function implementations.CAST(datetime AS VARCHAR)whose string format is engine-dependent: PPL Calcite produce ANSI SQL format (2024-01-15 12:00:00) like most other SQL databases (SparkSQL, PostgreSQL, MySQL, Oracle, SQL Server), while DataFusion produces ISO 8601 format (2024-01-15T12:00:00).Related Issues
Part of #5250
Check List
--signoffor-s.By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.