Description
The DatetimeOutputCastRule introduced in #5408 wraps datetime
output columns in CAST(... AS VARCHAR) so PPL responses are
rendered as ISO strings. The string produced by that cast is
engine-dependent:
- PPL Calcite path (
UnifiedQueryCompiler) — emits ANSI SQL
format 2024-01-15 12:00:00 (space separator, no T),
consistent with SparkSQL, PostgreSQL, MySQL, Oracle, and
SQL Server.
- Analytics-engine path (DataFusion native runtime) — emits
ISO 8601 format 2024-01-15T12:00:00 (with T separator).
This means the same PPL query against the same data returns
two different string formats depending on which execution
engine handled it, breaking the wire-format contract that
#5408 set out to enforce.
Reproduction
source=events | fields created_at
- Calcite engine →
"2024-01-15 12:00:00"
- DataFusion engine →
"2024-01-15T12:00:00"
Context
Called out as Note 2 of #5408. Filed as a follow-up so the
divergence is tracked outside that already-merged PR.
Possible directions
- Replace the implicit
CAST(... AS VARCHAR) with an explicit
DATE_FORMAT(..., '<pattern>') (or to_char /
format_datetime) that pins the wire format on both engines.
- Push the formatting into the response formatter rather than
the logical plan, so engine-side cast semantics never leak
to the wire.
- Add a DataFusion-side cast adapter that emits the ANSI SQL
string format to match Calcite.
Related
Description
The
DatetimeOutputCastRuleintroduced in #5408 wraps datetimeoutput columns in
CAST(... AS VARCHAR)so PPL responses arerendered as ISO strings. The string produced by that cast is
engine-dependent:
UnifiedQueryCompiler) — emits ANSI SQLformat
2024-01-15 12:00:00(space separator, noT),consistent with SparkSQL, PostgreSQL, MySQL, Oracle, and
SQL Server.
ISO 8601 format
2024-01-15T12:00:00(withTseparator).This means the same PPL query against the same data returns
two different string formats depending on which execution
engine handled it, breaking the wire-format contract that
#5408 set out to enforce.
Reproduction
"2024-01-15 12:00:00""2024-01-15T12:00:00"Context
Called out as Note 2 of #5408. Filed as a follow-up so the
divergence is tracked outside that already-merged PR.
Possible directions
CAST(... AS VARCHAR)with an explicitDATE_FORMAT(..., '<pattern>')(orto_char/format_datetime) that pins the wire format on both engines.the logical plan, so engine-side cast semantics never leak
to the wire.
string format to match Calcite.
Related
DatetimeOutputCastRule