Skip to content

Update UDF compiler for shared helper layout#15028

Open
gerashegalov wants to merge 1 commit into
codex/unshim-stack-03-delta-icebergfrom
codex/unshim-stack-04-udf-docs
Open

Update UDF compiler for shared helper layout#15028
gerashegalov wants to merge 1 commit into
codex/unshim-stack-03-delta-icebergfrom
codex/unshim-stack-04-udf-docs

Conversation

@gerashegalov

@gerashegalov gerashegalov commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Related to #14834.

Description

This PR is one reviewable layer in the unshim stack introduced by #15025. It updates the UDF compiler and related documentation for the shared helper layout. This is the final follow-up layer on top of the unshim stack.

Stack context

Testing and validation notes

  • Covered by the full-stack packaging/build validation described in Add default common unshim packaging flow #15025 and the existing UDF compiler coverage for the affected integration points.
  • The full split stack was verified to be tree-equivalent to the pre-split stack top.

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Covered by the validation notes in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 3cf1779 to c99cc88 Compare June 10, 2026 15:08
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch 2 times, most recently from 6ebd35c to f608c12 Compare June 10, 2026 15:44
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from c99cc88 to bc737d5 Compare June 10, 2026 15:44
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from f608c12 to f04280c Compare June 10, 2026 20:49
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from 6f0d32e to 39664ac Compare June 10, 2026 21:13
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch 2 times, most recently from 2aad668 to 7918b20 Compare June 10, 2026 21:32
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from 12050e2 to da7172c Compare June 10, 2026 21:36
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 9974576 to bc3dcf9 Compare June 10, 2026 22:20
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from d7cc002 to 58fd9d3 Compare June 10, 2026 22:37
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 09556c1 to 951af07 Compare June 10, 2026 22:41
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 58fd9d3 to dd489ce Compare June 10, 2026 22:41
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 951af07 to d0da6bd Compare June 10, 2026 22:46
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from dd489ce to 9a0fceb Compare June 10, 2026 22:46
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from d0da6bd to 9523c22 Compare June 10, 2026 22:59
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from 2d11a0d to c9b6910 Compare June 10, 2026 23:12
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch 2 times, most recently from 6a78825 to 389bcfb Compare June 10, 2026 23:15
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from c9b6910 to 38f68ad Compare June 10, 2026 23:15
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 389bcfb to 1c955c0 Compare June 10, 2026 23:29
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 38f68ad to 42519c0 Compare June 10, 2026 23:29
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 42519c0 to a5a6171 Compare June 10, 2026 23:33
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch 2 times, most recently from 57a7219 to a1e022a Compare June 10, 2026 23:48
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from a5a6171 to 0681545 Compare June 10, 2026 23:48
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from a1e022a to fa4c038 Compare June 10, 2026 23:59
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from 8d38745 to c2a810e Compare June 11, 2026 00:25
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch 2 times, most recently from 7c8095e to c012546 Compare June 11, 2026 00:37
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from c2a810e to b83a78a Compare June 11, 2026 00:37
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from c012546 to e9c90de Compare June 11, 2026 00:51
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from b83a78a to 97298e1 Compare June 11, 2026 00:51
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from e9c90de to afbcd10 Compare June 11, 2026 01:18
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 97298e1 to 23b2e37 Compare June 11, 2026 01:18
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from afbcd10 to 632eb14 Compare June 11, 2026 01:32
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from 07a37f3 to f52fcdc Compare June 11, 2026 01:43
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch 2 times, most recently from d630815 to 7c174ff Compare June 11, 2026 01:58
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from f52fcdc to 681d6a2 Compare June 11, 2026 01:58
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 7c174ff to 85991fb Compare June 11, 2026 02:26
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch 2 times, most recently from 07130f5 to 3720055 Compare June 11, 2026 02:56
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 85991fb to 2c95704 Compare June 11, 2026 02:56
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 3720055 to 76c8964 Compare June 13, 2026 12:13
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-04-udf-docs branch from 76c8964 to f1ce3e8 Compare June 13, 2026 12:20
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-03-delta-iceberg branch from 2db9df3 to e557937 Compare June 13, 2026 12:20
@gerashegalov gerashegalov marked this pull request as ready for review June 13, 2026 12:49
@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR adapts the UDF compiler module to the "shared helper layout" introduced by the unshim stack. The main changes remove the dependency on Spark's internal Logging trait (replacing it with direct SLF4J calls via LoggerFactory), switch Cast and Encode from companion-object apply to direct new construction with explicit None timezone, convert case class LogicalPlanRules() to class LogicalPlanRules (compatible with reflection-based instantiation), and update documentation links.

  • Logging migration: All UDF compiler classes (BB, Instruction, CatalystExpressionBuilder, GpuScalaUDFLogical, LogicalPlanRules) drop with Logging and instead declare a static SLF4J logger in a companion object; isDebugEnabled/isTraceEnabled guards are correctly added since Scala string interpolation is eager unlike Spark's by-name logDebug.
  • Cast/Encode construction: Companion-object apply calls (e.g. Cast(e, t)) are replaced with new Cast(e, t, None) and new Encode(...), reflecting a layout change that no longer exposes the Spark companion objects directly; timezone is None in all cases where it was previously also the default.
  • Doc updates: Pandas UDF links updated from the archived Spark 3.2.0 docs to live Spark 3.5.7 docs, and the AQE code snippet is corrected to use new GpuQueryStagePrepOverrides (matching the case class → class pattern in the prior stack layer).

Confidence Score: 5/5

This PR is safe to merge — all changes are mechanical refactors with no logic mutations.

The changes swap Spark's internal Logging trait for direct SLF4J calls, convert Cast/Encode companion-object apply to new construction (preserving all defaults), and convert one case class to a plain class that is only ever instantiated via reflection. All call sites have been verified, isDebugEnabled guards are correctly added, and the timeZoneId forwarding in CatalystExpressionBuilder.simplifyExpr is preserved unchanged.

No files require special attention; both Scala 2.12 and 2.13 Instruction.scala variants are in sync.

Important Files Changed

Filename Overview
udf-compiler/src/main/scala/com/nvidia/spark/udf/LogicalPlanRules.scala Changed from case class LogicalPlanRules() to class LogicalPlanRules; instantiated via reflection in ShimLoader so no direct call-site impact; Logging trait removed cleanly.
udf-compiler/src/main/scala/com/nvidia/spark/udf/GpuScalaUDF.scala Logging moved to companion object logger; logDebug guarded with isDebugEnabled, logWarning replaced with direct warn (no guard needed at warn level).
udf-compiler/src/main/scala/com/nvidia/spark/udf/CatalystExpressionBuilder.scala Logging and Cast/Encode construction updated for shared helper layout; timezone is correctly forwarded (ce.timeZoneId) where the original used it, and None elsewhere.
udf-compiler/src/main/scala/com/nvidia/spark/udf/CFG.scala Added object BB companion with SLF4J logger; all trace/debug calls wrapped with level guards; no logic changes.
udf-compiler/src/main/scala-2.12/com/nvidia/spark/udf/Instruction.scala Scala 2.12 copy: same Logging→SLF4J and Cast/Encode constructor changes as the shared file; in sync with 2.13 variant.
udf-compiler/src/main/scala-2.13/com/nvidia/spark/udf/Instruction.scala Scala 2.13 copy: identical changes to 2.12 variant; both kept in sync.
udf-compiler/src/main/scala/com/nvidia/spark/udf/State.scala Copyright year update and two Cast(l, t) calls updated to new Cast(l, t, None); semantics preserved.
docs/additional-functionality/rapids-udfs.md All Pandas UDF documentation links updated from archive.apache.org/dist/spark/docs/3.2.0 to spark.apache.org/docs/3.5.7; no content changes.
docs/dev/adaptive-query.md Code snippet updated from GpuQueryStagePrepOverrides() companion-object style to new GpuQueryStagePrepOverrides, matching the prior stack layer's class change.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[UDF Compiler Module] --> B[LogicalPlanRules\nclass - no-arg ctor]
    B --> C[GpuScalaUDFLogical\ncase class + companion log]
    C --> D[CatalystExpressionBuilder\ncase class + companion log]
    D --> E[CFG / BB\ncase class + companion log]
    D --> F[Instruction\ncase class + companion log]
    D --> G[State]

    subgraph Logging
        H[Before: extends Logging\nSpark internal trait]
        I[After: LoggerFactory.getLogger\nSLF4J direct - shared helper]
    end

    subgraph CastConstruction
        J["Before: Cast(e, t)\ncompanion apply"]
        K["After: new Cast(e, t, None)\ndirect constructor"]
    end

    B -.->|reflection| L[ShimLoader.newInstanceOf]
    H --> I
    J --> K
Loading

Reviews (1): Last reviewed commit: "Update UDF compiler for shared helper la..." | Re-trigger Greptile

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants