Skip to content

Move columnar runtime config helpers to Java#15032

Open
gerashegalov wants to merge 2 commits into
codex/unshim-stack-02g-columnar-scalar-utilsfrom
codex/unshim-stack-02h-columnar-runtime-config
Open

Move columnar runtime config helpers to Java#15032
gerashegalov wants to merge 2 commits into
codex/unshim-stack-02g-columnar-scalar-utilsfrom
codex/unshim-stack-02h-columnar-runtime-config

Conversation

@gerashegalov

@gerashegalov gerashegalov commented Jun 10, 2026

Copy link
Copy Markdown
Collaborator

Related to #14834.

Description

This PR is one reviewable layer in the unshim stack introduced by #15025. It moves columnar runtime configuration helpers into Java-friendly sources so runtime configuration access can be shared by later caller-migration layers.

Stack context

Testing and validation notes

  • No standalone behavior change is intended in this layer. It is covered by the full-stack packaging/build validation described in Add default common unshim packaging flow #15025 and the existing tests for the affected subsystem.
  • The full split stack was verified to be tree-equivalent to the pre-split stack top.

Checklists

Documentation

  • Updated for new or modified user-facing features or behaviors
  • No user-facing change

Testing

  • Added or modified tests to cover new code paths
  • Covered by existing tests
    (Covered by the validation notes in the PR description.)
  • Not required

Performance

  • Tests ran and results are added in the PR description
  • Issue filed with a link in the PR description
  • Not required

@gerashegalov gerashegalov changed the title codex/unshim stack 02h columnar runtime config Move columnar runtime config helpers to Java Jun 10, 2026
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02g-columnar-scalar-utils branch from 7bd098b to e444cc4 Compare June 10, 2026 20:49
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02h-columnar-runtime-config branch 2 times, most recently from 6593a9f to db64c8d Compare June 10, 2026 21:13
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02g-columnar-scalar-utils branch 2 times, most recently from 7d76368 to e3ce48d Compare June 10, 2026 21:32
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02h-columnar-runtime-config branch from db64c8d to 288d815 Compare June 10, 2026 21:32
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02g-columnar-scalar-utils branch from e3ce48d to 0e024e3 Compare June 10, 2026 21:36
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02h-columnar-runtime-config branch 2 times, most recently from 037114c to 275be6e Compare June 10, 2026 22:37
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
Signed-off-by: Gera Shegalov <gshegalov@nvidia.com>
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02g-columnar-scalar-utils branch from 0e024e3 to 9d75d70 Compare June 13, 2026 12:13
@gerashegalov gerashegalov force-pushed the codex/unshim-stack-02h-columnar-runtime-config branch from 275be6e to 66c2b1b Compare June 13, 2026 12:13
@gerashegalov gerashegalov marked this pull request as ready for review June 13, 2026 12:49
@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR migrates a set of columnar runtime-configuration helpers from Scala (sql-plugin) into Java (sql-plugin-columnar) to enable downstream Scala-free callers in the unshim stack. Call-sites in GpuReaderFactory.scala and WithRetrySuite.scala are updated to use Java constructors instead of the Scala companion-object apply methods.

  • Eight new Java classes are added (AutoCloseableTargetSize, CombineConf, DefaultThreadPoolConf, MemoryBoundedPoolConf, ThreadPoolConf, HostAllocResult, ExecutorCache, GpuOrcTimezoneUtils, DeviceBuffersUtils), faithfully porting the Scala originals with initialization-on-demand holders replacing lazy val.
  • Two Scala source files (ExecutorCache.scala, GpuOrcTimezoneUtils.scala) are deleted; four test call-sites and one Iceberg call-site are trivially updated.

Confidence Score: 4/5

The change is a straightforward Scala-to-Java translation with no intended behavior change; nearly all of it is safe, with one native-resource cleanup inconsistency in GpuOrcTimezoneUtils.java worth fixing before merge.

The closeAll(ColumnView[]) method in GpuOrcTimezoneUtils.java does not catch exceptions thrown by individual close() calls, so a single failing close leaves the remaining GPU columns in the array unclosed. This is inconsistent with both the closeAll(List<ColumnView>) overload in the same file and DeviceBuffersUtils.closeAll, and the array version is called from the finally path that releases all output columns.

sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/GpuOrcTimezoneUtils.java — the closeAll(ColumnView[]) method needs exception-safe iteration before merge.

Important Files Changed

Filename Overview
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/GpuOrcTimezoneUtils.java Port of GpuOrcTimezoneUtils from Scala to Java; contains an inconsistency in closeAll(ColumnView[]) that can leak GPU column references if any close() call throws.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/DeviceBuffersUtils.java New Java utility for device buffer allocation and ref-count management; exception-safe with proper suppressed-exception chaining in closeAll.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/AutoCloseableTargetSize.java Java port of the Scala case class; exposes both public final fields and no-arg accessor methods for Java/Scala interop; no-op close() and serialization are intentional.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/ExecutorCache.java Correct translation of Scala lazy vals to the initialization-on-demand holder idiom; package-private visibility matches original private[rapids] scope.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/CombineConf.java Clean Scala case class to Java value class port; Serializable with explicit serialVersionUID, proper equals/hashCode.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/ThreadPoolConf.java New Java interface extending Serializable; simple and correct.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/DefaultThreadPoolConf.java New Java implementation of ThreadPoolConf; correct serialVersionUID and equals/hashCode.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/MemoryBoundedPoolConf.java New Java implementation of ThreadPoolConf with memory-bounded extensions; clean translation.
sql-plugin-columnar/src/main/java/com/nvidia/spark/rapids/HostAllocResult.java Straightforward Java value class holding a HostMemoryBuffer and pinned flag; no resource management concerns (caller owns the buffer).
iceberg/common/src/main/scala/org/apache/iceberg/spark/source/GpuReaderFactory.scala Trivial call-site update: CombineConf case-class apply → new CombineConf(...) Java constructor now that the class lives in Java.
tests/src/test/scala/com/nvidia/spark/rapids/WithRetrySuite.scala Four test call-sites updated from Scala apply to new Java constructor for AutoCloseableTargetSize; no logic change.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/ExecutorCache.scala Deleted; functionality moved to ExecutorCache.java.
sql-plugin/src/main/scala/com/nvidia/spark/rapids/GpuOrcTimezoneUtils.scala Deleted; functionality moved to GpuOrcTimezoneUtils.java.

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    subgraph Before
        ScalaEC[ExecutorCache.scala\nlazy val]
        ScalaORC[GpuOrcTimezoneUtils.scala\nScala object]
        ScalaCC[CombineConf / AutoCloseableTargetSize\nScala case classes in sql-plugin]
    end

    subgraph After["After (sql-plugin-columnar Java)"]
        JavaEC[ExecutorCache.java\nInit-on-demand holders]
        JavaORC[GpuOrcTimezoneUtils.java\nfinal utility class]
        JavaCC[CombineConf / AutoCloseableTargetSize\nJava value classes]
        JavaTP[ThreadPoolConf interface\nDefaultThreadPoolConf / MemoryBoundedPoolConf]
        JavaDB[DeviceBuffersUtils.java]
        JavaHA[HostAllocResult.java]
    end

    ScalaEC -->|deleted / replaced by| JavaEC
    ScalaORC -->|deleted / replaced by| JavaORC
    ScalaCC -->|deleted / replaced by| JavaCC

    GpuReaderFactory[GpuReaderFactory.scala\niceberg] -->|new CombineConf| JavaCC
    WithRetrySuite[WithRetrySuite.scala\ntests] -->|new AutoCloseableTargetSize| JavaCC
Loading

Reviews (1): Last reviewed commit: "Move columnar runtime config helpers to ..." | Re-trigger Greptile

Comment on lines +138 to +144
private static void closeAll(ColumnView[] views) {
for (ColumnView view : views) {
if (view != null) {
view.close();
}
}
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 The closeAll(ColumnView[]) overload does not guard against exceptions thrown by view.close(). If element i throws, elements i+1 through the end of newColumns are never closed, leaking native GPU memory. The companion closeAll(List<ColumnView>) in the same file and DeviceBuffersUtils.closeAll both handle this correctly by catching each element's exception individually.

Suggested change
private static void closeAll(ColumnView[] views) {
for (ColumnView view : views) {
if (view != null) {
view.close();
}
}
}
private static void closeAll(ColumnView[] views) {
RuntimeException firstException = null;
for (ColumnView view : views) {
if (view != null) {
try {
view.close();
} catch (RuntimeException e) {
if (firstException == null) {
firstException = e;
} else {
firstException.addSuppressed(e);
}
}
}
}
if (firstException != null) {
throw firstException;
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants