Skip to content
Draft
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions dist/unshimmed-from-each-spark3xx.txt
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
com/nvidia/spark/rapids/Arm*
com/nvidia/spark/rapids/CloseableHolder*
com/nvidia/spark/rapids/*/RapidsShuffleManager.class
com/nvidia/spark/rapids/AvroProvider.class
com/nvidia/spark/rapids/HiveProvider.class
Expand Down
42 changes: 42 additions & 0 deletions docs/additional-functionality/rapids-udfs.md
Original file line number Diff line number Diff line change
Expand Up @@ -93,6 +93,48 @@ closed by the RAPIDS Accelerator, so the UDF only needs to close any
intermediate data generated while producing the final result that is
returned.

#### ARM Helper Methods

For Scala, helper methods implementing the automatic resource management

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is worth pointing out that this originated as a solution for Scala 2.12- before Using was introduced in Scala 2.13. Thus, with 2.13 builds of Apache Spark it is an alternative as well.

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

pattern are exposed via the public Spark RAPIDS API under the `Arm` object:

```scala
import com.nvidia.spark.rapids.Arm.{withResource, closeOnExcept}

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we show both we should explain the difference and when to use one over the other

Copy link
Copy Markdown
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point, done

```

These patterns serve different use cases:
- `withResource`: closes the resource after the code block completes regardless
of whether it succeeds or throws an exception. This should be used for intermediate
resources that are unused beyond the code block.
- `closeOnExcept`: only closes the resource if an exception occurs within the
code block; if the block succeeds, the resource remains open. This should be used
if you want to continue using the resource after the code block but need to clean
up if an error occurs.

Examples:

```scala
override def evaluateColumnar(numRows: Int, args: ColumnVector*): ColumnVector = {
val nullMask = withResource(args.head.isNull()) { nulls => // nulls always closed after code block
nulls.not()
}
// ...
}
```

```scala
override def evaluateColumnar(numRows: Int, args: ColumnVector*): ColumnVector = {
val nullsCol = closeOnExcept(args.head.isNull()) { nulls => // nulls only closed if exception occurs
validate(nulls) // do something that might throw an exception
nulls
}
// ...
}
```

Note that these methods originated as a solution for Scala 2.12. Scala 2.13 introduced
the [Using](https://www.scala-lang.org/api/2.13.6/scala/util/Using$.html) utility, which is an alternative to `withResource` for 2.13 builds of Apache Spark.

### Generating Columnar Output

The `evaluateColumnar` method must return a `ColumnVector` of an appropriate
Expand Down