Skip to content

Commit 38d8042

Browse files
committed
Removed databricks-connect by default in notebooks
Signed-off-by: Miguel Rodriguez Gutierrez <[email protected]>
1 parent ca881f1 commit 38d8042

File tree

2 files changed

+9
-3
lines changed

2 files changed

+9
-3
lines changed

kedro-datasets/RELEASE.md

+3
Original file line numberDiff line numberDiff line change
@@ -20,13 +20,16 @@
2020
* Fixed deprecated load and save approaches of GBQTableDataset and GBQQueryDataset by invoking save and load directly over `pandas-gbq` lib
2121

2222
## Breaking Changes
23+
* Now `_get_spark()` does not use `databricks-connect` by default when run in a Databricks notebook
24+
2325
## Community contributions
2426
Many thanks to the following Kedroids for contributing PRs to this release:
2527
* [Brandon Meek](https://github.com/bpmeek)
2628
* [yury-fedotov](https://github.com/yury-fedotov)
2729
* [gitgud5000](https://github.com/gitgud5000)
2830
* [janickspirig](https://github.com/janickspirig)
2931
* [Galen Seilis](https://github.com/galenseilis)
32+
* [MigQ2](https://github.com/MigQ2)
3033

3134

3235
# Release 4.1.0

kedro-datasets/kedro_datasets/spark/spark_dataset.py

+6-3
Original file line numberDiff line numberDiff line change
@@ -38,7 +38,10 @@ def _get_spark() -> Any:
3838
extended configuration mechanisms and notebook compatibility,
3939
otherwise we use classic pyspark.
4040
"""
41-
try:
41+
if (
42+
"DATABRICKS_RUNTIME_VERSION" in os.environ
43+
and int(os.environ["DATABRICKS_RUNTIME_VERSION"].split(".")[0]) >= 13
44+
):
4245
# When using databricks-connect >= 13.0.0 (a.k.a databricks-connect-v2)
4346
# the remote session is instantiated using the databricks module
4447
# If the databricks-connect module is installed, we use a remote session
@@ -47,9 +50,9 @@ def _get_spark() -> Any:
4750
# We can't test this as there's no Databricks test env available
4851
spark = DatabricksSession.builder.getOrCreate() # pragma: no cover
4952

50-
except ImportError:
53+
else:
5154
# For "normal" spark sessions that don't use databricks-connect
52-
# we get spark normally
55+
# or for databricks-connect<13 we get spark "normally"
5356
spark = SparkSession.builder.getOrCreate()
5457

5558
return spark

0 commit comments

Comments
 (0)