Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-48665][PYTHON][CONNECT] Support providing a dict in pyspark lit to create a map. #49318

Open
wants to merge 11 commits into
base: master
Choose a base branch
from

Conversation

skanderboudawara
Copy link

@skanderboudawara skanderboudawara commented Dec 27, 2024

What changes were proposed in this pull request?

Reopening the PR done by Ronserruya and addidng small changes.

Added the option to provide a dict to pyspark.sql.functions.lit in order to create a map

Why are the changes needed?

To make it easier to create a map in pyspark.
Currently, it is only possible via create_map which requires a sequence of key,value,key,value...
Scala already supports such functionality using typedLit

A similar PR was done in the past to add similar functionality for the creating of an array using a list, so I tried to follow all the changes done there as well.

Does this PR introduce any user-facing change?

Yes, docstring of lit was edited, and new functionality was added

Before:

from pyspark.sql import functions as F
F.lit({"a":1})
# pyspark.errors.exceptions.captured.SparkRuntimeException: [UNSUPPORTED_FEATURE.LITERAL_TYPE] The feature is not supported: Literal for '{asd=2}' of class java.util.HashMap.

After:

from pyspark.sql import functions as F
F.lit({"a":1, "b": 2})
# Column<'map(a, 1, b, 2)'>

How was this patch tested?

Manual tests + unittest in CI

Was this patch authored or co-authored using generative AI tooling?

No

@@ -262,6 +277,14 @@ def lit(col: Any) -> Column:
errorClass="COLUMN_IN_LIST", messageParameters={"func_name": "lit"}
)
return array(*[lit(item) for item in col])
elif isinstance(col, dict):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the problem is that dict can be mapped to struct.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you might need a configuration or parameter to control this

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added a param to_struct, if you want to check

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants