serialization of columns added into the definition of the table #1733

matteocacciola · 2025-05-13T10:21:12Z

Important

Add column serialization to DataframeSerializer and update type annotations in semantic_layer_schema.py.

Behavior:
- DataframeSerializer.serialize() now includes column metadata in the serialized output.
- Updates test_serialize_with_name_and_description and test_serialize_with_name_and_description_with_dialect in test_dataframe_serializer.py to check for column serialization.
Type Annotations:
- Use Python 3.10 union type syntax (str | None) in semantic_layer_schema.py for Column, Relation, TransformationParams, Transformation, Source, Destination, and SemanticLayerSchema classes.
Testing:
- Adds tests for column serialization in test_dataframe_serializer.py.

^{This description was created by}^{for f60b9b3. You can customize this summary. It will automatically update as commits are pushed.}

ellipsis-dev

Important

Looks good to me! 👍

Reviewed everything up to f60b9b3 in 1 minute and 20 seconds. Click for details.

Reviewed 315 lines of code in 4 files
Skipped 0 files when reviewing.
Skipped posting 4 draft comments. View those below.
Modify your settings and rules to customize what types of comments Ellipsis leaves. And don't forget to react with 👍 or 👎 to teach Ellipsis.

1. CONTRIBUTING.md:56

Draft comment:
Typo detected: change "usee" to "use" in the codespell section.
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

2. pandasai/helpers/dataframe_serializer.py:31

Draft comment:
The columns metadata is now serialized via json.dumps, which resolves Columns not added during the serialization of the DataFrame #1714. Consider noting that if field order matters, enforcing sorted keys (or relying on deterministic model_dump ordering) might improve test stability.
Reason this comment was not posted:
Comment looked like it was already resolved.

3. tests/unit_tests/helpers/test_dataframe_serializer.py:8

Draft comment:
The expected strings now include the 'columns' attribute. Ensure that the order of keys in the serialized JSON matches the expected output consistently. If order becomes non-deterministic, tests might fail.
Reason this comment was not posted:
Comment looked like it was already resolved.

4. CONTRIBUTING.md:56

Draft comment:
Typo: In the Spell check section, "We usee codespell..." should be corrected to "We use codespell...".
Reason this comment was not posted:
Comment was not on a location in the diff, so it can't be submitted as a review comment.

Workflow ID: wflow_rg5lomeeqiDGFaM6

^{You can customize}^{by changing your verbosity settings, reacting with 👍 or 👎, replying to comments, or adding code review rules.}

ellipsis-dev bot reviewed May 13, 2025

View reviewed changes

serialization of columns added into the definition of the table

6e01a99

matteocacciola force-pushed the fix/serialize-dataframe-columns branch from f60b9b3 to 6e01a99 Compare May 22, 2025 17:29

Provide feedback