fix(datasets): make `GBQTableDataset` serializable #961

deepyaman · 2024-12-11T07:23:48Z

Description

Resolve https://kedro.hall.community/using-pandas-with-bigquery-and-parallel-runners-hTdF7LG3995h#a921cd9d-aa25-4752-9307-1862590a833d by delaying bigquery.Client() creation, as done in similar datasets.

Development notes

Also, as it stands now, the GBQQueryDataset doesn't need a bigquery.Client()? I think it would make sense to potentially implement GBQQueryDataset.exists() (if not query_or_table._is_query(), checked with something like https://github.com/googleapis/python-bigquery-pandas/blob/78aa01e3f039500c9fabbcdd8d8dcfa3e5c42b73/pandas_gbq/gbq.py#L71), but I haven't done that in this PR. (You'd need to figure out what to do if it isn't a table; would you return True but raise a warning?)

Checklist

Opened this PR as a 'Draft Pull Request' if it is work-in-progress
Updated the documentation to reflect the code changes
Added a description of this change in the relevant RELEASE.md file
Added tests to cover my changes
Received approvals from at least half of the TSC (required for adding a new, non-experimental dataset)

Signed-off-by: Deepyaman Datta <[email protected]>

deepyaman · 2024-12-11T18:33:10Z

Relevant tests are passing; just need to go through the rigamarole of rerunning flaky tests pass before merging.

ankatiyar

LGTM

DimedS

Thanks, @deepyaman ! LGTM

deepyaman force-pushed the feat/datasets/serializable-gbq branch from e3e70f3 to f5d828f Compare December 11, 2024 15:35

fix(datasets): make GBQTableDataset serializable

c6a40e7

Signed-off-by: Deepyaman Datta <[email protected]>

deepyaman force-pushed the feat/datasets/serializable-gbq branch from f5d828f to c6a40e7 Compare December 11, 2024 15:41

deepyaman marked this pull request as ready for review December 11, 2024 15:41

docs(datasets): add change description for release

8ef65c9

Signed-off-by: Deepyaman Datta <[email protected]>

deepyaman enabled auto-merge (squash) December 11, 2024 15:49

deepyaman mentioned this pull request Dec 12, 2024

Release kedro-datasets 6.0.0 #942

Closed

7 tasks

Merge branch 'main' into feat/datasets/serializable-gbq

c7f283e

ravi-kumar-pilla requested review from DimedS and ankatiyar December 17, 2024 03:42

ankatiyar approved these changes Dec 17, 2024

View reviewed changes

DimedS approved these changes Dec 17, 2024

View reviewed changes

deepyaman merged commit 9054d99 into main Dec 17, 2024
12 of 13 checks passed

deepyaman deleted the feat/datasets/serializable-gbq branch December 17, 2024 13:00

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix(datasets): make `GBQTableDataset` serializable #961

fix(datasets): make `GBQTableDataset` serializable #961

deepyaman commented Dec 11, 2024 •

edited

Loading

deepyaman commented Dec 11, 2024

ankatiyar left a comment

DimedS left a comment

fix(datasets): make GBQTableDataset serializable #961

fix(datasets): make GBQTableDataset serializable #961

Conversation

deepyaman commented Dec 11, 2024 • edited Loading

Description

Development notes

Checklist

deepyaman commented Dec 11, 2024

ankatiyar left a comment

Choose a reason for hiding this comment

DimedS left a comment

Choose a reason for hiding this comment

fix(datasets): make `GBQTableDataset` serializable #961

fix(datasets): make `GBQTableDataset` serializable #961

deepyaman commented Dec 11, 2024 •

edited

Loading