Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(datasets): make GBQTableDataset serializable #961

Merged
merged 3 commits into from
Dec 17, 2024

Conversation

deepyaman
Copy link
Member

@deepyaman deepyaman commented Dec 11, 2024

Description

Resolve https://kedro.hall.community/using-pandas-with-bigquery-and-parallel-runners-hTdF7LG3995h#a921cd9d-aa25-4752-9307-1862590a833d by delaying bigquery.Client() creation, as done in similar datasets.

Development notes

Also, as it stands now, the GBQQueryDataset doesn't need a bigquery.Client()? I think it would make sense to potentially implement GBQQueryDataset.exists() (if not query_or_table._is_query(), checked with something like https://github.com/googleapis/python-bigquery-pandas/blob/78aa01e3f039500c9fabbcdd8d8dcfa3e5c42b73/pandas_gbq/gbq.py#L71), but I haven't done that in this PR. (You'd need to figure out what to do if it isn't a table; would you return True but raise a warning?)

Checklist

  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added a description of this change in the relevant RELEASE.md file
  • Added tests to cover my changes
  • Received approvals from at least half of the TSC (required for adding a new, non-experimental dataset)

@deepyaman deepyaman force-pushed the feat/datasets/serializable-gbq branch from e3e70f3 to f5d828f Compare December 11, 2024 15:35
@deepyaman deepyaman force-pushed the feat/datasets/serializable-gbq branch from f5d828f to c6a40e7 Compare December 11, 2024 15:41
@deepyaman deepyaman marked this pull request as ready for review December 11, 2024 15:41
@deepyaman deepyaman enabled auto-merge (squash) December 11, 2024 15:49
@deepyaman
Copy link
Member Author

Relevant tests are passing; just need to go through the rigamarole of rerunning flaky tests pass before merging.

@deepyaman deepyaman mentioned this pull request Dec 12, 2024
7 tasks
Copy link
Contributor

@ankatiyar ankatiyar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Copy link
Member

@DimedS DimedS left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, @deepyaman ! LGTM

@deepyaman deepyaman merged commit 9054d99 into main Dec 17, 2024
12 of 13 checks passed
@deepyaman deepyaman deleted the feat/datasets/serializable-gbq branch December 17, 2024 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants