fix(datasets): make GBQTableDataset
serializable
#961
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
Resolve https://kedro.hall.community/using-pandas-with-bigquery-and-parallel-runners-hTdF7LG3995h#a921cd9d-aa25-4752-9307-1862590a833d by delaying
bigquery.Client()
creation, as done in similar datasets.Development notes
Also, as it stands now, the
GBQQueryDataset
doesn't need abigquery.Client()
? I think it would make sense to potentially implementGBQQueryDataset.exists()
(if not query_or_table._is_query()
, checked with something like https://github.com/googleapis/python-bigquery-pandas/blob/78aa01e3f039500c9fabbcdd8d8dcfa3e5c42b73/pandas_gbq/gbq.py#L71), but I haven't done that in this PR. (You'd need to figure out what to do if it isn't a table; would you returnTrue
but raise a warning?)Checklist
RELEASE.md
file