Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add shards to valid_encodings to enable sharded Zarr writing #9948

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

jacobbieker
Copy link

@jacobbieker jacobbieker commented Jan 13, 2025

Adds shards to the list of valid_encodings in the zarr backend, so that sharded Zarr V3s can be written.

Copy link

welcome bot commented Jan 13, 2025

Thank you for opening this pull request! It may take us a few days to respond here, so thank you for being patient.
If you have questions, some answers may be found in our contributing guidelines.

@jhamman
Copy link
Member

jhamman commented Jan 17, 2025

Thanks @jacobbieker for opening a PR. Do you have time to add a roundtrip test for this? This test would be a good test to emulate:

def test_chunk_encoding(self) -> None:
# These datasets have no dask chunks. All chunking specified in
# encoding
data = create_test_data()
chunks = (5, 5)
data["var2"].encoding.update({"chunks": chunks})
with self.roundtrip(data) as actual:
assert chunks == actual["var2"].encoding["chunks"]

@jacobbieker
Copy link
Author

jacobbieker commented Jan 17, 2025

Thanks @jacobbieker for opening a PR. Do you have time to add a roundtrip test for this? This test would be a good test to emulate:

def test_chunk_encoding(self) -> None:
# These datasets have no dask chunks. All chunking specified in
# encoding
data = create_test_data()
chunks = (5, 5)
data["var2"].encoding.update({"chunks": chunks})
with self.roundtrip(data) as actual:
assert chunks == actual["var2"].encoding["chunks"]

Yep! I think I've added one that works for that, and updated the Zarr V3 loading to include the shards. It seems to pass the tests locally so far at least.

Edit: Ah it seems to cause some issues with threads on 3.12 potentially?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug topic-zarr Related to zarr storage library
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to write to sharded Zarr V3 because shards not in valid_encodings
3 participants