Skip to content

ArraySettings.output_key and downsampling_method shouldn't gate the presence/absence of ngff multiscales group #182

Description

@tlambert03

This is basically the same issue as #170, but extended to any array. And it also touches on the output_key issue mentioned in #181 (but i'll leave that issue to focus on docs, while this one focuses on the expected output). Briefly: there's too much magic going on when it comes to deciding what sort of zarr hierarchy to create (output_key and downsampling_method are both doing double duty in determining what the structure of the zarr hierarchy looks like)

Here's the TLDR; feature/bug-fix request, and the background/rationale below:

  1. I think ArraySettings.output_key should only determine path, not structure (its presence/absence shouldn't be used to gate whether you get an array node or an ome multiscales group) (side-note: i do like that there is a way to directly write only the array node, but i think that result is currently a little conflated and unclear)
  2. I think downsampling_method should only determine the downsampling method, and possibly whether or not downsampling occurs (its presence should also not be used to gate whether you get an array node or multiscales group)
  3. chunk_size_px should not be used to determine how many downscaling levels you get (sure, it can determine what the maximum valid number of scales could be, but you should be able to select number of downscales without modifying the chunk layout of the top level array).

Note

Background: generally speaking, I think most typical experiments fall into one of three categories
(at least as far as what ome-zarr can currently represent)

  1. A single ≤5D image (TCZYX, at a single position). Whether it has multiple scales or not is optional.
  2. An HCS plate, where each FOV is a single ≤5D image
  3. A collection of ≤5D images, usually corresponding to different positions. There's no great spec for this at the moment (there's an old ongoing discussion Collections Specification ome/ngff#31), but the best option at the moment is piggy backing on the bioformats2raw.layout transitional spec

This issue is mostly about 3 (collections), but touches on 1 (multiscales) images.

Generating a valid OME multiscales image for a single position is easy enough. Just use the standard pattern, but make sure not to provide output_key. This works (though it's semantically a bit confusing/magic ... and that's what #181 is discussing)

settings = StreamSettings(
    arrays=[
        ArraySettings(
            output_key="",  # any other value will NOT result in an OME multiscales group
            ...
        )
    ]
)

However, generating a collection of multiscale images is currently not possible, unless you also opt-in to downsampling.

This will NOT work... You will get root.zarr/Pos0 and root.zarr/Pos1, but providing output_key means that each of these is directly an Array node, no multiscales metadata is written.

settings = StreamSettings(
    arrays=[
        ArraySettings(
            output_key="Pos0",
            ...
        ),
        ArraySettings(
            output_key="Pos1",
            ...
        )
    ]
)

This will work, but it's a hack (the presence of downsampling_method shouldn't be related to group structure), and there's no way to say "i only want a single scale"... the number of downsampled scales is always automagically calculated from the chunking scheme.

settings = StreamSettings(
      arrays=[
          ArraySettings(
              output_key="Pos0",
              downsampling_method=DownsamplingMethod.MEAN,
              ...
          ),
          ArraySettings(
              output_key="Pos1",
              downsampling_method=DownsamplingMethod.MEAN,
              ...
          )
      ]
  )

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions