Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API for a Sliceable Metadata Class #56

Open
wants to merge 1 commit into
base: main
Choose a base branch
from
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
137 changes: 137 additions & 0 deletions SEP-0013.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,137 @@
# SEP-0013 -- A Metadata Class that Supports Axis-aligned Metadata

| SEP | num |
|---------------|------------------------------------------------------|
| title | A Metadata Class that Supports Axis-aligned Metadata |
| author(s) | [Daniel Ryan](https://orcid.org/0000-0001-8661-3825) |
| contact email | [email protected] |
| date-creation | 2021-11-21 |
| type | standard |
| discussion | link to discussion if available |
| status | discussion |

# Introduction
Currently in `ndcube` and `sunpy`, the metadata object attached to data classes
like `NDCube` and `Map` can be of any `dict`-like type.
While this level of generality allows users to use the metadata object however
they want, it prevents the data classes interacting with the metadata object
because there is no standardized API.
One important use-case is supporting axis-aligned metadata, i.e. a piece of
metadata that has a different value for different indices in the data array,
e.g. exposure time.
This SEP strives to define a `dict`-like metadata API that is independent of the
type of data it describes and enables, but does not require, axis-aligned metadata.

The API enables:
- Access to the metadata values via their name key;
- Comments associated with specific pieces of metadata;
- Editing of the metadata values and/or comments;
- The preservation of a copy of the original metadata values;
- Addition and removal of metadata;
- Association of metadata with one or more data axes;
- Slicing of the metadata object given a slice item valid for the data array the metadata object describes.

It is anticipated that this API will be implemented in `ndcube`.
The motivating use-case storing exposure time and enabling sunraster data objects
to normalize their intensity values.

This metadata API is not designed as a standardization of names of specific metadata
associated with different data types, e.g. images, or instruments.
Instead it is hoped that such metadata schemes can be implemented on top of this API.

# MetaABC API
Below is the class definition of an Abstract Base Class (ABC) for a `Meta` object.

```python
class Meta(dict):

def __init__(self,
header: dict-like = None,
comments: dict-like = None,
axes: dict-like = None,
data_shape: Iterable[int] = None):
"""
A sliceable object for storing metadata.

Metdata can be linked to a data array axis. This causes it to be sliced when the
standard Python numeric slicing API is applied to the object.
Specific pieces of metadata can be obtained using the dict-like str slicing API.
Metadata associated with an axis/axes must have the same length/shape as those axes.

Parameters
----------
header:
The names and values of metadata.
comments:
Comments associated with any of the above pieces of metadata.
This does not have to include a key for every key in header,
but every key in comments must also be in header.
axes:
The axis/axes associated with the above metadata values.
Each axis value must be None (for no axis association), `int`
or an iterable of `int` if the metadata is associated with multiple axes.
Metadata in header without a corresponding entry here are assumed to not
be associated with an axis.
All keys in axes must also be in header.
data_shape:
The shape of the data array with which this metadata is associated.
Must be set if axes input is set.
"""

@abc.abstractmethod
@property
def original_header(self) -> Any:
"""A copy of the original header input."""

@abc.abstractmethod
@property
def comments(self) -> dict:
"""Returns a dict of comments associated with specific pieces of metadata."""

@abc.abstractmethod
@property
def axes(self) -> dict:
"""Returns a dict of array axes with which specific pieces of metadata are associated.

Axes are returned as tuples of ints. If the metadata is not associated with
array axis, None is returned. Metadata keys absent from this dict are assumed
to be associated with no array axes.
"""

@abc.abstractmethod
@property
def shape(self) -> numpy.ndarray:
"""1-D array giving the length of each array axis of the associated data array."""

@abc.abstractmethod
def add(self,
name: str,
value: Any,
comment: str = None,
axis: Union[int, Iterable[int]] = None,
overwrite: bool = False):
"""Add a new piece of metadata to instance."""

@abc.abstractmethod
def remove(self,
name: str):
"""Remove a new piece of metadata to instance."""

@abc.abstractmethod
def __getitem__(self, item: Union[int, slice, Iterable[Union[slice, int]]]) -> "MetaABC":
"""Slice the Meta object.

The input item must be a valid slice item for an array with a shape given by
MetaABC.shape. If MetaABC.shape is None, then this method should error.
This method will alter the following things:
* The values of axis-aligned metadata;
* The axes of axis-aligned metadata if any data axes are dropped by the slicing;
* MetaABC.shape.
Data axes can be dropped by slicing if an integer is applied to an axis. This means
that axis indices greater than the dropped axis, will be decremented by 1.
Metadata, all of whose axes are dropped by the slicing, will have a new axis value of None.
"""
```

# Decision Rationale
This is a great idea because...