Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add groupby and groupby_apply functionality to metadata #383

Draft
wants to merge 8 commits into
base: dev
Choose a base branch
from

Conversation

sjvenditto
Copy link
Collaborator

Summary

This PR brings grouping functionality to metadata, inspired by pandas implementation

Two new class methods are added for objects with metadata:

obj.groupby(by, get_group=None)

  • Parameters:
    • by (required) is a string or list of strings corresponding to the metadata columns by which to group
    • get_group (optional) can be specified to return a new, reduced object filtered to the group get_group
  • Returns:
    • a dictionary with keys corresponding to groups and values corresponding to grouped indices if get_group is not supplied
      • a pynapple object filtered to get_group if it is supplied
  • Example:
>>> import pynapple as nap
>>> import numpy as np

>>> times = np.array([[0, 5], [10, 12], [20, 33]])
>>> ep = nap.IntervalSet(times)
>>> metadata = {"l1": [1, 2, 3], "l2": ["x", "x", "y"], }
>>> ep = nap.IntervalSet(times, metadata=metadata)
>>> ep.groupby("l2")
{'x': [0, 1], 'y': [2]}

>>> ep.groupby("l2","x")
  index    start    end    l1  l2
      0        0      5     1  x
      1       10     12     2  x
shape: (2, 2), time unit: sec.

obj.groupby_apply(by, func, grouped_arg=None, **func_kwargs)

  • Parameters:
    • by (required) same as above
    • func (required) function to apply to each group of the object
    • grouped_arg (optional) string corresponding to the function argument that the grouped object is passed as. if None, the grouped object is sent at the first positional argument.
    • func_kwargs (optional) dictionary of other required or optional keyword arguments that are passed to func.
  • Returns:
    • dictionary with keys corresponding to groups and values corresponding to the function output for each group
  • Examples:
>>> ep.groupby_apply("l2",np.mean)
{'x': 6.75, 'y': 26.5}

>>> tsg = nap.TsGroup(
>>>     {
>>>         3: nap.Ts(t=np.arange(0, 40)),
>>>         2: nap.Ts(t=np.arange(0, 40, 0.5), time_units="s"),
>>>         4: nap.Ts(t=np.arange(0, 40, 0.2), time_units="s"),
>>>         10: nap.Ts(t=np.arange(0, 40, 1), time_units="s"),
>>>     },
>>>     metadata={"label": [1, 1, 2, 2]},
>>> )
>>> feature = nap.Tsd(t=np.arange(40), d=np.concatenate([np.zeros(20), np.ones(20)]))
>>> func_kwargs = {
>>>     "group": tsg,
>>>     "feature": feature,
>>>     "nb_bins": 2,
>>> }
>>> ep.groupby_apply("l2", nap.compute_1d_tuning_curves, "ep", **func_kwargs)
{'x':             2         3         4         10
 0.25  1.823362  1.025641  4.216524  1.025641
 0.75       NaN       NaN       NaN       NaN,
 'y':             2         3         4         10
 0.25       NaN       NaN       NaN       NaN
 0.75  1.978022  1.025641  4.835165  1.025641}

Copy link

codecov bot commented Jan 3, 2025

Codecov Report

Attention: Patch coverage is 58.33333% with 10 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
pynapple/core/metadata_class.py 58.33% 6 Missing and 4 partials ⚠️
Files with missing lines Coverage Δ
pynapple/core/metadata_class.py 78.19% <58.33%> (-4.38%) ⬇️

@gviejo
Copy link
Contributor

gviejo commented Jan 6, 2025

Looks great. If anything, I found the name "grouped_arg" a bit confusing. Maybe there is a better name for the argument?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants