New Feature: support user defined functions (#146)
- Support for user defined functions (UDFs). Note that these require annotating the return type. For more on the theory behind these see ADR-003.
from siuba.siu import symbolic_dispatch
from pandas.core.groupby import SeriesGroupBy, GroupBy
from pandas import Series
@symbolic_dispatch(cls = Series)
def cummean(x):
"""Return a same-length array, containing the cumulative mean."""
return x.expanding().mean()
@cummean.register(SeriesGroupBy)
def _cummean_grouped(x) -> SeriesGroupBy:
grouper = x.grouper
n_entries = x.obj.notna().groupby(grouper).cumsum()
res = x.cumsum() / n_entries
return res.groupby(grouper)
from siuba import _, mutate
from siuba.data import mtcars
# a pandas DataFrameGroupBy object
g_cyl = mtcars.groupby("cyl")
mutate(g_students, cumul_mean = cummean(_.score))
- Support for many methods in vector.py, using UDFs (#158)
Bug Fixes
- Fix regression where .str wasn't being removed when processing siu expressions for SQL (#159)
- Grouped filter now preserves order
- Verbs now tested to preserve original index (d938ab3)
Tests
- Add many more versions of python and pandas to travis CI test matrix (#161)