allow referring to previously created columns in summarize #20

machow · 2019-05-10T21:07:37Z

e.g.

from siuba.data import mtcars
from siuba import *

mtcars >> summarize(avg_mpg = _.mpg.mean(), avg_kpg = _.avg_mpg * 1.6)

The text was updated successfully, but these errors were encountered:

machow · 2020-02-08T23:20:10Z

Just a thought, this is very doable in the experimental fast_summarize. This is because it can compose operations mixing aggregated results (e.g. 1 value per group) and the original data (see ADR-003.

e.g.

mtcars >> group_by(_.cyl) >> summarize(wat = _.mpg.mean() + _.mpg)

Would just be...

an elementwise operation + over..
GroupByAgg (mpg.mean())
Original column mpg

machow mentioned this issue Oct 22, 2019

pandas Summarize needs to validate each result #138

Closed

machow added the time:5 Half a day or more label Feb 8, 2020

machow added the dplyr:parity Enables a dplyr behavior label Aug 11, 2020

machow added this to siuba Jan 6, 2022

machow added type:feature New feature or request api:verb labels Jan 11, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

allow referring to previously created columns in summarize #20

allow referring to previously created columns in summarize #20

machow commented May 10, 2019

machow commented Feb 8, 2020

allow referring to previously created columns in summarize #20

allow referring to previously created columns in summarize #20

Comments

machow commented May 10, 2019

machow commented Feb 8, 2020