Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow referring to previously created columns in summarize #20

Open
machow opened this issue May 10, 2019 · 1 comment
Open

allow referring to previously created columns in summarize #20

machow opened this issue May 10, 2019 · 1 comment
Labels
api:verb dplyr:parity Enables a dplyr behavior time:5 Half a day or more type:feature New feature or request

Comments

@machow
Copy link
Owner

machow commented May 10, 2019

e.g.

from siuba.data import mtcars
from siuba import *

mtcars >> summarize(avg_mpg = _.mpg.mean(), avg_kpg = _.avg_mpg * 1.6)
@machow
Copy link
Owner Author

machow commented Feb 8, 2020

Just a thought, this is very doable in the experimental fast_summarize. This is because it can compose operations mixing aggregated results (e.g. 1 value per group) and the original data (see ADR-003.

e.g.

mtcars >> group_by(_.cyl) >> summarize(wat = _.mpg.mean() + _.mpg)

Would just be...

  • an elementwise operation + over..
  • GroupByAgg (mpg.mean())
  • Original column mpg

@machow machow added the dplyr:parity Enables a dplyr behavior label Aug 11, 2020
@machow machow added this to siuba Jan 6, 2022
@machow machow added type:feature New feature or request api:verb labels Jan 11, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api:verb dplyr:parity Enables a dplyr behavior time:5 Half a day or more type:feature New feature or request
Projects
Status: No status
Development

No branches or pull requests

1 participant