-
-
Notifications
You must be signed in to change notification settings - Fork 147
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default output_format
to "dataframe"
#1115
Comments
I fully agree to this suggestion. Indeed, dataframes were added later and have matured substantially since then and I guess we can consider handling them standard knowledge for machine learners. |
I took a look at this to see how the proposed changes would behave and it appeared to lead to a few issues with some functions not fully integrated with dataframes yet. Many of these errors were Attribute errors centering around flow_id, task_id, and run_id, and appear to be the result of some functions expecting objects other than dataframes. |
In that case it might be best to do this incrementally. First we'd get a PR that changes the default, but makes internal calls use the explicit |
Hi! I am not sure how I ended up here, but I looked at what you guys are doing and It's pretty remarkable. I find this challenge to be my starter in this project and would love to contribute more. =) |
Hi @varshneydevansh, thank you! We already started working on this, as can be seen in #1258. It's already assigned to me since it's pretty involved. However, if you would like to do this instead, here is basically what needs to be done: If you'd like something smaller, I could recommend you to work on #1142 or #1213 (the latter requires some Github Actions knowledge, but no knowledge of our code-base). |
We missed this in the 0.15 release, adding it to 0.16 |
I propose that the several functions that take the
ouput_format
parameter and produce lists change their defaults to return dataframes instead of dictionaries (e.g.list_datasets
).I think originally we had loose integration with pandas and it was an optional dependency, but since it's a required dependency now I don't see a reason to provide
dicts
as default overdataframes
. I'd argue most people prefer dataframes in interactive sessions, and it makes sense to minimize the amount of writing people have to do in those (also plenty of people might prefer dataframes in scripts too).The text was updated successfully, but these errors were encountered: