-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Feat/expand add features #2202
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feat/expand add features #2202
Conversation
…tures and add features. - This function is important for when we want to add a feature and remove another so we can do it in one time to avoid copying and creating the dataset multiple times
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR refactors the dataset tools module to improve efficiency by consolidating feature modification operations. It replaces the single-feature add_feature function with a multi-feature add_features function and introduces a new modify_features function that can add and remove features simultaneously.
Key changes:
- Replace
add_featurewithadd_featuresfor batch feature addition - Introduce
modify_featuresfunction for simultaneous add/remove operations - Update all tests and examples to use the new API
Reviewed Changes
Copilot reviewed 3 out of 3 changed files in this pull request and generated 2 comments.
| File | Description |
|---|---|
| src/lerobot/datasets/dataset_tools.py | Implements new modify_features and add_features functions, refactors remove_feature to use modify_features |
| tests/datasets/test_dataset_tools.py | Updates all tests to use new API, adds comprehensive tests for modify_features |
| examples/dataset/use_dataset_tools.py | Updates example script to demonstrate new batch operations and modify_features |
Tip: Customize your code reviews with copilot-instructions.md. Create the file or learn how to get started.
…s always start at file_idx=0
….py` instead of to_parquet
e4f2d5c to
2213fa7
Compare
What this does
Expand dataset_tools function for adding features:
add_feature->add_features: takes a dict of features. First iteration of this function takes add one feature at a time. This was very inefficient as we had to create a new dataset and copy the files each time we wanted to add a feature.modify_featuresthat combinesadd_featuresandremove_features. These two functions behave very similarly. so it makes sense to make one general function where we can add and remove features simultaneously in order to avoid multiple copies.This PR will simplify the logic in #2138 and improve the performance.
Testing
examples/datasets/use_datase_tools.pytests/datasets/test_dataset_tools.pymodify_features