-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Split behaviour #86
Merged
mathysgrapotte
merged 36 commits into
mathysgrapotte:dev
from
JulienRaynal:split_behaviour
Feb 19, 2025
Merged
Split behaviour #86
mathysgrapotte
merged 36 commits into
mathysgrapotte:dev
from
JulienRaynal:split_behaviour
Feb 19, 2025
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
…fig per split and not all the config possible between splits and transforms
… the right variable transforms
… of YamlTransform
…shot for the unique generated file
…plits, gives one file per split
… transform, gives one file per transform
…ew ones exist now
…a list of transforms because the titanic test uses a unique value and the dna uses a list
…lits in a yaml config in x files holding unique splits
…ique yaml config with splits and multiple transforms in configs with one unique split and one unique transform
…le with a unique split and multiple Transforms
…file because now the the config are first separated per unique split and then per unique transforms
…ne split and one transform and changed the test file is just formatting
… one with only a unique split and multiple transforms
… to not be a file but a YamlSplitTransformDict object
…TransformDict and not the yaml path anymore
…g_path' to 'data_config' for the DatasetLoader
…ate_data_configs' to 'generate_split_transform_configs' as the awaited output is the sub config file
…e_split and generate_split_transform function + fixed old tests
…w generate_split_config and generate_split_transform_config functions
…plitTransformDict directly instead of yaml files as argument for the class:
…e two new class YamlSplitConfigDict and YamlSplitTransformDict
…creates a YamlSplitConfigDict from the path given as argument
…s directly YamlSplitTransformDict, the tests have been fixed to work with this
…eplaced with two new files for splitting in YamlSplitConfigDict and YamlSplitTransformDict
…a YamlSplitConfigDict as input now
417aadd
to
10092ee
Compare
…he docstring to match the right parameters
mathysgrapotte
approved these changes
Feb 19, 2025
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, linting is failing but I will fix it
Closed
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
2 main changes in the commit, we create 2 config files instead of one:
A lot of changes in multiple files to make this work. A lot of objects takes a
dict_config
instead ofconfig_dict_path
now as we need to be explicit on which of those two objects the function is supposed to take.A lot of the changes are also auto refractoring in the files and can be dismissed.
Each commit is specific to the file and the commit describes what has been done to the file.
Please review this carefully as the
test are working but there may be breaking changes in the use of the pipeline.
Tests:
✓ (python3.10) Running tests
✓ (python3.11) Running tests
✓ (python3.11) Running tests (couldn't find how to fix the double run on python 3.11)