Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

CrossValidation results suffering from oversampling/augmentation #97

Open
dominikmn opened this issue Apr 21, 2021 · 0 comments
Open

CrossValidation results suffering from oversampling/augmentation #97

dominikmn opened this issue Apr 21, 2021 · 0 comments
Labels
bug Something isn't working enhancement New feature or request

Comments

@dominikmn
Copy link
Owner

Problem

In our current GridSearch approach we train the models on the oversampled/augmented train set.
On the same set, we do perform the cross-validation.
This is a problem as the model sees samples in the validation-split that it already saw in the train-split.
Hence, models that overfit will be favored by the GridSearch.

Resources

https://imbalanced-learn.org/dev/miscellaneous.html#custom-samplers

@dominikmn dominikmn added bug Something isn't working enhancement New feature or request labels Apr 21, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

1 participant