Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[python-package] add scikit-learn-style API for early stopping #5808
base: master
Are you sure you want to change the base?
[python-package] add scikit-learn-style API for early stopping #5808
Changes from 22 commits
ad43e17
f05e5e0
76f3c19
457c7f6
0db1941
10fac65
d10ca54
3b8eb0a
e47acc0
66701ac
39d333e
cad7eb6
1234ccf
d54c96a
9c1c8b4
724c7fe
c957fce
2d7da78
069a84e
c430ec1
416323a
73562ff
38edc42
9a32376
f33ebd3
a61726f
44316d7
4cbfc84
93acf6a
2b049c9
61371cb
65c4e2f
0a8e843
File filter
Filter by extension
Conversations
Jump to
There are no files selected for viewing
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should make these keyword-only arguments, as they are in
scikit-learn
: I think we should make these keyword-only arguments, asscikit-learn
does.https://github.com/scikit-learn/scikit-learn/blob/6cccd99aee3483eb0f7562afdd3179ccccab0b1d/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py#L1689
Could you please try that (in these estimators and the Dask ones)?
I don't want to do that for other existing parameters, to prevent breaking existing user code, but since these are new parameters, it's safe to be stricter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this type hint removed? If it was just an accident, please put it back to reduce the size of the diff.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There are not any tests in
test_sklearn.py
that passvalidation_fraction
. Please add some, covering both the default behavior and that passing a non-default value (like0.4
) works as expected.I don't know the exact code paths off the top of my head, would appreciate if you can investigate... but I think it should be possible to test this by checking the size of the datasets added to
valid_sets
and confirming that they're as expected (e.g. that the automatically-aded validation set has 4,000 rows if the input dataX
has 40,000 rows andvalidation_fraction=0.1
is passed).If that's not observable through the public API, try to use mocking/patching to observe it instead of adding any additional properties to the Booster / estimators' public API.
Comment in-thread here if you have questions or need help with that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your comment, I have added a couple of tests for that, I have used patching to achieve that, let me know if you think there is more that can be improved in those tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks to me like it might turn off early stopping enabled other ways (like passing
early_stopping_round=15
or thelgb.early_stopping()
callback + somevalid_sets
) if keyword argumentearly_stopping=False
. Sinceearly_stopping=False
is the default, that'd be a backwards-incompatible change.The
early_stopping
keyword argument in this PR is not intended to control ALL early stopping, right? I think it should be limited to controlling the scikit-learn-style early stopping, but that the other mechanisms that people have been using withlightgbm
for years should continue to work.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Most of this part of the diff appears to only be whitespace changes. When I hide whitespace, it looks like just this:
To make it easier for reviewers and to shrink the total number of lines touched, please change your approach here. On
master
,lightgbm
currently has this:Change it to:
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not a huge fan of how the validation set is created from the train set using the
_make_n_folds
function.if 1/validation_fraction is not an integer, the result will be that the actual validation set size will not match the validation fraction specified by the user.
For example, if the validation fraction is 0.4 the number of splits calculated here will be 2, which will result in a fraction of 0.5, instead of 0.4.
Using something like train_test_split from scikit-learn would solve the issue for the classification and the regression case, but for ranking tasks our best option is GroupShuffleSplit, which will inevitably suffer from the same issue expressed above. The options that I thought to solve this issue are:
I would lean more towards option 2, but this will make the MR bigger.
@jameslamb I would like to hear your opinion on it, do you perhaps already have something else in mind?