Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[python-package] add scikit-learn-style API for early stopping #5808

Open
wants to merge 33 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 23 commits
Commits
Show all changes
33 commits
Select commit Hold shift + click to select a range
ad43e17
Enable Auto Early Stopping
ClaudioSalvatoreArcidiacono Sep 15, 2023
f05e5e0
Relax test conditions
ClaudioSalvatoreArcidiacono Sep 18, 2023
76f3c19
Merge branch 'master' into 3313-enable-auto-early-stopping
ClaudioSalvatoreArcidiacono Sep 20, 2023
457c7f6
Merge master
ClaudioSalvatoreArcidiacono Jan 16, 2024
0db1941
Revert "Merge master"
ClaudioSalvatoreArcidiacono Jan 16, 2024
10fac65
Merge remote-tracking branch 'lgbm/master' into 3313-enable-auto-earl…
ClaudioSalvatoreArcidiacono Jan 16, 2024
d10ca54
Add missing import
ClaudioSalvatoreArcidiacono Jan 16, 2024
3b8eb0a
Remove added extra new line
ClaudioSalvatoreArcidiacono Jan 17, 2024
e47acc0
Merge branch 'master' into 3313-enable-auto-early-stopping
ClaudioSalvatoreArcidiacono Jan 17, 2024
66701ac
Merge branch 'master' into 3313-enable-auto-early-stopping
ClaudioSalvatoreArcidiacono Jan 25, 2024
39d333e
Merge branch 'master' into 3313-enable-auto-early-stopping
ClaudioSalvatoreArcidiacono Feb 2, 2024
cad7eb6
Merge branch 'master' into 3313-enable-auto-early-stopping
ClaudioSalvatoreArcidiacono Feb 6, 2024
1234ccf
Merge master
ClaudioSalvatoreArcidiacono Nov 28, 2024
d54c96a
Improve documentation, check default behavior of early stopping
ClaudioSalvatoreArcidiacono Nov 28, 2024
9c1c8b4
Solve python 3.8 compatibility issue
ClaudioSalvatoreArcidiacono Nov 28, 2024
724c7fe
Remove default to auto
ClaudioSalvatoreArcidiacono Nov 29, 2024
c957fce
Revert changes in fit top part
ClaudioSalvatoreArcidiacono Nov 29, 2024
2d7da78
Make interface as similar as possible to sklearn
ClaudioSalvatoreArcidiacono Nov 29, 2024
069a84e
Add parameters to dask interface
ClaudioSalvatoreArcidiacono Nov 29, 2024
c430ec1
Improve documentation
ClaudioSalvatoreArcidiacono Nov 29, 2024
416323a
Linting
ClaudioSalvatoreArcidiacono Nov 29, 2024
73562ff
Check for exact value equal true for early stopping
ClaudioSalvatoreArcidiacono Nov 29, 2024
38edc42
Merge branch 'master' into 3313-enable-auto-early-stopping
jameslamb Dec 15, 2024
9a32376
Switch if/else conditions order in fit
ClaudioSalvatoreArcidiacono Dec 18, 2024
f33ebd3
Merge remote-tracking branch 'origin/master' into 3313-enable-auto-ea…
ClaudioSalvatoreArcidiacono Dec 18, 2024
a61726f
fix issues in engine.py
ClaudioSalvatoreArcidiacono Dec 18, 2024
44316d7
make new early stopping parameters keyword-only
ClaudioSalvatoreArcidiacono Dec 18, 2024
4cbfc84
Remove n_iter_no_change parameter
ClaudioSalvatoreArcidiacono Dec 18, 2024
93acf6a
Address comments in tests
ClaudioSalvatoreArcidiacono Dec 18, 2024
2b049c9
Improve tests
ClaudioSalvatoreArcidiacono Dec 18, 2024
61371cb
Add tests to check for validation fraction
ClaudioSalvatoreArcidiacono Dec 18, 2024
65c4e2f
Remove validation_fraction=None option
ClaudioSalvatoreArcidiacono Dec 18, 2024
0a8e843
Remove validation_fraction=None option also in dask
ClaudioSalvatoreArcidiacono Dec 18, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 4 additions & 0 deletions docs/Python-Intro.rst
Original file line number Diff line number Diff line change
Expand Up @@ -244,6 +244,10 @@ This works with both metrics to minimize (L2, log loss, etc.) and to maximize (N
Note that if you specify more than one evaluation metric, all of them will be used for early stopping.
However, you can change this behavior and make LightGBM check only the first metric for early stopping by passing ``first_metric_only=True`` in ``early_stopping`` callback constructor.

In the scikit-learn API of lightgbm, early stopping can also be enabled by setting the parameter ``early_stopping`` to ``True``
When early stopping is enabled and no validation set is provided, a portion of the training data will be used as validation set.
The amount of data to use for validation is controlled by the parameter ``validation_fraction`` and defaults to 0.1.

Prediction
----------

Expand Down
9 changes: 9 additions & 0 deletions python-package/lightgbm/dask.py
Original file line number Diff line number Diff line change
Expand Up @@ -1134,6 +1134,9 @@ def __init__(
random_state: Optional[Union[int, np.random.RandomState, "np.random.Generator"]] = None,
n_jobs: Optional[int] = None,
importance_type: str = "split",
early_stopping: bool = False,
n_iter_no_change: int = 10,
validation_fraction: Optional[float] = 0.1,
client: Optional[Client] = None,
**kwargs: Any,
):
Expand Down Expand Up @@ -1337,6 +1340,9 @@ def __init__(
random_state: Optional[Union[int, np.random.RandomState, "np.random.Generator"]] = None,
n_jobs: Optional[int] = None,
importance_type: str = "split",
early_stopping: bool = False,
n_iter_no_change: int = 10,
validation_fraction: Optional[float] = 0.1,
client: Optional[Client] = None,
**kwargs: Any,
):
Expand Down Expand Up @@ -1504,6 +1510,9 @@ def __init__(
random_state: Optional[Union[int, np.random.RandomState, "np.random.Generator"]] = None,
n_jobs: Optional[int] = None,
importance_type: str = "split",
early_stopping: bool = False,
n_iter_no_change: int = 10,
validation_fraction: Optional[float] = 0.1,
client: Optional[Client] = None,
**kwargs: Any,
):
Expand Down
36 changes: 22 additions & 14 deletions python-package/lightgbm/engine.py
Original file line number Diff line number Diff line change
Expand Up @@ -510,11 +510,9 @@ def _make_n_folds(
nfold: int,
params: Dict[str, Any],
seed: int,
fpreproc: Optional[_LGBM_PreprocFunction],
stratified: bool,
shuffle: bool,
eval_train_metric: bool,
) -> CVBooster:
) -> Iterable[Tuple[np.ndarray, np.ndarray]]:
"""Make a n-fold list of Booster from random indices."""
full_data = full_data.construct()
num_data = full_data.num_data()
Expand Down Expand Up @@ -559,7 +557,16 @@ def _make_n_folds(
test_id = [randidx[i : i + kstep] for i in range(0, num_data, kstep)]
train_id = [np.concatenate([test_id[i] for i in range(nfold) if k != i]) for k in range(nfold)]
folds = zip(train_id, test_id)
return folds


def _make_cvbooster(
full_data: Dataset,
params: Dict[str, Any],
folds: Iterable[Tuple[np.ndarray, np.ndarray]],
fpreproc: Optional[_LGBM_PreprocFunction],
eval_train_metric: bool,
) -> CVBooster:
ret = CVBooster()
for train_idx, test_idx in folds:
train_set = full_data.subset(sorted(train_idx))
Expand Down Expand Up @@ -764,10 +771,11 @@ def cv(
nfold=nfold,
params=params,
seed=seed,
fpreproc=fpreproc,
stratified=stratified,
shuffle=shuffle,
eval_train_metric=eval_train_metric,
)
cvbooster = _make_cvbooster(
full_data=train_set, params=params, folds=cvfolds, fpreproc=fpreproc, eval_train_metric=eval_train_metric
)

# setup callbacks
Expand Down Expand Up @@ -802,24 +810,24 @@ def cv(
for cb in callbacks_before_iter:
cb(
callback.CallbackEnv(
model=cvfolds,
model=cvbooster,
params=params,
iteration=i,
begin_iteration=0,
end_iteration=num_boost_round,
evaluation_result_list=None,
)
)
cvfolds.update(fobj=fobj) # type: ignore[call-arg]
res = _agg_cv_result(cvfolds.eval_valid(feval)) # type: ignore[call-arg]
cvbooster.update(fobj=fobj) # type: ignore[call-arg]
jameslamb marked this conversation as resolved.
Show resolved Hide resolved
res = _agg_cv_result(cvbooster.eval_valid(feval)) # type: ignore[call-arg]
for _, key, mean, _, std in res:
results[f"{key}-mean"].append(mean)
results[f"{key}-stdv"].append(std)
try:
for cb in callbacks_after_iter:
cb(
callback.CallbackEnv(
model=cvfolds,
model=cvbooster,
params=params,
iteration=i,
begin_iteration=0,
Expand All @@ -828,14 +836,14 @@ def cv(
)
)
except callback.EarlyStopException as earlyStopException:
cvfolds.best_iteration = earlyStopException.best_iteration + 1
for bst in cvfolds.boosters:
bst.best_iteration = cvfolds.best_iteration
cvbooster.best_iteration = earlyStopException.best_iteration + 1
for bst in cvbooster.boosters:
bst.best_iteration = cvbooster.best_iteration
for k in results:
results[k] = results[k][: cvfolds.best_iteration]
results[k] = results[k][: cvbooster.best_iteration]
break

if return_cvbooster:
results["cvbooster"] = cvfolds # type: ignore[assignment]
results["cvbooster"] = cvbooster # type: ignore[assignment]

return dict(results)
145 changes: 95 additions & 50 deletions python-package/lightgbm/sklearn.py
Original file line number Diff line number Diff line change
Expand Up @@ -44,7 +44,7 @@
dt_DataTable,
pd_DataFrame,
)
from .engine import train
from .engine import _make_n_folds, train

if TYPE_CHECKING:
from .compat import _sklearn_Tags
Expand Down Expand Up @@ -507,7 +507,10 @@ def __init__(
random_state: Optional[Union[int, np.random.RandomState, np.random.Generator]] = None,
n_jobs: Optional[int] = None,
importance_type: str = "split",
**kwargs: Any,
early_stopping: bool = False,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
early_stopping: bool = False,
*,
early_stopping: bool = False,

I think we should make these keyword-only arguments, as they are in scikit-learn: I think we should make these keyword-only arguments, as scikit-learn does.

https://github.com/scikit-learn/scikit-learn/blob/6cccd99aee3483eb0f7562afdd3179ccccab0b1d/sklearn/ensemble/_hist_gradient_boosting/gradient_boosting.py#L1689

Could you please try that (in these estimators and the Dask ones)?

I don't want to do that for other existing parameters, to prevent breaking existing user code, but since these are new parameters, it's safe to be stricter.

n_iter_no_change: int = 10,
validation_fraction: Optional[float] = 0.1,
**kwargs,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
**kwargs,
**kwargs: Any,

Why was this type hint removed? If it was just an accident, please put it back to reduce the size of the diff.

):
r"""Construct a gradient boosting model.

Expand Down Expand Up @@ -587,6 +590,16 @@ def __init__(
The type of feature importance to be filled into ``feature_importances_``.
If 'split', result contains numbers of times the feature is used in a model.
If 'gain', result contains total gains of splits which use the feature.
early_stopping : bool, optional (default=False)
Whether to enable early stopping. If set to True, training will stop if the validation score does not improve
for a specified number of rounds (controlled by `n_iter_no_change`).
n_iter_no_change : int, optional (default=10)
If early stopping is enabled, this parameter specifies the number of iterations with no
improvement after which training will be stopped.
validation_fraction : float or None, optional (default=0.1)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There are not any tests in test_sklearn.py that pass validation_fraction. Please add some, covering both the default behavior and that passing a non-default value (like 0.4) works as expected.

I don't know the exact code paths off the top of my head, would appreciate if you can investigate... but I think it should be possible to test this by checking the size of the datasets added to valid_sets and confirming that they're as expected (e.g. that the automatically-aded validation set has 4,000 rows if the input data X has 40,000 rows and validation_fraction=0.1 is passed).

If that's not observable through the public API, try to use mocking/patching to observe it instead of adding any additional properties to the Booster / estimators' public API.

Comment in-thread here if you have questions or need help with that.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for your comment, I have added a couple of tests for that, I have used patching to achieve that, let me know if you think there is more that can be improved in those tests.

Proportion of training data to set aside as
validation data for early stopping. If None, early stopping is done on
the training data. Only used if early stopping is performed.
**kwargs
Other parameters for the model.
Check http://lightgbm.readthedocs.io/en/latest/Parameters.html for more parameters.
Expand Down Expand Up @@ -651,6 +664,9 @@ def __init__(
self.random_state = random_state
self.n_jobs = n_jobs
self.importance_type = importance_type
self.early_stopping = early_stopping
self.n_iter_no_change = n_iter_no_change
self.validation_fraction = validation_fraction
self._Booster: Optional[Booster] = None
self._evals_result: _EvalResultDict = {}
self._best_score: _LGBM_BoosterBestScoreType = {}
Expand Down Expand Up @@ -816,11 +832,19 @@ def _process_params(self, stage: str) -> Dict[str, Any]:
params.pop("importance_type", None)
params.pop("n_estimators", None)
params.pop("class_weight", None)
params.pop("validation_fraction", None)
params.pop("early_stopping", None)
params.pop("n_iter_no_change", None)

if isinstance(params["random_state"], np.random.RandomState):
params["random_state"] = params["random_state"].randint(np.iinfo(np.int32).max)
elif isinstance(params["random_state"], np.random.Generator):
params["random_state"] = int(params["random_state"].integers(np.iinfo(np.int32).max))

params = _choose_param_value("early_stopping_round", params, self.n_iter_no_change)
if self.early_stopping is not True:
params["early_stopping_round"] = None
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
if self.early_stopping is not True:
params["early_stopping_round"] = None

This looks to me like it might turn off early stopping enabled other ways (like passing early_stopping_round=15 or the lgb.early_stopping() callback + some valid_sets) if keyword argument early_stopping=False. Since early_stopping=False is the default, that'd be a backwards-incompatible change.

The early_stopping keyword argument in this PR is not intended to control ALL early stopping, right? I think it should be limited to controlling the scikit-learn-style early stopping, but that the other mechanisms that people have been using with lightgbm for years should continue to work.


if self._n_classes > 2:
for alias in _ConfigAliases.get("num_class"):
params.pop(alias, None)
Expand Down Expand Up @@ -957,54 +981,75 @@ def fit(
params=params,
)

valid_sets: List[Dataset] = []
if eval_set is not None:
if isinstance(eval_set, tuple):
eval_set = [eval_set]
for i, valid_data in enumerate(eval_set):
# reduce cost for prediction training data
if valid_data[0] is X and valid_data[1] is y:
valid_set = train_set
else:
valid_weight = _extract_evaluation_meta_data(
collection=eval_sample_weight,
name="eval_sample_weight",
i=i,
)
valid_class_weight = _extract_evaluation_meta_data(
collection=eval_class_weight,
name="eval_class_weight",
i=i,
)
if valid_class_weight is not None:
if isinstance(valid_class_weight, dict) and self._class_map is not None:
valid_class_weight = {self._class_map[k]: v for k, v in valid_class_weight.items()}
valid_class_sample_weight = _LGBMComputeSampleWeight(valid_class_weight, valid_data[1])
if valid_weight is None or len(valid_weight) == 0:
valid_weight = valid_class_sample_weight
else:
valid_weight = np.multiply(valid_weight, valid_class_sample_weight)
valid_init_score = _extract_evaluation_meta_data(
collection=eval_init_score,
name="eval_init_score",
i=i,
)
valid_group = _extract_evaluation_meta_data(
collection=eval_group,
name="eval_group",
i=i,
)
valid_set = Dataset(
data=valid_data[0],
label=valid_data[1],
weight=valid_weight,
group=valid_group,
init_score=valid_init_score,
categorical_feature="auto",
params=params,
)

valid_sets.append(valid_set)
if self.early_stopping is True and eval_set is None:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most of this part of the diff appears to only be whitespace changes. When I hide whitespace, it looks like just this:

image

To make it easier for reviewers and to shrink the total number of lines touched, please change your approach here. On master, lightgbm currently has this:

if eval_set is not None:
    # (existing code)

Change it to:

if eval_set is not None:
    # (existing code)
elif self.early_stopping is True:
    # (new code added in this PR)

if self.validation_fraction is not None:
n_splits = max(int(np.ceil(1 / self.validation_fraction)), 2)
stratified = isinstance(self, LGBMClassifier)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not a huge fan of how the validation set is created from the train set using the _make_n_folds function.

if 1/validation_fraction is not an integer, the result will be that the actual validation set size will not match the validation fraction specified by the user.

For example, if the validation fraction is 0.4 the number of splits calculated here will be 2, which will result in a fraction of 0.5, instead of 0.4.

Using something like train_test_split from scikit-learn would solve the issue for the classification and the regression case, but for ranking tasks our best option is GroupShuffleSplit, which will inevitably suffer from the same issue expressed above. The options that I thought to solve this issue are:

  1. Leave the code as-is and raise a warning when 1/validation_fraction is not an integer.
  2. Use train_test_split for creating the validation set in the classification and regression cases; Raise a warning when 1/validation_fraction is not an integer in the ranking case.

I would lean more towards option 2, but this will make the MR bigger.

@jameslamb I would like to hear your opinion on it, do you perhaps already have something else in mind?

cvfolds = _make_n_folds(
full_data=train_set,
folds=None,
nfold=n_splits,
params=params,
seed=self.random_state,
stratified=stratified,
shuffle=True,
)
train_idx, val_idx = next(cvfolds)
valid_set = train_set.subset(sorted(val_idx))
train_set = train_set.subset(sorted(train_idx))
else:
valid_set = train_set
valid_set = valid_set.construct()
valid_sets = [valid_set]
else:
valid_sets: List[Dataset] = []
if eval_set is not None:
if isinstance(eval_set, tuple):
eval_set = [eval_set]
for i, valid_data in enumerate(eval_set):
# reduce cost for prediction training data
if valid_data[0] is X and valid_data[1] is y:
valid_set = train_set
else:
valid_weight = _extract_evaluation_meta_data(
collection=eval_sample_weight,
name="eval_sample_weight",
i=i,
)
valid_class_weight = _extract_evaluation_meta_data(
collection=eval_class_weight,
name="eval_class_weight",
i=i,
)
if valid_class_weight is not None:
if isinstance(valid_class_weight, dict) and self._class_map is not None:
valid_class_weight = {self._class_map[k]: v for k, v in valid_class_weight.items()}
valid_class_sample_weight = _LGBMComputeSampleWeight(valid_class_weight, valid_data[1])
if valid_weight is None or len(valid_weight) == 0:
valid_weight = valid_class_sample_weight
else:
valid_weight = np.multiply(valid_weight, valid_class_sample_weight)
valid_init_score = _extract_evaluation_meta_data(
collection=eval_init_score,
name="eval_init_score",
i=i,
)
valid_group = _extract_evaluation_meta_data(
collection=eval_group,
name="eval_group",
i=i,
)
valid_set = Dataset(
data=valid_data[0],
label=valid_data[1],
weight=valid_weight,
group=valid_group,
init_score=valid_init_score,
categorical_feature="auto",
params=params,
)

valid_sets.append(valid_set)

if isinstance(init_model, LGBMModel):
init_model = init_model.booster_
Expand Down
Loading
Loading