Implement Random Forest as a sub case to EnsembleLearner #410

ladezai · 2025-10-17T09:49:03Z

Premise: a Random Forest is just Bagging with the additional constraint that each weak predictor knows only a subset of the feature of the dataset. In this PR, we updated the code of EnsembleLearner to account for that key constraint.

Implementation details:

Adds to src/dataset/impl_dataset.rs 3 new methods: bootstrap_with_indices, bootstrap_samples_with_indices, bootstrap_features_with_indices. Indices knowledge is fundamental to know which feature should each weak classifier.
Adds the parameter feature_proportion to EnsembleLearnerValidParams.
Adds model_features to EnsembleLearner to keep track of which feature should each weak learner use.
Updates the examples/bagging_iris.rs -> examples/ensemble.rs.

Hope this implementation meets the library standards. If any further modification is required, just ping me or feel free to modify it :)

…straint on predictors (or features) to be used. Introduce Dataset bootstrap implementation with indices denoted as '_with_indices' methods.

relf · 2025-10-19T15:21:22Z

Thanks for your contribution.

I would like to mention the prior art on RandomForest. Did you see it?
In this comment, I was suggesting an implementation.
As it is a well known algorithm, I think random forest deserves a struct after its name. What do you think about it?

Otherwise, you need to make clippy happy (I think you can go with #[allow(clippy::type_complexity)])

…e DecisionTree.

ladezai · 2025-10-20T08:52:28Z

Thanks for your contribution.

Hi, thanks to you for taking your time to review the code!

I would like to mention the prior art on RandomForest. Did you see it? In this comment, I was suggesting an implementation. As it is a well known algorithm, I think random forest deserves a struct after its name. What do you think about it?

Yes, I saw the previous implementation and, for those reason, I tried to stick with the already working implementation of EnsembleLearner with minor edits.

I pushed a new commit in which I add a type alias for RandomForest<F, L> = EnsembleLearner<DecisionTree<F, L>>. Do you think is it enough or I should add more documentation to the type alias or provide a different solution altogether?

codecov · 2025-10-20T12:40:22Z

Codecov Report

❌ Patch coverage is 61.53846% with 20 lines in your changes missing coverage. Please review.
✅ Project coverage is 36.16%. Comparing base (3f2c202) to head (7b3bdb2).

Files with missing lines	Patch %	Lines
algorithms/linfa-ensemble/src/algorithm.rs	47.05%	9 Missing ⚠️
algorithms/linfa-ensemble/src/hyperparams.rs	25.00%	6 Missing ⚠️
src/dataset/impl_dataset.rs	81.48%	5 Missing ⚠️

Additional details and impacted files

@@            Coverage Diff             @@
##           master     #410      +/-   ##
==========================================
+ Coverage   36.05%   36.16%   +0.11%     
==========================================
  Files         100      100              
  Lines        6549     6592      +43     
==========================================
+ Hits         2361     2384      +23     
- Misses       4188     4208      +20

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

relf · 2025-10-21T09:16:55Z

Thanks! The implementation looks good. And yes it would be great to adapt the documentation in lib.rs and add some on the RandomForest and EnsembleLearner types.

You have to add also some automated tests in lib.rs for random forest as well as in dataset\mod.rs for the new boostrap functions.

ladezai · 2025-10-21T12:41:41Z

Thanks! The implementation looks good. And yes it would be great to adapt the documentation in lib.rs and add some on the RandomForest and EnsembleLearner types.

For now, just to not increase this PR size, I added a minimal documentation with an example (and fixed some typo in EnsembleLearner documentation). Later on I'd guess a rewrite of this part of documentation is needed to be consistent with other sub-crates.

You have to add also some automated tests in lib.rs for random forest as well as in dataset\mod.rs for the new boostrap functions.

Added! I don't know if are enough or you have something more in mind, just ask.

ladezai added 2 commits October 17, 2025 11:36

Implement Random Forest as a subset of Bagging with an additional con…

37d1f4c

…straint on predictors (or features) to be used. Introduce Dataset bootstrap implementation with indices denoted as '_with_indices' methods.

cargo fmt

f1727c8

relf mentioned this pull request Oct 19, 2025

Add boosting support to linfa-ensemble #411

Open

ladezai added 3 commits October 20, 2025 10:39

clippy is far happier now :)

ddba346

Add type alias for Random Forest as an EnsembleLearner with model typ…

e7974ae

…e DecisionTree.

fix clippy code quality check

baae379

ladezai added 3 commits October 21, 2025 14:24

Add unit tests for bootstrap with indices and random forest

359e015

📜 Add docs and example for RandomForest type alias.

fc61f47

lint

7b3bdb2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

Implement Random Forest as a sub case to EnsembleLearner #410

Implement Random Forest as a sub case to EnsembleLearner #410

Uh oh!

ladezai commented Oct 17, 2025

Uh oh!

relf commented Oct 19, 2025

Uh oh!

ladezai commented Oct 20, 2025

Uh oh!

codecov bot commented Oct 20, 2025 •

edited

Loading

Uh oh!

relf commented Oct 21, 2025

Uh oh!

ladezai commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Implement Random Forest as a sub case to EnsembleLearner #410

Are you sure you want to change the base?

Implement Random Forest as a sub case to EnsembleLearner #410

Uh oh!

Conversation

ladezai commented Oct 17, 2025

Uh oh!

relf commented Oct 19, 2025

Uh oh!

ladezai commented Oct 20, 2025

Uh oh!

codecov bot commented Oct 20, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

relf commented Oct 21, 2025

Uh oh!

ladezai commented Oct 21, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

codecov bot commented Oct 20, 2025 •

edited

Loading