Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New_AdaptiveRandomForest #243

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

pass-always
Copy link
Contributor

Comparing with the original AdaptiveRandomForest, New_AdaptiveRandomForest with an extra option to turn off resampling function. The original is designed to deal with imbalanced datasets.

…orest with an extra option to turn off resampling function. The original is designed to deal with imbalanced datasets.
@hmgomes
Copy link
Member

hmgomes commented Oct 11, 2021

Hi Ethan,
Please review the suggestions and update

Cheers,
Heitor

"Defines how m, defined by mFeaturesPerTreeSize, is interpreted. M represents the total number of features.",
new String[]{"Specified m (integer value)", "sqrt(M)+1", "M-(sqrt(M)+1)",
"Percentage (M * (m / 100))"},
new String[]{"SpecifiedM", "SqrtM1", "MSqrtM1", "Percentage"}, 1);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default value should be 3 in this option

"Percentage (M * (m / 100))"},
new String[]{"SpecifiedM", "SqrtM1", "MSqrtM1", "Percentage"}, 3);

"The number of trees.", 10, 1, Integer.MAX_VALUE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The default option should be 100 for the number of trees


@Override
public String getPurposeString() {
return "Adaptive Random Forest algorithm for evolving data streams from Gomes et al.";
return "Adaptive Random Forest with Resampling --- beta version";
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please revert the purpose string, there is not need to change it

public IntOption mFeaturesPerTreeSizeOption = new IntOption("mFeaturesPerTreeSize", 'm',
"Number of features allowed considered for each split. Negative values corresponds to M - m", 60, Integer.MIN_VALUE, Integer.MAX_VALUE);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

similar to the other comments, leave these default values as they are

"The lambda parameter for bagging.", 6.0, 1.0, Float.MAX_VALUE);
"The lambda parameter for bagging.", 6.0, 1.0, Float.MAX_VALUE);

public MultiChoiceOption streamBalanceOption = new MultiChoiceOption("streamBalanceOption", 'r', "The option of stream balance", new String[]{"Imbalance", "Balance"},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be the only option added / changed for this PR

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The name would be better if it were classBalancingOption. The options that are displayed on the GUI should be None (No balance) and Balanced and not Imbalance and Balance

private void init(int indexOriginal, ARFHoeffdingTree instantiatedClassifier, BasicClassificationPerformanceEvaluator evaluatorInstantiated,
long instancesSeen, boolean useBkgLearner, boolean useDriftDetector, ClassOption driftOption, ClassOption warningOption, boolean isBackgroundLearner) {
//Option stream/dataset balance or not. Resampling for imbalance, No resampling for balance dataset
public MultiChoiceOption resampling = streamBalanceOption;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not use the Option directly? It is not clear why the attribute resampling is needed
I see now why you did it. However, perhaps use a boolean attribute in ARFBaseLearner instead of a MultiChoiceOption. Also, instead of calling it resampling, what about calling balancing, which can be on (true) or off (false).

this.executor = Executors.newFixedThreadPool(numberOfJobs);
}

@Override
public void trainOnInstanceImpl(Instance instance) {
++this.instancesSeen;
if(this.ensemble == null)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

avoid changes to the format of the code, such as adding spaces or extra lines

this.classifier.resetLearning();
this.createdOn = instancesSeen;
this.driftDetectionMethod = ((ChangeDetector) getPreparedClassOption(this.driftOption)).copy();
}
this.evaluator.reset();
}

private void populateCounterArray(Instance instnc) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can call it instance instead of instnc :)

…on as a FlagOption. Modified some default option values.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants