-
Notifications
You must be signed in to change notification settings - Fork 355
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
New_AdaptiveRandomForest #243
base: master
Are you sure you want to change the base?
Conversation
…orest with an extra option to turn off resampling function. The original is designed to deal with imbalanced datasets.
Hi Ethan, Cheers, |
"Defines how m, defined by mFeaturesPerTreeSize, is interpreted. M represents the total number of features.", | ||
new String[]{"Specified m (integer value)", "sqrt(M)+1", "M-(sqrt(M)+1)", | ||
"Percentage (M * (m / 100))"}, | ||
new String[]{"SpecifiedM", "SqrtM1", "MSqrtM1", "Percentage"}, 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default value should be 3 in this option
"Percentage (M * (m / 100))"}, | ||
new String[]{"SpecifiedM", "SqrtM1", "MSqrtM1", "Percentage"}, 3); | ||
|
||
"The number of trees.", 10, 1, Integer.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The default option should be 100 for the number of trees
|
||
@Override | ||
public String getPurposeString() { | ||
return "Adaptive Random Forest algorithm for evolving data streams from Gomes et al."; | ||
return "Adaptive Random Forest with Resampling --- beta version"; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please revert the purpose string, there is not need to change it
public IntOption mFeaturesPerTreeSizeOption = new IntOption("mFeaturesPerTreeSize", 'm', | ||
"Number of features allowed considered for each split. Negative values corresponds to M - m", 60, Integer.MIN_VALUE, Integer.MAX_VALUE); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
similar to the other comments, leave these default values as they are
"The lambda parameter for bagging.", 6.0, 1.0, Float.MAX_VALUE); | ||
"The lambda parameter for bagging.", 6.0, 1.0, Float.MAX_VALUE); | ||
|
||
public MultiChoiceOption streamBalanceOption = new MultiChoiceOption("streamBalanceOption", 'r', "The option of stream balance", new String[]{"Imbalance", "Balance"}, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be the only option added / changed for this PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The name would be better if it were classBalancingOption
. The options that are displayed on the GUI should be None (No balance)
and Balanced
and not Imbalance
and Balance
private void init(int indexOriginal, ARFHoeffdingTree instantiatedClassifier, BasicClassificationPerformanceEvaluator evaluatorInstantiated, | ||
long instancesSeen, boolean useBkgLearner, boolean useDriftDetector, ClassOption driftOption, ClassOption warningOption, boolean isBackgroundLearner) { | ||
//Option stream/dataset balance or not. Resampling for imbalance, No resampling for balance dataset | ||
public MultiChoiceOption resampling = streamBalanceOption; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the Option directly? It is not clear why the attribute resampling
is needed
I see now why you did it. However, perhaps use a boolean attribute in ARFBaseLearner instead of a MultiChoiceOption
. Also, instead of calling it resampling, what about calling balancing
, which can be on (true) or off (false).
this.executor = Executors.newFixedThreadPool(numberOfJobs); | ||
} | ||
|
||
@Override | ||
public void trainOnInstanceImpl(Instance instance) { | ||
++this.instancesSeen; | ||
if(this.ensemble == null) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
avoid changes to the format of the code, such as adding spaces or extra lines
this.classifier.resetLearning(); | ||
this.createdOn = instancesSeen; | ||
this.driftDetectionMethod = ((ChangeDetector) getPreparedClassOption(this.driftOption)).copy(); | ||
} | ||
this.evaluator.reset(); | ||
} | ||
|
||
private void populateCounterArray(Instance instnc) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can call it instance
instead of instnc
:)
…on as a FlagOption. Modified some default option values.
Comparing with the original AdaptiveRandomForest, New_AdaptiveRandomForest with an extra option to turn off resampling function. The original is designed to deal with imbalanced datasets.