-
-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
AutoTuner $model
is set to some bogus value after resample()
call.
#1091
Comments
Thank you @JHarrisonEcoEvo for your thoughtful feedback. (And sorry that you are getting a reply three years later...). There are a few points here that basically boil down to design decisions, but there is also an error in your code.
We basically have that, and you almost found it: We could argue, whether a It would probably also make sense to expose certain meta-information about the base-model, e.g. Why does rr$learners[[1]]
#> <AutoTuner:imputehist.scale.regr.ranger.tuned>
#> * Model: auto_tuner_model
#> * Parameters: list()
#> * Validate: NULL
#> * Packages: mlr3, mlr3tuning, mlr3pipelines, graphics, mlr3learners,
#> ranger
#> * Predict Types: [response], se, distr
#> * Feature Types: logical, integer, numeric, character, factor, ordered,
#> POSIXct
#> * Properties: featureless, hotstart_backward, hotstart_forward,
#> importance, loglik, marshal, missings, oob_error, selected_features,
#> weights
#> * Search Space:
#> id class lower upper nlevels
#> <char> <char> <num> <num> <num>
#> 1: regr.ranger.num.trees ParamInt 10 200 191
#> 2: regr.ranger.splitrule ParamFct NA NA 2
# using the `$importance()` interface:
rr$learners[[1]]$base_learner()$importance()
#> lstat rm tract nox dis indus crim lon
#> 48.148068 31.680772 11.407258 8.476442 7.139589 6.704006 5.855580 4.117204
#> age ptratio rad tax zn b lat
#> 3.422048 3.398435 2.558387 2.257436 1.950319 1.790536 1.782093
# alternatively, going into the model
rr$learners[[1]]$base_learner()$model$variable.importance
#> age b crim dis indus lat lon lstat
#> 3.422048 1.790536 5.855580 7.139589 6.704006 1.782093 4.117204 48.148068
#> nox ptratio rad rm tax tract zn
#> 8.476442 3.398435 2.558387 31.680772 2.257436 11.407258 1.950319 If you train the at$train(boston_task)
at$base_learner()$model
#> Ranger result
#>
#> Call:
#> ranger::ranger(dependent.variable.name = task$target_names, data = task$data(), case.weights = task$weights$weight, importance = "permutation", num.threads = 1L, num.trees = 22L, splitrule = "variance")
#>
#> Type: Regression
#> Number of trees: 22
#> Sample size: 506
#> Number of independent variables: 15
#> Mtry: 3
#> Target node size: 5
#> Variable importance mode: permutation
#> Splitrule: variance
#> OOB prediction error (MSE): 12.33812
#> R squared (OOB): 0.8536618 To the degree that GraphLearner does not have an at$model
#> $learner
#> $learner$predict_type
#> [1] "response" this may be confusing? I think |
$model
is set to some bogus value after resample()
call.
Bumping this back to |
Fixed by mlr-org/mlr3tuning#484 |
Hi there,
Thanks for the package. I recently encountered difficulty extracting variable importance metrics from a ranger learner after building a pipeline that did some feature engineering, etc. Working example below with possible points of discussion included as comments.
In short, the difficulty in extracting expected model outputs will stymie new users very easily...it did me, for sure. If possible, it seems that the base learner should always be attached to mlr3 objects at the top level as the model object...e.g., learner$model should return the most base model. I get that this might be hard if one is wrapping learners with other learners, as I do here.
Perhaps a solution if multiple learners are being wrapped/combined, is to have users specify which learner is the one they want to "track" through the modeling process. That way after byzantine tuning, resampling, etc. they could say give me the model that I marked. And, by default that model would fill the top slot of learner$model. Perhaps this suggestion is bad since I know very little about the mlr3 coding architecture. But, if at all possible, it should be made easier to get at the most base learner ....
Thanks!
Thanks, for your time!
The text was updated successfully, but these errors were encountered: