-
-
Couldn't load subscription status.
- Fork 8
Benchmark/rf use case #294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Draft
cxzhang4
wants to merge
39
commits into
main
Choose a base branch
from
benchmark/rf_use_case
base: main
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
Draft
Changes from all commits
Commits
Show all changes
39 commits
Select commit
Hold shift + click to select a range
2380913
TODO: write tests
cxzhang4 86f87c8
name -> TB. began refactoring based on last meeting with Sebastian
cxzhang4 400ed74
slight description change
cxzhang4 9e6acd8
removed extraneous comments
cxzhang4 fc4f2fa
added n_last_loss frequency test
cxzhang4 81d1ded
in progress
cxzhang4 cb03eb3
autotest working, accidentally used the wrong callback_generator
cxzhang4 78b95a5
simple and eval_freq tests pass
cxzhang4 a365757
changed logging methods to private
cxzhang4 43a8ffb
removed magrittr pipe from tests
cxzhang4 6b9a845
added details for callback class
cxzhang4 d354b2c
formatting
cxzhang4 b5b27b1
built docs
cxzhang4 565456b
Merge branch 'main' into feat/tflog-callback
cxzhang4 7c9f431
all tests pass, I think this is parity with the previous broken commi…
cxzhang4 c6c9333
implemented step logging
cxzhang4 43e7396
removed extraneous comments
cxzhang4 ec5d8fc
added tensorboard instructions
cxzhang4 f26a254
passes R CMD Check, minimally addresses every comment in the previous PR
cxzhang4 a86c946
moved newest news to bottom
cxzhang4 3652fe6
init
cxzhang4 92b4ffc
Update benchmarks/rf_use_case/run_benchmark.R
cxzhang4 f821e09
use mlr3oml cache
cxzhang4 5903001
Copied in Sebastian's solution for tuning the neurons as a paramset
cxzhang4 869aba2
looks like benchmark code working
cxzhang4 ab3bedf
Error: Inner tuning and parameter transformations are currently not s…
cxzhang4 31b3964
changed to grid search
cxzhang4 a489897
LLM-generated fn for neuron search space
cxzhang4 0073dcc
should work, test this on another machine
cxzhang4 10f3448
fjwoie
cxzhang4 b81c23b
encapsulated the learner for parallelization
cxzhang4 00b272f
comments
cxzhang4 89a72f1
added install script
cxzhang4 52af8ed
looks ready to run. 100 evals of mbo
cxzhang4 95f0a45
addoed surrogate learner for mbo
cxzhang4 5c0a447
Delete R/CallbackSetTB.R
sebffischer c384529
Delete tests/testthat/test_CallbackSetTB.R
sebffischer ee3f51d
merge main
sebffischer c01c531
update benchmark
sebffischer File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -14,7 +14,8 @@ inst/doc | |
| /doc/ | ||
| /Meta/ | ||
| CRAN-SUBMISSION | ||
| benchmarks/data | ||
| paper/data | ||
| .idea/ | ||
| .vsc/ | ||
| paper/data | ||
| paper/data | ||
Binary file not shown.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,25 @@ | ||
| library(here) | ||
| library(mlr3oml) | ||
| library(tidytable) | ||
|
|
||
| cc18_collection = ocl(99) | ||
|
|
||
| cc18_simple = list_oml_data(data_id = cc18_collection$data_ids, | ||
| number_classes = 2, | ||
| number_missing_values = 0) | ||
|
|
||
| cc18_small = cc18_simple |> | ||
| filter(NumberOfSymbolicFeatures == 1) |> # the target class is a symbolic feature | ||
| select(data_id, name, NumberOfFeatures, NumberOfInstances) |> | ||
| filter(name %in% c("qsar-biodeg", "madelon", "kc1", "blood-transfusion-service-center", "climate-model-simulation-crashes")) | ||
|
|
||
| data_dir = here("benchmarks", "data") | ||
| if (!dir.exists(data_dir)) { | ||
| dir.create(data_dir) | ||
| } | ||
|
|
||
| options(mlr3oml.cache = here(data_dir, "oml")) | ||
| mlr3misc::pwalk(cc18_small, function(data_id, name, NumberOfFeatures, NumberOfInstances) odt(data_id)) | ||
|
|
||
| dir.create(here("benchmarks", "data", "oml", "collections")) | ||
| fwrite(cc18_small, here("benchmarks", "data", "oml", "collections", "cc18_small.csv")) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,11 @@ | ||
| devtools::install_github("mlr-org/mlr3torch") | ||
| devtools::install_github("mlr-org/mlr3tuning@fix/int-tune-trafo") | ||
|
|
||
| # Package names | ||
| packages = c("here", "mlr3oml", "tidytable", "mlr3", "mlr3learners", "mlr3tuning", "mlr3mbo", "bbotk", "bench", "data.table") | ||
|
|
||
| # Install packages not yet installed | ||
| installed_packages = packages %in% rownames(installed.packages()) | ||
| if (any(installed_packages == FALSE)) { | ||
| install.packages(packages[!installed_packages], repos = "https://ftp.fau.de/cran/") | ||
| } |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,62 @@ | ||
| library(mlr3verse) | ||
| library(mlr3oml) | ||
| library(mlr3torch) | ||
| library(mlr3batchmark) | ||
| library(mlr3mbo) | ||
| library(mlr3tuning) | ||
|
|
||
| ids = c(1067, 1464, 1485, 1494, 40994) | ||
| task_list = lapply(ids, function(id) tsk("oml", data_id = id)) | ||
|
|
||
| mlp = lrn("classif.mlp", | ||
| activation = nn_relu, | ||
| n_layers = to_tune(lower = 1, upper = 10), | ||
| neurons = to_tune(p_int(lower = 10, upper = 1000)), | ||
| batch_size = to_tune(c(64, 128, 256)), | ||
| p = to_tune(0.1, 0.9), | ||
| epochs = to_tune(lower = 1, upper = 1000L, internal = TRUE), | ||
| validate = "test", | ||
| measures_valid = msr("classif.logloss"), | ||
| patience = 10, | ||
| device = "auto", | ||
| predict_type = "prob" | ||
| ) | ||
|
|
||
| mlp$encapsulate("callr", lrn("classif.featureless")) | ||
|
|
||
| surrogate = srlrn(as_learner(po("imputesample", affect_columns = selector_type("logical")) %>>% | ||
| po("imputeoor", multiplier = 3, affect_columns = selector_type(c("integer", "numeric", "character", "factor", "ordered"))) %>>% | ||
| po("colapply", applicator = as.factor, affect_columns = selector_type("character")) %>>% | ||
| lrn("regr.ranger")), catch_errors = TRUE) | ||
|
|
||
| # define an AutoTuner that wraps the classif.mlp | ||
| at = auto_tuner( | ||
| learner = mlp, | ||
| tuner = tnr("mbo", surrogate = surrogate), | ||
| resampling = rsmp("cv", folds = 5), | ||
| measure = msr("internal_valid_score", minimize = TRUE), | ||
| term_evals = 1 | ||
| ) | ||
|
|
||
| lrn_rf = lrn("classif.ranger") | ||
|
|
||
| design = benchmark_grid( | ||
| task_list, | ||
| learners = list(at, lrn_rf), | ||
| resampling = rsmp("cv", folds = 3) | ||
| ) | ||
|
|
||
| design1 = benchmark_grid( | ||
| task_list[[1]], | ||
| learners = list(at, lrn_rf), | ||
| resampling = rsmp("holdout") | ||
| ) | ||
|
|
||
| benchmark(design1) | ||
|
|
||
| reg = makeExperimentRegistry( | ||
| file.dir = here("benchmarks", "rf_use_case", "reg"), | ||
| packages = c("mlr3verse", "mlr3oml", "mlr3torch", "batchmark") | ||
| ) | ||
|
|
||
| batchmark(design) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,85 @@ | ||
| library(mlr3) | ||
| library(mlr3learners) | ||
| library(mlr3oml) | ||
| library(mlr3torch) | ||
| library(mlr3tuning) | ||
| library(mlr3mbo) | ||
| library(bbotk) | ||
|
|
||
| library(bench) | ||
| library(data.table) | ||
| library(here) | ||
|
|
||
| options(mlr3oml.cache = here("benchmarks", "data", "oml")) | ||
|
|
||
| # define the tasks | ||
| cc18_small = fread(here(getOption("mlr3oml.cache"), "collections", "cc18_small.csv")) | ||
|
|
||
| task_list = mlr3misc::pmap(cc18_small, function(data_id, name, NumberOfFeatures, NumberOfInstances) tsk("oml", data_id = data_id)) | ||
|
|
||
| task_list | ||
|
|
||
| # define the learners | ||
| # neurons = function(n_layers, latent_dim) { | ||
| # rep(latent_dim, n_layers) | ||
| # } | ||
|
|
||
| # n_layers_values <- 1:5 | ||
| # latent_dim_values <- seq(10, 200, by = 20) | ||
| # neurons_search_space <- mapply( | ||
| # neurons, | ||
| # expand.grid(n_layers = n_layers_values, latent_dim = latent_dim_values)$n_layers, | ||
| # expand.grid(n_layers = n_layers_values, latent_dim = latent_dim_values)$latent_dim, | ||
| # SIMPLIFY = FALSE | ||
| # ) | ||
|
|
||
| mlp = lrn("classif.mlp", | ||
| activation = nn_relu, | ||
| neurons = to_tune(ps( | ||
| n_layers = p_int(lower = 1, upper = 10), latent = p_int(10, 500), | ||
| .extra_trafo = function(x, param_set) { | ||
| list(neurons = rep(x$latent, x$n_layers)) | ||
| }) | ||
| ), | ||
| # neurons = to_tune(neurons_search_space), | ||
| batch_size = to_tune(c(64, 128, 256)), | ||
| p = to_tune(0.1, 0.7), | ||
| epochs = to_tune(upper = 1000L, internal = TRUE), | ||
| validate = "test", | ||
| measures_valid = msr("classif.acc"), | ||
| patience = 10, | ||
| device = "cpu" | ||
| ) | ||
|
|
||
| mlp$encapsulate("callr", lrn("classif.featureless")) | ||
|
|
||
| # define an AutoTuner that wraps the classif.mlp | ||
| at = auto_tuner( | ||
| learner = mlp, | ||
| tuner = tnr("mbo"), | ||
| resampling = rsmp("cv", folds = 5), | ||
| measure = msr("classif.acc"), | ||
| term_evals = 10 | ||
| ) | ||
|
|
||
| future::plan("multisession", workers = 8) | ||
|
|
||
| lrn_rf = lrn("classif.ranger") | ||
|
|
||
| options(mlr3.exec_random = FALSE) | ||
|
|
||
| design = benchmark_grid( | ||
| task_list[[1]], | ||
| learners = list(at, lrn_rf), | ||
| resampling = rsmp("cv", folds = 3) | ||
| ) | ||
| design = design[order(mlr3misc::ids(learner)), ] | ||
|
|
||
| time = bench::system_time( | ||
| bmr <- benchmark(design) | ||
| ) | ||
|
|
||
| bmrdt = as.data.table(bmr) | ||
|
|
||
| fwrite(bmrdt, here("R", "rf_use_case", "results", "bmrdt.csv")) | ||
| fwrite(time, here("R", "rf_use_case", "results", "time.csv")) |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| library(data.table) | ||
| library(mlr3) | ||
|
|
||
| library(here) | ||
|
|
||
| bmr_ce = fread(here("benchmarks", "rf_use_case", "results", "bmr_ce.csv")) | ||
|
|
||
| bmr_ce | ||
|
|
||
| time = fread(here("benchmarks", "rf_use_case", "results", "time.csv")) | ||
|
|
||
| time |
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can also add this to your
.Rprofile