Skip to content

Commit

Permalink
Fix bugs in cross validation vignette (added by me when I made initia…
Browse files Browse the repository at this point in the history
…l edit)
  • Loading branch information
ericward-noaa committed Feb 14, 2025
1 parent 4f8bea1 commit df5ea4d
Showing 1 changed file with 11 additions and 11 deletions.
22 changes: 11 additions & 11 deletions vignettes/articles/cross-validation.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -27,7 +27,7 @@ knitr::opts_chunk$set(
)
```

```{r packages, message=FALSE, warning=TRUE}
```{r packages, message=FALSE, warning=TRUE, eval=TRUE}
library(ggplot2)
library(dplyr)
library(sdmTMB)
Expand All @@ -54,7 +54,7 @@ In LFOCV, data up to year $t$ are used to predict observations at $t+1$, etc.
Cross validation in sdmTMB is implemented using the `sdmTMB_cv()` function, with the `k_folds` argument specifying the number of folds (defaults to 8).
The function uses parallelization by default a `future::plan()` is set, but this can be turned off with the `parallel` argument.

```{r ex0}
```{r ex0, eval=TRUE}
data(pcod)
mesh <- make_mesh(pcod, c("X", "Y"), cutoff = 25)
pcod$fyear <- as.factor(pcod$year)
Expand All @@ -65,7 +65,7 @@ pcod$fyear <- as.factor(pcod$year)
# Set parallel processing if desired:
library(future)
plan(multisession, workers = 2)
m_cv <- sdmTMB_cv(
m_cv <- sdmTMB::sdmTMB_cv(
density ~ 0 + s(depth_scaled) + fyear,
data = pcod,
mesh = mesh,
Expand All @@ -80,7 +80,7 @@ Without getting into the complexities of the `blockCV` or `spatialsample` packag
```{r ex2, eval=TRUE}
clust <- kmeans(pcod[, c("X", "Y")], 20)$cluster
m_cv <- sdmTMB_cv(
m_cv <- sdmTMB::sdmTMB_cv(
density ~ 0 + s(depth_scaled) + fyear,
data = pcod,
mesh = mesh,
Expand All @@ -94,7 +94,7 @@ Or similarly, these clusters could be assigned in time---here, each year to a un
```{r ex3, eval=TRUE}
clust <- as.numeric(as.factor(pcod$year))
m_cv <- sdmTMB_cv(
m_cv <- sdmTMB::sdmTMB_cv(
density ~ 0 + s(depth_scaled),
data = pcod,
mesh = mesh,
Expand Down Expand Up @@ -133,16 +133,16 @@ m <- sdmTMB_cv(
)
# RMSE across entire dataset:
sqrt(mean((d$density - d$cv_predicted)^2))
sqrt(mean((m$data$density - m$data$cv_predicted)^2))
# MAE across entire dataset:
mean(abs(d$density - d$cv_predicted))
mean(abs(m$data$density - m$data$cv_predicted))
```

Alternatively, we might be interested in calculating RMSE and MAE by fold,

```{r ex5, eval=TRUE}
```{r ex5b, eval=TRUE}
# RMSE and MAE by fold:
group_by(d, cv_fold) |>
group_by(m$data, cv_fold) |>
summarize(
rmse = sqrt(mean((density - cv_predicted)^2)),
mae = mean(abs(density - cv_predicted))
Expand Down Expand Up @@ -207,15 +207,15 @@ In this example, using either the predictive log-likelihood or ELPD would lead o
```{r ex7, eval=TRUE}
clust <- sample(seq_len(10), size = nrow(pcod), replace = TRUE)
m1 <- sdmTMB_cv(
m1 <- sdmTMB::sdmTMB_cv(
density ~ 0 + fyear,
data = pcod,
mesh = mesh,
fold_ids = clust,
family = tweedie(link = "log")
)
m2 <- sdmTMB_cv(
m2 <- sdmTMB::sdmTMB_cv(
density ~ 0 + fyear + s(depth_scaled),
data = pcod,
mesh = mesh,
Expand Down

0 comments on commit df5ea4d

Please sign in to comment.