Result of `tune_cluster()` depends on the name of the split?

When I try to use `tune_cluster()` with an `apparent()` split (because kmeans isn't often used with splits, so `apparent()` seems to make the most sense to me), the result has a lot of `NA`s. After a lot of work I eventually traced it down to something really weird: the result seems to depend on the *name of the split* (!?). 

You can reproduce this in the docker image `ubcdsci/r-dsci-100-grading:cafad0999c16`.

Reprex:

```r
library(tidyverse)
library(tidymodels)
library(tidyclust)

# start by reducing the size of mtcars just to make things cleaner (this is not important for the bug)
mt <- mtcars |> rep_sample_n(size = 10, replace = TRUE, reps = 1) |> ungroup() |> select(mpg, disp)

# specification and recipe
kmeans_spec <- k_means(num_clusters = tune()) |>
    set_engine("stats")

kmeans_recipe <- recipe(~ ., data=mt) |>
    step_scale(all_predictors()) |>
    step_center(all_predictors())

# tuning 1-4 clusters
ks <- tibble(num_clusters = 1:4)

# Now we create two rsets. One using apparent, one manually. They're identical except for the split name.

# RSET 1: manually created single split that just does tuning on the whole data set. 
# The split can be named anything you want EXCEPT "Apparent". I named it "banana".
# Note: if you name this "Apparent", you'll see a buggy result just like if you used apparent().
indices <- list(list(analysis = 1:nrow(mt), assessment = 1:nrow(mt)))
splits <- lapply(indices, make_splits, data = mt)
split_good <- manual_rset(splits, c("banana"))

# RSET 2: using apparent. 
split_bad <- apparent(mt)

# if you inspect split_good and split_bad, they're identical aside from the split name.

# Now we tune the number of clusters with each rset
results_good <- workflow() |>
    add_recipe(kmeans_recipe) |>
    add_model(kmeans_spec) |>
    tune_cluster(resamples = split_good, grid = ks) |>
    collect_metrics()

results_bad <- workflow() |>
    add_recipe(kmeans_recipe) |>
    add_model(kmeans_spec) |>
    tune_cluster(resamples = split_bad, grid = ks) |>
    collect_metrics()
```

The outputs look like:

![image](https://github.com/user-attachments/assets/7bcdaa76-68bd-4681-95ea-cf7cd75b5b26)




Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Result of `tune_cluster()` depends on the name of the split? #193

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Result of tune_cluster() depends on the name of the split? #193

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Result of `tune_cluster()` depends on the name of the split? #193