Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RC 0.2.0 #171

Merged
merged 8 commits into from
Sep 25, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -12,3 +12,4 @@
^vignettes/articles$
^cran-comments\.md$
^CRAN-SUBMISSION$
^revdep$
5 changes: 5 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -5,3 +5,8 @@
.DS_Store
docs
hex sticker/
revdep/checks.noindex
revdep/library.noindex
revdep/data.sqlite
.httr-oauth
revdep/cloud.noindex/*
4 changes: 3 additions & 1 deletion DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
Package: tidyclust
Title: A Common API to Clustering
Version: 0.1.2.9000
Version: 0.2.0.9000
Authors@R: c(
person("Emil", "Hvitfeldt", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0002-0679-1945")),
Expand All @@ -13,6 +13,8 @@ Description: A common interface to specifying clustering models, in the
License: MIT + file LICENSE
URL: https://github.com/tidymodels/tidyclust, https://tidyclust.tidymodels.org/
BugReports: https://github.com/tidymodels/tidyclust/issues
Depends:
R (>= 3.6)
Imports:
cli (>= 3.0.0),
dials (>= 1.1.0),
Expand Down
28 changes: 19 additions & 9 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,5 +1,19 @@
# tidyclust (development version)

# tidyclust 0.2.0

## New Engines

* The clustMixType engine as been added to `k_means()`. This engine allows fitting of k-prototype models. (#63)

* The klaR engine as been added to `k_means()`. This engine allows fitting of k-modes models. (#63)

## Improvements

* Engine specific documentation has been added for all models and engines. (#159)

## Bug Fixes

* Fixed bug where engine specific arguments were passed along for `k_means()` when the engine ClusterR. (#142)

* Fixed bug where `prefix` argument wouldn't be correctly passed through `extract_cluster_assignment()`, `extract_centroids()`, and `predict()` (#145)
Expand All @@ -12,19 +26,15 @@

* `k_means()` now errors informatively if `fit()` without `num_clust` specified. (#134)

* Exported internal functions `ClusterR_kmeans_fit()`, `stats_kmeans_fit()`, and `hclust_fit()` have been renamed to `.k_means_fit_ClusterR()`, `.k_means_fit_stats()`, and `.hier_clust_fit_stats()` to reduce visibility for users.

* The clustMixType engine as been added to `k_means()`. This engine allows fitting of k-prototype models. (#63)
* Fixed bug where levels didn't match number of clusters if prediction on fewer number of observations. (#158)

* The klaR engine as been added to `k_means()`. This engine allows fitting of k-modes models. (#63)
* Fixed bug where `tune_cluster()` would error if used with an recipe that contained non-predictor variables such as id variables. (#124)

* Cluster reordering is now done at the fitting time, not the extraction and prediction time. (#154)
## Breaking Changes

* Engine specific documentation has been added for all models and engines. (#159)

* Fixed bug where levels didn't match number of clusters if prediction on fewer number of observations. (#158)
* Exported internal functions `ClusterR_kmeans_fit()`, `stats_kmeans_fit()`, and `hclust_fit()` have been renamed to `.k_means_fit_ClusterR()`, `.k_means_fit_stats()`, and `.hier_clust_fit_stats()` to reduce visibility for users.

* Fixed bug where `tune_cluster()` would error if used with an recipe that contained non-predictor variables such as id variables. (#124)
* Cluster reordering is now done at the fitting time, not the extraction and prediction time. (#154)

# tidyclust 0.1.2

Expand Down
1 change: 0 additions & 1 deletion R/engine_docs.R
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,6 @@
#' Knit engine-specific documentation
#' @param pattern A regular expression to specify which files to knit. The
#' default knits all engine documentation files.
#' @param ... Options passed to [knitr::knit()].
#' @return A tibble with column `file` for the file name and `result` (a
#' character vector that echos the output file name or, when there is
#' a failure, the error message).
Expand Down
2 changes: 1 addition & 1 deletion R/hier_clust.R
Original file line number Diff line number Diff line change
Expand Up @@ -171,7 +171,7 @@ translate_tidyclust.hier_clust <- function(x, engine = x$engine, ...) {
#'
#' @param x matrix or data frame
#' @param num_clusters the number of clusters
#' @param h the height to cut the dendrogram
#' @param cut_height the height to cut the dendrogram
#' @param linkage_method the agglomeration method to be used. This should be (an
#' unambiguous abbreviation of) one of `"ward.D"`, `"ward.D2"`, `"single"`,
#' `"complete"`, `"average"` (= UPGMA), `"mcquitty"` (= WPGMA), `"median"` (=
Expand Down
28 changes: 14 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -41,7 +41,7 @@ library(tidyclust)
set.seed(1234)

kmeans_spec <- k_means(num_clusters = 3) %>%
set_engine("stats")
set_engine("stats")

kmeans_spec
#> K Means Cluster Specification (partition)
Expand All @@ -60,38 +60,38 @@ kmeans_spec_fit <- kmeans_spec %>%
kmeans_spec_fit
#> tidyclust cluster object
#>
#> K-means clustering with 3 clusters of sizes 7, 14, 11
#> K-means clustering with 3 clusters of sizes 7, 11, 14
#>
#> Cluster means:
#> mpg cyl disp hp drat wt qsec vs
#> 1 19.74286 6 183.3143 122.28571 3.585714 3.117143 17.97714 0.5714286
#> 2 15.10000 8 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#> 3 26.66364 4 105.1364 82.63636 4.070909 2.285727 19.13727 0.9090909
#> 2 15.10000 8 353.1000 209.21429 3.229286 3.999214 16.77214 0.0000000
#> am gear carb
#> 1 0.4285714 3.857143 3.428571
#> 2 0.1428571 3.285714 3.500000
#> 3 0.7272727 4.090909 1.545455
#> 2 0.1428571 3.285714 3.500000
#>
#> Clustering vector:
#> Mazda RX4 Mazda RX4 Wag Datsun 710 Hornet 4 Drive
#> 1 1 3 1
#> 1 1 2 1
#> Hornet Sportabout Valiant Duster 360 Merc 240D
#> 2 1 2 3
#> 3 1 3 2
#> Merc 230 Merc 280 Merc 280C Merc 450SE
#> 3 1 1 2
#> 2 1 1 3
#> Merc 450SL Merc 450SLC Cadillac Fleetwood Lincoln Continental
#> 2 2 2 2
#> 3 3 3 3
#> Chrysler Imperial Fiat 128 Honda Civic Toyota Corolla
#> 2 3 3 3
#> Toyota Corona Dodge Challenger AMC Javelin Camaro Z28
#> 3 2 2 2
#> Pontiac Firebird Fiat X1-9 Porsche 914-2 Lotus Europa
#> Toyota Corona Dodge Challenger AMC Javelin Camaro Z28
#> 2 3 3 3
#> Pontiac Firebird Fiat X1-9 Porsche 914-2 Lotus Europa
#> 3 2 2 2
#> Ford Pantera L Ferrari Dino Maserati Bora Volvo 142E
#> 2 1 2 3
#> 3 1 3 2
#>
#> Within cluster sum of squares by cluster:
#> [1] 13954.34 93643.90 11848.37
#> [1] 13954.34 11848.37 93643.90
#> (between_SS / total_SS = 80.8 %)
#>
#> Available components:
Expand Down Expand Up @@ -132,7 +132,7 @@ extract_cluster_assignment(kmeans_spec_fit)
#> 8 Cluster_2
#> 9 Cluster_2
#> 10 Cluster_1
#> # … with 22 more rows
#> # 22 more rows
```

and `extract_centroids()` returns the locations of the clusters
Expand Down
2 changes: 1 addition & 1 deletion cran-comments.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
## Comments

Patch release to make sure `utils::packageVersion()` doesn't cause issues when package isn't available.
Release including fix for all known bugs, new engines and better document and error messages.

## R CMD check results

Expand Down
4 changes: 2 additions & 2 deletions man/dot-hier_clust_fit_stats.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Loading