document details_hier_clust_stats

EmilHvitfeldt · EmilHvitfeldt · commit 022144027e48 · 2023-08-30T16:30:48.000-07:00
diff --git a/R/hier_clust.R b/R/hier_clust.R
@@ -5,6 +5,12 @@
 #' `hier_clust()` defines a model that fits clusters based on a distance-based
 #' dendrogram
 #'
+#' There are different ways to fit this model, and the method of estimation is
+#' chosen by setting the model engine. The engine-specific pages for this model
+#' are listed below.
+#'
+#' - \link[=details_hier_clust_stats]{stats}
+#'
 #' @param mode A single character string for the type of model. The only
 #'   possible value for this model is "partition".
 #' @param engine A single character string specifying what computational engine
@@ -23,7 +29,8 @@
 #' ## What does it mean to predict?
 #'
 #' To predict the cluster assignment for a new observation, we find the closest
-#' cluster. How we measure “closeness” is dependent on the specified type of linkage in the model:
+#' cluster. How we measure “closeness” is dependent on the specified type of
+#' linkage in the model:
 #'
 #' - *single linkage*: The new observation is assigned to the same cluster as
 #'   its nearest observation from the training data.
diff --git a/R/hier_clust_stats.R b/R/hier_clust_stats.R
@@ -0,0 +1,11 @@
+#' Hierarchical (Agglomerative) Clustering via stats
+#'
+#' [hier_clust()] creates Hierarchical (Agglomerative) Clustering model.
+#'
+#' @includeRmd man/rmd/hier_clust_stats.md details
+#'
+#' @name details_hier_clust_stats
+#' @keywords internal
+NULL
+
+# See inst/README-DOCS.md for a description of how these files are processed
diff --git a/man/details_hier_clust_stats.Rd b/man/details_hier_clust_stats.Rd
diff --git a/man/hier_clust.Rd b/man/hier_clust.Rd
diff --git a/man/rmd/hier_clust_stats.Rmd b/man/rmd/hier_clust_stats.Rmd
@@ -0,0 +1,60 @@
+```{r, child = "aaa.Rmd", include = FALSE}
+```
+
+`r descr_models("hier_clust", "stats")`
+
+## Tuning Parameters
+
+```{r stats-param-info, echo = FALSE}
+defaults <- 
+  tibble::tibble(tidyclust = c("num_clusters"),
+                 default = c("no default"))
+
+param <-
+ hier_clust() %>% 
+  set_engine("stats") %>% 
+  set_mode("partition") %>% 
+  make_parameter_list(defaults)
+```
+
+This model has `r nrow(param)` tuning parameters:
+
+```{r stats-param-list, echo = FALSE, results = "asis"}
+param$item
+```
+
+## Translation from tidyclust to the original package (partition)
+
+```{r stats-cls}
+hier_clust(num_clusters = integer(1)) %>% 
+  set_engine("stats") %>% 
+  set_mode("partition") %>% 
+  translate_tidyclust()
+```
+
+## Preprocessing requirements
+
+```{r child = "template-makes-dummies.Rmd"}
+```
+
+## References
+
+- Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Wadsworth & Brooks/Cole. (S version.)
+
+- Everitt, B. (1974). Cluster Analysis. London: Heinemann Educ. Books.
+
+- Hartigan, J.A. (1975). Clustering Algorithms. New York: Wiley.
+
+- Sneath, P. H. A. and R. R. Sokal (1973). Numerical Taxonomy. San Francisco: Freeman.
+
+- Anderberg, M. R. (1973). Cluster Analysis for Applications. Academic Press: New York.
+
+- Gordon, A. D. (1999). Classification. Second Edition. London: Chapman and Hall / CRC
+
+- Murtagh, F. (1985). “Multidimensional Clustering Algorithms”, in COMPSTAT Lectures 4. Wuerzburg: Physica-Verlag (for algorithmic details of algorithms used).
+
+- McQuitty, L.L. (1966). Similarity Analysis by Reciprocal Pairs for Discrete and Continuous Data. Educational and Psychological Measurement, 26, 825–831. doi:10.1177/001316446602600402.
+
+- Legendre, P. and L. Legendre (2012). Numerical Ecology, 3rd English ed. Amsterdam: Elsevier Science BV.
+
+- Murtagh, Fionn and Legendre, Pierre (2014). Ward's hierarchical agglomerative clustering method: which algorithms implement Ward's criterion? Journal of Classification, 31, 274–295. doi:10.1007/s00357-014-9161-z.
diff --git a/man/rmd/hier_clust_stats.md b/man/rmd/hier_clust_stats.md
@@ -0,0 +1,63 @@
+
+
+
+For this engine, there is a single mode: partition
+
+## Tuning Parameters
+
+
+
+This model has 1 tuning parameters:
+
+- `num_clusters`: # Clusters (type: integer, default: no default)
+
+## Translation from tidyclust to the original package (partition)
+
+
+```r
+hier_clust(num_clusters = integer(1)) %>% 
+  set_engine("stats") %>% 
+  set_mode("partition") %>% 
+  translate_tidyclust()
+```
+
+```
+## Hierarchical Clustering Specification (partition)
+## 
+## Main Arguments:
+##   num_clusters = integer(1)
+##   linkage_method = complete
+## 
+## Computational engine: stats 
+## 
+## Model fit template:
+## tidyclust::.hier_clust_fit_stats(data = missing_arg(), num_clusters = integer(1), 
+##     linkage_method = "complete")
+```
+
+## Preprocessing requirements
+
+
+Factor/categorical predictors need to be converted to numeric values (e.g., dummy or indicator variables) for this engine. When using the formula method via \\code{\\link[=fit.cluster_spec]{fit()}}, tidyclust will convert factor columns to indicators.
+
+## References
+
+- Becker, R. A., Chambers, J. M. and Wilks, A. R. (1988). The New S Language. Wadsworth & Brooks/Cole. (S version.)
+
+- Everitt, B. (1974). Cluster Analysis. London: Heinemann Educ. Books.
+
+- Hartigan, J.A. (1975). Clustering Algorithms. New York: Wiley.
+
+- Sneath, P. H. A. and R. R. Sokal (1973). Numerical Taxonomy. San Francisco: Freeman.
+
+- Anderberg, M. R. (1973). Cluster Analysis for Applications. Academic Press: New York.
+
+- Gordon, A. D. (1999). Classification. Second Edition. London: Chapman and Hall / CRC
+
+- Murtagh, F. (1985). “Multidimensional Clustering Algorithms”, in COMPSTAT Lectures 4. Wuerzburg: Physica-Verlag (for algorithmic details of algorithms used).
+
+- McQuitty, L.L. (1966). Similarity Analysis by Reciprocal Pairs for Discrete and Continuous Data. Educational and Psychological Measurement, 26, 825–831. doi:10.1177/001316446602600402.
+
+- Legendre, P. and L. Legendre (2012). Numerical Ecology, 3rd English ed. Amsterdam: Elsevier Science BV.
+
+- Murtagh, Fionn and Legendre, Pierre (2014). Ward's hierarchical agglomerative clustering method: which algorithms implement Ward's criterion? Journal of Classification, 31, 274–295. doi:10.1007/s00357-014-9161-z.