You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,6 +2,8 @@
2
2
3
3
The cuVS documentation is a Fern project in [../fern](../fern).
4
4
5
+
Fern requires Node.js 18 or newer. If the docs fail with an error such as `SyntaxError: Unexpected token '.'`, check `node --version` and activate a newer Node.js runtime.
Copy file name to clipboardExpand all lines: fern/README.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -4,6 +4,8 @@ The cuVS documentation lives in this Fern project. Pages are in `fern/pages`, an
4
4
5
5
The C, C++, Python, Java, Rust, and Go API reference pages are generated from the source tree by `fern/scripts/generate_api_reference.py`. `fern/build_docs.sh` refreshes those pages before validation, preview, and publish runs.
6
6
7
+
Fern requires Node.js 18 or newer. If the docs fail with an error such as `SyntaxError: Unexpected token '.'`, check `node --version` and activate a newer Node.js runtime.
8
+
7
9
## Preview locally
8
10
9
11
Start the local preview server from the repository root:
Copy file name to clipboardExpand all lines: fern/pages/build.md
+5-5Lines changed: 5 additions & 5 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -199,18 +199,18 @@ cuVS has the following configurable cmake flags available:
199
199
200
200
### Preview documentation
201
201
202
-
The cuVS documentation is a Fern project in the repository's `fern` directory. Install the Fern CLI, then run the local preview from the repository root:
202
+
The cuVS documentation is a Fern project in the repository's `fern` directory. Fern requires Node.js 18 or newer. If the docs fail with an error such as `SyntaxError: Unexpected token '.'`, check `node --version` and activate a newer Node.js runtime.
203
+
204
+
Run the local preview from the repository root:
203
205
204
206
```bash
205
-
npm install -g fern-api
206
-
fern docs dev
207
+
fern/build_docs.sh dev
207
208
```
208
209
209
210
Fern serves the preview at [http://localhost:3000](http://localhost:3000) by default.
210
211
211
212
Run the Fern checks before publishing documentation changes:
|`oversampling_factor`|`double`| Oversampling factor for use in the k-means\|\| algorithm |
47
49
|`batch_samples`|`int`| batch_samples and batch_centroids are used to tile 1NN computation which is useful to optimize/control the memory footprint Default tile is [batch_samples x n_clusters] i.e. when batch_centroids is 0 then don't tile the centroids |
48
50
|`batch_centroids`|`int`| if 0 then batch_centroids = n_clusters |
49
-
|`inertia_check`|`bool`|Check inertia during iterations for early convergence. |
51
+
|`inertia_check`|`bool`|Deprecated, ignored. Kept for ABI compatibility. |
50
52
|`hierarchical`|`bool`| Whether to use hierarchical (balanced) kmeans or not |
51
53
|`hierarchical_n_iters`|`int`| For hierarchical k-means , defines the number of training iterations |
52
54
|`streaming_batch_size`|`int64_t`| Number of samples to process per GPU batch for the batched (host-data) API. When set to 0, defaults to n_samples (process all at once). |
55
+
|`init_size`|`int64_t`| Number of samples to draw for KMeansPlusPlus initialization. When set to 0, uses heuristic min(3 * n_clusters, n_samples) for host data, or n_samples for device data. |
TODO: Remove this after cuvsKMeansParams is replaced in ABI 2.0
64
+
65
+
```c
66
+
struct cuvsKMeansParams_v2 { ... };
67
+
```
68
+
69
+
**Fields**
70
+
71
+
| Name | Type | Description |
72
+
| --- | --- | --- |
73
+
| `n_clusters` | `int` | The number of clusters to form as well as the number of centroids to generate (default:8). |
74
+
| `init` | [`cuvsKMeansInitMethod`](/api-reference/c-api-cluster-kmeans#cuvskmeansinitmethod) | Method for initialization, defaults to k-means++:<br />- cuvsKMeansInitMethod::KMeansPlusPlus (k-means++): Use scalable k-means++ algorithm to select the initial cluster centers.<br />- cuvsKMeansInitMethod::Random (random): Choose 'n_clusters' observations (rows) at random from the input data for the initial centroids.<br />- cuvsKMeansInitMethod::Array (ndarray): Use 'centroids' as initial cluster centers. |
75
+
| `max_iter` | `int` | Maximum number of iterations of the k-means algorithm for a single run. |
76
+
| `tol` | `double` | Relative tolerance with regards to inertia to declare convergence. |
77
+
| `n_init` | `int` | Number of instance k-means algorithm will be run with different seeds. |
78
+
| `oversampling_factor` | `double` | Oversampling factor for use in the k-means\|\| algorithm |
79
+
| `batch_samples` | `int` | batch_samples and batch_centroids are used to tile 1NN computation which is useful to optimize/control the memory footprint Default tile is [batch_samples x n_clusters] i.e. when batch_centroids is 0 then don't tile the centroids |
80
+
| `batch_centroids` | `int` | if 0 then batch_centroids = n_clusters |
81
+
| `hierarchical` | `bool` | Whether to use hierarchical (balanced) kmeans or not |
82
+
| `hierarchical_n_iters` | `int` | For hierarchical k-means , defines the number of training iterations |
83
+
| `streaming_batch_size` | `int64_t` | Number of samples to process per GPU batch for the batched (host-data) API. When set to 0, defaults to n_samples (process all at once). |
84
+
| `init_size` | `int64_t` | Number of samples to draw for KMeansPlusPlus initialization. When set to 0, uses heuristic min(3 * n_clusters, n_samples) for host data, or n_samples for device data. |
@@ -127,6 +203,8 @@ Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinit
127
203
128
204
X may reside on either host (CPU) or device (GPU) memory. When X is on the host the data is streamed to the GPU in batches controlled by params->streaming_batch_size.
129
205
206
+
replaced by cuvsKMeansFit_v2.
207
+
130
208
**Parameters**
131
209
132
210
| Name | Direction | Type | Description |
@@ -141,15 +219,48 @@ X may reside on either host (CPU) or device (GPU) memory. When X is on the host
Mirrors cuvsKMeansFit but takes cuvsKMeansParams_v2_t. Will become the unsuffixed cuvsKMeansFit in cuVS 26.08.
240
+
241
+
**Parameters**
242
+
243
+
| Name | Direction | Type | Description |
244
+
| --- | --- | --- | --- |
245
+
|`res`| in |[`cuvsResources_t`](/api-reference/c-api-core-c-api#cuvsresources-t)| opaque C handle |
246
+
|`params`| in |[`cuvsKMeansParams_v2_t`](/api-reference/c-api-cluster-kmeans#cuvskmeansparams-v2)| Parameters for KMeans model (v2 layout). |
247
+
|`X`| in |`DLManagedTensor*`| Training instances to cluster. The data must be in row-major format. May be on host or device memory. [dim = n_samples x n_features]|
248
+
|`sample_weight`| in |`DLManagedTensor*`| Optional weights for each observation in X. Must be on the same memory space as X. [len = n_samples]|
249
+
|`centroids`| inout |`DLManagedTensor*`|[in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by 'centroids'. Must be on device. [dim = n_clusters x n_features]|
250
+
|`inertia`| out |`double*`| Sum of squared distances of samples to their closest cluster center. |
251
+
|`n_iter`| out |`int*`| Number of iterations run. |
0 commit comments