Skip to content

Commit 0afad3c

Browse files
committed
Update Fern docs guidelines and references
1 parent 4700779 commit 0afad3c

67 files changed

Lines changed: 2249 additions & 1835 deletions

File tree

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

docs/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@
22

33
The cuVS documentation is a Fern project in [../fern](../fern).
44

5+
Fern requires Node.js 18 or newer. If the docs fail with an error such as `SyntaxError: Unexpected token '.'`, check `node --version` and activate a newer Node.js runtime.
6+
57
## Preview locally
68

79
```bash

fern/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,8 @@ The cuVS documentation lives in this Fern project. Pages are in `fern/pages`, an
44

55
The C, C++, Python, Java, Rust, and Go API reference pages are generated from the source tree by `fern/scripts/generate_api_reference.py`. `fern/build_docs.sh` refreshes those pages before validation, preview, and publish runs.
66

7+
Fern requires Node.js 18 or newer. If the docs fail with an error such as `SyntaxError: Unexpected token '.'`, check `node --version` and activate a newer Node.js runtime.
8+
79
## Preview locally
810

911
Start the local preview server from the repository root:

fern/build_docs.sh

Lines changed: 22 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,28 @@ Examples:
3030
EOF
3131
}
3232

33+
require_node_18() {
34+
if ! command -v node >/dev/null 2>&1; then
35+
echo "Fern docs require Node.js 18 or newer, but node was not found on PATH." >&2
36+
echo "Install or activate Node.js 18+ before running fern/build_docs.sh." >&2
37+
exit 1
38+
fi
39+
40+
local node_version
41+
local node_major
42+
node_version=$(node -p 'process.versions.node' 2>/dev/null || true)
43+
node_major="${node_version%%.*}"
44+
45+
if [[ ! "${node_major}" =~ ^[0-9]+$ || "${node_major}" -lt 18 ]]; then
46+
echo "Fern docs require Node.js 18 or newer, but found Node.js ${node_version:-unknown}." >&2
47+
echo "Older Node.js versions can fail with errors such as \"SyntaxError: Unexpected token '.'.\"" >&2
48+
echo "Install or activate Node.js 18+ before running fern/build_docs.sh." >&2
49+
exit 1
50+
fi
51+
}
52+
53+
require_node_18
54+
3355
if [[ -n "${FERN_CLI:-}" ]]; then
3456
FERN_CMD=("${FERN_CLI}")
3557
elif command -v fern >/dev/null 2>&1; then

fern/docs.yml

Lines changed: 7 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22

33
title: "cuVS"
44
instances:
5-
- url: "nvidia-cuvs.docs.buildwithfern.com/cuvs"
5+
- url: "nvidia-cuvs.docs.buildwithfern.com/cuvs"
66
custom-domain: docs/nvidia.com/cuvs
77
footer: "./theme/nvidia/components/CustomFooter.tsx"
88
logo:
@@ -159,8 +159,12 @@ navigation:
159159
path: "./pages/user_guide/integration_patterns.md"
160160
- section: "Developer Guide"
161161
contents:
162-
- page: "Guidelines"
163-
path: "./pages/developer_guide.md"
162+
- section: "Guidelines"
163+
contents:
164+
- page: "C++ Guidelines"
165+
path: "./pages/cpp_guidelines.md"
166+
- page: "Python Guidelines"
167+
path: "./pages/python_guidelines.md"
164168
- section: "Advanced Topics"
165169
path: "./pages/advanced_topics.md"
166170
contents:

fern/pages/build.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -199,18 +199,18 @@ cuVS has the following configurable cmake flags available:
199199

200200
### Preview documentation
201201

202-
The cuVS documentation is a Fern project in the repository's `fern` directory. Install the Fern CLI, then run the local preview from the repository root:
202+
The cuVS documentation is a Fern project in the repository's `fern` directory. Fern requires Node.js 18 or newer. If the docs fail with an error such as `SyntaxError: Unexpected token '.'`, check `node --version` and activate a newer Node.js runtime.
203+
204+
Run the local preview from the repository root:
203205

204206
```bash
205-
npm install -g fern-api
206-
fern docs dev
207+
fern/build_docs.sh dev
207208
```
208209

209210
Fern serves the preview at [http://localhost:3000](http://localhost:3000) by default.
210211

211212
Run the Fern checks before publishing documentation changes:
212213

213214
```bash
214-
fern check --warnings --strict-broken-links
215-
fern docs md check
215+
fern/build_docs.sh check
216216
```

fern/pages/c_api/c-api-cluster-kmeans.md

Lines changed: 159 additions & 11 deletions
Original file line numberDiff line numberDiff line change
@@ -30,6 +30,8 @@ typedef enum { ... } cuvsKMeansInitMethod;
3030
3131
Hyper-parameters for the kmeans algorithm
3232
33+
NB: The inertia_check field is kept for ABI compatibility. Removed in cuvsKMeansParams_v2. TODO: CalVer for the replacement: 26.08
34+
3335
```c
3436
struct cuvsKMeansParams { ... };
3537
```
@@ -46,10 +48,40 @@ struct cuvsKMeansParams { ... };
4648
| `oversampling_factor` | `double` | Oversampling factor for use in the k-means\|\| algorithm |
4749
| `batch_samples` | `int` | batch_samples and batch_centroids are used to tile 1NN computation which is useful to optimize/control the memory footprint Default tile is [batch_samples x n_clusters] i.e. when batch_centroids is 0 then don't tile the centroids |
4850
| `batch_centroids` | `int` | if 0 then batch_centroids = n_clusters |
49-
| `inertia_check` | `bool` | Check inertia during iterations for early convergence. |
51+
| `inertia_check` | `bool` | Deprecated, ignored. Kept for ABI compatibility. |
5052
| `hierarchical` | `bool` | Whether to use hierarchical (balanced) kmeans or not |
5153
| `hierarchical_n_iters` | `int` | For hierarchical k-means , defines the number of training iterations |
5254
| `streaming_batch_size` | `int64_t` | Number of samples to process per GPU batch for the batched (host-data) API. When set to 0, defaults to n_samples (process all at once). |
55+
| `init_size` | `int64_t` | Number of samples to draw for KMeansPlusPlus initialization. When set to 0, uses heuristic min(3 * n_clusters, n_samples) for host data, or n_samples for device data. |
56+
| `metric` | [`cuvsDistanceType`](/api-reference/c-api-distance-distance#cuvsdistancetype) | |
57+
58+
<a id="cuvskmeansparams-v2"></a>
59+
### cuvsKMeansParams_v2
60+
61+
Hyper-parameters for the kmeans algorithm
62+
63+
TODO: Remove this after cuvsKMeansParams is replaced in ABI 2.0
64+
65+
```c
66+
struct cuvsKMeansParams_v2 { ... };
67+
```
68+
69+
**Fields**
70+
71+
| Name | Type | Description |
72+
| --- | --- | --- |
73+
| `n_clusters` | `int` | The number of clusters to form as well as the number of centroids to generate (default:8). |
74+
| `init` | [`cuvsKMeansInitMethod`](/api-reference/c-api-cluster-kmeans#cuvskmeansinitmethod) | Method for initialization, defaults to k-means++:<br />- cuvsKMeansInitMethod::KMeansPlusPlus (k-means++): Use scalable k-means++ algorithm to select the initial cluster centers.<br />- cuvsKMeansInitMethod::Random (random): Choose 'n_clusters' observations (rows) at random from the input data for the initial centroids.<br />- cuvsKMeansInitMethod::Array (ndarray): Use 'centroids' as initial cluster centers. |
75+
| `max_iter` | `int` | Maximum number of iterations of the k-means algorithm for a single run. |
76+
| `tol` | `double` | Relative tolerance with regards to inertia to declare convergence. |
77+
| `n_init` | `int` | Number of instance k-means algorithm will be run with different seeds. |
78+
| `oversampling_factor` | `double` | Oversampling factor for use in the k-means\|\| algorithm |
79+
| `batch_samples` | `int` | batch_samples and batch_centroids are used to tile 1NN computation which is useful to optimize/control the memory footprint Default tile is [batch_samples x n_clusters] i.e. when batch_centroids is 0 then don't tile the centroids |
80+
| `batch_centroids` | `int` | if 0 then batch_centroids = n_clusters |
81+
| `hierarchical` | `bool` | Whether to use hierarchical (balanced) kmeans or not |
82+
| `hierarchical_n_iters` | `int` | For hierarchical k-means , defines the number of training iterations |
83+
| `streaming_batch_size` | `int64_t` | Number of samples to process per GPU batch for the batched (host-data) API. When set to 0, defaults to n_samples (process all at once). |
84+
| `init_size` | `int64_t` | Number of samples to draw for KMeansPlusPlus initialization. When set to 0, uses heuristic min(3 * n_clusters, n_samples) for host data, or n_samples for device data. |
5385
| `metric` | [`cuvsDistanceType`](/api-reference/c-api-distance-distance#cuvsdistancetype) | |
5486
5587
<a id="cuvskmeansparamscreate"></a>
@@ -58,9 +90,11 @@ struct cuvsKMeansParams { ... };
5890
Allocate KMeans params, and populate with default values
5991
6092
```c
61-
cuvsError_t cuvsKMeansParamsCreate(cuvsKMeansParams_t* params);
93+
CUVS_EXPORT cuvsError_t cuvsKMeansParamsCreate(cuvsKMeansParams_t* params);
6294
```
6395

96+
replaced by cuvsKMeansParamsCreate_v2.
97+
6498
**Parameters**
6599

66100
| Name | Direction | Type | Description |
@@ -69,17 +103,19 @@ cuvsError_t cuvsKMeansParamsCreate(cuvsKMeansParams_t* params);
69103

70104
**Returns**
71105

72-
[`cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
106+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
73107

74108
<a id="cuvskmeansparamsdestroy"></a>
75109
### cuvsKMeansParamsDestroy
76110

77111
De-allocate KMeans params
78112

79113
```c
80-
cuvsError_t cuvsKMeansParamsDestroy(cuvsKMeansParams_t params);
114+
CUVS_EXPORT cuvsError_t cuvsKMeansParamsDestroy(cuvsKMeansParams_t params);
81115
```
82116
117+
replaced by cuvsKMeansParamsDestroy_v2.
118+
83119
**Parameters**
84120
85121
| Name | Direction | Type | Description |
@@ -88,7 +124,47 @@ cuvsError_t cuvsKMeansParamsDestroy(cuvsKMeansParams_t params);
88124
89125
**Returns**
90126
91-
[`cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
127+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
128+
129+
<a id="cuvskmeansparamscreate-v2"></a>
130+
### cuvsKMeansParamsCreate_v2
131+
132+
Allocate KMeans params
133+
134+
```c
135+
CUVS_EXPORT cuvsError_t cuvsKMeansParamsCreate_v2(cuvsKMeansParams_v2_t* params);
136+
```
137+
138+
Mirrors cuvsKMeansParamsCreate but operates on cuvsKMeansParams_v2. Will become the unsuffixed cuvsKMeansParamsCreate in cuVS 26.08.
139+
140+
**Parameters**
141+
142+
| Name | Direction | Type | Description |
143+
| --- | --- | --- | --- |
144+
| `params` | in | [`cuvsKMeansParams_v2_t*`](/api-reference/c-api-cluster-kmeans#cuvskmeansparams-v2) | cuvsKMeansParams_v2_t to allocate |
145+
146+
**Returns**
147+
148+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
149+
150+
<a id="cuvskmeansparamsdestroy-v2"></a>
151+
### cuvsKMeansParamsDestroy_v2
152+
153+
De-allocate KMeans params allocated by cuvsKMeansParamsCreate_v2.
154+
155+
```c
156+
CUVS_EXPORT cuvsError_t cuvsKMeansParamsDestroy_v2(cuvsKMeansParams_v2_t params);
157+
```
158+
159+
**Parameters**
160+
161+
| Name | Direction | Type | Description |
162+
| --- | --- | --- | --- |
163+
| `params` | in | [`cuvsKMeansParams_v2_t`](/api-reference/c-api-cluster-kmeans#cuvskmeansparams-v2) | |
164+
165+
**Returns**
166+
167+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
92168
93169
<a id="cuvskmeanstype"></a>
94170
### cuvsKMeansType
@@ -114,7 +190,7 @@ typedef enum { ... } cuvsKMeansType;
114190
Find clusters with k-means algorithm.
115191

116192
```c
117-
cuvsError_t cuvsKMeansFit(cuvsResources_t res,
193+
CUVS_EXPORT cuvsError_t cuvsKMeansFit(cuvsResources_t res,
118194
cuvsKMeansParams_t params,
119195
DLManagedTensor* X,
120196
DLManagedTensor* sample_weight,
@@ -127,6 +203,8 @@ Initial centroids are chosen with k-means++ algorithm. Empty clusters are reinit
127203
128204
X may reside on either host (CPU) or device (GPU) memory. When X is on the host the data is streamed to the GPU in batches controlled by params-&gt;streaming_batch_size.
129205
206+
replaced by cuvsKMeansFit_v2.
207+
130208
**Parameters**
131209
132210
| Name | Direction | Type | Description |
@@ -141,15 +219,48 @@ X may reside on either host (CPU) or device (GPU) memory. When X is on the host
141219
142220
**Returns**
143221
144-
[`cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
222+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
223+
224+
<a id="cuvskmeansfit-v2"></a>
225+
### cuvsKMeansFit_v2
226+
227+
Find clusters with k-means algorithm (v2 params layout).
228+
229+
```c
230+
CUVS_EXPORT cuvsError_t cuvsKMeansFit_v2(cuvsResources_t res,
231+
cuvsKMeansParams_v2_t params,
232+
DLManagedTensor* X,
233+
DLManagedTensor* sample_weight,
234+
DLManagedTensor* centroids,
235+
double* inertia,
236+
int* n_iter);
237+
```
238+
239+
Mirrors cuvsKMeansFit but takes cuvsKMeansParams_v2_t. Will become the unsuffixed cuvsKMeansFit in cuVS 26.08.
240+
241+
**Parameters**
242+
243+
| Name | Direction | Type | Description |
244+
| --- | --- | --- | --- |
245+
| `res` | in | [`cuvsResources_t`](/api-reference/c-api-core-c-api#cuvsresources-t) | opaque C handle |
246+
| `params` | in | [`cuvsKMeansParams_v2_t`](/api-reference/c-api-cluster-kmeans#cuvskmeansparams-v2) | Parameters for KMeans model (v2 layout). |
247+
| `X` | in | `DLManagedTensor*` | Training instances to cluster. The data must be in row-major format. May be on host or device memory. [dim = n_samples x n_features] |
248+
| `sample_weight` | in | `DLManagedTensor*` | Optional weights for each observation in X. Must be on the same memory space as X. [len = n_samples] |
249+
| `centroids` | inout | `DLManagedTensor*` | [in] When init is InitMethod::Array, use centroids as the initial cluster centers. [out] The generated centroids from the kmeans algorithm are stored at the address pointed by 'centroids'. Must be on device. [dim = n_clusters x n_features] |
250+
| `inertia` | out | `double*` | Sum of squared distances of samples to their closest cluster center. |
251+
| `n_iter` | out | `int*` | Number of iterations run. |
252+
253+
**Returns**
254+
255+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
145256

146257
<a id="cuvskmeanspredict"></a>
147258
### cuvsKMeansPredict
148259

149260
Predict the closest cluster each sample in X belongs to.
150261

151262
```c
152-
cuvsError_t cuvsKMeansPredict(cuvsResources_t res,
263+
CUVS_EXPORT cuvsError_t cuvsKMeansPredict(cuvsResources_t res,
153264
cuvsKMeansParams_t params,
154265
DLManagedTensor* X,
155266
DLManagedTensor* sample_weight,
@@ -159,6 +270,8 @@ bool normalize_weight,
159270
double* inertia);
160271
```
161272
273+
replaced by cuvsKMeansPredict_v2.
274+
162275
**Parameters**
163276
164277
| Name | Direction | Type | Description |
@@ -174,15 +287,50 @@ double* inertia);
174287
175288
**Returns**
176289
177-
[`cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
290+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
291+
292+
<a id="cuvskmeanspredict-v2"></a>
293+
### cuvsKMeansPredict_v2
294+
295+
Predict the closest cluster each sample in X belongs to (v2 params layout).
296+
297+
```c
298+
CUVS_EXPORT cuvsError_t cuvsKMeansPredict_v2(cuvsResources_t res,
299+
cuvsKMeansParams_v2_t params,
300+
DLManagedTensor* X,
301+
DLManagedTensor* sample_weight,
302+
DLManagedTensor* centroids,
303+
DLManagedTensor* labels,
304+
bool normalize_weight,
305+
double* inertia);
306+
```
307+
308+
Mirrors cuvsKMeansPredict but takes cuvsKMeansParams_v2_t. Will become the unsuffixed cuvsKMeansPredict in cuVS 26.08.
309+
310+
**Parameters**
311+
312+
| Name | Direction | Type | Description |
313+
| --- | --- | --- | --- |
314+
| `res` | in | [`cuvsResources_t`](/api-reference/c-api-core-c-api#cuvsresources-t) | opaque C handle |
315+
| `params` | in | [`cuvsKMeansParams_v2_t`](/api-reference/c-api-cluster-kmeans#cuvskmeansparams-v2) | Parameters for KMeans model (v2 layout). |
316+
| `X` | in | `DLManagedTensor*` | New data to predict. [dim = n_samples x n_features] |
317+
| `sample_weight` | in | `DLManagedTensor*` | Optional weights for each observation in X. [len = n_samples] |
318+
| `centroids` | in | `DLManagedTensor*` | Cluster centroids. The data must be in row-major format. [dim = n_clusters x n_features] |
319+
| `labels` | out | `DLManagedTensor*` | Index of the cluster each sample in X belongs to. [len = n_samples] |
320+
| `normalize_weight` | in | `bool` | True if the weights should be normalized |
321+
| `inertia` | out | `double*` | Sum of squared distances of samples to their closest cluster center. |
322+
323+
**Returns**
324+
325+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
178326

179327
<a id="cuvskmeansclustercost"></a>
180328
### cuvsKMeansClusterCost
181329

182330
Compute cluster cost
183331

184332
```c
185-
cuvsError_t cuvsKMeansClusterCost(cuvsResources_t res,
333+
CUVS_EXPORT cuvsError_t cuvsKMeansClusterCost(cuvsResources_t res,
186334
DLManagedTensor* X,
187335
DLManagedTensor* centroids,
188336
double* cost);
@@ -199,4 +347,4 @@ double* cost);
199347
200348
**Returns**
201349
202-
[`cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)
350+
[`CUVS_EXPORT cuvsError_t`](/api-reference/c-api-core-c-api#cuvserror-t)

0 commit comments

Comments
 (0)