Skip to content
11 changes: 11 additions & 0 deletions fern/docs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -114,6 +114,17 @@ navigation:
path: "./pages/api_guide.md"
collapsed: true
contents:
- section: "Core Types"
path: "./pages/core_types.md"
contents:
- page: "Memory Management"
path: "./pages/other/memory_management.md"
- page: "Multi-dim Arrays"
path: "./pages/other/multidimensional_arrays.md"
- page: "Multi-GPU"
path: "./pages/other/multi_gpu.md"
- page: "Resources"
path: "./pages/other/resources.md"
- section: "Clustering Guide"
path: "./pages/cluster/index.md"
contents:
Expand Down
8 changes: 4 additions & 4 deletions fern/pages/advanced_topics.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,10 +4,10 @@ slug: developer-guide/advanced-topics

# Advanced Topics

Use these pages when working on specialized cuVS development topics that need lower-level implementation context, runtime behavior, or platform-specific guidance.
Use these pages when working on specialized NVIDIA cuVS development topics that need lower-level implementation context, runtime behavior, or platform-specific guidance.

## Topic Guides

- [ABI Stability](../developer_guide/abi_stability.md): understand ABI expectations for developer-facing APIs and downstream integrations.
- [JIT Compilation](jit_compilation.md): understand when cuVS triggers just-in-time compilation, how caches behave, and how to warm up JIT-compiled kernels.
- [Link-time Optimization](jit_lto_guide.md): use JIT LTO for CUDA compilation, fragment generation, and runtime linking workflows.
- [ABI Stability](/developer-guide/advanced-topics/abi-stability): understand ABI expectations for developer-facing APIs and downstream integrations.
- [JIT Compilation](/developer-guide/advanced-topics/jit-compilation): understand when NVIDIA cuVS triggers just-in-time compilation, how caches behave, and how to warm up JIT-compiled kernels.
- [Link-time Optimization](/developer-guide/advanced-topics/link-time-optimization): use JIT LTO for CUDA compilation, fragment generation, and runtime linking workflows.
6 changes: 3 additions & 3 deletions fern/pages/api_basics.md
Original file line number Diff line number Diff line change
@@ -1,11 +1,11 @@
# cuVS API Basics
# NVIDIA cuVS API Basics

- [Memory management](#memory-management)
- [Resource management](#resource-management)

## Memory management

Centralized memory management allows flexible configuration of allocation strategies, such as sharing the same CUDA memory pool across library boundaries. cuVS uses the [RMM](https://github.com/rapidsai/rmm) library, which eases the burden of configuring different allocation strategies globally across GPU-accelerated libraries.
Centralized memory management allows flexible configuration of allocation strategies, such as sharing the same CUDA memory pool across library boundaries. NVIDIA cuVS uses the [RMM](https://github.com/rapidsai/rmm) library, which eases the burden of configuring different allocation strategies globally across GPU-accelerated libraries.

RMM currently has APIs for C++ and Python.

Expand Down Expand Up @@ -38,7 +38,7 @@ rmm.mr.set_current_device_resource(pool)

## Resource management

cuVS uses an API from the [RAFT](https://github.com/rapidsai/raft) library of ML and data mining primitives to centralize and reuse expensive resources, such as memory management. The below code examples demonstrate how to create these resources for use throughout this guide.
NVIDIA cuVS uses an API from the [RAFT](https://github.com/rapidsai/raft) library of ML and data mining primitives to centralize and reuse expensive resources, such as memory management. The below code examples demonstrate how to create these resources for use throughout this guide.

See RAFT's [resource API documentation](https://docs.rapids.ai/api/raft/nightly/cpp_api/core_resources/) for more information.

Expand Down
46 changes: 27 additions & 19 deletions fern/pages/api_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,34 +4,42 @@ slug: user-guide/api-guides

# API Guide

Use these pages to find task-focused cuVS API examples for clustering, vector indexing, preprocessing, and supporting routines.
Use these pages to find task-focused NVIDIA cuVS API examples for clustering, vector indexing, preprocessing, and supporting routines.

## Core Types

- [Memory Management](/user-guide/api-guides/core-types/memory-management): configure RMM device, pool, pinned host, host, and managed memory resources for NVIDIA cuVS workflows.
- [Multi-dim Arrays](/user-guide/api-guides/core-types/multi-dim-arrays): use RAFT `mdspan`, `span`, and `mdarray` types with NVIDIA cuVS C++ APIs.
- [Multi-GPU](/user-guide/api-guides/core-types/multi-gpu): initialize multi-GPU resources and understand RAFT/NCCL communication setup.
- [Resources](/user-guide/api-guides/core-types/resources): reuse CUDA streams, library handles, stream pools, and workspace resources across NVIDIA cuVS calls.

## Clustering Guide

- [K-Means](cluster/kmeans.md): partition vectors into a fixed number of clusters, often as part of scalable vector-search systems.
- [Single-linkage](cluster/single_linkage.md): build hierarchical clusters from nearest-neighbor relationships.
- [Spectral Clustering](cluster/spectral.md): use graph structure and spectral methods to identify clusters with more complex shapes.
- [K-Means](/user-guide/api-guides/clustering-guide/k-means): partition vectors into a fixed number of clusters, often as part of scalable vector-search systems.
- [Single-linkage](/user-guide/api-guides/clustering-guide/single-linkage): build hierarchical clusters from nearest-neighbor relationships.
- [Spectral Clustering](/user-guide/api-guides/clustering-guide/spectral-clustering): use graph structure and spectral methods to identify clusters with more complex shapes.

## Indexing Guide

- [Brute-force](neighbors/bruteforce.md): run exact nearest-neighbor search by comparing each query with every vector.
- [CAGRA](neighbors/cagra.md): build and search GPU-optimized graph indexes for high-throughput ANN search.
- [NN-Descent](neighbors/nn_descent.md): build approximate nearest-neighbor graphs with an iterative algorithm.
- [IVF-Flat](neighbors/ivfflat.md): partition vectors into inverted-file lists while storing full-precision vectors.
- [IVF-PQ](neighbors/ivfpq.md): combine inverted-file partitioning with product quantization for compact indexes.
- [ScaNN](neighbors/scann.md): combine partitioning, quantization, and refinement for high-quality approximate search.
- [Vamana](neighbors/vamana.md): build graph indexes for large-scale and disk-backed search workflows.
- [All-neighbors](neighbors/all_neighbors.md): compute all-neighbors graph structures.
- [Brute-force](/user-guide/api-guides/indexing-guide/brute-force): run exact nearest-neighbor search by comparing each query with every vector.
- [CAGRA](/user-guide/api-guides/indexing-guide/cagra): build and search GPU-optimized graph indexes for high-throughput ANN search.
- [NN-Descent](/user-guide/api-guides/indexing-guide/nn-descent): build approximate nearest-neighbor graphs with an iterative algorithm.
- [IVF-Flat](/user-guide/api-guides/indexing-guide/ivf-flat): partition vectors into inverted-file lists while storing full-precision vectors.
- [IVF-PQ](/user-guide/api-guides/indexing-guide/ivf-pq): combine inverted-file partitioning with product quantization for compact indexes.
- [ScaNN](/user-guide/api-guides/indexing-guide/sca-nn): combine partitioning, quantization, and refinement for high-quality approximate search.
- [Vamana](/user-guide/api-guides/indexing-guide/vamana): build graph indexes for large-scale and disk-backed search workflows.
- [All-neighbors](/user-guide/api-guides/indexing-guide/all-neighbors): compute all-neighbors graph structures.

## Preprocessing Guide

- [Binary Quantizer](preprocessing/binary_quantizer.md): compress vectors into binary representations for compact storage and fast comparisons.
- [PCA](preprocessing/pca.md): reduce dimensionality with a linear projection while preserving as much variance as possible.
- [Product Quantization](preprocessing/product_quantization.md): split vectors into subvectors and encode each part with compact codebooks.
- [Scalar Quantizer](preprocessing/scalar_quantizer.md): compress each vector dimension independently with scalar quantization.
- [Spectral Embedding](preprocessing/spectral_embedding.md): create lower-dimensional embeddings from graph structure.
- [Binary Quantizer](/user-guide/api-guides/preprocessing-guide/binary-quantizer): compress vectors into binary representations for compact storage and fast comparisons.
- [PCA](/user-guide/api-guides/preprocessing-guide/pca): reduce dimensionality with a linear projection while preserving as much variance as possible.
- [Product Quantization](/user-guide/api-guides/preprocessing-guide/product-quantization): split vectors into subvectors and encode each part with compact codebooks.
- [Scalar Quantizer](/user-guide/api-guides/preprocessing-guide/scalar-quantizer): compress each vector dimension independently with scalar quantization.
- [Spectral Embedding](/user-guide/api-guides/preprocessing-guide/spectral-embedding): create lower-dimensional embeddings from graph structure.

## Other APIs

- [Pairwise Distances](other/pairwise_distances.md): compute distances between vectors for analysis, validation, or algorithm building blocks.
- [K-selection](other/select_k.md): select the top `k` values or nearest candidates from larger result sets.
- [Dynamic Batching](/user-guide/api-guides/other-ap-is/dynamic-batching): collect many concurrent small ANN searches into larger GPU search batches.
- [K-selection](/user-guide/api-guides/other-ap-is/k-selection): select the top `k` values or nearest candidates from larger result sets.
- [Pairwise Distances](/user-guide/api-guides/other-ap-is/pairwise-distances): compute distances between vectors for analysis, validation, or algorithm building blocks.
6 changes: 3 additions & 3 deletions fern/pages/api_interoperability.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,11 +34,11 @@ dataset_tensor.dl_tensor.strides = nullptr;
cuvsRMMFree(dataset_dev);
```

Please refer to [cuVS C API documentation](/api-reference/c-api-documentation) to learn more.
Please refer to [NVIDIA cuVS C API documentation](/api-reference/c-api-documentation) to learn more.

## Multi-dimensional span (C++)

cuVS is built on top of the GPU-accelerated machine learning and data mining primitives in the [RAFT](https://github.com/rapidsai/raft) library. Most of the C++ APIs in cuVS accept [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory.
NVIDIA cuVS is built on top of the GPU-accelerated machine learning and data mining primitives in the [RAFT](https://github.com/rapidsai/raft) library. Most of the C++ APIs in NVIDIA cuVS accept [mdspan](https://arxiv.org/abs/2010.06474) multi-dimensional array view for representing data in higher dimensions similar to the `ndarray` in the Numpy Python library. RAFT also contains the corresponding owning `mdarray` structure, which simplifies the allocation and management of multi-dimensional data in both host and device (GPU) memory.

The `mdarray` is an owning object that forms a convenience layer over RMM and can be constructed in RAFT using a number of different helper functions:

Expand Down Expand Up @@ -102,4 +102,4 @@ Please refer to RAFT's [mdspan documentation](https://docs.rapids.ai/api/raft/st

The Python APIs accept objects that expose the CUDA Array Interface, which enables interoperability with GPU array libraries such as CuPy, Numba, PyTorch, and TensorFlow without copying device memory when compatible layouts are used.

See [Using APIs in Python](working_with_ann_indexes.md#python) for examples of building and searching cuVS indexes from Python arrays.
See [Using APIs in Python](/getting-started/using-cu-vs-ap-is#python) for examples of building and searching NVIDIA cuVS indexes from Python arrays.
4 changes: 2 additions & 2 deletions fern/pages/benchmarking_guide.md
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@ slug: user-guide/benchmarking-guide

Use these guides to compare vector-search algorithms fairly and run reproducible benchmark workflows with cuVS Bench.

- [Methodologies](comparing_indexes.md): compare indexes using quality buckets, Pareto curves, and consistent build and search metrics.
- [cuVS Bench Tool](cuvs_bench/index.md): use cuVS Bench to run reproducible benchmark workflows, prepare datasets, and extend benchmark backends.
- [Methodologies](/user-guide/benchmarking-guide/methodologies): compare indexes using quality buckets, Pareto curves, and consistent build and search metrics.
- [cuVS Bench Tool](/user-guide/benchmarking-guide/cu-vs-bench-tool): use cuVS Bench to run reproducible benchmark workflows, prepare datasets, and extend benchmark backends.
6 changes: 3 additions & 3 deletions fern/pages/build.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,9 +15,9 @@ Source builds and package installs require a supported NVIDIA GPU. For current s
- [C](/installation/c): install or build the C API and `libcuvs_c`.
- [C++](/installation/cpp): install or build the C++ headers and `libcuvs`.
- [Python](/installation/python): install Python wheels or conda packages, or build the Python package from source.
- [Java](/installation/java): build the Java API and connect it to matching native cuVS libraries.
- [Go](/installation/go): install the Go module and configure CGO against native cuVS libraries.
- [Rust](/installation/rust): install the Rust crate and configure native cuVS dependencies.
- [Java](/installation/java): build the Java API and connect it to matching native NVIDIA cuVS libraries.
- [Go](/installation/go): install the Go module and configure CGO against native NVIDIA cuVS libraries.
- [Rust](/installation/rust): install the Rust crate and configure native NVIDIA cuVS dependencies.

## Build From Source

Expand Down
22 changes: 11 additions & 11 deletions fern/pages/c_guidelines.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,19 +4,19 @@ slug: developer-guide/coding-guidelines/c-guidelines

# C Guidelines

This page collects the engineering conventions that keep cuVS C APIs stable, predictable, and easy to use from downstream projects and language bindings. Start with the [Contributor Guide](/developer-guide/contributing), then use this page when designing public C APIs, C wrappers, or C-facing documentation.
This page collects the engineering conventions that keep NVIDIA cuVS C APIs stable, predictable, and easy to use from downstream projects and language bindings. Start with the [Contributor Guide](/developer-guide/contributing), then use this page when designing public C APIs, C wrappers, or C-facing documentation.

## Local Development

Most C API changes can be developed directly in this repository. Cross-project work may also require a local RAFT build or a downstream project that consumes the installed cuVS C headers and shared libraries.
Most C API changes can be developed directly in this repository. Cross-project work may also require a local RAFT build or a downstream project that consumes the installed NVIDIA cuVS C headers and shared libraries.

If source builds are not being used, install the local cuVS C artifacts into the consuming project's environment before testing the downstream change.
If source builds are not being used, install the local NVIDIA cuVS C artifacts into the consuming project's environment before testing the downstream change.

## Public Interface

### General Guidelines

Public C APIs should be thin, ABI-stable wrappers around cuVS implementation code. Keep C headers free of C++ types, templates, namespaces, exceptions, and RAII-only ownership patterns.
Public C APIs should be thin, ABI-stable wrappers around NVIDIA cuVS implementation code. Keep C headers free of C++ types, templates, namespaces, exceptions, and RAII-only ownership patterns.

Expose only C-compatible types:

Expand All @@ -29,7 +29,7 @@ Prefer explicit create and destroy functions for every opaque object that owns m

### API Stability

The C API is the stable boundary used by downstream integrations and cuVS language bindings. Add new functions or fields before removing old ones, avoid changing the meaning of existing parameters, and keep [ABI compatibility](../developer_guide/abi_stability.md) in mind when changing public structs or exported symbols.
The C API is the stable boundary used by downstream integrations and NVIDIA cuVS language bindings. Add new functions or fields before removing old ones, avoid changing the meaning of existing parameters, and keep [ABI compatibility](/developer-guide/advanced-topics/abi-stability) in mind when changing public structs or exported symbols.

### Stateless C APIs

Expand Down Expand Up @@ -78,7 +78,7 @@ cuvsError_t cuvsIvfPqDeserialize(cuvsResources_t res,

### Performance

Keep C wrappers thin. Validate inputs and translate handles at the boundary, but leave expensive work in the underlying cuVS implementation.
Keep C wrappers thin. Validate inputs and translate handles at the boundary, but leave expensive work in the underlying NVIDIA cuVS implementation.

Avoid hidden host-device copies and hidden synchronization. If a wrapper needs to synchronize, document that behavior clearly.

Expand All @@ -90,7 +90,7 @@ Avoid mutable process-wide state in C wrappers. If shared state is unavoidable,

### Asynchronous Operations And Stream Ordering

C APIs should preserve the stream-ordering behavior of the underlying cuVS implementation. Do not add hidden synchronization only to simplify wrapper code.
C APIs should preserve the stream-ordering behavior of the underlying NVIDIA cuVS implementation. Do not add hidden synchronization only to simplify wrapper code.

When a C function accepts `cuvsResources_t`, use the stream and resources associated with that handle. Work queued by the caller before the API should complete before internal work starts, and work queued by the caller after the API returns should wait for internal work that affects the result.

Expand Down Expand Up @@ -131,13 +131,13 @@ Single-GPU C APIs should not require communication libraries or multi-GPU setup.

C APIs may call implementations that use JIT link-time optimization, but the C wrapper should not duplicate JIT LTO policy or expose C++ implementation details. Keep runtime behavior documented at the API level when JIT compilation can affect first-call latency or cache behavior.

For runtime and cache behavior, see [JIT Compilation](jit_compilation.md). For implementation guidance, see [Link-time Optimization](jit_lto_guide.md).
For runtime and cache behavior, see [JIT Compilation](/developer-guide/advanced-topics/jit-compilation). For implementation guidance, see [Link-time Optimization](/developer-guide/advanced-topics/link-time-optimization).

## Coding Style

### Formatting

cuVS uses [pre-commit](https://pre-commit.com/) to run formatting, linting, spelling, and copyright checks. Install it with conda:
NVIDIA cuVS uses [pre-commit](https://pre-commit.com/) to run formatting, linting, spelling, and copyright checks. Install it with conda:

```bash
conda install -c conda-forge pre-commit
Expand All @@ -163,7 +163,7 @@ pre-commit install

### Core Hooks

C headers and C wrapper implementation files are checked by the same formatting, spelling, Doxygen, and copyright hooks used by the rest of cuVS.
C headers and C wrapper implementation files are checked by the same formatting, spelling, Doxygen, and copyright hooks used by the rest of NVIDIA cuVS.

Run Doxygen checks for public C API documentation:

Expand All @@ -179,7 +179,7 @@ codespell -i 3 -w .

### Include Style

Use `#include <cuvs/...>` for public cuVS C headers. Keep public C headers minimal and avoid including private C++ implementation headers from the public C interface.
Use `#include <cuvs/...>` for public NVIDIA cuVS C headers. Keep public C headers minimal and avoid including private C++ implementation headers from the public C interface.

### Copyright

Expand Down
Loading
Loading