Skip to content
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -576,7 +576,7 @@
- Vendor RAPIDS.cmake ([#816](https://github.com/rapidsai/cuvs/pull/816)) [@bdice](https://github.com/bdice)
- Update libcuvs libraft ver to 25.06 in conda env ([#808](https://github.com/rapidsai/cuvs/pull/808)) [@jinsolp](https://github.com/jinsolp)
- Moving NN Descent class and struct declarations to `nn_descent_gnnd.hpp` ([#803](https://github.com/rapidsai/cuvs/pull/803)) [@jinsolp](https://github.com/jinsolp)
- Remove `[@rapidsai/cuvs-build-codeowners` ([#783](https://github.com/rapidsai/cuvs/pull/783)) @KyleFromNVIDIA](https://github.com/rapidsai/cuvs-build-codeowners` ([#783](https://github.com/rapidsai/cuvs/pull/783)) @KyleFromNVIDIA)
- Remove @rapidsai/cuvs-build-codeowners ([#783](https://github.com/rapidsai/cuvs/pull/783)) [@KyleFromNVIDIA](https://github.com/KyleFromNVIDIA)
- Moving wheel builds to specified location and uploading build artifacts to Github ([#777](https://github.com/rapidsai/cuvs/pull/777)) [@VenkateshJaya](https://github.com/VenkateshJaya)
- Remove unused raft cagra header in add_nodes.cuh ([#741](https://github.com/rapidsai/cuvs/pull/741)) [@jiangyinzuo](https://github.com/jiangyinzuo)
- Expose kmeans to python ([#729](https://github.com/rapidsai/cuvs/pull/729)) [@benfred](https://github.com/benfred)
Expand Down
2 changes: 1 addition & 1 deletion ci/release/update-version.sh
Original file line number Diff line number Diff line change
Expand Up @@ -146,7 +146,7 @@ elif [[ "${RUN_CONTEXT}" == "release" ]]; then
fi

# Update cuvs-bench Docker image references (version-only, not branch-related)
sed_runner "s|rapidsai/cuvs-bench:[0-9][0-9].[0-9][0-9]|rapidsai/cuvs-bench:${NEXT_SHORT_TAG}|g" docs/source/cuvs_bench/index.rst
sed_runner "s|rapidsai/cuvs-bench:[0-9][0-9].[0-9][0-9]|rapidsai/cuvs-bench:${NEXT_SHORT_TAG}|g" docs/source/cuvs_bench/index.md

# Version references (not branch-related)
sed_runner "s|=[0-9][0-9].[0-9][0-9]|=${NEXT_SHORT_TAG}|g" README.md
Expand Down
3 changes: 1 addition & 2 deletions conda/environments/all_cuda-129_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ dependencies:
- libopenblas<=0.3.30
- librmm==26.6.*,>=0.0.0a0
- make
- myst-parser
- nccl>=2.19
- ninja
- numpy>=1.23,<3.0
Expand All @@ -45,12 +46,10 @@ dependencies:
- pytest
- pytest-cov
- rapids-build-backend>=0.4.0,<0.5.0
- recommonmark
- rust
- scikit-build-core>=0.11.0
- scikit-learn>=1.5
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx>=8.0.0
- sysroot_linux-aarch64==2.28
- pip:
Expand Down
3 changes: 1 addition & 2 deletions conda/environments/all_cuda-129_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ dependencies:
- libnvjitlink-dev
- librmm==26.6.*,>=0.0.0a0
- make
- myst-parser
- nccl>=2.19
- ninja
- numpy>=1.23,<3.0
Expand All @@ -44,12 +45,10 @@ dependencies:
- pytest
- pytest-cov
- rapids-build-backend>=0.4.0,<0.5.0
- recommonmark
- rust
- scikit-build-core>=0.11.0
- scikit-learn>=1.5
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx>=8.0.0
- sysroot_linux-64==2.28
- pip:
Expand Down
3 changes: 1 addition & 2 deletions conda/environments/all_cuda-131_arch-aarch64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -35,6 +35,7 @@ dependencies:
- libopenblas<=0.3.30
- librmm==26.6.*,>=0.0.0a0
- make
- myst-parser
- nccl>=2.19
- ninja
- numpy>=1.23,<3.0
Expand All @@ -45,12 +46,10 @@ dependencies:
- pytest
- pytest-cov
- rapids-build-backend>=0.4.0,<0.5.0
- recommonmark
- rust
- scikit-build-core>=0.11.0
- scikit-learn>=1.5
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx>=8.0.0
- sysroot_linux-aarch64==2.28
- pip:
Expand Down
3 changes: 1 addition & 2 deletions conda/environments/all_cuda-131_arch-x86_64.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,7 @@ dependencies:
- libnvjitlink-dev
- librmm==26.6.*,>=0.0.0a0
- make
- myst-parser
- nccl>=2.19
- ninja
- numpy>=1.23,<3.0
Expand All @@ -44,12 +45,10 @@ dependencies:
- pytest
- pytest-cov
- rapids-build-backend>=0.4.0,<0.5.0
- recommonmark
- rust
- scikit-build-core>=0.11.0
- scikit-learn>=1.5
- sphinx-copybutton
- sphinx-markdown-tables
- sphinx>=8.0.0
- sysroot_linux-64==2.28
- pip:
Expand Down
3 changes: 1 addition & 2 deletions dependencies.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -450,11 +450,10 @@ dependencies:
- doxygen>=1.8.20
- graphviz
- ipython
- myst-parser
- numpydoc
- recommonmark
- sphinx>=8.0.0
- sphinx-copybutton
- sphinx-markdown-tables
- pip:
- nvidia-sphinx-theme
rust:
Expand Down
22 changes: 22 additions & 0 deletions docs/source/advanced_topics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,22 @@
# Advanced Topics

- [Just-in-Time Compilation](#just-in-time-compilation)

## Just-in-Time Compilation
cuVS uses the Just-in-Time (JIT) [Link-Time Optimization (LTO)](https://developer.nvidia.com/blog/cuda-12-0-compiler-support-for-runtime-lto-using-nvjitlink-library/) compilation technology to compile certain kernels. When a JIT compilation is triggered, cuVS will compile the kernel for your architecture and automatically cache it in-memory and on-disk. The validity of the cache is as follows:

1. In-memory cache is valid for the lifetime of the process.
2. On-disk cache is valid until a CUDA driver upgrade is performed. The cache can be portably shared between machines in network or cloud storage and we strongly recommend that you store the cache in a persistent location. For more details on how to configure the on-disk cache, look at CUDA documentation on [JIT Compilation](https://docs.nvidia.com/cuda/cuda-programming-guide/05-appendices/environment-variables.html#jit-compilation). Specifically, the environment variables of interest are: `CUDA_CACHE_PATH` and `CUDA_CACHE_MAX_SIZE`.


Thus, the JIT compilation is a one-time cost and you can expect no loss in real performance after the first compilation. We recommend that you run a "warmup" to trigger the JIT compilation before the actual usage.

Currently, the following capabilities will trigger a JIT compilation:
- IVF Flat search APIs: [cuvs::neighbors::ivf_flat::search()](cpp_api/neighbors_ivf_flat.md)

```{toctree}
:maxdepth: 2

jit_lto_guide
```

22 changes: 0 additions & 22 deletions docs/source/advanced_topics.rst

This file was deleted.

81 changes: 81 additions & 0 deletions docs/source/api_basics.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,81 @@
# cuVS API Basics

- [Memory management](#memory-management)
- [Resource management](#resource-management)

## Memory management

Centralized memory management allows flexible configuration of allocation strategies, such as sharing the same CUDA memory pool across library boundaries. cuVS uses the [RMM](https://github.com/rapidsai/rmm) library, which eases the burden of configuring different allocation strategies globally across GPU-accelerated libraries.

RMM currently has APIs for C++ and Python.

### C++

Here's an example of configuring RMM to use a pool allocator in C++ (derived from the RMM example [here](https://github.com/rapidsai/rmm?tab=readme-ov-file#example)):

```c++
rmm::mr::cuda_memory_resource cuda_mr;
// Construct a resource that uses a coalescing best-fit pool allocator
// With the pool initially half of available device memory
auto initial_size = rmm::percent_of_free_device_memory(50);
rmm::mr::pool_memory_resource pool_mr{cuda_mr, initial_size};
rmm::mr::set_current_device_resource(pool_mr);
auto mr = rmm::mr::get_current_device_resource_ref();
```

### Python

And the corresponding code in Python (derived from the RMM example [here](https://github.com/rapidsai/rmm?tab=readme-ov-file#memoryresource-objects)):

```python
import rmm
pool = rmm.mr.PoolMemoryResource(
rmm.mr.CudaMemoryResource(),
initial_pool_size=2**30,
maximum_pool_size=2**32)
rmm.mr.set_current_device_resource(pool)
```

## Resource management

cuVS uses an API from the [RAFT](https://github.com/rapidsai/raft) library of ML and data mining primitives to centralize and reuse expensive resources, such as memory management. The below code examples demonstrate how to create these resources for use throughout this guide.

See RAFT's [resource API documentation](https://docs.rapids.ai/api/raft/nightly/cpp_api/core_resources/) for more information.

C
^

```c
#include <cuda_runtime.h>
#include <cuvs/core/c_api.h>

cuvsResources_t res;
cuvsResourcesCreate(&res);

// ... do some processing ...

cuvsResourcesDestroy(res);
```

### C++

```c++
#include <raft/core/device_resources.hpp>

raft::device_resources res;
```

### Python

```python
import pylibraft

res = pylibraft.common.DeviceResources()
```

### Rust

```rust
let res = cuvs::Resources::new()?;
```

90 changes: 0 additions & 90 deletions docs/source/api_basics.rst

This file was deleted.

13 changes: 13 additions & 0 deletions docs/source/api_docs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,13 @@
# API Reference

```{toctree}
:maxdepth: 3

c_api.md
cpp_api.md
python_api.md
rust_api/index.md
```

* [Index](genindex.html)
* [Search](search.html)
13 changes: 0 additions & 13 deletions docs/source/api_docs.rst

This file was deleted.

Loading
Loading