Background
NN-Descent currently supports internal_distance_dtype = CUDA_R_16F for fp32 inputs.
When the input lives on host memory, we downcast it to fp16 while copying it to device, so the on-device copy is in fp16 instead of its native fp32 dtype.
Doing this in NN-Descent is a special case relative to the rest of cuVS.
Proposal
Once cuML UMAP and HDBSCAN can natively accept fp16 input (tracked in rapidsai/cuml#8102), remove the host-fp32 → device-fp16 downcast path from NN-Descent:
- Drop the
d_data_half_ buffer and convert_copy_kernel from cpp/src/neighbors/detail/nn_descent.cuh.
- Remove the
internal_distance_dtype parameter from the C++, C, and Python index params.
Related
Background
NN-Descent currently supports
internal_distance_dtype = CUDA_R_16Ffor fp32 inputs.When the input lives on host memory, we downcast it to fp16 while copying it to device, so the on-device copy is in fp16 instead of its native fp32 dtype.
Doing this in NN-Descent is a special case relative to the rest of cuVS.
Proposal
Once cuML UMAP and HDBSCAN can natively accept fp16 input (tracked in rapidsai/cuml#8102), remove the host-fp32 → device-fp16 downcast path from NN-Descent:
d_data_half_buffer andconvert_copy_kernelfromcpp/src/neighbors/detail/nn_descent.cuh.internal_distance_dtypeparameter from the C++, C, and Python index params.Related