C abi checker test gha by benfred · Pull Request #2 · benfred/cuvs

benfred · 2026-03-02T21:42:37Z

No description provided.

This adds a script that can check for changes that would break our C-ABI. t works by comparing two sets of header files: the `c/include` header files for the published ABI, as well as the `c/include` header files inside a new pull request. Each set of header files is parsed using libclang, and the differences between the old and new header files are examined for changes that would cause a breaking ABI change. Currently the following checks are made and flagged by this tool: * Functions that have been removed from the C-ABI * Functions that have extra parameters added * Functions that have parameters removed * Functions that have the type of any parameter changed * Structs that have been removed * Structs that have have members removed * Structs that have the types of members changed * Enum values that have been removed * Enum values that have their definition changed

- Add check-c-abi.yaml workflow for PRs - Add store-c-abi-baseline.yaml workflow for releases - Integrate ABI check into pr.yaml pipeline Co-authored-by: Cursor <cursoragent@cursor.com>

- Extract and store main baseline on push to main (never expires) - PRs download and compare against main baseline - Release workflow archives the main baseline with version tag - Eliminates duplicate ABI extraction work Co-authored-by: Cursor <cursoragent@cursor.com>

Enables bootstrapping the initial c-abi-baseline-main artifact Co-authored-by: Cursor <cursoragent@cursor.com>

- Store baselines with commit SHA for precise comparisons - Cascade: try merge-base → latest main → extract fresh - Add concurrency control to prevent race conditions - Report which baseline source was used for transparency Co-authored-by: Cursor <cursoragent@cursor.com>

Move the check_c_abi.py script to a python package, and add tests and type hints

add write contents permissions so that we can leave a comment on the PR

This reverts commit 252a9f8.

github-actions · 2026-03-02T21:43:26Z

⚠️ C ABI Breaking Changes Detected

This PR introduces breaking changes to the C ABI. Please review the changes carefully.

Breaking ABI changes are only allowed in major releases. If this is intentional for a major release,
the baseline ABI will need to be updated after merge.

See the job logs for details on what specific changes were detected.

### What are breaking ABI changes?

Breaking ABI changes include:
- Removing functions from the public API
- Changing function signatures (parameters or return types)
- Removing struct members or changing their types
- Removing or changing enum values

### Next steps

1. Review the changes flagged in the CI logs
2. If these changes are unintentional, update your PR to maintain ABI compatibility
3. If these changes are required, ensure:
   - This is part of a major version release
   - The changes are documented in the changelog
   - Migration guide is provided for users

For more information, see the [C ABI documentation](../docs/source/c_developer_guide.md).

…AGRA search switches kernel variants (NVIDIA#1851) Fix a bug in `safely_launch_kernel_with_smem_size` where `cudaFuncSetAttribute` was skipped for kernels that needed it. The function tracked the max shared memory in a single static variable per KernelT type, but `cudaFuncSetAttribute` applies per function pointer value — and the single-CTA CAGRA [search](https://github.com/rapidsai/cuvs/blob/d7a28aa1cb7648fa61037ed0459df0ec0e9db841/cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh#L1373C4-L1375C78) dispatches multiple kernel instantiations that share the same pointer type. When one kernel bumped the tracked max, a different kernel whose smem fell between its own previous max and the global max would skip `cudaFuncSetAttribute`, causing `cudaErrorInvalidValue`. The fix tracks the kernel pointer identity alongside a monotonically growing smem high-water mark: when the pointer changes, the new kernel is brought up to the high-water mark; when smem exceeds it, the mark is grown. ## Error in question ```c++ $ CUVS_CAGRA_ANN_BENCH --search --data_prefix='<DATA_DIR>/' --benchmark_out_format=csv --benchmark_out=res_search_iter_cagra.csv --benchmark_counters_tabular=true --override_kv=dataset_memory_type:\"device\" <CONFIG_DIR>/laion_1M_cagra_iterative.json [I] [12:28:52.095261] Using the query file '<DATA_DIR>/laion_1M/queries.fbin' [I] [12:28:52.096141] Using the ground truth file '<DATA_DIR>/laion_1M/groundtruth.1M.neighbors.ibin' 2026-02-25T12:28:52+00:00 Running CUVS_CAGRA_ANN_BENCH Run on (224 X 800 MHz CPU s) CPU Caches: L1 Data 48 KiB (x112) L1 Instruction 32 KiB (x112) L2 Unified 2048 KiB (x112) L3 Unified 307200 KiB (x2) Load Average: 0.70, 0.44, 0.28 dataset: laion_1M dim: 768 distance: euclidean ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations GPU Latency Recall end_to_end items_per_second itopk k max_iterations n_queries refine_ratio search_width total_queries ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- cuvs_cagra_iterative/0/process_time/real_time 5.70 ms 5.70 ms 121 5.68808m 5.69994m 0.96424 0.689692 1.75441M/s 64 10 8 10k 1 2 1.21M dataset_memory_type="device" cuvs_cagra_iterative/1/process_time/real_time 5.70 ms 5.70 ms 121 5.6863m 5.69879m 0.96424 0.689553 1.75477M/s 64 10 8 10k 1 2 1.21M dataset_memory_type="device" cuvs_cagra_iterative/2/process_time/real_time 4.92 ms 4.92 ms 140 4.90351m 4.91567m 0.96046 0.688193 2.03432M/s 128 10 12 10k 1 1 1.4M dataset_memory_type="device" cuvs_cagra_iterative/3/process_time/real_time 5.99 ms 5.99 ms 115 5.97476m 5.98617m 0.97519 0.688409 1.67052M/s 128 10 16 10k 1 1 1.15M dataset_memory_type="device" cuvs_cagra_iterative/4/process_time/real_time 6.97 ms 6.97 ms 99 6.95873m 6.9703m 0.98129 0.690059 1.43466M/s 256 10 16 10k 1 1 990k dataset_memory_type="device" cuvs_cagra_iterative/5/process_time/real_time 10.5 ms 10.5 ms 66 0.010479 0.0104908 0.98548 0.692391 953.222k/s 512 10 10 10k 1 2 660k dataset_memory_type="device" ----------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------- cuvs_cagra_iterative/6/process_time/real_time ERROR OCCURRED: 'Benchmark loop: CUDA error encountered at: file=cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh line=2348: call='cudaPeekAtLastError()', Reason=cudaErrorInvalidValue:invalid argument Obtained 19 stack frames #1 in CUVS_CAGRA_ANN_BENCH: raft::cuda_error::cuda_error(std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&) #2 in libcuvs.so: void cuvs::neighbors::cagra::detail::single_cta_search::select_and_run<float, unsigned int, float, unsigned int, cuvs::neighbors::filtering::none_sample_filter>(...) NVIDIA#3 in libcuvs.so: cuvs::neighbors::cagra::detail::single_cta_search::search<float, unsigned int, float, cuvs::neighbors::filtering::none_sample_filter, unsigned int, long>::operator()(...) NVIDIA#4 in libcuvs.so(+0x18fd0f1) NVIDIA#5 in libcuvs.so: void cuvs::neighbors::cagra::search<float, unsigned int, long>(...) NVIDIA#6-NVIDIA#19 in CUVS_CAGRA_ANN_BENCH / libc.so.6 ' ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations GPU Latency Recall end_to_end items_per_second itopk k max_iterations n_queries refine_ratio search_width total_queries ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- cuvs_cagra_iterative/7/process_time/real_time 10.5 ms 10.5 ms 66 0.0105088 0.0105202 0.98663 0.694332 950.555k/s 32 10 32 10k 1 1 660k dataset_memory_type="device" cuvs_cagra_iterative/8/process_time/real_time 12.8 ms 12.8 ms 54 0.012796 0.0128079 0.98807 0.691628 780.768k/s 32 10 64 10k 1 1 540k dataset_memory_type="device" ----------------------------------------------------------------------------------------- Benchmark Time CPU Iterations ----------------------------------------------------------------------------------------- cuvs_cagra_iterative/9/process_time/real_time ERROR OCCURRED: 'Benchmark loop: CUDA error encountered at: file=cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh line=2348: call='cudaPeekAtLastError()', Reason=cudaErrorInvalidValue:invalid argument [same stack trace as above] ' cuvs_cagra_iterative/10/process_time/real_time ERROR OCCURRED: 'Benchmark loop: CUDA error encountered at: file=cpp/src/neighbors/detail/cagra/search_single_cta_kernel-inl.cuh line=2348: call='cudaPeekAtLastError()', Reason=cudaErrorInvalidValue:invalid argument [same stack trace as above] ' ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- Benchmark Time CPU Iterations GPU Latency Recall end_to_end items_per_second itopk k max_iterations n_queries refine_ratio search_width total_queries ---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------- cuvs_cagra_iterative/11/process_time/real_time 46.1 ms 46.2 ms 15 0.0461323 0.0461439 0.99131 0.692158 216.714k/s 256 10 10 10k 1 16 150k dataset_memory_type="device" cuvs_cagra_iterative/12/process_time/real_time 142 ms 142 ms 5 0.141713 0.141725 0.99198 0.708627 70.5591k/s 512 10 32 10k 1 16 50k dataset_memory_type="device" ``` ## Config ``` { "dataset": { "name": "laion_1M", "base_file": "laion_1M/base.1M.fbin", "subset_size": 1000000, "query_file": "laion_1M/queries.fbin", "groundtruth_neighbors_file": "laion_1M/groundtruth.1M.neighbors.ibin", "distance": "euclidean" }, "search_basic_param": { "batch_size": 10000, "k": 10 }, "index": [ { "name": "cuvs_cagra_iterative", "algo": "cuvs_cagra", "build_param": { "graph_degree": 64, "intermediate_graph_degree": 128, "search_width": 1 }, "file": "laion_1M/cagra/q_coarse_iterative.ibin", "search_params": [ {"itopk": 64, "search_width": 2, "max_iterations": 8, "refine_ratio": 1}, {"itopk": 64, "search_width": 2, "max_iterations": 8, "refine_ratio": 1}, {"itopk": 128, "search_width": 1, "max_iterations": 12, "refine_ratio": 1}, {"itopk": 128, "search_width": 1, "max_iterations": 16, "refine_ratio": 1}, {"itopk": 256, "search_width": 1, "max_iterations": 16, "refine_ratio": 1}, {"itopk": 512, "search_width": 2, "max_iterations": 10, "refine_ratio": 1}, {"itopk": 256, "search_width": 2, "max_iterations": 12, "refine_ratio": 1}, {"itopk": 32, "search_width": 1, "max_iterations": 32, "refine_ratio": 1}, {"itopk": 32, "search_width": 1, "max_iterations": 64, "refine_ratio": 1}, {"itopk": 192, "search_width": 4, "max_iterations": 12, "refine_ratio": 1}, {"itopk": 256, "search_width": 4, "max_iterations": 12, "refine_ratio": 1}, {"itopk": 256, "search_width": 16, "max_iterations": 10, "refine_ratio": 1}, {"itopk": 512, "search_width": 16, "max_iterations": 32, "refine_ratio": 1} ] } ] } ``` Authors: - https://github.com/irina-resh-nvda Approvers: - Artem M. Chirkin (https://github.com/achirkin) URL: NVIDIA#1851

benfred and others added 23 commits February 10, 2026 11:29

Split extracting abi from analyzing abi

0746c12

Add GitHub Actions for C ABI checking

0f89228

- Add check-c-abi.yaml workflow for PRs - Add store-c-abi-baseline.yaml workflow for releases - Integrate ABI check into pr.yaml pipeline Co-authored-by: Cursor <cursoragent@cursor.com>

Add workflow_dispatch to allow manual baseline creation

20ce7c9

Enables bootstrapping the initial c-abi-baseline-main artifact Co-authored-by: Cursor <cursoragent@cursor.com>

code review fixes

5e763a4

Move to python package

c9dfd5c

Move the check_c_abi.py script to a python package, and add tests and type hints

remove old

eee7e23

fix typehint for root_path

1aec04d

Merge branch 'main' into c_abi_checker

08f72ba

update gha to use python package

365f83a

Merge branch 'main' into c_abi_checker

2ba3e43

fix yaml syntax

3a66d65

fix yaml syntax

bc7daf1

update permissions

038678e

add write contents permissions so that we can leave a comment on the PR

don't require a build just to get dlpack header file

0b5e572

don't require a build just to get dlpack header file

e47fc42

support python 3.11 w/ generator type hint

edd62fd

Revert

9268e4b

run in container to pick up nvcc

252a9f8

Revert "run in container to pick up nvcc"

098c283

This reverts commit 252a9f8.

Merge branch 'main' into c_abi_checker

2a7d564

github-actions Bot added the ci label Mar 2, 2026

don't require cmake/nvcc etc just for getting dlpack header

71c58c5

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

C abi checker test gha#2

C abi checker test gha#2
benfred wants to merge 24 commits into
mainfrom
c_abi_checker_test_gha

benfred commented Mar 2, 2026

Uh oh!

github-actions Bot commented Mar 2, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

benfred commented Mar 2, 2026

Uh oh!

github-actions Bot commented Mar 2, 2026

⚠️ C ABI Breaking Changes Detected

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants