fix(build): probe CUDA toolkit layouts in a shared openinfer-build crate#343
fix(build): probe CUDA toolkit layouts in a shared openinfer-build crate#343FeathBow wants to merge 3 commits into
Conversation
|
Review notes from a build-discovery pass: Thanks for tackling this. The motivation makes sense: CUDA discovery is currently duplicated across build scripts, and the classic The new I do not think this fully closes #342 yet, though. The issue lists several CUDA discovery sites, but this PR only migrates part of them. A few active build paths still keep their own CUDA assumptions:
There is also one partial fix inside My suggestion is to make I verified:
I could not run the CUDA build locally because |
|
Thanks for the careful review(and verification)! Agreed on these points, and the include-side catch is a real gap. I'll push a patch to this PR later :) |
a71b566 to
007ddcb
Compare
|
Pushed the follow-up covering these points. Discovery now lives in one One behavior change to flag: kvbm-kernels used to check Before (conda toolkit, the include-side gap you predicted, reproduced with the Error logs — Single GPU (x86_64, sm_89), conda toolkitAfter — with the Verification logs |
f0a6138 to
affc3bd
Compare
affc3bd to
b0191a8
Compare
Description
Fixes #342
CUDA toolkit discovery was duplicated across build scripts, each assuming the classic
/usr/local/cudalayout (lib64/for libs,include/for headers). Two real layouts break it: conda/micromamba (libs inlib/, headers intargets/<arch>-linux/include/) and the NVIDIA HPC SDK (cuBLAS in amath_libs/<ver>sibling tree). This PR concentrates discovery in a sharedopeninfer-buildcrate:find_packageprobes several check files per root, andcuda_libsprobeslib64/lib/targets/<arch>/libplus themath_libssibling, emitting only dirs that exist. The evidenced sites (openinfer-kernels, cuda-sys, cudart-sys) migrate to it; gdrapi-sys/libibverbs-sys keep their behavior through the same helper.Before
LIBRARY_PATHexport.cargo test --workspacedown with them.Error logs
After
LIBRARY_PATHunset; the workaround is deleted.Verification logs
Type of Change
Checklist
docs/conventions/coding-style.md).CLAUDE.md).