Add floating support to BLAS/LAPACK backends #109

blackwer · 2025-07-23T18:15:44Z

Description

This PR aims to add the same level of backend support for float types as double types. This work is branched from #103 and addresses issue #108.

Add all level BLAS routines currently implemented, including MKL batching
Test all level BLAS routines properly
Add all level LAPACK routines currently implemented
Test all level LAPACK routines properly
Ensure all tests are not sneakily doing type promotion
Incorporate solid concept requirements, including their template macros (this is currently verbose, and somewhat inconsistent)
Document all changes, ensuring each API member properly reports the expect type requirements and behaviors
Resolve issues with answer tolerances in various tests (absolute tolerances on broad values, some randomness giving spontaneously failing tests, etc)

Notes

I had to change the tolerances on the tests to work for both float and double types. I did some very basic error analysis to get rough bounds on the error for a given matrix. Someone who is more knowledgeable about this kind of thing should check to make sure it's sensible.
Some matrices are generated randomly. This means that changing the seed might increase their error. Notably anything using the syhe_matrix calls in the lapack/linear_algebra tests. Those matrices aren't generally applied, so they can't really amplify error, so I wasn't sure what to use to bound the error. I chose 100*machine_eps.

- Allow general nda::MemoryArray types of rank 1 or 2

- do not allow conjugate expressions - update docs and tests

- Remove outer_product from nda::blas namespace - Update docs and tests

- move outer_product from blas/ger.hpp to linalg/outer_product.hpp

- move the generic dot routines from the blas to the linalg namespace - blas::dot and blas::dotc - thin wrappers around the BLAS routines - restrict to nda::MemoryVectors - linalg::dot linalg::dotc - allow scalar-scalar and vector-vector dot products - only call blas::dot and blas::dotc if it is possible without making a copy

- move matvecmul into linalg namespace and into its own linalg/matvecmul.hpp header - move blas::gemv_generic into linalg::detail namespace in linalg/matvecmul.hpp - linalg::matvecmul - allow non-contiguous matrices - move various helper lambdas into linalg::detail namespace - avoid unnecessary vector copies

- move blas::gemm_generic into linalg::detail namespace in linalg/matmul.hpp - linalg::matvecmul - allow non-contiguous matrices - move various helper lambdas into linalg::detail namespace

- remove explicit throw statements --> always return the LAPACK info - consistent checks and function layout

- remove explicit throw statement --> always return the LAPACK info - allow C-layout matrices

- require F-layout for B matrices

- lapack::getrf - allow C-layout matrices - lapack::getri - restrict to host (there is not getri for cuda) - lapack::getrs - allow conjugate lazy expressions - allow B to be either a vector or matrix - restrict B to F-layout

- move norm into linalg namespace

- allow vectors with different value types - require vectors to have a host compatible address space

…rse.hpp to matrix_functions.hpp

- rename inverse to inv, inverse_in_place to inv_in_place, inverse1_in_place to inv_in_place_1d, inverse2_in_place to inv_place_2d, inverse3_in_place to inv_in_place_3d - allow in place functions for host compatible address spaces - use getrf + getrs to allow device matrices in inv

- rename determinant to det, determinant_in_place to det_in_place - introduce det_1d, det_2d and det_3d - only allow matrices on host compatible address spaces and with double or complex value types - call optimized routines for matrices with less than 4 rows/columns

- replace throw statements with std::terminate()

- wrap the BLAS gerc routine - restrict to Fortran layout matrices

- rename linalg::eigenelements to linalg::eigh - rename linalg::eigenvalues to linalg::eigvalsh - rename linalg/eigenelements.hpp to linalg/eigh.hpp - call the LAPACK wrappers lapack::syev and lapack::heev

… problems

- increase overall test coverage - add tests for unified memory

- cusolverDn?gesvd only supports matrices with m >= n

- increase overall test coverage - add (some) tests for unified memory

- try to get compile time errors instead of runtime errors whenever possible

Supported functions in this patch: sdot, cdotu, cdotc, sgemv, cgemv, sgemm, cgemm Still need to add LAPACK routines, and device support

This was a bit hairier than the blas patch before, as there were a lot of implicit assumptions about a double underlying type in many of the temporaries and null returns. For tests I factored out some logic to make the tests run in float and double, which also required grabbing separate tolerances and passing them to the checkers. I've added some more tests to teh blas and linear algebra test runners. Also, one of the eigenvector/value decompositions gives eigenvectors with a different sign, which is something that probably needs to be handled better than my approach. This isn't wrong, and there are means to ensure the same eigenvector sign, if that's of interest in the future. Some other functions were touched, but these are the main functions in this patch: geqp3, heev, orgqr, syev, ungqr, eigh

gelss, gesvd, hegv, sygv, svd

I'm not a numerics guy, but calculating the condition number of the input matrix and bounding the error by coefficient_magnitude * condition_number * machine_epsilon gives very reasonable results

Thoemi09 added 30 commits July 25, 2025 16:19

Generalize get_ld and get_ncols in blas/tools.hpp

3bbed50

- Allow general nda::MemoryArray types of rank 1 or 2

Update blas/scal.hpp

8b0d8a3

- do not allow conjugate expressions - update docs and tests

Update blas/ger.hpp

afca7ec

- Remove outer_product from nda::blas namespace - Update docs and tests

Add an outer product function to linalg

31d8304

- move outer_product from blas/ger.hpp to linalg/outer_product.hpp

Add blas::get_array to blas/tools.hpp

ceb52f9

Allow nda::Array types in mem::common_addr_space

5a86f16

Update blas/gemm.hpp and linalg/matmul.hpp

19a66d2

- move blas::gemm_generic into linalg::detail namespace in linalg/matmul.hpp - linalg::matvecmul - allow non-contiguous matrices - move various helper lambdas into linalg::detail namespace

Update blas/gemm_batch.hpp

14c0941

Update lapack/geqp3.hpp, lapack/orgqr.hpp and lapack/ungqr.hpp

be5709a

- remove explicit throw statements --> always return the LAPACK info - consistent checks and function layout

Update lapack/gesvd.hpp

3c120fc

- remove explicit throw statement --> always return the LAPACK info - allow C-layout matrices

Add linalg::svd and linalg::svd_in_place

42609d8

Update lapack/gtsv.hpp

12bcfb5

- require F-layout for B matrices

Update lapack/gelss.hpp and lapack/gelss_worker.hpp

d2dfd70

Add linalg::solve and linalg::solve_in_place

a29b571

Update linalg/norm.hpp

2ce350c

- move norm into linalg namespace

Update linalg/cross_product.hpp

551c2f8

- allow vectors with different value types - require vectors to have a host compatible address space

Move is_matrix_square and is_matrix_diagonal from linalg/det_and_inve…

6806401

…rse.hpp to matrix_functions.hpp

Split linalg/det_and_inverse.hpp into linalg/det.hpp and linalg/inv.hpp

e798dcd

Remove compiler warning in mem and update docs

910b62d

Do not throw in assign_from_ndarray and fill_with_scalar

e9978ed

- replace throw statements with std::terminate()

Add blas/gerc.hpp

5e030e1

- wrap the BLAS gerc routine - restrict to Fortran layout matrices

Add lapack::syev and lapack::heev wrappers

28d1025

Add linalg::eigh and linalg::eigvalsh

7ee182c

- rename linalg::eigenelements to linalg::eigh - rename linalg::eigenvalues to linalg::eigvalsh - rename linalg/eigenelements.hpp to linalg/eigh.hpp - call the LAPACK wrappers lapack::syev and lapack::heev

Add lapack::sygv and lapack::hegv wrappers

49ef113

Overload linalg::eigh and linalg::eigvalsh for generalized eigenvalue…

daaa5b1

… problems

Thoemi09 and others added 19 commits July 25, 2025 16:19

Update tests in nda_cublas.cpp

b856252

- increase overall test coverage - add tests for unified memory

Remove getri declarations from lapack/interface/cusolver_interface.hpp

a9d0268

Add missing include to lapack/gelss_worker.hpp

483738b

Add additional check to lapack/gesvd.hpp for device implementation

5736f3b

- cusolverDn?gesvd only supports matrices with m >= n

Update tests in nda_culapack.cpp

37f4ec1

- increase overall test coverage - add (some) tests for unified memory

Minor test updates in nda_linear_algebra.cpp

18c5bb5

Add compile time check to assign_from_ndarray

229ac12

- try to get compile time errors instead of runtime errors whenever possible

Add test for linalg routines on the device

70e5830

[doc] Update docs of BLAS, LAPACK and linalg routines

84b4b98

[doc] Use sections instead of subsections in examples

f236859

[ghactions] Update to gcc-15 in macos runner

e725e82

Add float32 support for main blas routines

bc17721

Supported functions in this patch: sdot, cdotu, cdotc, sgemv, cgemv, sgemm, cgemm Still need to add LAPACK routines, and device support

test_linear_algebra: handle promotion and precision issues in libc++

06380eb

test_linear_algebra: more precision check improvements

9e50a7e

add float overloads for MKL blas functions, including batches

ed0b15b

Add float overloads for more LAPACK functions

0601056

gelss, gesvd, hegv, sygv, svd

tests: more eps_close for fp32 lapack tests

ad18c62

tests: small tweaks to linear algebra tests

a6e3088

blackwer force-pushed the float-support branch from 7d63385 to a6e3088 Compare July 28, 2025 14:06

blackwer added 2 commits July 28, 2025 12:29

tests: Set 'eps_close' in linalg/lapack based on error analysis

4934c16

I'm not a numerics guy, but calculating the condition number of the input matrix and bounding the error by coefficient_magnitude * condition_number * machine_epsilon gives very reasonable results

Merge remote-tracking branch 'origin/unstable' into float-support

05eadcb

Wentzell force-pushed the unstable branch from 3b0bdae to 77327e3 Compare November 3, 2025 17:51

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Add floating support to BLAS/LAPACK backends #109

Add floating support to BLAS/LAPACK backends #109

Uh oh!

blackwer commented Jul 23, 2025 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Add floating support to BLAS/LAPACK backends #109

Are you sure you want to change the base?

Add floating support to BLAS/LAPACK backends #109

Uh oh!

Conversation

blackwer commented Jul 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Description

Notes

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

blackwer commented Jul 23, 2025 •

edited

Loading