Skip to content

Block latest CUDSS #640

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Jul 22, 2025
Merged

Block latest CUDSS #640

merged 1 commit into from
Jul 22, 2025

Conversation

ChrisRackauckas
Copy link
Member

No description provided.

@ChrisRackauckas ChrisRackauckas merged commit 63a90fa into main Jul 22, 2025
39 of 42 checks passed
@ChrisRackauckas ChrisRackauckas deleted the ChrisRackauckas-patch-1 branch July 22, 2025 06:18
@imciner2
Copy link

What's the reason for the incompatibility?

CC @amontoison

@amontoison
Copy link
Contributor

amontoison commented Jul 22, 2025

I don't know. I almost didn't change the API between v0.4 and v0.5.
I just know that CUDSS gives different results and that it may be less stable numerically sometimes.

@ChrisRackauckas
Copy link
Member Author

#638 (comment)

CUDSS: Error During Test at /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/test/gpu/cuda.jl:96
  | Got exception outside of a @test
  | CUDSSError: an invalid value was used as an argument (code 3, CUDSS_STATUS_INVALID_VALUE)
  | Stacktrace:
  | [1] throw_api_error(res::CUDSS.cudssStatus_t)
  | @ CUDSS ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/CUDSS/5Tf0Y/src/error.jl:37
  | [2] check
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/CUDSS/5Tf0Y/src/error.jl:48 [inlined]
  | [3] cudssExecute
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/GPUToolbox/cZlg7/src/ccalls.jl:33 [inlined]
  | [4] cudss(phase::String, solver::CUDSS.CudssSolver{Float32}, X::CUDSS.CudssMatrix{Float32}, B::CUDSS.CudssMatrix{Float32})
  | @ CUDSS ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/CUDSS/5Tf0Y/src/interfaces.jl:329
  | [5] #lu!#101
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/CUDSS/5Tf0Y/src/generic.jl:36 [inlined]
  | [6] lu!
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/CUDSS/5Tf0Y/src/generic.jl:31 [inlined]
  | [7] solve!(cache::LinearSolve.LinearCache{CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}, SciMLBase.NullParameters, LinearSolve.DefaultLinearSolver, LinearSolve.DefaultLinearSolverInit{CUDSS.CudssSolver{Float32}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Krylov.GmresWorkspace{Float32, Float32, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}}, Nothing, Tuple{Nothing, Nothing}, Nothing, Nothing, Nothing, Nothing, LinearAlgebra.Cholesky{Float32, CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}}, LinearAlgebra.Cholesky{Float32, CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}}, Tuple{LinearAlgebra.LU{Float32, Matrix{Float32}, Vector{Int32}}, Base.RefValue{Int32}}, Tuple{LinearAlgebra.LU{Float32, Matrix{Float32}, Vector{Int64}}, Base.RefValue{Int64}}, Nothing, Krylov.CraigmrWorkspace{Float32, Float32, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}}, Krylov.LsmrWorkspace{Float32, Float32, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}}}, SciMLOperators.IdentityOperator, SciMLOperators.IdentityOperator, Float32, Bool, LinearSolve.LinearSolveAdjoint{Missing}}, alg::LinearSolve.LUFactorization{LinearAlgebra.RowMaximum}; kwargs::@Kwargs{})
  | @ LinearSolve /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/factorization.jl:136
  | [8] solve!
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/factorization.jl:125 [inlined]
  | [9] macro expansion
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/default.jl:365 [inlined]
  | [10] solve!(::LinearSolve.LinearCache{CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}, SciMLBase.NullParameters, LinearSolve.DefaultLinearSolver, LinearSolve.DefaultLinearSolverInit{CUDSS.CudssSolver{Float32}, Nothing, Nothing, Nothing, Nothing, Nothing, Nothing, Krylov.GmresWorkspace{Float32, Float32, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}}, Nothing, Tuple{Nothing, Nothing}, Nothing, Nothing, Nothing, Nothing, LinearAlgebra.Cholesky{Float32, CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}}, LinearAlgebra.Cholesky{Float32, CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}}, Tuple{LinearAlgebra.LU{Float32, Matrix{Float32}, Vector{Int32}}, Base.RefValue{Int32}}, Tuple{LinearAlgebra.LU{Float32, Matrix{Float32}, Vector{Int64}}, Base.RefValue{Int64}}, Nothing, Krylov.CraigmrWorkspace{Float32, Float32, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}}, Krylov.LsmrWorkspace{Float32, Float32, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}}}, SciMLOperators.IdentityOperator, SciMLOperators.IdentityOperator, Float32, Bool, LinearSolve.LinearSolveAdjoint{Missing}}, ::LinearSolve.DefaultLinearSolver; assump::LinearSolve.OperatorAssumptions{Nothing}, kwargs::@Kwargs{})
  | @ LinearSolve /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/default.jl:354
  | [11] solve!
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/default.jl:354 [inlined]
  | [12] #solve!#11
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:299 [inlined]
  | [13] solve!
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:298 [inlined]
  | [14] #solve#10
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:295 [inlined]
  | [15] solve
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:293 [inlined]
  | [16] #solve#9
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:290 [inlined]
  | [17] solve
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:288 [inlined]
  | [18] #solve#8
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:285 [inlined]
  | [19] solve(::SciMLBase.LinearProblem{Nothing, true, CUDA.CUSPARSE.CuSparseMatrixCSR{Float32, Int32}, CUDA.CuArray{Float32, 1, CUDA.DeviceMemory}, SciMLBase.NullParameters, Nothing, @Kwargs{}})
  | @ LinearSolve /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/src/common.jl:284
  | [20] macro expansion
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/test/gpu/cuda.jl:107 [inlined]
  | [21] macro expansion
  | @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Test/src/Test.jl:1709 [inlined]
  | [22] top-level scope
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/test/gpu/cuda.jl:97
  | [23] include(mod::Module, _path::String)
  | @ Base ./Base.jl:562
  | [24] include(x::String)
  | @ Main.var"##CUDA#246" ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:28
  | [25] macro expansion
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:24 [inlined]
  | [26] macro expansion
  | @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Test/src/Test.jl:1709 [inlined]
  | [27] top-level scope
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:24
  | [28] eval(m::Module, e::Any)
  | @ Core ./boot.jl:430
  | [29] macro expansion
  | @ ~/.cache/julia-buildkite-plugin/depots/42a562d3-f0dd-4024-a23d-12553727650e/packages/SafeTestsets/raUNr/src/SafeTestsets.jl:28 [inlined]
  | [30] macro expansion
  | @ ./timing.jl:581 [inlined]
  | [31] top-level scope
  | @ /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/test/runtests.jl:42
  | [32] include(fname::String)
  | @ Main ./sysimg.jl:38
  | [33] top-level scope
  | @ none:6
  | Test Summary:                                                                                                                          \| Pass  Error  Total     Time
  | CUDA                                                                                                                                   \|   32      1     33  1m28.8s
  | LinearSolve.CudaOffloadFactorization()                                                                                               \|    4             4    30.4s
  | LinearSolve.NormalCholeskyFactorization{LinearAlgebra.NoPivot}(LinearAlgebra.NoPivot())                                              \|    4             4     0.6s
  | Simple GMRES: restart = true                                                                                                         \|    4             4     1.6s
  | Simple GMRES: restart = false                                                                                                        \|    4             4     0.0s
  | Block Diagonal Specialization                                                                                                        \|    5             5    22.6s
  | Adjoint/Transpose Type: LinearSolve.NormalCholeskyFactorization{LinearAlgebra.NoPivot}(LinearAlgebra.NoPivot())                      \|    2             2     3.4s
  | Adjoint/Transpose Type: LinearSolve.CholeskyFactorization{LinearAlgebra.NoPivot, Nothing}(LinearAlgebra.NoPivot(), 16, 0.0, nothing) \|    2             2     0.1s
  | Adjoint/Transpose Type: LinearSolve.LUFactorization{LinearAlgebra.RowMaximum}(LinearAlgebra.RowMaximum(), true, true)                \|    2             2     0.6s
  | Adjoint/Transpose Type: LinearSolve.QRFactorization{LinearAlgebra.NoPivot}(LinearAlgebra.NoPivot(), 16, true)                        \|    2             2     2.0s
  | Adjoint/Transpose Type: nothing                                                                                                      \|    2             2     7.2s
  | CUDSS                                                                                                                                \|           1      1    11.0s
  | ERROR: LoadError: Some tests did not pass: 32 passed, 0 failed, 1 errored, 0 broken.
  | in expression starting at /var/lib/buildkite-agent/builds/gpuci-12/julialang/linearsolve-dot-jl/test/runtests.jl:38
  | ERROR: Package LinearSolve errored during testing
  | Stacktrace:
  | [1] pkgerror(msg::String)
  | @ Pkg.Types ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Pkg/src/Types.jl:68
  | [2] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, julia_args::Cmd, test_args::Cmd, test_fn::Nothing, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool)
  | @ Pkg.Operations ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Pkg/src/Operations.jl:2128
  | [3] test
  | @ ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Pkg/src/Operations.jl:2011 [inlined]
  | [4] test(ctx::Pkg.Types.Context, pkgs::Vector{Pkg.Types.PackageSpec}; coverage::Bool, test_fn::Nothing, julia_args::Cmd, test_args::Cmd, force_latest_compatible_version::Bool, allow_earlier_backwards_compatible_versions::Bool, allow_reresolve::Bool, kwargs::@Kwargs{io::IOContext{IO}})
  | @ Pkg.API ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Pkg/src/API.jl:481
  | [5] test(pkgs::Vector{Pkg.Types.PackageSpec}; io::IOContext{IO}, kwargs::@Kwargs{coverage::Bool, julia_args::Cmd, test_args::Cmd})
  | @ Pkg.API ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Pkg/src/API.jl:159
  | [6] test(; name::Nothing, uuid::Nothing, version::Nothing, url::Nothing, rev::Nothing, path::Nothing, mode::Pkg.Types.PackageMode, subdir::Nothing, kwargs::@Kwargs{coverage::Bool, julia_args::Cmd, test_args::Cmd})
  | @ Pkg.API ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.11/julia-1.11-latest-linux-x86_64/share/julia/stdlib/v1.11/Pkg/src/API.jl:174
  | [7] top-level scope
  |  

<br class="Apple-interchange-newline">

@ChrisRackauckas
Copy link
Member Author

CUDSSError: an invalid value was used as an argument (code 3, CUDSS_STATUS_INVALID_VALUE) seems like a bug in the wrapper? Whatever it is, our tests fail with it. I think it's a bug in the lu! wrapper 😅 so for now we're avoiding it until we see a way to fix the tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants