-
-
Notifications
You must be signed in to change notification settings - Fork 15
Change type annotation and similar
to zero
#30
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
ReverseDiff's julia> s
(2, 2)
julia> u
2×2 ReverseDiff.TrackedArray{Float64, Float64, 2, Matrix{Float64}, Matrix{Float64}}:
TrackedReal<73Q>(4.275532424545035, 0.0, LZl, 1, Les) TrackedReal<27L>(1.4210350814697517, 0.0, LZl, 3, Les)
TrackedReal<Gra>(3.986400929377556, 0.0, LZl, 2, Les) TrackedReal<Af6>(2.732510481205003, 0.0, LZl, 4, Les)
julia> similar(u, s)
2×2 Matrix{ReverseDiff.TrackedReal{Float64, Float64, ReverseDiff.TrackedArray{Float64, Float64, 2, Matrix{Float64}, Matrix{Float64}}}}:
#undef #undef
#undef #undef
julia> zero(u)
2×2 ReverseDiff.TrackedArray{Float64, Float64, 2, Matrix{Float64}, Matrix{Float64}}:
TrackedReal<8lT>(0.0, 0.0, C0z, 1, I6e) TrackedReal<ICq>(0.0, 0.0, C0z, 3, I6e)
TrackedReal<2o6>(0.0, 0.0, C0z, 2, I6e) TrackedReal<1jB>(0.0, 0.0, C0z, 4, I6e) How about? function Base.getindex(b::LazyBufferCache, u::T) where {T <: AbstractArray}
s = b.sizemap(size(u)) # required buffer size
buf = get!(b.bufs, (T, s)) do
similar(u, s) # buffer to allocate if it was not found in b.bufs
end::T # declare type since b.bufs dictionary is untyped
return buf
end
function Base.getindex(b::LazyBufferCache, u::ReverseDiff.TrackedArray)
s = b.sizemap(size(u)) # required buffer size
buf = get!(b.bufs, (T, s)) do
# declare type since b.bufs dictionary is untyped
zero(u)::T # buffer to allocate if it was not found in b.bufs
end
return buf
end |
I think so. @mohamed82008 do you think overloading that |
It might be breaking since code that worked before might not work. Especially |
I couldn't make the Defining a |
It looks like this broke CUDA support? |
strange -- I don't see how it can break it. Should I try a rebase? |
LoadError: ArgumentError: cannot take the CPU address of a CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}
--
| Stacktrace:
| [1] unsafe_convert(#unused#::Type{Ptr{Float32}}, x::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
| @ CUDA ~/.cache/julia-buildkite-plugin/depots/76057d1d-9065-4525-8f99-c59b2e38dd89/packages/CUDA/DfvRa/src/array.jl:319
| [2] getrf!(A::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
| @ LinearAlgebra.LAPACK ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.7/julia-1.7-latest-linux-x86_64/share/julia/stdlib/v1.7/LinearAlgebra/src/lapack.jl:575
| [3] lu!(A::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, ::LinearAlgebra.RowMaximum; check::Bool)
| @ LinearAlgebra ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.7/julia-1.7-latest-linux-x86_64/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:81
| [4] lu(A::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, pivot::LinearAlgebra.RowMaximum; check::Bool)
| @ LinearAlgebra ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.7/julia-1.7-latest-linux-x86_64/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:279
| [5] lu(A::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer}, pivot::LinearAlgebra.RowMaximum) (repeats 2 times)
| @ LinearAlgebra ~/.cache/julia-buildkite-plugin/julia_installs/bin/linux/x64/1.7/julia-1.7-latest-linux-x86_64/share/julia/stdlib/v1.7/LinearAlgebra/src/lu.jl:278
| [6] lu_instance(A::CUDA.CuArray{Float32, 2, CUDA.Mem.DeviceBuffer})
| @ ArrayInterfaceGPUArrays ~/.cache/julia-buildkite-plugin/depots/76057d1d-9065-4525-8f99-c59b2e38dd89/packages/ArrayInterfaceGPUArrays/lwGfo/src/ArrayInterfaceGPUArrays.jl:23 https://github.com/JuliaGPU/CUDA.jl/blob/master/lib/cusolver/linalg.jl#L314-L316 @maleadt how come LU factorization is only for v1.8 and above? |
The last time this CI ran the default was QR factorization for GPU. It changed to LU factorization for GPU. This failed because of the version bound. But that's unrelated to this repo so I'll merge. |
I only implemented it after we got the generalized factorizations in 1.8, and didn't bother porting it to older versions (which is a hassle, because it requires CUDA-specific factorization objects, i.e. CuLU, etc). |
This seems to solve #29 of
ReverseDiffVJP()
.