-
Notifications
You must be signed in to change notification settings - Fork 3
Do not require DenseArray #2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Comments
The problem is that, the way all the methods are implemented, is by assuming that in the end there is a dense memory region underlying the data, as all indices are in the end transformed into a linear index into this dense parent array. I guess I could add a generic fallback for I should check that would work with the |
Why not simply call If you want to fall back to an implementation for any |
My apologies for not responding sooner. Maybe this is a good strategy nowadays. I think there used to be a time that Still, in the end the current implementation needs to have an object in which it can index linearly using the strides to compute the linear index. Maybe that feels for some cases which are not |
I don't see what |
This addresses #2. In particular, while this allows `StridedView`s to be used for non-`DenseArray` types, it does not define a constructor for them. This reflects the fact that in general, `StridedView`s are only well-defined for `DenseArray`s, but allows users to manually tap into the machinery of `Strided`, in which case it is up to them to ensure correctness.
My apologies for the late response. I am afraid that what you say is not true. Consider the following: julia> A = randn(10,10);
julia> B = view(A, 1:2:10, 1:2:10); strides(B)
(2, 20)
julia> B[3,2], B[1 + (3-1)*2 + (2-1)*20], B.parent[1 + (3-1)*2+(2-1)*20]
(-1.3798464342130103, -1.7417661865291487, -1.3798464342130103)
julia> B[3,4], B[1 + (3-1)*2 + (4-1)*20]
ERROR: BoundsError: attempt to access 5×5 view(::Matrix{Float64}, 1:2:9, 1:2:9) with eltype Float64 at index [65]
Stacktrace:
[1] throw_boundserror(A::SubArray{Float64, 2, Matrix{Float64}, Tuple{StepRange{Int64, Int64}, StepRange{Int64, Int64}}, false}, I::Tuple{Int64})
@ Base ./abstractarray.jl:744
[2] checkbounds
@ ./abstractarray.jl:709 [inlined]
[3] _getindex
@ ./abstractarray.jl:1340 [inlined]
[4] getindex(A::SubArray{Float64, 2, Matrix{Float64}, Tuple{StepRange{Int64, Int64}, StepRange{Int64, Int64}}, false}, I::Int64)
@ Base ./abstractarray.jl:1296
[5] top-level scope
@ REPL[10]:1 As you can see, it is only because I can do linearly indexing in the parent array associated with the view, or more generally, in the memory associated with that array and strides, that I can use the strides to compute all the index information. Hence, I take |
To add to that, we are loosening the restriction of |
I don't follow the code in the issue above (what's "B"?) If the point is that an array can be strided, but still not contiguous, well, of course. That is, if I have a 5x6 array, then strides (1, 5) and (6, 1) indicate a contiguous layout in memory, and any other strides will indicate a "Swiss cheese" array - e.g., (2, 10). It is reasonable that some operations will only work for strided arrays that are also contiguous. Still, not all such arrays happen to be Am I missing something? |
My example is indeed missing a line, my apologies for that. Not sure how that happend. I have now corrected it. The point is that linear indexing into an array does not mean that you are accessing the underlying memory. Julia's indexing behaviour makes it such that you can do linear indexing into any There is no general way to directly index into the memory that backs this AbstractArray, for which one would compute a "linear memory index" using the actual stride information. For the Base wrapper types, this is only possible by recursively calling |
I thought the whole point of the Consider this session:
Yes, one can write both Looking at the It seems to me that the only safe way to distinguish between the cases is by checking whether |
I have a different use case with a different solution. I was hoping to compose multiple StridedViews, similar to what they do in tinygrad. So to index into a composition of strided views, you would do outer multiindex -> linear index -> inner multiindex -> linear index, and then finally use that to index into memory. So in other words we could pretend that the parent is a dense block of memory, and in the cases where it actually is hopefully the compiler would recognize that. Sounds like maybe that's not the point of this package though. |
The point is that I still need to know, if I compute a linear index using the strides, what to index into. Consider the following: julia> A = randn(5,5)
5×5 Matrix{Float64}:
-1.83299 0.373954 -0.0752648 0.504695 1.99652
-0.442616 0.411342 1.24364 -0.0993078 -0.670033
0.394759 0.672543 1.53753 -0.569871 -1.55445
-0.714782 0.910227 0.0708801 -0.317265 0.563627
-0.0564112 -1.16758 -1.93046 2.44655 1.28567
julia> B = view(A, 1:3, 1:5);
julia> strides(B)
(1, 5)
julia> (x, y) = (3, 3)
(3, 3)
julia> B[x, y]
1.537533054119266
julia> B[1 + (x-1) * stride(B, 1) + (y-1) * stride(B, 2)]
1.9965179361688334
julia> parent(B)[1 + (x-1) * stride(B, 1) + (y-1) * stride(B, 2)]
1.537533054119266 The only reason I can get the correct element |
I don't fully understand the considerations here, but I was imagining we could define linear indexing into sparse arrays to agree with their multiindexing behavior. So we could define
instead of just treating |
Yes, that is what Julia's linear indexing already allows you to do. The point is that Strided.jl is specifically about about data that is layed out according to a strided pattern in memory, and that it will optimize the data access pattern utilizing that information. |
I think I have identified a confusion (in my head at least) between two aspects of strides. Making this explicit would help clarify the discussion. The way I see it, there are two possible meaning for "strides": One is that if you look at the memory layout - the physical addresses of where the data is - then it is possible to compute the address of the element at (i, j) by a formula Another applies to a "strided view" and says that if you have some parent array, and a strided view, it is possible to compute the index-in-the-parent of the element at (i, j) in the view by a formula Note that one doesn't imply the other - one can have virtual strides and not physical strides. I think that in practice physical strides imply virtual strides, but I suppose one could construct a pathological case where this isn't the case? Either way, I think that in the discussion, the term "strides" sometimes means "physical strides" and sometimes means "virtual strides" depending on on the author and the context. As a concrete example of the difference between the two, consider a subset of just the even-indexed columns of a There is a second distinction between cases where one of Is it correct to say that at the end of the day, what The original motivation for this issue is that one can have continuous, physically strided arrays which are not I assumed that the |
I was indeed talking about physical strides, and this is also the meaning of help?> strides
search: strides stride trues string otimes strip sortslices
strides(A)
Return a tuple of the memory strides in each dimension. Unfortunately, that interface is incomplete since, once you computed the physical memory address or offset, Julia does not provide a standardized interface to use this offset to request the corresponding element at that position, unless by pointer conversion and other unsafe operations and pointer arithmetic. In that sense, requiring a |
Perhaps this might be interesting to you: Julia already offers various tools to work with simply "mapping indices", often also by combining some wrapper types. Some examples of things you might be interested in are below, which can be used to obtain the mapping behavior you like. julia> using SparseArrays
julia> v = sprand(100, 0.2)
100-element SparseVector{Float64, Int64} with 26 stored entries:
[...]
julia> b = view(v, 3:60)
58-element view(::SparseVector{Float64, Int64}, 3:60) with eltype Float64:
[...]
julia> c = reshape(b, 2, 29)
2×29 reshape(view(::SparseVector{Float64, Int64}, 3:60), 2, 29) with eltype Float64:
[...]
julia> d = PermutedDimsArray(c, (2, 1))
29×2 PermutedDimsArray(reshape(view(::SparseVector{Float64, Int64}, 3:60), 2, 29), (2, 1)) with eltype Float64:
[...]
julia> c[2, 2] == c[4] == b[4] == v[3 + 4 - 1] == d[2, 2]
true Note that these constructions are completely generic, in the sense that they are designed to work with any If you could elaborate a bit more on what you would like to achieve, given that I'm not too familiar with tinygrad, we might be able to help out better? |
@Jutho - It seems you can: 1 - Call 2 - Call 3 - Call 4 - Use the formula Basically, if the array you were given is not a DenseArray, you create a DenseArray wrapper object that is "functionally equivalent" to it. The rest of the code would work exactly as it does today - using the public API of the DenseArray. Admittedly the call to This is a change only in the constructor (and adding a data member to hold on to the original parent). The rest of the code would be unaffected. Makes sense? |
Probably; there was an |
Interesting - so there's a case where
I'm not sure under which category the The current state of thins is counter-intuitive - |
As stated above, I agree that it would be nice if there would be a standard Julia interface to complement Nonetheless, I would also like to argue that |
Isn't the combination of |
I made my own struct View{T,N,P<:AbstractArray{T}} <: AbstractArray{T,N}
parent::P
dims::NTuple{N,Int}
strides::NTuple{N,Int}
offset::Int
function View{T,N,P}(parent, dims, strides, offset) where {T,N,P<:AbstractArray{T}}
@inline
if any(dims .< 1) ||
sum(max.(0, strides) .* (dims .- 1)) + offset >= length(parent) ||
sum(min.(0, strides) .* (dims .- 1)) + offset < 0
throw(ArgumentError("unsafe view, parentdims: $(size(parent)) dims: $dims strides: $strides offset: $offset"))
end
new(parent, dims, strides, offset)
end
end
View(parent::P, dims::NTuple{N,Int}, strides::NTuple{N,Int}, offset) where {T,N,P<:AbstractArray{T}} = View{T,N,P}(parent, dims, strides, offset)
Base.size(A::View) = A.dims
function Base.getindex(A::View{T,N}, I::Vararg{Int,N}) where {T,N}
@inline
@boundscheck checkbounds(A, I...)
idx = sum(A.strides .* (I .- 1)) + A.offset + 1
@inbounds out = A.parent[idx]
out
end
function Base.setindex!(A::View{T,N}, val, I::Vararg{Int,N}) where {T,N}
@inline
@boundscheck checkbounds(A, I...)
idx = sum(A.strides .* (I .- 1)) + A.offset + 1
@inbounds A.parent[idx] = val
A
end
|
No it's not. julia> supertype(Array)
DenseArray
julia> supertype(DenseArray)
AbstractArray @akriegman , as I mentioned above, this doesn't work. The index |
If the proposal is to use the pointer to the beginning of the parent, and then julia> a = rand(5)
5-element Vector{Float64}:
0.9133291929143837
0.17935218848356715
0.31691884142854054
0.7247547504715043
0.7583702853388546
julia> pa = pointer(a)
Ptr{Float64} @0x000000010dee8460
julia> unsafe_load(pa, 3), a[3]
(0.31691884142854054, 0.31691884142854054)
julia> b = [randn(5) for n=1:5]
5-element Vector{Vector{Float64}}:
[-0.39054832410530715, 1.2970022449051175, -1.1243745025700178, 1.2447151441862772, -0.4948897579954707]
[0.035007918114120774, -1.4004292420450568, -0.17782930468583014, 0.4504752715516825, 0.07151919367908639]
[1.3622083111074377, -0.6179471078219791, 0.7859175178496625, 0.07098911546608531, -0.5013870379927964]
[-1.693626877184436, 0.11838104727817299, 0.27541815011285736, 1.1682348106310123, 0.7477141794602888]
[0.03151197005787201, 0.5780909060892557, 0.8253062475739976, -1.7227093088929948, -0.10541693001699494]
julia> pb = pointer(b)
Ptr{Vector{Float64}} @0x000000010f195020
julia> unsafe_load(pb, 3)
ERROR: pointerref: invalid pointer type
Stacktrace:
[1] unsafe_load(p::Ptr{Vector{Float64}}, i::Int64)
@ Base ./pointer.jl:153
[2] top-level scope
@ REPL[25]:1
julia> b[3]
5-element Vector{Float64}:
1.3622083111074377
-0.6179471078219791
0.7859175178496625
0.07098911546608531
-0.5013870379927964 |
You are right, I misspoke, sorry. I do want to nitpick about "Linear indexing with idx will only be correct if parent has its data laid out contiguously in memory" - that's not technically correct. The entries of each column must be continuous, but the start of one column does not have to be immediately after the end of the previous column. It is sufficient that the offset between the start of each column and the previous one is constant. That is, any strides of the form Your claim that this would only work for
Also your counter-example #2 (comment) isn't. It merely demonstrates that one can't take the physical strides and use them as logical strides (hence my rambling about the difference between the two). But that's irrelevant. The proposed implementation would not pass the linear address from the strides to the original array, it would pass it to a different "physical-array" constructed by Other examples of arrays that could be accepted by At the end of the day, the question is whether the A separate question is whether |
I feel like I am repeating myself here. Yes StridedViews.jl assumes So, when claiming that the proposed approach does not work, I was specifically referring to the implementation put forward by of @akriegman, which uses linear indexing with I fully understand that with Out of curiosity, have you also checkout out https://github.com/JuliaSIMD/StrideArrays.jl and whether this could do what you want? |
Fair enough, I didn't mean to get this thread into a rut, sorry. I have some bandwidth issues myself (who doesn't :-) but I'll see if I can find some for a PR. I'll also take a look at StrideArrays, it seems to be along the lines I described - thanks for the pointer! |
Right not the constructor of
StridedView
requires the base array to beDenseArray
.However there are types which support
strides
and are notDenseArray
,so would work in a
StridedView
- if it were not for the requirements.How about removing the
<:DenseArray
from the signature? If the argument hasstrides
, it "should" just work; if it does not, then it would fail with a more informative error message (missingstrides
) than today (not having aStridedView
constructor).Right now
StridedView
knows about some special cases (e.g.Transpose
) which can be made strided w/o beingDenseArray
, but this doesn't work with non-Base
types. I encountered this trying to passPyArray
fromPythonCall
to@strided
, but this is just one example.The text was updated successfully, but these errors were encountered: