Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Missing distributed array creation methods #1278

Open
1 of 5 tasks
stevenrbrandt opened this issue Oct 6, 2020 · 21 comments
Open
1 of 5 tasks

Missing distributed array creation methods #1278

stevenrbrandt opened this issue Oct 6, 2020 · 21 comments

Comments

@stevenrbrandt
Copy link
Member

stevenrbrandt commented Oct 6, 2020

  • array_d(array) : split up an existing array across localities (we have tile(), see below)

  • fromfunction(func, shape) : like numpy's fromfunction (see Adding indices and fromfunction primitives #1289)

  • file_read_hdf5_d(filename) : like file_read_csv_d except it reads from hdf5 files

Related methods that would be helpful...

  • file_write_hdf5_d(filename) : like file_read_hdf5_d except it writes files

  • file_write_csv_d(filename) : like file_write_hdf5_d except it writes csv files

@ct-clmsn
Copy link
Contributor

ct-clmsn commented Oct 7, 2020

@stevenrbrandt,

  • Would array_d accept an array shape and raw data as an input argument?

  • If the input is a shape, would the shape be generated across multiple machines symmetrically? by that, I mean, an input of 4x4 would create 4x4 tiles across N-machines.

  • Could a shape argument be capable of defining the size, how many machines to tile across, and the memory layout of the tile (row/column)?

example:

array_d([[0,0,0,0],[0,0,0,0],[0,0,0,0],[0,0,0,0]]) -> each machine creates a 4x4 matrix tile of 0's.

array_d(shape=((4,4), row)) -> all machines create a 4x4 row order tile

array_d(shape=( (4,4), [ (0, row) , (1, col) ] ) ) -> each machine makes a 4x4 tile, machine 0 creates row order tile, machine 1 creates a column order tile; if a machine isn't listed, it creates a 4x4 using a default layout.

@hkaiser
Copy link
Member

hkaiser commented Oct 7, 2020

array_d( array ) : split up an existing array across localities

We have tile(array, ...) that is doing exactly that (see: https://github.com/STEllAR-GROUP/phylanx/blob/master/src/plugins/matrixops/tile_operation.cpp#L32-L45)

@hkaiser
Copy link
Member

hkaiser commented Oct 7, 2020

@stevenrbrandt How should the distributed array be initialized? We have available the equivalents for numpy.zeros, numpy.ones, numpy.empty, etc. (all of those are mapped onto our constant primitive). We should however make sure that our distributed constant_d primitive supports all variations.

@stevenrbrandt
Copy link
Member Author

@hkaiser array_d(arg) should not take a shape arg. It should work kind of like numpy.array(arg) does. It should convert a non-distributed array into a distributed one. The value arg should keep both its data and its shape, but that shape should get split up.

The 'tile()' primitive creates a bigger array by replicating an input array. I assume you meant retile_d()? As discussed in the meeting, retile_d() does not create a distributed array, it retiles an existing distributed array.

array_d() could take an optional tiling argument and default to sym.

@stevenrbrandt
Copy link
Member Author

@ct-clmsn see above about what I think array_d() ought to do.

@hkaiser
Copy link
Member

hkaiser commented Oct 8, 2020

@hkaiser array_d(arg) should not take a shape arg. It should work kind of like numpy.array(arg) does. It should convert a non-distributed array into a distributed one. The value arg should keep both its data and its shape, but that shape should get split up.

The 'tile()' primitive creates a bigger array by replicating an input array. I assume you meant retile_d()? As discussed in the meeting, retile_d() does not create a distributed array, it retiles an existing distributed array.

array_d() could take an optional tiling argument and default to sym.

In all of this you are assuming a single locality loading a file (or otherwise generating the array) which should then be distributed across other localities. The execution model we target at this point however is pure SPMD, i.e. every locality executes the same code. How should the array_d primitive proposed operate when executed on the localities that do not have the data? In the end the (re-)tile primitive would do the right thing if one of the localities has all of the data and all the other localities just have an empty array.

@stevenrbrandt
Copy link
Member Author

@hkaiser -- if we assume pure SPMD, then any non-distributed array must have an identical copy on each locality (unless it's a random array). I guess that means array_d() would throw away all but it's part of the data and provide a unifying id for the distinct pieces.

@hkaiser
Copy link
Member

hkaiser commented Oct 8, 2020

@hkaiser -- if we assume pure SPMD, then any non-distributed array must have an identical copy on each locality (unless it's a random array). I guess that means array_d() would throw away all but it's part of the data and provide a unifying id for the distinct pieces.

Ok. You can do that using slice_d already, I think. If that doesn't work (yet), using slice/annotate should do the trick as well.

@hkaiser
Copy link
Member

hkaiser commented Oct 8, 2020

array_d([[0,0,0,0],[0,0,0,0],[0,0,0,0],[0,0,0,0]]) -> each machine creates a 4x4 matrix tile of 0's.

array_d(shape=((4,4), row)) -> all machines create a 4x4 row order tile

array_d(shape=( (4,4), [ (0, row) , (1, col) ] ) ) -> each machine makes a 4x4 tile, machine 0 creates row order tile, machine 1 creates a column order tile; if a machine isn't listed, it creates a 4x4 using a default layout.

@ct-clmsn I like this approach as it provides a concise way to specify tiles. We currently have some means for that, but it's difficult to use.

@ct-clmsn
Copy link
Contributor

ct-clmsn commented Oct 8, 2020

@hkaiser - in reference to tile - cool! i recalled ya'll had some mechanism defined earlier, couldn't remember the name. As for the creation routines - thanks, we've collectively talked about doing these sorts of things, not sure if we've ever placed it into text before; hope it captures the discussion lines.

@stevenrbrandt - I like the concept of registering local data into a distributed AGAS container (matrix/array/vector). did you have any comment on the examples in the response I posted earlier? Do those fit into your model or are they separate conceptually and build on what you are thinking?

@stevenrbrandt
Copy link
Member Author

@hkaiser I don't see how slice_column_d() or slice_row_d() provide this functionality. As for annotate_d(), sure, that works. However, it means the user has to figure out exactly which piece of which array they want on which processor rather than just getting a standard tiling. I think annotate_d() is too low level for most users.

@stevenrbrandt
Copy link
Member Author

@ct-clmsn I'm not sure I understand your idea. Normally, when we think of a distributed array, I think of a large array with some definite shape that gets chopped into pieces and distributed. You seem to want to have a distributed array that exists as an identical copy on each processor, but with different tilings. Is that correct? I'm a little unclear about how such an array is kept in sync during a calculation.

@hkaiser
Copy link
Member

hkaiser commented Oct 20, 2020

@stevenrbrandt I don't think having fromfunction_d would be appropriate. The numpy docs say that fromfunction returns whatever the function object passed to it returns, which could possibly be not an array. As such, the distributed operation should be performed by that function object:

    define(dist_array, random_d(list(20, 20))),
    fromfunction(lambda(i, j, slice(dist_array, i, j)), list(2, 2))

or similar. Having a local fromfunction would be very useful, however (along with a local version of indices).

@ct-clmsn
Copy link
Contributor

ct-clmsn commented Oct 20, 2020

I'm not sure I understand your idea. Normally, when we think of a distributed array, I think of a large array with some definite shape that gets chopped into pieces and distributed. You seem to want to have a distributed array that exists as an identical copy on each processor but with different tilings. Is that correct? I'm a little unclear about how such an array is kept in sync during a calculation.

@stevenrbrandt My bad! You are on point; I misstated something...this may not lend itself to a text exchange, will save for a conference call. For clarity, The 'local copy' in the question posed was about a tile that composes a matrix portion. If you consider the matrix as a globally addressable Java/C++ container type, each tile is globally addressable, and the data in the tile is stored on a local node - that last bit feels a bit regressive to state but is posted for clarity.

array_d([[0,0,0,0],[0,0,0,0],[0,0,0,0],[0,0,0,0]]) -> each machine creates a 4x4 matrix tile of 0's.

This feature intends that a user knows how big each tile should be on each of the N machines ahead of time. The final matrix size is 4x4 * N machines. For a 4 node computation, you get a 16x16 matrix. The 16x16 matrix is broken up into 4 tiles of 4x4 mini-matrices.

array_d(shape=((4,4), row)) -> all machines create a 4x4 row order tile

A similar way of saying the same thing in the previous example adds in a row or column layout for the data. The 4x4 mini-matrix on each machine is stored in memory using a row layout.

array_d(shape=( (4,4), [ (0, row) , (1, col) ] ) ) -> each machine makes a 4x4 tile, machine 0 creates row order tile, machine 1 creates a column order tile; if a machine isn't listed, it creates a 4x4 using a default layout.

A similar way of saying the same thing in the previous example provides users with a finer level of granularity for memory layout. The 0'th machine's 4x4 matrix is stored in a row order fashion, the 1st machine's 4x4 matrix is stored in a column order fashion.

@stevenrbrandt
Copy link
Member Author

@hkaiser I'm a little confused. What do you mean it might not be an array? The lambda should always return a scalar. Array creation with fromfunction() looks like this:

>>> np.fromfunction(lambda i,j : i*10+j,(3,4))
array([[ 0.,  1.,  2.,  3.],
       [10., 11., 12., 13.],
       [20., 21., 22., 23.]])

from_function_d() would possibly need a list of localities and/or a tiling, but should otherwise work the same. Yes?

@stevenrbrandt
Copy link
Member Author

@ct-clmsn I think it's better to pass the size of the global array and the number of localities and have some algorithm figure out how to make it work. What happens with your routine if you ask for 4x4 and want 3 localities? If I passed a 6x6 and asked for 3 localities, I could get a 2x6 and two 3x4's. I don't want to have to figure that kind of thing out for myself, normally.

@hkaiser
Copy link
Member

hkaiser commented Oct 20, 2020

@hkaiser I'm a little confused. What do you mean it might not be an array? The lambda should always return a scalar. Array creation with fromfunction() looks like this:

>>> np.fromfunction(lambda i,j : i*10+j,(3,4))
array([[ 0.,  1.,  2.,  3.],
       [10., 11., 12., 13.],
       [20., 21., 22., 23.]])

from_function_d() would possibly need a list of localities and/or a tiling, but should otherwise work the same. Yes?

I think this is not quite how it works. The supplied function is called with arrays of indices, one array per dimension:

import numpy as np 
np.fromfunction(lambda i, j: print("i: ", i, "\nj: ", j), (2, 3))  

will print:

i:  [[0. 0. 0.] 
 [1. 1. 1.]]    
j:  [[0. 1. 2.] 
 [0. 1. 2.]]    

Also, for instance:

print(np.fromfunction(lambda i, j: "boo!", (2, 3)) )

will output:

'boo!'

@ct-clmsn
Copy link
Contributor

ct-clmsn commented Oct 20, 2020

@stevenrbrandt - Understood, I've been thinking about your suggestion to have a Python function that does the partitioning automatically, and that's where I believe we (could) all agree with Avah's automatic tiling and data layout work fits into the picture. That said, something like what I'm proposing will be needed by Avah's contributions for building that automatic capability - think of the proposal as an entry point for Avah's contributions. The proposal can also allow people who need or want that sort of granular control to express or compute layouts they may require.

@stevenrbrandt
Copy link
Member Author

stevenrbrandt commented Oct 21, 2020

@hkaiser, so I see I was mistaken. It passes an array of integers--it just happens to give the same answer I was expecting. I think it would be okay for fromfunction_d() to require the output to match the shape of the input.

@stevenrbrandt
Copy link
Member Author

@ct-clmsn one can, in principle do what you propose with annotate_d(), I believe. It's a question of level of user friendliness.

@hkaiser
Copy link
Member

hkaiser commented Oct 21, 2020

@hkaiser, so I see I was mistaken. It passes an array of integers--it just happens to give the same answer I was expecting. I think it would be okay for fromfunction_d() to require the output to match the shape of the input.

I merged a local-only version of fromfunction just now (see #1289), which should be sufficient as it will return whatever the supplied function object returns. So all of the distributed handlings can be done there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants