Consider changing default axes for map / reduce / filter

The current default for these operations on Spark arrays is `axis=(0,)`, which may incur a swap to distribute along that axis (if it isn't already). The default could instead be `axis=None` which would mean apply over the distributed axes (whatever they are) and would never incur a `swap`.

Suggested by @shoyer, thanks!

This generally seems like a more friendly default, the only issue arises not with `map` but with `reduce`, when considering sequences of mixed operations. For example, in the following two cases where the `map` is a no-op, 

```
data = ones((2, 3, 4), sc)
data.map(lambda x: x, axis=(0,)).reduce(add)
data.map(lambda x: x, axis=(0,1)).reduce(add)
```

if the default for `reduce` is over the partitioned axes, the answer will be different in the two cases, whereas if the default is over `axis=(0,)` it will be the same.

I can see an argument that these really _should_ be the same with the default parameters, but curious to get other opinions. Another option is using different defaults for `map`/`filter` and `reduce`.

cc @andrewosh  


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Consider changing default axes for map / reduce / filter #66

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Consider changing default axes for map / reduce / filter #66

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions