Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate with TensorFlow.jl #29

Open
davidanthoff opened this issue May 3, 2017 · 6 comments
Open

Integrate with TensorFlow.jl #29

davidanthoff opened this issue May 3, 2017 · 6 comments
Milestone

Comments

@davidanthoff
Copy link
Member

If possible.

@davidanthoff davidanthoff added this to the Backlog milestone May 3, 2017
@oxinabox
Copy link

I'm confused as to what this would even mean.

@davidanthoff
Copy link
Member Author

I think I had the idea that TensorFlow probably needs some data input, and that it would be nice if it could consume an iterable table. But, I have no idea how TensorFlow works, so this might indeed be a nonsense idea :)

@oxinabox
Copy link

The main way to get data into TensorFlow is
passing it in via run. where you loadup your data in julia, then pass it as an argument to the tensorflow run.
Which works just fine I assume already.
Eg run(sess, Ycancerrisk, Dict(Xage=df[:age], Xoccupation=df[:occupation]).
This is the only way I've ever loaded data into TensorFlow myself.

The other way is using various things like readers
so that the tensorflow C backend loads the data directly.
I'm not sure how queues fit into the picture.
It might be that there is a place for IterableTables.jl in integrating with queues.

@davidanthoff
Copy link
Member Author

I guess one thing that would work is that one can pass a table directly as the third argument, instead of the Dict, and then it will use the data from the columns by name. For example with your line it would be

run(sess, Ycancerrisk, df)

And the df variable could be any iterable table that has a column Xageand Xoccupation. Would that be useful?

It would also mean that you could pass in a Query.jl result directly, without materializing it first:

q = @from i in df begin
    @select {Xage=i.age, XOccupation=i.occupation}
end
run(sess, Ycancerrisk, q)

We could also enable the piping syntax that I showed at juliacon, so that things like this would work:

df |> run(sess, Ycancerrisk)

df |>
@query(i, begin
    @select {Xage=i.age, XOccupation=i.occupation}
end) |>
run(sess, Ycancerrisk)

I have no idea how useful any of this would be :)

@oxinabox
Copy link

And the df variable could be any iterable table that has a column Xageand Xoccupation. Would that be useful?

Maybe,
It would basically come for free, if the feed_dict (eg Dict(Xage=df[:age], Xoccupation=df[:occupation])),
was changed from Associative{Tensor, Any} to be allowed to be Associative{Symbol, Any}.
Which requires Symbols to be resolved to the Tensor nodes they represent.
Which can be done right now, generally.
I have had some general thoughts in this direction about revamping run.
(Probably also with a macro form)

It would also mean that you could pass in a Query.jl result directly, without materializing it first:

If you mean delaying computation via lazy evaluation then it is not possible (without substantial additional magic).
Not too deep behind run is the data collected so that it can be sent over FFI to the tensorflow C/C++ backend.

@davidanthoff
Copy link
Member Author

Ah, I had not looked carefully enough at the Dict there. Another option would be to pass in an iterable table, and the day a list of pairs that map tensor objects to symbols that refer to the columns. Not a huge improvement, but it would save a few characters if typing, I guess.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants