fix shape issues, improve performance utilizing GPU, make it pip installable, etc #35

carusyte · 2018-08-06T06:30:16Z

I met a couple of issues when I tried to use the DNC code, and after about a month's tuning, I've managed to incorporate this model into my existing model. Several major modifications in this PR here:

In order to reuse this code, I have to copy paste it because I can't pip install it, which is kind of counter-best-practice. So I made it "pip installable", with reference to the Convert to a package installable via pip #20 PR. One can generate installable packages by running python setup.py sdist under the root directory of this repo and then install it pip install ./dist/dnc-0.0.1.tar.gz. Some versioning and author information may not be correct, which can be edited or enriched in the setup.py file.
tf.unstack can't process tensors with partially-known shape info. This is quite common in many DL application scenarios I think. So I refactored those methods in the util.py by utilizing a combination of tensor operations as a workaround, meanwhile ensuring those operations can be run in GPU.
Performance. Some operations (or its gradients) can't be run in GPU as of tensorflow version 1.9(e.g. tf.reduce_prod, see #3957,#8841, and tf.nn.top_k, see #5719). I managed to workaround tf.reduce_prod by using tf.cumprod instead, and with some modification to the tf.nn.top_k's gradient method.
Deprecated APIs. Some methods/function arguments are deprecated as of tf v1.9. I've replaced them with the recommended substitutions, so that no warnings will be shown during startup.

dm-jrae

Thanks for the modernization! Some small comments that should be easy to implement :)

dm-jrae · 2018-08-06T08:53:52Z

dnc/access.py

    reset_weights = tf.expand_dims(reset_weights, 2)
    weighted_resets = expand_address * reset_weights
-    reset_gate = tf.reduce_prod(1 - weighted_resets, [1])
+    # back prob of tf.reduce_prod runs in CPU


Could you remove the commented lines? (55 & 56)

dm-jrae · 2018-08-06T08:54:43Z

dnc/access.py


-import addressing
-import util
+from dnc import addressing, util


Nit, but we always place individual imports on a new line, would you be able to replace these with

from dnc import addressing
from dnc import util

dm-jrae · 2018-08-06T08:55:35Z

dnc/addressing.py

    """
    with tf.name_scope('link'):
-      batch_size = prev_link.get_shape()[0].value
+      # batch_size = prev_link.get_shape()[0].value


Can you remove the commented line?

dm-jrae · 2018-08-06T08:56:03Z

dnc/addressing.py

      # Calculate the aggregated effect of all write heads
-      write_weights = 1 - tf.reduce_prod(1 - write_weights, [1])
+      # back prob of reduce_prod runs in CPU
+      # write_weights = 1 - tf.reduce_prod(1 - write_weights, [1])


Can you remove the commented line here too, please?

dm-jrae · 2018-08-06T08:56:15Z

dnc/addressing.py

      free_read_weights = free_gate * read_weights
-      phi = tf.reduce_prod(1 - free_read_weights, [1], name='phi')
+      # back prob of reduce_prod runs in CPU
+      # phi = tf.reduce_prod(1 - free_read_weights, [1], name='phi')


dm-jrae · 2018-08-06T09:01:33Z

dnc/util.py

@@ -0,0 +1,82 @@
+# Copyright 2017 Google Inc.


g4 mv hasn't quite worked with util.py, please fix :-)

dm-jrae · 2018-08-06T09:03:14Z

dnc/util.py

+    size = tf.cast(tf.shape(perm)[0], tf.float32)
+    delta = tf.cast(tf.shape(perm)[-1], tf.float32)
+    rg = tf.range(0, size*delta, delta, dtype=tf.float32)
+    rg = tf.reshape(rg, [-1, 1])


nit: I think tf.expand_dims(rg, 1) is a bit clearer here, as you're specifically trying to expand out a dimension.

dm-jrae · 2018-08-06T09:04:18Z

dnc/util.py

+def batch_invert_permutation(permutations):
+  """Returns batched `tf.invert_permutation` for every row in `permutations`."""
+  with tf.name_scope('batch_invert_permutation', values=[permutations]):
+    # unpacked = tf.unstack(permutations)


Could you remove these commented lines?

dm-jrae · 2018-08-06T09:07:14Z

dnc/util.py

+    # return tf.stack(result)
+
+    # fix unknown shape issue when using tf.unstack
+    idxf = tf.expand_dims(tf.cast(indices, tf.float32), -1)


Unclear why you cast idxf to a float32, and instantiate rg as float32 only to move everything back to int32. May as well keep everything as int32s.

Sorry for the lack of explanation. Some tensorflow issues showed that certain operations does not support running on GPU for int64 type. see #13164 and #13163 and such... I went straight to the type casting workaround without giving too much thought and experiment at the time. I will discard these casting as soon as I can confirm tensorflow v1.9 has fixed those

dm-jrae · 2018-08-06T09:07:24Z

dnc/util.py

+def batch_gather(values, indices):
+  """Returns batched `tf.gather` for every row in the input."""
+  with tf.name_scope('batch_gather', values=[values, indices]):
+    # unpacked = zip(tf.unstack(values), tf.unstack(indices))


Could you remove these commented lines?

dm-jrae

That's great, a few more style nitpicks. Can you confirm that the training script runs without issue?

dm-jrae · 2018-08-06T11:20:48Z

dnc/util.py

  return result
+
+def reduce_prod(x, axis, name=None):
+  '''


Final nit, can you format the docstring to coincide with the rest of the package's docstring style.

"""Efficient reduce product over axis.

Uses tf.cumprod and tf.gather_nd as a workaround to the poor performance of calculating tf.reduce_prod's gradient on CPU.
"""

dm-jrae · 2018-08-06T11:24:22Z

dnc/util.py

+  Uses tf.cumprod and tf.gather_nd as a workaround to the poor performance of calculating tf.reduce_prod's gradient
+  on CPU.
+  '''
+  with tf.variable_scope(name or "c_reduce_prod"):


Could you amend this variable scope to a name scope? Preferably replace with this line:

with tf.name_scope(name, 'util_reduce_prod', values=[x]):

dm-jrae · 2018-08-06T11:25:28Z

dnc/util.py

+  on CPU.
+  '''
+  with tf.variable_scope(name or "c_reduce_prod"):
+    cp=tf.cumprod(x, axis, reverse=True)


Can you add spaces around your equals assignment?

size = tf.shape(cp)[0]
idx1 = ... etc.

You do not need to do this for keyword args however:

dtype=tf.float32 is fine!

dm-jrae · 2018-08-06T11:26:28Z

dnc/util.py

+    dim = int(perm.get_shape()[-1])
+    size = tf.cast(tf.shape(perm)[0], tf.float32)
+    delta = tf.cast(tf.shape(perm)[-1], tf.float32)
+    rg = tf.range(0, size*delta, delta, dtype=tf.float32)


Can you place a space around this multiply operator?

size * delta

carusyte · 2018-08-06T11:39:41Z

Sure. Please be patient, don't merge as I'm working in progress, I'm validating my modification in google cloud platform so I'll have to push many unfinished changes to Github server in order to sync to my cloud compute engine... I'll post a notification in the PR when I'm done. Thanks~

May be I should have created this PR using tag rather than a branch...

* Replace xrange() with range()

carusyte · 2018-08-06T12:27:41Z

Seems I've messed up the PR when I was trying to squash all the minor commits...
Let me create a new one, sorry for all the inconvenience.

dm-jrae suggested changes Aug 6, 2018

View reviewed changes

dm-jrae reviewed Aug 6, 2018

View reviewed changes

carusyte force-pushed the fix_unstack_issue branch from 446652e to 3e116d3 Compare August 6, 2018 11:30

dm-jrae and others added 4 commits August 6, 2018 20:21

Initial commit.

f7edcb6

fix named parameter issue with sonnet; add updated python gitignore

e3ff5b8

Redefine xrange for Python 3

f1d5318

* Replace xrange() with range()

fix shape issues, improve performance, make it pip installable, etc

8598342

carusyte closed this Aug 6, 2018

carusyte force-pushed the fix_unstack_issue branch from 0649cc2 to 8598342 Compare August 6, 2018 12:24

carusyte mentioned this pull request Aug 6, 2018

fix shape issues, improve performance, make it pip installable, etc #36

Merged

carusyte deleted the fix_unstack_issue branch August 6, 2018 14:59

fix shape issues, improve performance utilizing GPU, make it pip installable, etc #35

fix shape issues, improve performance utilizing GPU, make it pip installable, etc #35

Uh oh!

Conversation

carusyte commented Aug 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

dm-jrae left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

dm-jrae left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

carusyte commented Aug 6, 2018 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

carusyte commented Aug 6, 2018

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

carusyte commented Aug 6, 2018 •

edited

Loading

carusyte commented Aug 6, 2018 •

edited

Loading