tensorflow optimizing sparse_tensor_dense_matmul operation on GPU -


is optimizing sparse_tensor_dense_matmul operation possible in tensorflow on gpu? use tensoflow 1.2.1 cuda 8. error example:

import tensorflow tf  tf.device('/gpu:0'):     st = tf.sparsetensor(         tf.constant([[0, 0], [1, 1]], dtype=tf.int64),         tf.constant([1.2, 3.4], dtype=tf.float32),         tf.constant([2, 2], dtype=tf.int64)     )      v = tf.variable([[1.0, 0.0], [0.0, 1.0]], dtype=tf.float32)     st = tf.sparse_tensor_dense_matmul(st, v)     st = tf.reduce_min(st)     optimizer = tf.train.adamoptimizer()     trainer = optimizer.minimize(st)  tf.session() sess:     print(sess.run(trainer)) 

results in following error:

traceback (most recent call last):   file "test_tf3.py", line 18, in <module>     print(sess.run(trainer))   file "/media/awork/home/astepochkin/drecs/repo/env/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 895, in run     run_metadata_ptr)   file "/media/awork/home/astepochkin/drecs/repo/env/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1124, in _run     feed_dict_tensor, options, run_metadata)   file "/media/awork/home/astepochkin/drecs/repo/env/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1321, in _do_run     options, run_metadata)   file "/media/awork/home/astepochkin/drecs/repo/env/local/lib/python2.7/site-packages/tensorflow/python/client/session.py", line 1340, in _do_call     raise type(e)(node_def, op, message) tensorflow.python.framework.errors_impl.invalidargumenterror: cannot assign device operation 'gradients/sparsetensordensematmul/sparsetensordensematmul_grad/strided_slice_1': not satisfy explicit device specification '/device:gpu:0' because no supported kernel gpu devices available.      [[node: gradients/sparsetensordensematmul/sparsetensordensematmul_grad/strided_slice_1 = stridedslice[index=dt_int32, t=dt_int64, begin_mask=1, ellipsis_mask=0, end_mask=1, new_axis_mask=0, shrink_axis_mask=2, _device="/device:gpu:0"](const, gradients/sparsetensordensematmul/sparsetensordensematmul_grad/strided_slice_1/stack, gradients/sparsetensordensematmul/sparsetensordensematmul_grad/strided_slice_1/stack_1, gradients/sparsetensordensematmul/sparsetensordensematmul_grad/strided_slice_1/stack_2)]] 

it may make sense disable hard device placement:

import tensorflow tf  tf.device('/gpu:0'):     st = tf.sparsetensor(         tf.constant([[0, 0], [1, 1]], dtype=tf.int64),         tf.constant([1.2, 3.4], dtype=tf.float32),         tf.constant([2, 2], dtype=tf.int64)     )      v = tf.variable([[1.0, 0.0], [0.0, 1.0]], dtype=tf.float32)     st = tf.sparse_tensor_dense_matmul(st, v)     st = tf.reduce_min(st)     optimizer = tf.train.adamoptimizer()     trainer = optimizer.minimize(st)  tf.session(config=tf.configproto(allow_soft_placement=true)) sess:     sess.run(tf.global_variables_initializer())     print(sess.run(trainer)) 

you can log device placements, may useful figuring out whether kernels care on gpu.

there host memory fake gpu kernels registered int32 strided slice, not int64. open pull request / feature request on github add int64 host memory kernels (effectively copying int32 versions) if need/want hard device placement.

for background, strided slice getting used in gradient of sparsetensordensematmul. there's no benefit running these kinds of indexing operations on gpu, registered gpu kernels run on cpu in order avoid kinds of hard device placement bookkeeping issues you've run into.


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -