machine learning - CheckNumerics finds Nans in "dense_1/kernel/read:0" after training MDN for a while -


i training mixture density network , after while (57 epochs) error nan values tf.add_check_numerics_ops()

the error message is:

dense_1/kernel/read:0 : tensor had nan values  [[node: checknumerics_9 = checknumerics[t=dt_float, message="dense_1/kernel/read:0", _device="/job:localhost/replica:0/task:0/gpu:0"](dense_1/kernel/read, ^checknumerics_8)]] 

if check weights using layer.get_weights() of dense_1 can see not nan.

when try sess.run([graph.get_tensor_by_name('dense_1/kernel/read:0)], feed_dict=stuff) array size off weights nans.

i don't understand read operation doing, there sort of caching having issues?

details of network:

(i've tried many combinations of these , find nans although @ different epochs.)

  • 3 hidden layers, 32, 16, 32
  • non linearity = selu, i've tried tanh, relu, elu , selu
  • gradient clipping
  • dropout
  • happens or without batchnorm
  • validation error still improving when nans
  • input: 128 dimensions
  • output: mixture of 3 beta distributions in each of 64 dimensions
  • occurs or without adversarial examples
  • i use eps=1e-7 clip value [eps , 1-eps]
  • i use logsumexp trick numerical stability

most of relevant code can found here:

https://gist.github.com/marvint/29bbeda2aecee17858e329745881cc7c

caused unsolved bug in tensorflow:

https://github.com/tensorflow/tensorflow/issues/2288

i still don't know nan getting gradient though...


Comments

Popular posts from this blog

ubuntu - PHP script to find files of certain extensions in a directory, returns populated array when run in browser, but empty array when run from terminal -

php - How can i create a user dashboard -

javascript - How to detect toggling of the fullscreen-toolbar in jQuery Mobile? -