I have a Tensorflow multiclass classifier that is generating nan
or inf
while computing probabilities using tf.nn.softmax
. See the following snippet (logits
is of shape batch_size x 6
, since I have 6 classes and the output is one-hot encoded). batch_size
is 1024.
logits = tf.debugging.check_numerics(logits, message='bad logits', name=None)
probabilities = tf.nn.softmax(logits=logits, name='Softmax')
probabilities = tf.debugging.check_numerics(probabilities, message='bad probabilities', name=None)
The classifier fails on the last statement as it finds nan
or inf
in probabilities
. logits
are clean, otherwise the first statement would have failed.
From what I read about tf.nn.softmax
, it can handle very large and very small values in logits. I have verified this in interactive mode.
>>> with tf.Session() as s:
... a = tf.constant([[1000, 10], [-100, -200], [3, 4.0]])
... sm = tf.nn.softmax(logits=a, name='Softmax')
... print(a.eval())
... print(sm.eval())
...
[[1000. 10.]
[-100. -200.]
[ 3. 4.]]
[[1. 0. ]
[1. 0. ]
[0.26894143 0.7310586 ]]
I then tried clipping the values in logits
and the whole thing now works. See the modified snippet below.
logits = tf.debugging.check_numerics(logits, message='logits', name=None)
safe_logits = tf.clip_by_value(logits, -15.0, 15.0)
probabilities = tf.nn.softmax(logits=safe_logits, name='Softmax')
probabilities = tf.debugging.check_numerics(probabilities, message='bad probabilities', name=None)
In second statement, I am clipping the values in logits
to -15 and 15, and that somehow prevents nan
/inf
in softmax computation. So, I was able to fix the issue at hand.
However, I still don't understand why this clipping is working? (I should mention that clipping between -20 and 20 does not work and the model fails with nan
or inf
in probabilities
).
Could someone help me understand why this is the case?
I am using tensorflow 1.15.0, running on a 64-bit instance.
logits
? – Thomasson