According to softmax function, you need to iterate all elements in the array and compute the exponential for each individual element then divide it by the sum of the exponential of the all elements:
import numpy as np
a = [1,3,5]
for i in a:
print np.exp(i)/np.sum(np.exp(a))
0.015876239976466765
0.11731042782619837
0.8668133321973349
However if the numbers are too big the exponents will probably blow up (computer can not handle such big numbers):
a = [2345,3456,6543]
for i in a:
print np.exp(i)/np.sum(np.exp(a))
__main__:2: RuntimeWarning: invalid value encountered in double_scalars
nan
nan
nan
To avoid this, first shift the highest value in array to zero. Then compute the softmax. For example, to compute the softmax of [1, 3, 5]
use [1-5, 3-5, 5-5]
which is [-4, -2, 0]
. Also you may choose the implement it in vectorized way (as you intendet to do in question):
def softmax(x):
f = np.exp(x - np.max(x)) # shift values
return f / f.sum(axis=0)
softmax([1,3,5])
# prints: array([0.01587624, 0.11731043, 0.86681333])
softmax([2345,3456,6543,-6789,-9234])
# prints: array([0., 0., 1., 0., 0.])
For detailed information check out the cs231n course page. The Practical issues: Numeric stability. heading is exactly what I'm trying to explain.