I use the following code to return Shannon's Entropy on an array that represents a probability distribution.
A = np.random.randint(10, size=10)
pA = A / A.sum()
Shannon2 = -np.sum(pA*np.log2(pA))
This works fine if the array doesn't contain any zero's.
Example:
Input: [2 3 3 3 2 1 5 3 3 4]
Output: 3.2240472715
However, if the array does contain zero's, Shannon's Entropy produces nan
Example:
Input:[7 6 6 8 8 2 8 3 0 7]
Output: nan
I do get two RuntimeWarnings:
1) RuntimeWarning: divide by zero encountered in log2
2) RuntimeWarning: invalid value encountered in multiply
Is there a way to alter the code to include zero's? I'm just not sure if removing them completely will influence the result. Specifically, if the variation would be greater due to the greater frequency in distribution.
pA = A / A.sum()
. The result ofA.sum()
is smaller due to the zeroes being present. – Encode