Shannon's Entropy on an array containing zero's

Asked 23/4, 2018 at 4:44 Answered 23/4, 2018 at 5:1

I use the following code to return Shannon's Entropy on an array that represents a probability distribution.

A = np.random.randint(10, size=10)

pA = A / A.sum()
Shannon2 = -np.sum(pA*np.log2(pA))

This works fine if the array doesn't contain any zero's.

Example:

Input: [2 3 3 3 2 1 5 3 3 4]
Output: 3.2240472715

However, if the array does contain zero's, Shannon's Entropy produces nan

Example:

Input:[7 6 6 8 8 2 8 3 0 7]
Output: nan

I do get two RuntimeWarnings:

1) RuntimeWarning: divide by zero encountered in log2

2) RuntimeWarning: invalid value encountered in multiply

Is there a way to alter the code to include zero's? I'm just not sure if removing them completely will influence the result. Specifically, if the variation would be greater due to the greater frequency in distribution.

Awl answered 23/4, 2018 at 4:44 Comment(1)

Removing the zeros in the later part of the calculation does not amount to ignoring the zeroes. The influence of the zero is from the pA = A / A.sum(). The result of A.sum() is smaller due to the zeroes being present. – Encode 23/4, 2018 at 5:17

I think you want to use nansum to count nans as zero:

A = np.random.randint(10, size=10)
pA = A / A.sum()
Shannon2 = -np.nansum(pA*np.log2(pA))

Sheeran answered 23/4, 2018 at 5:1 Comment(1)

For python 2.7 this code also needs: from __future__ import division to force non-integer division. See: #1268369 – Encode 23/4, 2018 at 5:15

The easiest and most used way is to ignore the zero probabilities and calculate the Shannon's Entropy on remaining values.

Try the following:

import numpy as np
A = np.array([1.0, 2.0, 0.0, 5.0, 0.0, 9.0])
A = np.array(filter(lambda x: x!= 0, A))
pA = A / A.sum()
Shannon2 = -np.sum(pA * np.log2(pA))

Commensal answered 23/4, 2018 at 4:58 Comment(0)

Recommended topics

Hot tags