scipy.stats attribute `entropy` for continuous distributions doesn't work manually
Asked Answered
H

1

1

Each continuous distribution in scipy.stats comes with an attribute that calculates its differential entropy: .entropy. Unlike the normal distribution (norm) and others that have a closed-form solution for entropy, other distributions have to rely on numerical integration.

Trying to find out which function the .entropy attribute is calling in those cases, I found a function called _entropy in scipy.stats._distn_infrastructure.py that does so with integrate.quad(pdf) (numerical integration).

But when I try to compare the two approaches (the attribute .entropy vs. numerical integration with the function _entropy), the function gives an error:

AttributeError: 'rv_frozen' object has no attribute '_pdf'

Why does the distribution's attribute .entropy calculate fine, but the function _entropy gives an error?

import numpy as np
from scipy import integrate 
from scipy.stats import norm, johnsonsu
from scipy.special import entr

def _entropy(self, *args): #from _distn_infrastructure.py
    def integ(x):
        val = self._pdf(x, *args)
        return entr(val)

    # upper limit is often inf, so suppress warnings when integrating
    # _a, _b = self._get_support(*args)
    _a, _b = -np.inf, np.inf   
    with np.errstate(over='ignore'):
        h = integrate.quad(integ, _a, _b)[0]

    if not np.isnan(h):
        return h
    else:
        # try with different limits if integration problems
        low, upp = self.ppf([1e-10, 1. - 1e-10], *args)
        if np.isinf(_b):
            upper = upp
        else:
            upper = _b
        if np.isinf(_a):
            lower = low
        else:
            lower = _a
    return integrate.quad(integ, lower, upper)[0]

Using the attribute works fine:

print(johnsonsu(a=2.55,b=2.55).entropy())

returns 0.9503703091220894

But the function does not:

print(_entropy(johnsonsu(a=2.55,b=2.55)))

returns the error AttributeError: 'rv_frozen' object has no attribute '_pdf', even though johnsonsu does have this attribute:

def _pdf(self, x, a, b):
    # johnsonsu.pdf(x, a, b) = b / sqrt(x**2 + 1) *
    #                          phi(a + b * log(x + sqrt(x**2 + 1)))
    x2 = x*x
    trm = _norm_pdf(a + b * np.log(x + np.sqrt(x2+1)))
    return b*1.0/np.sqrt(x2+1.0)*trm

Which function is the attribute .entropy calling then in the case of the johnsonsu?

Halfmast answered 9/1, 2021 at 8:27 Comment(0)
C
1

You want either johnsonsu(a=2.55,b=2.55).entropy() if you are using frozen distributions or johnsonsu.entropy(a=2.55,b=2.55) otherwise.

The why part of your question is basically that leading underscore in _entropy means "implementation detail, don't call directly". A longer answer is that frozen distributions wrap a distribution instance (self.dist), and delegate to it the calls to _pdf, _pmf etc.

EDIT: executing johnsonsu(a=2.55,b=2.55) creates a frozen distribution, rv_frozen. Don't do it unless you want to reuse the instance multiple times: just give the a,b shape parameters as arguments to the entropy function.

Cristiecristin answered 9/1, 2021 at 10:24 Comment(1)
so how can I get the manual _entropy function to work? and is it really what the attribute .entropy is calling? I don't know if I'm using frozen distributions since I don't know what those are. All I thought I was doing was estimating johnsonsu. The distinction between frozen and non-frozen is not helped with the two code examples you wrote since they both work just as well as each otherHalfmast

© 2022 - 2024 — McMap. All rights reserved.