How to get the mode of distribution in scipy.stats
Asked Answered
C

1

9

The scipy.stats library has functions to find the mean and median of a fitted distribution but not mode.

If I have the parameters of a distribution after fitting to data, how can I find the mode of the fitted distribution?

Cipango answered 9/1, 2020 at 11:25 Comment(1)
@MayowaAyodele: Why do you refer to this same post (#59663420) in your comment? :-)Dollhouse
Z
7

If I don't get your wrong, you want to find the mode of fitted distributions instead of mode of a given data. Basically, we can do it with following 3 steps.

Step 1: generate a dataset from a distribution

from scipy import stats
from scipy.optimize import minimize
# generate a norm data with 0 mean and 1 variance
data = stats.norm.rvs(loc= 0,scale = 1,size = 100)
data[0:5]

Output:

array([1.76405235, 0.40015721, 0.97873798, 2.2408932 , 1.86755799])

Step 2: fit the parameters

# fit the parameters of norm distribution
params = stats.norm.fit(data)
params

Output:

(0.059808015534485, 1.0078822447165796)

Note that there are 2 parameters for stats.norm, i.e. loc and scale. For different dist in scipy.stats, the parameters are different. I think it's convenient to store parameter in a tuple and then unpack it in the next step.

Step 3: get the mode(maximum of your density function) of fitted distribution

# continuous case
def your_density(x):
    return -stats.norm.pdf(x,*paras)
minimize(your_density,0).x

Output:

0.05980794

Note that a norm distribution has mode equals to mean. It's a coincidence in this example.

One more thing is that scipy treats continuous dist and discrete dist different(they have different father classes), you can do the same thing with following code on discrete dists.

## discrete dist, example for poisson
x = np.arange(0,100) # the range of x should be specificied
x[stats.poisson.pmf(x,mu = 2).argmax()] # find the x value to maximize pmf

Out:

1

You can it try with your own data and distributions!

Zebadiah answered 9/1, 2020 at 11:39 Comment(7)
Thanks for the detailed answer! I understand the logic you have used here. But I am not sure why in my case the pdf function is always returning me 0. I have a gamma distribution fitted to data. Other functions like ppf, mean and var are returning the correct values but pdf returns me 0. And if i try to minimize it, the solution returned is the initial starting point of optimization. Not sure what's going wrong here.Cipango
@AdnanTamimi show me your data and code, I think you maybe misuse the .pdf functionZebadiah
The parameters of the distribution p = [1.0903919789648953, 186586.34341665, 102313.74542487558] from scipy.stats import gamma def your_density(x): return -gamma.pdf(x,*p) minimize(your_density, 0).xCipango
Unable to post the data here due to character limitationCipango
Your x range should be larger than scale, which means x0 should larger than 186586.SeeZebadiah
This only works for discrete distributions if you know the possible range of values apriori. How do you conduct a search if your distribution is unknown, and might be unbounded or non-integer? It would be nice to have a general method.Friedafriedberg
In parametric statistics, we assume data obey some distribution and only the parameter of distribution is unknown. Insead of parametric statistics, there are many nonparametric statistical methods which can estimate population median based on samples. The simplistic one is using sample median to estimate population median.Zebadiah

© 2022 - 2024 — McMap. All rights reserved.