Python: how to fit a gamma distribution from data?
Asked Answered
U

2

6

I have a dataset and I am trying to see which is the best distribution its following.

In the firs attempt I tried to fit it with a rayleigh, so

y, x = np.histogram(data, bins=45, normed=True)
param = rayleigh.fit(y) # distribution fitting
# fitted distribution
xx = linspace(0,45,1000)
pdf_fitted = rayleigh.pdf(xx,loc=param[0],scale=param[1])
pdf = rayleigh.pdf(xx,loc=0,scale=8.5)

fig,ax = plt.subplots(figsize=(7,5))
plot(xx,pdf,'r-', lw=5, alpha=0.6, label='rayleigh pdf')
plot(xx,pdf,'k-', label='Data')
plt.bar(x[1:], y)
ax.set_xlabel('Distance, '+r'$x [km]$',size = 15)
ax.set_ylabel('Frequency, '+r'$P(x)$',size=15)
ax.legend(loc='best', frameon=False)

enter image description here

I am trying to do the same with a gamma distribution without succeding

y, x = np.histogram(net1['distance'], bins=45, normed=True)
xx = linspace(0,45,1000)
ag,bg,cg = gamma.fit(y)  
pdf_gamma = gamma.pdf(xx, ag, bg,cg)  
fig,ax = plt.subplots(figsize=(7,5))
# fitted distribution
plot(xx,pdf_gamma,'r-', lw=5, alpha=0.6, label='gamma pdf')
plot(xx,pdf_gamma,'k-')
plt.bar(x[1:], y, label='Data')
ax.set_xlabel('Distance, '+r'$x [km]$',size = 15)
ax.set_ylabel('Frequency, '+r'$P(x)$',size=15)
ax.legend(loc='best', frameon=False)

enter image description here

Underbred answered 5/12, 2016 at 17:10 Comment(5)
what do you get for ag, bg,cg ?Thelen
for scaling the curve.Underbred
I mean what are the numbersThelen
ag = 0.3487208935787377, bg=2.642125282765536e-05, cg=0.028141488538963526Underbred
The answer to #29208940 might be what you need.Durrace
D
3

Unfortunately scipy.stats.gamma is not well documented.

suppose you have some "raw" data in the form data=array([a1,a2,a3,.....]), these can be the results of an experiment of yours.

You can give these raw values to the fit method: gamma.fit(data) and it will return for you three parameters a,b,c = gamma.fit(data). These are the "shape", the "loc"ation and the "scale" of the gamma curve that fits better the DISTRIBUTION HISTOGRAM of your data (not the actual data).

I noticed from the questions online that many people confuse. They have a distribution of data, and try to fit it with gamma.fit. This is wrong.

The method gamma.fit expects your raw data, not the distribution of your data. This will presumably solve problems to few of us.

GR

Dianetics answered 24/1, 2019 at 6:17 Comment(0)
T
1

My guess is that you have much of the original data at 0, so the alpha of the fit ends up lower than 1 (0.34) and you get the decreasing shape with singularity at 0. The bar plot does not include the zero (x[1:]) so you don't see the huge bar on the left.

Can I be right?

Thelen answered 5/12, 2016 at 20:13 Comment(2)
I double checked and I do not have any zeros in the original datasetUnderbred
It's really awkward. But I don't have your data to test. Can you do the same fit in R for instance? It would be a nice quick test to see if you get the same estimates.Thelen

© 2022 - 2024 — McMap. All rights reserved.