python plot and powerlaw fit
Asked Answered
F

4

11

I have the following list:

[6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2]

I want to plot the frequency of each entity with python and make a powerlaw analysis on it.

But I cannot figure how I can plot the list with ylabel the frequency and xlabel the numbers on the list.

I thought to create a dict with the frequencies and plot the values of the dictionary, but with that way, I cannot put the numbers on xlabel.

Any advice?

Filariasis answered 19/5, 2013 at 23:16 Comment(0)
V
5

I think you're right about the dictionary:

>>> import matplotlib.pyplot as plt
>>> from collections import Counter
>>> c = Counter([6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2])
>>> sorted(c.items())
[(0, 50), (1, 30), (2, 9), (3, 8), (4, 1), (5, 1), (6, 1)]
>>> plt.plot(*zip(*sorted(c.items()))
... )
[<matplotlib.lines.Line2D object at 0x36a9990>]
>>> plt.show()

There are a few pieces here that are of interest. zip(*sorted(c.items())) will return something like [(0,1,2,3,4,5,6),(50,30,9,8,1,1,1)]. We can unpack that using the * operator so that plt.plot sees 2 arguments -- (0,1,2,3,4,5,6) and (50,30,9,8,1,1,1). which are used as the x and y values in plotting respectively.

As for fitting the data, scipy will probably be of some help here. Specifically, have a look at the following examples. (one of the examples even uses a power law).

Verse answered 19/5, 2013 at 23:22 Comment(1)
I just see your edits. Thank you. This will probably solve my problem.Filariasis
C
13

Use the package: powerlaw

import powerlaw
d=[6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3,2,  3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1,0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1,3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2]
fit = powerlaw.Fit(numpy.array(d)+1,xmin=1,discrete=True)
fit.power_law.plot_pdf( color= 'b',linestyle='--',label='fit ccdf')
fit.plot_pdf( color= 'b')

print('alpha= ',fit.power_law.alpha,'  sigma= ',fit.power_law.sigma)

alpha= 1.85885487521 sigma= 0.0858854875209

enter image description here

It allow to plot, fit and analyse the data correctly. It has as special method for fit on power law distributions with discrete data.

it can be installed with: pip install powerlaw

Caprifig answered 31/3, 2016 at 18:48 Comment(1)
Do you know by any chance how to get the scaling factor C?Heikeheil
V
5

I think you're right about the dictionary:

>>> import matplotlib.pyplot as plt
>>> from collections import Counter
>>> c = Counter([6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2])
>>> sorted(c.items())
[(0, 50), (1, 30), (2, 9), (3, 8), (4, 1), (5, 1), (6, 1)]
>>> plt.plot(*zip(*sorted(c.items()))
... )
[<matplotlib.lines.Line2D object at 0x36a9990>]
>>> plt.show()

There are a few pieces here that are of interest. zip(*sorted(c.items())) will return something like [(0,1,2,3,4,5,6),(50,30,9,8,1,1,1)]. We can unpack that using the * operator so that plt.plot sees 2 arguments -- (0,1,2,3,4,5,6) and (50,30,9,8,1,1,1). which are used as the x and y values in plotting respectively.

As for fitting the data, scipy will probably be of some help here. Specifically, have a look at the following examples. (one of the examples even uses a power law).

Verse answered 19/5, 2013 at 23:22 Comment(1)
I just see your edits. Thank you. This will probably solve my problem.Filariasis
A
4
y = np.bincount([6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2])
x = np.nonzero(y)[0]
plt.bar(x,y)

enter image description here

Apocope answered 19/5, 2013 at 23:28 Comment(0)
S
-1
import matplotlib.pyplot as plt
data = [6, 4, 0, 0, 0, 0, 0, 1, 3, 1, 0, 3, 3, 0, 0, 0, 0, 1, 1, 0, 0, 0, 3, 2, 3, 3, 2, 5, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0, 0, 1, 2, 1, 0, 1, 0, 0, 0, 0, 1, 0, 1, 2, 0, 0, 0, 2, 1, 1, 0, 0, 0, 1, 0, 1, 0, 0, 1, 1, 1, 1, 0, 1, 1, 1, 3, 1, 1, 0, 1, 1, 0, 0, 0, 1, 0, 0, 1, 0, 1, 2, 2, 3, 2, 1, 0, 0, 0, 1, 2]

plt.hist(data, bins=range(max(data)+2))
plt.show()

enter image description here

Surefire answered 19/5, 2013 at 23:30 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.