How to have logarithmic bins in a Python histogram
Asked Answered
B

5

104

As far as I know the option Log=True in the histogram function only refers to the y-axis.

P.hist(d,bins=50,log=True,alpha=0.5,color='b',histtype='step')

I need the bins to be equally spaced in log10. Is there something that can do this?

Bohon answered 28/7, 2011 at 7:55 Comment(1)
You must divide the count in each bin by the bin width if you do so!Esau
A
158

use logspace() to create a geometric sequence, and pass it to bins parameter. And set the scale of xaxis to log scale.

import pylab as pl
import numpy as np

data = np.random.normal(size=10000)
pl.hist(data, bins=np.logspace(np.log10(0.1),np.log10(1.0), 50))
pl.gca().set_xscale("log")
pl.show()

enter image description here

Ahner answered 28/7, 2011 at 8:37 Comment(5)
note that np.logspace(0.1,1.0,...) will create a range from 10**0.1 to 10**1.0, not from 0.1 to 1.0Sapers
should be np.logspace(np.log10(0.1),np.log10(1.0),50)Misguided
See my answer for how to use bins='auto'Goolsby
@AndreHolzner @OrangeSherbet One can use np.geomspace to specify endpoints directly.Gwennie
Note that the photo doesn't match the code. The limits on the x-axis should be from 10**-1 to 10**0.Upton
L
28

The most direct way is to just compute the log10 of the limits, compute linearly spaced bins, and then convert back by raising to the power of 10, as below:

import pylab as pl
import numpy as np

data = np.random.normal(size=10000)

MIN, MAX = .01, 10.0

pl.figure()
pl.hist(data, bins = 10 ** np.linspace(np.log10(MIN), np.log10(MAX), 50))
pl.gca().set_xscale("log")
pl.show()

log10 spaced bins

Lozar answered 4/8, 2014 at 0:52 Comment(0)
G
18

The following code indicates how you can use bins='auto' with the log scale.

import numpy as np
import matplotlib.pyplot as plt

data = 10**np.random.normal(size=500)

_, bins = np.histogram(np.log10(data + 1), bins='auto')
plt.hist(data, bins=10**bins);
plt.gca().set_xscale("log")

chart

Goolsby answered 21/4, 2018 at 11:45 Comment(1)
why did you add "+1" (in np.log10(data + 1))? I see that it regularize the case of log(0), but does this not create proble mo represent with the binning the data <1?Unpopular
J
1

In addition to what was stated, performing this on pandas dataframes works as well:

some_column_hist = dataframe['some_column'].plot(bins=np.logspace(-2, np.log10(max_value), 100), kind='hist', loglog=True, xlim=(0,max_value))

I would caution, that there may be an issue with normalizing the bins. Each bin is larger than the previous one, and therefore must be divided by it's size to normalize the frequencies before plotting, and it seems that neither my solution, nor HYRY's solution accounts for this.

Source: https://arxiv.org/pdf/cond-mat/0412004.pdf

Jackhammer answered 26/10, 2016 at 5:37 Comment(0)
F
0

A variation of the proposed answers for the case where we need to include zero or even negative values:

import numpy as np
import matplotlib.pyplot as plt

data = np.random.normal(size=10000)
cutoff = 0.01
bins = np.logspace(np.log10(cutoff),np.log10(1.0), 50)
bins = np.concatenate((-bins[::-1], bins))
plt.figure()
plt.hist(data, bins=bins)
plt.xscale("symlog", linthresh=cutoff)
plt.show()

enter image description here

Fitly answered 17/4, 2024 at 12:11 Comment(0)

© 2022 - 2025 — McMap. All rights reserved.