Matplotlib/Pandas error using histogram
Asked Answered
R

2

77

I have a problem making histograms from pandas series objects and I can't understand why it does not work. The code has worked fine before but now it does not.

Here is a bit of my code (specifically, a pandas series object I'm trying to make a histogram of):

type(dfj2_MARKET1['VSPD2_perc'])

which outputs the result: pandas.core.series.Series

Here's my plotting code:

fig, axes = plt.subplots(1, 7, figsize=(30,4))
axes[0].hist(dfj2_MARKET1['VSPD1_perc'],alpha=0.9, color='blue')
axes[0].grid(True)
axes[0].set_title(MARKET1 + '  5-40 km / h')

Error message:

    AttributeError                            Traceback (most recent call last)
    <ipython-input-75-3810c361db30> in <module>()
      1 fig, axes = plt.subplots(1, 7, figsize=(30,4))
      2 
    ----> 3 axes[1].hist(dfj2_MARKET1['VSPD2_perc'],alpha=0.9, color='blue')
      4 axes[1].grid(True)
      5 axes[1].set_xlabel('Time spent [%]')

    C:\Python27\lib\site-packages\matplotlib\axes.pyc in hist(self, x, bins, range, normed,          weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label,    stacked, **kwargs)
   8322             # this will automatically overwrite bins,
   8323             # so that each histogram uses the same bins
-> 8324             m, bins = np.histogram(x[i], bins, weights=w[i], **hist_kwargs)
   8325             m = m.astype(float) # causes problems later if it's an int
   8326             if mlast is None:

    C:\Python27\lib\site-packages\numpy\lib\function_base.pyc in histogram(a, bins, range,     normed, weights, density)
    158         if (mn > mx):
    159             raise AttributeError(
--> 160                 'max must be larger than min in range parameter.')
    161 
    162     if not iterable(bins):

AttributeError: max must be larger than min in range parameter.
Rohrer answered 18/12, 2013 at 11:17 Comment(3)
Hmm,it works for me. Can you show your dataframe?Harmonics
Hmm, strange when I do this I can actually produce a histogram: s = dfj2_MARKET1['VSPD1_perc'] s.hist()Rohrer
Yes, but then you are using pandas hist function, and not matplotlibs. And this handles eg NaNs as expected. See my update.Intercostal
I
131

This error occurs among other things when you have NaN values in the Series. Could that be the case?

These NaN's are not handled well by the hist function of matplotlib. For example:

s = pd.Series([1,2,3,2,2,3,5,2,3,2,np.nan])
fig, ax = plt.subplots()
ax.hist(s, alpha=0.9, color='blue')

produces the same error AttributeError: max must be larger than min in range parameter. One option is eg to remove the NaN's before plotting. This will work:

ax.hist(s.dropna(), alpha=0.9, color='blue')

Another option is to use pandas hist method on your series and providing the axes[0] to the ax keyword:

dfj2_MARKET1['VSPD1_perc'].hist(ax=axes[0], alpha=0.9, color='blue')
Intercostal answered 18/12, 2013 at 11:58 Comment(0)
S
3

The error is rightly due to NaN values as explained above. Just use:

df = df['column_name'].apply(pd.to_numeric)

if the value is not numeric and then apply:

df = df['column_name'].replace(np.nan, your_value)
Stick answered 28/9, 2018 at 20:34 Comment(0)

© 2022 - 2024 — McMap. All rights reserved.